{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"_Alex Malz (NYU)_\n",
"_(Add your name here when contributing.)_\n",
"\n",
"# PRObabilistic CLAssification Metrics\n",
"\n",
"This notebook explores the behavior of a number of classification metrics, drawn from [discussions](https://docs.google.com/document/d/1pg0KUY0KihjlWKwoCE7Fc29u9pjv-fhwUnL8o34s58k/edit#) in the context of PLAsTiCC."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Many classification metrics are already implemented in [`scikit-learn`](http://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sklearn as skl"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Receiver Operating Curve (ROC)\n",
"\n",
"[on Wikipedia](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)\n",
"\n",
"Pros\n",
"* Works with multi-label data\n",
"\n",
"Cons\n",
"* Doesn't naturally work with multi-class data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import roc_curve"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### ROC Area Under Curve (AUC)\n",
"\n",
"Pros\n",
"* Commonly used\n",
"\n",
"Cons\n",
"* Not good for sparse classes\n",
"* \"Noisy\" metric"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import roc_auc_score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Standard Score (zROC)\n",
"\n",
"[on Wikipedia](https://en.wikipedia.org/wiki/Standard_score)\n",
"\n",
"Pros\n",
"\n",
"Cons\n",
"* Not implemented in `scikit-learn`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# write it!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Detection Error Tradeoff (DET) Graph\n",
"\n",
"[on Wikipedia](https://en.wikipedia.org/wiki/Detection_error_tradeoff)\n",
"\n",
"Pros\n",
"* More sensitive to areas of interest than ROC\n",
"\n",
"Cons\n",
"* Not implemented in `scikit-learn`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# write it!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Log-Loss\n",
"\n",
"[on Wikipedia](https://en.wikipedia.org/wiki/Loss_functions_for_classification)\n",
"\n",
"Pros\n",
"* `scikit-learn` implementation works with multi-class data\n",
"\n",
"Cons\n",
"* Doesn't naturally work with multi-class data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import log_loss"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Brier score\n",
"\n",
"[on Wikipedia](https://en.wikipedia.org/wiki/Brier_score)\n",
"\n",
"Pros\n",
"* Naturally works with multi-class data\n",
"* Intuitively interpretable\n",
"\n",
"Cons\n",
"* `scikit-learn` implementation only works with binary classes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import brier_score_loss"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Precision-Recall Curve (PRC)\n",
"\n",
"[on Wikipedia](https://en.wikipedia.org/wiki/Precision_and_recall)\n",
"\n",
"Pros\n",
"\n",
"Cons"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import precision_recall_curve"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### PRC Area Under Curve (AUC)\n",
"\n",
"[not on Wikipedia](https://andybeger.com/2015/03/16/precision-recall-curves/)\n",
"\n",
"Pros\n",
"* Better for sparse data than ROC AUC\n",
"\n",
"Cons\n",
"* Doesn't naturally work with multi-class data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import auc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### PRC Average Precision Score\n",
"\n",
"Pros\n",
"* Less optimistic than PRC AUC\n",
"\n",
"Cons"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import average_precision_score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Other modifications of deterministic metrics?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Impact of converting classifications from deterministic to probabilistic?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (not)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.14"
}
},
"nbformat": 4,
"nbformat_minor": 2
}