A Python package for calculating precision-recall-gain
In "Precision-Recall-Gain Curves: PR Analysis Done Right" by Flach and Kull (2015), they argue that the standard way of measuring the area under the precion-recall curve (the AUPRC metric) is flawed. They show that this metric does not have the same properties as the area under an ROC curve. In particular, the scores calculated for random models shifts as you change class balance, straight lines between two points are not valid because the scale is curved, and the AUPRC is hard to interpret.
Let
Now, random models score zero, lines between points are straight, and the Pareto frontier of models is a convex hull.
Intuition on the origin shift
Let
So the performance score of a random baseline depends on the class distribution, which is not good. If you want to measure your "gain" over random chance, you should put that chance point at the origin. That's what subtracting
Intuition on the stretching
The distances in the the raw PR curve are deceptive. Moving from 0.02 to 0.04 precision is a 100% lift when
Vanilla precision spans
However, the paper explains that this rescaling needs to be done in the harmonic scale (not linear scale) which is why the equation is a bit different. In essence, the vanilla AUPRC metric is taking the arithmetic mean of the precision scores, which is not appropriate in the linear coordinate system.
Calculating precision-recall-gain in Python
There's official implementation of PRG in Matlab, R, and Python here, but the Python implementation was very broken. Someone had opened a pull request to Sklearn, but unfortunately it looks like it won't get merged. So, I copied their implementation into a stand-alone Python library precision-recall-gain
. You can find it here: https://github.com/crypdick/precision-recall-gain.