revscoring.scoring.statistics¶

Statistics represent the fitness of a revscoring.Model. They can be fit() to scores and labels and then output using format(). Once initialize, a Statistics instance behaves like a dict of statistics values.

Classification¶

Classification statistics can be generated for “Classifiers” – models that produce factors (aka levels) as an ouput. E.g. True and False or “A”, “B”, or “C”.

class revscoring.scoring.statistics.Classification(labels, multilabel=False, prediction_key='prediction', decision_key=None, threshold_ndigits=None, population_rates=None, **kwargs)[source]¶

fit(score_labels)[source]¶

Fit to scores and labels.

Parameters:	score_labels : [( dict, mixed )] A collection of scores-label pairs generated using `revscoring.Model.score`. Note that fitting is usually done using data withheld during model training

format_json(path_tree, **kwargs)[source]¶: Formats a json-able dictionary including rounding to at most ndigits.

format_str(path_tree, **kwargs)[source]¶: Formats path tree into a table and rounding to at most ndigits.

lookup(path)[source]¶

Looks up a specific information value based on either a string pattern or a path.

For example, the pattern “stats.roc_auc.labels.true” is the same as the path ['stats', 'roc_auc', 'labels', True].

Parameters:	path : str \| list The location of the information to lookup.

class revscoring.scoring.statistics.classification.Counts(labels, score_labels, prediction_key)[source]¶

class revscoring.scoring.statistics.classification.Rates(counts, population_rates=None)[source]¶

class revscoring.scoring.statistics.classification.MicroMacroStats(stats, field)[source]¶

class revscoring.scoring.statistics.classification.ScaledPredictionStatistics(y_preds=None, y_trues=None, counts=None, population_rate=None)[source]¶

accuracy()[source]¶: The proportion of predictions that were right.

accuracy = correct / n

f1()[source]¶: An information theoretic statistic that balances specificity with sensitivity.

filter_rate()[source]¶: The proportion of observations that are not matched.

filter-rate = 1 - match-rate

fpr()[source]¶: False-positive rate. The proportion of proportion of non-target class items that are not matched.

fpr = false-positives / !target-class

match_rate()[source]¶: The proportion of observations that are matched in prediction.

match-rate = positives / n

precision()[source]¶: The proportion of matched observations that are correctly matched. AKA “positive predictive value”.

precision = true-positives / true-predicions

recall()[source]¶: The proportion of the target class that the classifier matches. AKA “true-positive rate” and “sensitivity”.

recall = true-positives / target-class

class revscoring.scoring.statistics.classification.ScaledThresholdStatistics(y_decisions, y_trues, population_rate=None, threshold_ndigits=None)[source]¶

class revscoring.scoring.statistics.classification.ScaledClassificationMatrix(y_preds=None, y_trues=None, counts=None, population_rate=None)[source]¶

fit(y_preds, y_trues)[source]¶

Parameters:	y_preds : [ bool ] Predictions where True represents a prediction of the target class y_trues : [ bool ] Labels where True represents a label matching the target class

rescale(tp, fp, tn, fn)[source]¶

Re-scale a matrix based on sample counts

Parameters:	tp : int True positives fp : int False positives tn : int True negatives fn : int False negatives

class revscoring.scoring.statistics.classification.ThresholdOptimization(maximize, target_stat, cond_stat, greater, cond_value)[source]¶

get_optimal(threshold_statistics)[source]¶: Generates an optimized value by scanning a sequence of ScaledThresholdStatistics for a the best threshold that matches the conditional criteria. This function returns the entire ScaledPredictionStatistics mapping at the optimal threshold.

optimize_from(threshold_statistics)[source]¶: Generates an optimized value by scanning a sequence of ScaledThresholdStatistics for a the best threshold that matches the conditional criteria. This function returns the value of the optimized target statistic (or None).

classmethod parse(pattern)[source]¶

Parse a formatted string representing a threshold optimization. E.g. ‘maximum recall @ precision >= 0.9’ or ‘minimum match_rate @ recall >= 0.9’.

Parameters:	pattern : str The optimization pattern to parse

Abstract base class¶

class revscoring.scoring.Statistics(*args, **kwargs)[source]¶

fit(score_labels)[source]¶

Fit to scores and labels.

Parameters:	score_labels : [( dict, mixed )] A collection of scores-label pairs generated using `revscoring.Model.score`. Note that fitting is usually done using data withheld during model training

revscoring.scoring.statistics¶

Classification¶

Abstract base class¶

Revscoring

Navigation

Related Topics