revscoring.scoring.models¶

This module contains a collection of models that implement a simple function: score(). Currently, all models are a subclass of revscoring.scoring.models.Learned which means that they also implement train() and cross_validate().

Gradient Boosting¶

A collection of Gradient Boosting type classifier models.

class revscoring.scoring.models.GradientBoosting(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

Implements a Gradient Boosting model.

Estimator¶: alias of sklearn.ensemble.gradient_boosting.GradientBoostingClassifier

Naive Bayes¶

A collection of Naive Bayes type classifier models.

class revscoring.scoring.models.GaussianNB(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

Implements a Gaussian Naive Bayes model

Estimator¶: alias of sklearn.naive_bayes.GaussianNB

class revscoring.scoring.models.MultinomialNB(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

Implements a Multinomial Naive Bayes model

Estimator¶: alias of sklearn.naive_bayes.MultinomialNB

class revscoring.scoring.models.BernoulliNB(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

Implements a Bernoulli Naive Bayes model

Estimator¶: alias of sklearn.naive_bayes.BernoulliNB

Linear Regression¶

A collection of linear classifier models.

class revscoring.scoring.models.LogisticRegression(*args, label_weights=None, **kwargs)[source]¶

Implements a Logistic Regression

Estimator¶: alias of sklearn.linear_model.logistic.LogisticRegression

Support Vector¶

A collection of Support Vector Machine type classifier models.

class revscoring.scoring.models.LinearSVC(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶: Implements a Support Vector Classifier model with a Linear kernel.

class revscoring.scoring.models.RBFSVC(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶: Implements a Support Vector Classifier model with an RBF kernel.

class revscoring.scoring.models.SVC(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

Implements a Support Vector Classifier model.

Estimator¶: alias of sklearn.svm.classes.SVC

Random Forest¶

A collection of Random Forest type classifier models.

class revscoring.scoring.models.RandomForest(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

Implements a Random Forest model.

Estimator¶: alias of sklearn.ensemble.forest.RandomForestClassifier

Abstract classes¶

All scoring models are an implementation of revscoring.Model.

class revscoring.scoring.models.Learned(*args, scale=False, center=False, **kwargs)[source]¶

cross_validate(values_labels, folds=10, processes=1)[source]¶

Trains and tests the model agaists folds of labeled data.

Parameters:	values_labels : [( <feature_values>, <label> )] an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the Feature s provided to the constructor folds : int When set to 1, cross-validation will run in the parent thread. When set to 2 or greater, a `multiprocessing.Pool` will be created.

fit_scaler_and_transform(fv_vectors)[source]¶

Fits the internal scale to labeled data.

Parameters:	fv_vectors : iterable (( <feature_values>, <label> )) an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the Feature s provided to the constructor
Returns:	A dictionary of model statistics.

train(values_labels)[source]¶

Fits the model using labeled data by learning its shape.

Parameters:	values_labels : [( <feature_values>, <label> )] an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the `revscoring.Feature` s provided to the constructor

class revscoring.scoring.models.Classifier(features, labels, multilabel=False, population_rates=None, **kwargs)[source]¶

SciKit Learn-based models¶

Implements the basics of all sklearn based models.

class revscoring.scoring.models.sklearn.Classifier(features, labels, multilabel=False, version=None, label_weights=None, population_rates=None, scale=False, center=False, statistics=None, estimator=None, **estimator_params)[source]¶

score(feature_values)[source]¶

Generates a score for a single revision based on a set of extracted feature_values.

Parameters:

feature_values : collection(mixed): an ordered collection of values that correspond to the Feature s provided to the constructor

Returns:

A dict with the fields:

prediction – The most likely class

score_many(feature_values)[source]¶

Generates a score for a bunch of revisions based on a set of extracted feature_values.

Parameters:

feature_values : collection(mixed): an ordered collection of values that correspond to the Feature s provided to the constructor

Returns:

A dict with the fields:

prediction – The most likely class

train(values_labels, **kwargs)[source]¶

Fits the internal model to the provided values_labels.

Returns:

A dictionary with the fields:

seconds_elapsed – Time in seconds spent fitting the model

class revscoring.scoring.models.sklearn.ProbabilityClassifier(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶

score(feature_values)[source]¶

Generates a score for a single revision based on a set of extracted feature_values.

Parameters:

feature_values : collection(mixed): an ordered collection of values that correspond to the Feature s provided to the constructor

Returns:

A dict with the fields:

prediction – The most likely class
probability – A mapping of probabilities for input classes

corresponding to the classes the classifier was trained on. Generating this probability is slower than a simple prediction.

score_many(feature_values)[source]¶

Generates a score for a bunch of revisions based on a set of extracted feature_values.

Parameters:

feature_values : array(collection(mixed)): an ordered collection of values that correspond to the Feature s provided to the constructor

Returns:

A dict with the fields:

prediction – The most likely class
probability – A mapping of probabilities for input classes

corresponding to the classes the classifier was trained on. Generating this probability is slower than a simple prediction.

revscoring.scoring.models¶

Gradient Boosting¶

Naive Bayes¶

Linear Regression¶

Support Vector¶

Random Forest¶

Abstract classes¶

SciKit Learn-based models¶

Revscoring

Navigation

Related Topics