revscoring.scoring.models¶
This module contains a collection of models that implement a simple function:
score()
. Currently, all models are
a subclass of revscoring.scoring.models.Learned
which means that they also implement
train()
and
cross_validate()
.
Gradient Boosting¶
A collection of Gradient Boosting type classifier models.
Naive Bayes¶
A collection of Naive Bayes type classifier models.
-
class
revscoring.scoring.models.
GaussianNB
(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶ Implements a Gaussian Naive Bayes model
-
Estimator
¶ alias of
sklearn.naive_bayes.GaussianNB
-
-
class
revscoring.scoring.models.
MultinomialNB
(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶ Implements a Multinomial Naive Bayes model
-
Estimator
¶ alias of
sklearn.naive_bayes.MultinomialNB
-
-
class
revscoring.scoring.models.
BernoulliNB
(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶ Implements a Bernoulli Naive Bayes model
-
Estimator
¶ alias of
sklearn.naive_bayes.BernoulliNB
-
Linear Regression¶
A collection of linear classifier models.
Support Vector¶
A collection of Support Vector Machine type classifier models.
-
class
revscoring.scoring.models.
LinearSVC
(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶ Implements a Support Vector Classifier model with a Linear kernel.
Random Forest¶
A collection of Random Forest type classifier models.
Abstract classes¶
All scoring models are an implementation of revscoring.Model
.
-
class
revscoring.scoring.models.
Learned
(*args, scale=False, center=False, **kwargs)[source]¶ -
cross_validate
(values_labels, folds=10, processes=1)[source]¶ Trains and tests the model agaists folds of labeled data.
Parameters: - values_labels : [( <feature_values>, <label> )]
an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the Feature s provided to the constructor
- folds : int
When set to 1, cross-validation will run in the parent thread. When set to 2 or greater, a
multiprocessing.Pool
will be created.
-
fit_scaler_and_transform
(fv_vectors)[source]¶ Fits the internal scale to labeled data.
Parameters: - fv_vectors : iterable (( <feature_values>, <label> ))
an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the Feature s provided to the constructor
Returns: A dictionary of model statistics.
-
train
(values_labels)[source]¶ Fits the model using labeled data by learning its shape.
Parameters: - values_labels : [( <feature_values>, <label> )]
an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the
revscoring.Feature
s provided to the constructor
-
SciKit Learn-based models¶
Implements the basics of all sklearn based models.
-
class
revscoring.scoring.models.sklearn.
Classifier
(features, labels, multilabel=False, version=None, label_weights=None, population_rates=None, scale=False, center=False, statistics=None, estimator=None, **estimator_params)[source]¶ -
score
(feature_values)[source]¶ Generates a score for a single revision based on a set of extracted feature_values.
Parameters: - feature_values : collection(mixed)
an ordered collection of values that correspond to the Feature s provided to the constructor
Returns: A dict with the fields:
- prediction – The most likely class
-
score_many
(feature_values)[source]¶ Generates a score for a bunch of revisions based on a set of extracted feature_values.
Parameters: - feature_values : collection(mixed)
an ordered collection of values that correspond to the Feature s provided to the constructor
Returns: A dict with the fields:
- prediction – The most likely class
-
-
class
revscoring.scoring.models.sklearn.
ProbabilityClassifier
(features, labels, multilabel=False, statistics=None, population_rates=None, threshold_ndigits=None, **kwargs)[source]¶ -
score
(feature_values)[source]¶ Generates a score for a single revision based on a set of extracted feature_values.
Parameters: - feature_values : collection(mixed)
an ordered collection of values that correspond to the Feature s provided to the constructor
Returns: A dict with the fields:
- prediction – The most likely class
- probability – A mapping of probabilities for input classes
- corresponding to the classes the classifier was trained on. Generating this probability is slower than a simple prediction.
-
score_many
(feature_values)[source]¶ Generates a score for a bunch of revisions based on a set of extracted feature_values.
Parameters: - feature_values : array(collection(mixed))
an ordered collection of values that correspond to the Feature s provided to the constructor
Returns: A dict with the fields:
- prediction – The most likely class
- probability – A mapping of probabilities for input classes
- corresponding to the classes the classifier was trained on. Generating this probability is slower than a simple prediction.
-