revscoring.features

This module implements a set of revscoring.Feature for use in scoring revisions. revscoring.Feature lists can be provided to a revscoring.dependencies.solve(), or more commonly, to a revscoring.Extractor to obtain simple numerical/boolean values that can be used when modeling revision scores. The provided features are split conceptually into a set of modules:

Feature collections

revision_oriented
Basic features of revisions. E.g. revision.user.text_matches(r'.*Bot')
bytes
Features of the number of bytes of content, byte length of characters, etc.
temporal
Features of the time between events of a interest. E.g. revision.user.last_revision.seconds_since
wikibase
Features of wikibase items and changes made to them. E.g. revision.diff.property_changed('P31')
wikitext
Features of wikitext content and differences between revisions. E.g. revision.diff.uppercase_words_added

Functions

revscoring.features.trim(features, context=None)[source]

Trims a feature set down to a bare set of Feature by removing Modifier and Constant.

Parameters:
features : list ( revscoring.Feature )

A feature list to trim

context : dict | set

A context to apply while trimming

Meta-features

Meta-Features are classes that extend Feature and implement common operations on Datasource like sum and item_in_set. See revscoring.features.meta for the full list.

Modifiers

Modifiers are functions that can be applied to a revscoring.Feature to modify the value. E.g. log, max and add. See modifiers for the full list.

Base classes

class revscoring.Feature(name, process=None, *, returns=None, depends_on=None)[source]

Represents a predictive feature.

Parameters:
name : str

The name of the feature

process : func

A function that will generate a feature value

return_type : type

A type to compare the return of this function to.

dependencies : list`(`hashable)

An ordered list of dependencies that correspond to the *args of process

class revscoring.features.Modifier(name, process=None, *, returns=None, depends_on=None)[source]

Represents a modification of one or more predictive feature.

Parameters:
name : str

The name of the feature

process : func

A function that will generate a feature value

return_type : type

A type to compare the return of this function to.

dependencies : list`(`hashable)

An ordered list of dependencies that correspond to the *args of process

class revscoring.features.Constant(value, name=None)[source]

A special sub-type of revscoring.Feature that returns a constant value.

Parameters:
value : mixed

Any type of potential feature value

name : str

A name to give the feature