revscoring.datasources.revision_oriented¶
Implements a set of datasources oriented off of a single revision. This is useful for extracting features of edit and article quality.
-
revscoring.datasources.revision_oriented.
revision
= {revision}¶ Represents the base revision of interest. Implements this structure:
Supporting classes¶
-
class
revscoring.datasources.revision_oriented.
Revision
(name, include_parent=True, include_user=True, include_user_info=True, include_user_last_revision=False, include_page=True, include_page_creation=False, include_page_suggested=False, include_content=False)[source]¶ Represents a revision
-
id
= None¶ int : Revision ID
-
timestamp_str
= None¶ str : Timestamp the revision was saved in ISO format
-
timestamp
= None¶ mwtypes.Timestamp
: Timestamp the revision was saved
-
comment
= None¶ str : The comment saved with the revision
-
byte_len
= None¶ int : The length of the revision content in bytes
-
minor
= None¶ bool : Was the revision flagged as minor?
-
content_model
= None¶ str : Describes the format of revision content
-
text
= None¶ str : The decoded (Unicode) text of the revision content
-
-
class
revscoring.datasources.revision_oriented.
Diff
(name)[source]¶ Represents the difference between two sequential revisions.
-
class
revscoring.datasources.revision_oriented.
Page
(name, include_creation=False, include_suggested=False)[source]¶ Represents a revision’s page
-
id
= None¶ int : The page’s ID
-
title
= None¶ str : The page’s title (namespace stripped)
-
-
class
revscoring.datasources.revision_oriented.
Namespace
(name)[source]¶ Represents a page’s namespace
-
id
= None¶ int : The namespace’s ID
-
name
= None¶ str : The name of the namespace
-
-
class
revscoring.datasources.revision_oriented.
User
(name, include_info=True, include_last_revision=False)[source]¶ Represents a user’s id and name/ip
-
id
= None¶ int : The id of the user who saved the edit. 0 for IPs.
-
text
= None¶ str : The user’s name or IP address
-
-
class
revscoring.datasources.revision_oriented.
UserInfo
(name)[source]¶ Represents a user’s information
-
editcount
= None¶ int : A count of edits the user has ever saved
-
registration
= None¶ mwtypes.Timestamp
: The date the user registered or None
-
groups
= None¶ list ( str ) : The groups the user is a member of
-
emailable
= None¶ bool : True if the users is emailable, False otherwise
-
gender
= None¶ str : A string representing the user’s
gender
preference.
-