Research:Revision scoring as a service/Revscoring library
Key features
editScorer abstraction
edit...todo...
Feature extraction garden
editWhen supporting an ecosystem with multiple models that use similar features, it's important that features are (1) well defined and (2) don't duplicate work. #Feature dependencies depicts a set of example features, their dependencies on datasources and other features. By using a dependency injection strategy for specifying and actualizing relationships between features/datasources, we can allow for easy development of new features based on old features and datasources. We can also minimize the work that the system will need to perform when building feature sets for a large set of different models.
Feature dependencies. Dependencies for features and datasources are presented. Datasources can depend on other datasources. Features can depend on both datasources and other features.
Example Makefile style dependency expression for MisspellingRaioDifferential
WordsAdded: RevisionDiff
<parse revision diff> \
return count
MisspellingsAdded: RevisionDiff Dictionary
<parse revision diff and use Dictionary to find misspellings> \
return count
PreviousWords: ParsedPreviousRevisionText
<parse non-markup content> \
return count
PreviousMisspellings: ParsedPreviousRevisionText Dictionary
<parse non-markup content and use Dictionary to find misspellings> \
return count
MisspellingRaioDifferential: WordsAdded, MisspellingsAdded, PreviousWords, PreviousMisspellings
return (MisspellingsAdded/WordsAdded) / \
((MisspellingsAdded/WordsAdded)+(PreviousMisspellings/PreviousWords))
Model files
edit...todo...