Research:Automated classification of edit types/Taxonomy
This page documents a complete and inclusive taxonomy. The goal is to capture all potential change types that describe editing activity on Wikipedia. A practical subset will be used for the automated classification system, but we leave the identification of this practical subset to other discussion.
Syntactic
editThese classes describe "what" was done during an edit. (As opposed to "why")
Mechanical operations
editThese types of changes can be detected with simple regular expressions
- wiki links
- insert/delete
- modify
- disambiguate
- inter-wiki links
- insert/modify/delete
- external links
- insert/modify/delete
- category
- insert/modify/delete
- headers
- insert/modify/delete
- table
- insert/modify/delete
- image
- insert/modify/delete
- references
- insert/modify/delete
- content move / refactor
- redirect
- cleanup
- punctuation
- insert/delete
- whitespace
- insert/delete
- formatting -- css/style/bold/italics
- punctuation
Abstract/probabilistic operations
editThese classes can't be detected trivially with regular expressions. They would require some machine prediction.
- Grammar (word-level)
- punctuation, whitespace
- spelling error, typo
- capitalization
- tense change
- Rephrase (word-level)
- synonym
- remove redundant words
- Sentence (sentence-level)
- insert/modify/delete (substantive)
Semantic
editThese classes describe "why" an edit was made. They usually amount to subjective applications of policy.
- NPOV
- Vandalism
- Notable?
- External link policy
- Manual of style
- New topic (article creation)
Complex operations
editThese classes describe changes that are part of a multi-edit operation
- Merge
- Archiving
Discussion
editThese classes describe actions relevant to a discussion.
- New topic
- Reply
- !Vote (Support/oppose)
- Comment signing
- Suggestion
- WP tagging/assessment