Objective Revision Evaluation Service/goodfaith
One of the most critical concerns about Wikimedia's open projects is the detection and removal of damaging contributions. This model was trained on human judgement[1] for whether or not an edit was probably made in good-faith. It is useful for directing newcomer socialization efforts (e.g. en:User:HostBot) and detecting vandals & spammers.
This model is trained to predict good-faith edits. Note that, due to limitations in the field of natural language processing sarcasm and other types of cleverness in vandalism are likely to fool the model. Keep this in mind when consuming scores.
Contexts (wikis)
editEnglish Wikipedia (enwiki)
edithttps://ores.wmflabs.org/v2/scores/enwiki/goodfaith/?model_info
ScikitLearnClassifier - type: GradientBoosting - params: subsample=1.0, max_features="log2", loss="deviance", learning_rate=0.01, center=true, verbose=0, warm_start=false, presort="auto", max_depth=7, scale=true, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, random_state=null, init=null, n_estimators=700, min_samples_leaf=1, max_leaf_nodes=null, min_samples_split=2, balanced_sample=false - version: 0.3.0 - trained: 2017-01-06T19:35:15.426659 Table: ~False ~True ----- -------- ------- False 428 212 True 1699 17194 Accuracy: 0.902 Precision: ----- ----- False 0.201 True 0.988 ----- ----- Recall: ----- ----- False 0.667 True 0.91 ----- ----- PR-AUC: ----- ----- False 0.383 True 0.993 ----- ----- ROC-AUC: ----- ----- False 0.907 True 0.905 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.475 0.688 0.098 True 0.88 0.704 0.097 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.96 0.046 1 True 0.24 0.977 0.981 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.957 0.053 0.99 True 0.038 1 0.968 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.808 0.364 0.481 True 0.038 1 0.968 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.322 0.777 0.155 True 0.038 1 0.968
Persian Wikipedia (fawiki)
edithttps://ores.wmflabs.org/v2/scores/fawiki/goodfaith/?model_info
ScikitLearnClassifier - type: GradientBoosting - params: presort="auto", max_features="log2", scale=true, max_leaf_nodes=null, init=null, verbose=0, random_state=null, learning_rate=0.01, balanced_sample_weight=true, n_estimators=700, balanced_sample=false, subsample=1.0, warm_start=false, min_samples_leaf=1, max_depth=7, min_weight_fraction_leaf=0.0, loss="deviance", center=true, min_samples_split=2 - version: 0.3.0 - trained: 2017-01-06T20:21:04.924687 Table: ~False ~True ----- -------- ------- False 87 77 True 472 19168 Accuracy: 0.972 Precision: ----- ----- False 0.158 True 0.996 ----- ----- Recall: ----- ----- False 0.532 True 0.976 ----- ----- PR-AUC: ----- ----- False 0.211 True 0.995 ----- ----- ROC-AUC: ----- ----- False 0.974 True 0.964 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.09 0.939 0.077 True 0.89 0.922 0.079 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.953 0.102 1 True 0.051 1 0.992 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.953 0.102 1 True 0.051 1 0.992 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.936 0.095 0.633 True 0.051 1 0.992 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.439 0.652 0.162 True 0.051 1 0.992
Dutch Wikipedia (nlwiki)
edithttps://ores.wmflabs.org/v2/scores/nlwiki/goodfaith/?model_info
ScikitLearnClassifier - type: GradientBoosting - params: loss="deviance", max_features="log2", center=true, warm_start=false, subsample=1.0, scale=true, random_state=null, presort="auto", max_depth=5, min_samples_leaf=1, balanced_sample=false, n_estimators=700, min_samples_split=2, learning_rate=0.01, max_leaf_nodes=null, init=null, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, verbose=0 - version: 0.3.0 - trained: 2017-01-06T21:54:13.608947 Table: ~False ~True ----- -------- ------- False 601 70 True 1500 17293 Accuracy: 0.919 Precision: ----- ----- False 0.286 True 0.996 ----- ----- Recall: ----- ----- False 0.896 True 0.92 ----- ----- PR-AUC: ----- ----- False 0.677 True 0.995 ----- ----- ROC-AUC: ----- ----- False 0.971 True 0.971 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.361 0.935 0.094 True 0.5 0.922 0.094 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.967 0.198 1 True 0.072 0.996 0.981 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.954 0.302 0.92 True 0.024 1 0.969 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.803 0.756 0.466 True 0.024 1 0.969 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.091 0.975 0.171 True 0.024 1 0.969
Polish Wikipedia (plwiki)
edithttps://ores.wmflabs.org/v2/scores/plwiki/goodfaith/?model_info
ScikitLearnClassifier - type: RF - params: center=true, n_estimators=320, max_depth=null, balanced_sample_weight=true, min_samples_split=2, min_samples_leaf=1, verbose=0, min_weight_fraction_leaf=0.0, criterion="entropy", oob_score=false, n_jobs=1, class_weight=null, max_leaf_nodes=null, random_state=null, scale=true, max_features="log2", balanced_sample=false, bootstrap=true, warm_start=false - version: 0.3.0 - trained: 2017-01-06T22:30:20.768873 Table: ~False ~True ----- -------- ------- False 527 67 True 4 11998 Accuracy: 0.994 Precision: ----- ----- False 0.991 True 0.994 ----- ----- Recall: ----- ----- False 0.888 True 1 ----- ----- PR-AUC: ----- ----- False 0.953 True 0.995 ----- ----- ROC-AUC: ----- ----- False 0.985 True 0.989 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.047 0.974 0.062 True 0.675 0.995 0.086 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.4 0.918 0.991 True 0.293 1 0.989 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.252 0.923 0.944 True 0.133 1 0.974 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.062 0.962 0.595 True 0.133 1 0.974 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.013 0.988 0.199 True 0.133 1 0.974
Portuguese Wikipedia (ptwiki)
edithttps://ores.wmflabs.org/v2/scores/ptwiki/goodfaith/?model_info
ScikitLearnClassifier - type: GradientBoosting - params: scale=true, balanced_sample_weight=true, learning_rate=0.01, min_weight_fraction_leaf=0.0, max_depth=7, center=true, random_state=null, max_leaf_nodes=null, init=null, presort="auto", warm_start=false, min_samples_leaf=1, subsample=1.0, min_samples_split=2, verbose=0, loss="deviance", balanced_sample=false, n_estimators=700, max_features="log2" - version: 0.3.0 - trained: 2017-01-06T22:48:01.162565 Table: ~False ~True ----- -------- ------- False 935 258 True 2173 16447 Accuracy: 0.877 Precision: ----- ----- False 0.301 True 0.985 ----- ----- Recall: ----- ----- False 0.784 True 0.883 ----- ----- PR-AUC: ----- ----- False 0.522 True 0.992 ----- ----- ROC-AUC: ----- ----- False 0.937 True 0.932 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.554 0.744 0.099 True 0.729 0.807 0.096 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.959 0.053 1 True 0.396 0.916 0.98 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.95 0.091 0.969 True 0.034 1 0.941 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.749 0.577 0.457 True 0.034 1 0.941 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.065 0.976 0.157 True 0.034 1 0.941
Turkish Wikipedia (trwiki)
edithttps://ores.wmflabs.org/v2/scores/trwiki/goodfaith/?model_info
ScikitLearnClassifier - type: GradientBoosting - params: n_estimators=700, min_samples_leaf=1, scale=true, center=true, learning_rate=0.01, init=null, subsample=1.0, max_depth=7, min_weight_fraction_leaf=0.0, balanced_sample_weight=true, max_features="log2", warm_start=false, max_leaf_nodes=null, loss="deviance", balanced_sample=false, random_state=null, verbose=0, presort="auto", min_samples_split=2 - version: 0.3.0 - trained: 2017-01-06T23:29:38.432498 Table: ~False ~True ----- -------- ------- False 714 191 True 2678 16148 Accuracy: 0.855 Precision: ----- ----- False 0.21 True 0.988 ----- ----- Recall: ----- ----- False 0.787 True 0.858 ----- ----- PR-AUC: ----- ----- False 0.292 True 0.992 ----- ----- ROC-AUC: ----- ----- False 0.914 True 0.908 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.659 0.656 0.099 True 0.764 0.794 0.095 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.923 0.021 1 True 0.315 0.91 0.98 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.923 0.021 1 True 0.08 1 0.955 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.883 0.111 0.491 True 0.08 1 0.955 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.128 0.936 0.156 True 0.08 1 0.955
Wikidata (wikidatawiki)
edithttps://ores.wmflabs.org/v2/scores/wikidatawiki/goodfaith/?model_info
ScikitLearnClassifier - type: GradientBoosting - params: balanced_sample=false, center=true, verbose=0, presort="auto", scale=true, init=null, subsample=1.0, random_state=null, min_samples_leaf=1, max_depth=5, loss="deviance", min_weight_fraction_leaf=0.0, max_features="log2", learning_rate=0.1, n_estimators=300, warm_start=false, min_samples_split=2, max_leaf_nodes=null, balanced_sample_weight=true - version: 0.3.0 - trained: 2017-01-07T00:57:21.651623 Table: ~False ~True ----- -------- ------- False 2091 155 True 1009 21177 Accuracy: 0.952 Precision: ----- ----- False 0.675 True 0.993 ----- ----- Recall: ----- ----- False 0.931 True 0.955 ----- ----- PR-AUC: ----- ----- False 0.792 True 0.994 ----- ----- ROC-AUC: ----- ----- False 0.987 True 0.979 ----- ----- Recall @ 0.1 false-positive rate: label threshold recall fpr ------- ----------- -------- ----- False 0.093 0.986 0.096 True 0.277 0.965 0.096 Recall @ 0.98 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.993 0.034 1 True 0.077 0.974 0.98 Recall @ 0.9 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.99 0.087 0.934 True 0.006 1 0.909 Recall @ 0.45 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.054 0.992 0.471 True 0.006 1 0.909 Recall @ 0.15 precision: label threshold recall precision ------- ----------- -------- ----------- False 0.006 1 0.245 True 0.006 1 0.909
References
edit- ↑ See en:Wikipedia:Labels/Edit quality for the English Wikipedia manual labeling campaign