User:KHarold (WMF)/Sandbox/Learning Patterns/Contest Scoring Systems

Learning Pattern Library

Contest Scoring Systems

problemYou want to set up or improve an editing contest scoring system.

solutionScoring systems can be simple or complex, and can be adjusted to meet contest goals.

creator• KHarold (WMF)

discuss

endorse

status:in progress

What problem does this solve?

Contests should have a clear scoring system so that participants know what tasks to focus on and so that judges can determine the winners of the contest. Use the guidelines and examples below to choose a scoring system for your next contest.

What is the solution?

It is a good idea to come up with the scoring system for a contest before the competition begins. You can start this discussion on the contest event talk page.^[1]^[2] Scoring systems for contest can be very simple, or very complex, with different values applied to different kinds of tasks or content. When starting a new competition, it is a good idea to use a simple system and add in complexity as needed.^[3]

With point-based systems, it is likely that contest participants will look at work done by others and check that their scores are correct.^[4] If you use a bot or tool to count points, you might consider reivewing a sample of edits to make sure that scoring is correct.

Simple Point Based System

Simple Self-Score System
New Article	3 points
1,000 Bytes	1 point
Image Added	3 points

Complex Point Based System

Weighting applied to different tasks.
Best to use a bot or tool to automate scores.

Weighted scoring system
New article started	10 points (not included redirects)
Bytes added	.001 point, maxing out at 50 points
Words added	.01 point, not including templates or technicals
References	5 points for every source, 1 point for every reference to an existing source
Images	5 points for every picture, with a limit of 15 points

Qualitative

Judges give scores based on the quality of a submission, or the value that a contest participant has added to Wikipedia.^[5]
This kind of system can be problematic because judging is subjective. Contestants may complain about outcomes.
You might use this system to judge long-term contests where participants are working on a wide variety of tasks. Judges should be experienced Wikipedians who are respected or trusted by the community for maintaining neutrality.
You might use this system on content-gap contests, where you may not have experienced Wikipedians with significant knowledge of a subject. Consider asking partner organizations or content experts to act as judges.

Collaboration

Some contests give a prize for collaboration.
One way to do this is to score an article based using whatever system the community has agreed on, then awardi the points to all participants who contributed to that article.
Another way to score based on collaboration is to watch the interaction between contest participants using a watchlist or the education extension to see who is contributing to other articles more than others.^[6]

Quality

Your community must decide what quality means and how it will be judged.
Some competitons give points for number of references and citations added, or for adding images to articles.

Add YOUR contest scoring tips!

General considerations

When to use

“There was a conversation in the community about scoring for a few weeks before the competition began. 1 point for any spelling typo and broader errors as well: eg the the beatles, then vs than. There was more weight to larger typos, eg 5 pts to vs too vs two. 1 pts for each unique typo on page.” - Tyop
"It has grown quite complex, because of people who gamed the system. We added in these rules to circumvent that. Even though it is quite complex, it is predefined and people know what it is." "If I started a competition on a different Wikipedia, I would start with a simple scoring system, and just add on complexity as needed." - Lars
“Marks given for quantity, how long is the article, how many articles were written, but marks also given for quality.” - WikiWomen

Related patterns

External links

References

↑ “The community is active on the talk page in the days before the contest begins, discussing what quality means and what will be measured. Normally, the community decides what will be measured.” - Alex
↑ “There was a conversation in the community about scoring for a few weeks before the competition began.” - Tyop
↑ “We have tweaked the scoring system, for example, the parameters. When we started out a new article was worth 30 points, which meant that people were creating all of these stubs and gaming the system. So we changed things to conquer that. One major problem was with just one user, actually, who was machine translating things through Google Translate. The difficulty was that you don't want to discourage people from participation, but they are participating with poor content. We were stuck between a rock and a hard place. Someone else dealt with it, and it wasn't pretty. That user didn't really see the problem with creating poor quality translations. This affected the motivation of others because they saw that he was getting points, and I'm not getting points for all of the work I am putting into it." - Lars
↑ "Scaling judging can be difficult, so we use the Keep It Simple Stupid KISS rule for most contests. It is up to people count their own points. Usually 2nd and 3rd place winners count the other winners’ points, just to check that scores are correct. It is community led and community managed, so there are no judges at all." - Amical
↑ “The scoring system is very general. Did they accomplish their goals? Within these goals, what it the quality of the work done? Tell participants to break down what sort of articles they are creating: seed, good, featured. Did this person impact WP more than anyone else?” WHO SAID THIS!
↑ "I keep all of the pages in the competition are on my watchlist. In the second competition, it was clear who was the most helpful [to others] so that person won a prize for collaboration." - Yoni PhysiWiki

[1] “The community is active on the talk page in the days before the contest begins, discussing what quality means and what will be measured. Normally, the community decides what will be measured.” - Alex

[2] “There was a conversation in the community about scoring for a few weeks before the competition began.” - Tyop

[3] “We have tweaked the scoring system, for example, the parameters. When we started out a new article was worth 30 points, which meant that people were creating all of these stubs and gaming the system. So we changed things to conquer that. One major problem was with just one user, actually, who was machine translating things through Google Translate. The difficulty was that you don't want to discourage people from participation, but they are participating with poor content. We were stuck between a rock and a hard place. Someone else dealt with it, and it wasn't pretty. That user didn't really see the problem with creating poor quality translations. This affected the motivation of others because they saw that he was getting points, and I'm not getting points for all of the work I am putting into it." - Lars

[4] "Scaling judging can be difficult, so we use the Keep It Simple Stupid KISS rule for most contests. It is up to people count their own points. Usually 2nd and 3rd place winners count the other winners’ points, just to check that scores are correct. It is community led and community managed, so there are no judges at all." - Amical

[5] “The scoring system is very general. Did they accomplish their goals? Within these goals, what it the quality of the work done? Tell participants to break down what sort of articles they are creating: seed, good, featured. Did this person impact WP more than anyone else?” WHO SAID THIS!

[6] "I keep all of the pages in the competition are on my watchlist. In the second competition, it was clear who was the most helpful [to others] so that person won a prize for collaboration." - Yoni PhysiWiki

[1]

[2]

[3]

[4]

[5]

[6]