Web2Cit
Wikipedia's automatic citation generator (Citoid) greatly reduces the time, effort and knowledge needed to insert citations, one of Wikipedia's main pillars. However, these automatic citations don't always work as expected (see Research).
Web2Cit is a collaborative automatic citation generator for web sources, meant to complement citation results returned by Citoid. It is collaboratively controlled by the community via a set of relatively simple tools aiming at lowering the barrier of technical skills needed to help improving automatic citations.
Getting started
editInstall
editTo simply use Web2Cit automatic citations in Wikipedia, install our user script to your Wikipedia account[Notes 1] by pasting the following code to your Wikipedia's Special:MyPage/common.js
file:
// Web2Cit
mw.loader.load( '//en.wikipedia.org/w/index.php?title=User:Diegodlh/Web2Cit/script.js&action=raw&ctype=text/javascript' ); // Backlink: [[:en:User:Diegodlh/Web2Cit/script.js]]
Read our User script documentation for detailed instructions.
Use
editOnce installed, you should see a Web2Cit checkbox on your Visual Editor's "Cite" dialog, confirming installation.
Enter the URL you would like to cite and click on Generate. You should now get two citation results instead of one: the one at the top from Citoid, and the one at the bottom from Web2Cit.
Note that you can still use Web2Cit automatic citations without installing our user script (e.g., if you are not logged in, or if you are using ProveIt). Just prepend https://web2cit.toolforge.org/ to the URL pasted into the unmodified Cite dialog and click Generate as usual. The citation result shown will come from Web2Cit. |
This is all you need to know if you just want to use Web2Cit automatic citations in Wikipedia.
However, sooner or later you will find a URL for which both Citoid and Web2Cit results are incorrect. Read on to find out how to use Web2Cit to fix this.
Editing
editWeb2Cit uses a set of three configuration files per website that determine how Web2Cit handles webpages (paths) from that website (domain). Read our Web2Cit basics page to find out more about how Web2Cit works and the parts that it is made of.
Because these configuration files are collaboratively defined, if you are not happy with Web2Cit extraction (aka. translation) results you can just edit them and results will be updated for all Web2Cit users.
Following the Citoid/Zotero tradition, Web2Cit metadata extraction from web pages and automatic citation generation is referred to as translation. |
Open the editor
editTo edit configuration files, first open the translation summary page for a target webpage on the Web2Cit server. To do so, click on the Web2Cit link at the lower-right corner of the citation results showing in the Cite dialog (see Getting started above). Alternatively, just go to the Web2Cit server homepage, type in the target URL, and click Extract.
Then, click on any of the "edit" links that show on the translation summary page to edit the corresponding domain configuration file using our JSON editor, as explained in the subsections below. Refresh the translation summary page after each configuration change to see how results change accordingly.
Read the JSON editor section of our Editing documentation and our Server documentation to find out more about them, including how to temporarily save configuration files to a personal sandbox space without affecting all Web2Cit users, how to instruct the Web2Cit server to use configuration files from this sandbox space, and how to include debugging information in the translation summary that may help diagnose unexpected outputs.
Add a translation test
editAlways define a translation test before anything else. Clearly stating what the expected output should be for each translation field for a given target webpage will help you and other Web2Cit collaborators maintain Web2Cit configuration files for a given domain.
First, define a translation test for your target webpage, indicating the expected output:
- Click on the "edit" link next to the "Expected output" header on the translation summary page to edit the domain's tests configuration file using our JSON editor.
- If no translation test for the target webpage has been created, add a new translation test and enter the webpage's path.
- Add one or more test fields and indicate the expected output. Read our Tests documentation to find out more.
- Save the configuration file back to the Web2Cit storage.
If defining an translation test is all you can or want to do, that's great already! Your test will help other contributors define translation templates (see below), and will be used to regularly check the health of the Web2Cit system (see Watch changes below).
Add a translation template
editSo far you or somebody else will have indicated the expected output for your target webpage. But that doesn't include how to actually get that result!
Web2Cit uses translation templates to extract citation metadata from web sources. To define a translation template based on your target webpage:
- Click on the "edit" link next to the "Translation output" header on the translation summary page to edit the domain's templates configuration file using the JSON editor.
- If no translation template based on your target webpage has been created previously, add a new translation template and type in the target webpage's path.
- Add one or more template fields, each including at least one translation procedure. Procedures comprise a series of selection and transformation steps that specify how to retrieve and transform citation metadata. Read our Templates documentation to find out more.
- Save the configuration back to the Web2Cit storage.
In some cases, multiple templates per website might be needed. These can be can be grouped into separate translation subgroups based on URL path patterns. Read our Templates and Patterns documentation to find out more.
Once you are happy with your configuration, go back to Wikipedia and retry generating a citation for your target URL.
Watch changes
editFinally, the Web2Cit monitor regularly checks whether translation outputs from Web2Cit match the expected outputs defined in translation tests.
Test results are written to a series of per-domain result pages on-wiki which you can add to your watchlist to get a notification whenever test results change. The full list of test result pages can be checked on the overview page.
Read the Web2Cit monitor documentation to find more about it.
Need help?
editDocumentation
editDocumentation about how Web2Cit works includes:
- Basics: a quick overview of how Web2Cit works and the parts that make the Web2Cit ecosystem.
- Fields: translation field types and details.
- Templates: what are translation templates and how they work.
- Tests: what are translation tests and how they work.
- Patterns: what are URL path patterns and how they work.
- Editing: how to edit Web2Cit configuration.
User and developer documentation about the parts that make the Web2Cit ecosystem can be reached from the Basics documentation page.
Support
editUnderstanding Web2Cit can be challenging at first. If you need further help you can:
- Open a new thread at the discussion page of this or any other Web2Cit page
- Open a new task on Phabricator, using the Web2Cit umbrella project tag.
Ask someone
editWeb2Cit is a collaborative effort to improve automatic citations in Wikipedia. If you have questions you may also reach other members of the Web2Cit community directly. Check the Web2Cit contributors category for users who have added this category to their profile page. And feel free to add that category to your profile page too if you are a Web2Cit contributor yourself!
Contribute
editConfiguration
editThe simplest way of contributing to Web2Cit is by helping collaboratively create and maintain domain configuration files as described above.
Show that you are a Web2Cit contributor by including this userbox on your user page:
|
Language translations
editWeb2Cit is collaboratively translated into different languages. Different parts of Web2Cit are translated differently:
Metawiki pages
editTo translate pages like this one, check whether there is a banner at the top with a "translate this page" link. If yes, just click there to start translating. If not, it means the page is not ready (yet) for translation. You can bring this to notice in the corresponding discussion page.
User interfaces
editThe Web2Cit server interface is available for collaborative translation on translatewiki.net.
This does not currently include the Web2Cit JSON editor. It is planned that its interface and contents be available for collaborative translation under the same translatewiki.net project. In the meantime, you may use automatic translation provided by some web browsers.
Web2Cit monitor
editThe Web2Cit monitor produces overview, log and result pages that are made using custom templates. Please, help us translating those to have those pages translated to your language (see T321606).
Documentation
editAs most Wikimedia tools, Web2Cit is collaboratively documented. We made our best effort to provide some basic general and technical documentation, but we understand there is still lots of room for improvement. Feel free to improve what we currently have!
Development
editAll Web2Cit code is open source and free software. Please check the pages for our different software components to find out more about how to contribute:
Acknowledgements
editWeb2Cit was first developed with a grant from the Wikimedia Foundation, based on a idea proposed by Strainu.
The original team included:
- Diego de la Hera as project manager and lead developer.
- Evelin Heidel as communications and community manager.
- Gimena del Rio Riande, Nidia Hernández and Romina De León as research team members.
- Dennis Tobar as Web2Cit monitor developer.
Special thanks to our Advisory Board, who helped us from the beginning of the project with its development.
Web2Cit alternatives
editWeb2Cit may not always be the best choice. It may be worth it considering the following alternatives:
- If you think it's unlikely that somebody else will benefit from the extra work of configuring a website in Web2Cit, simply fix the citation generated by Citoid manually. This may be trivial if simply having to add or fix a field, but may require extra effort if the citation template must be changed. This is the fastest way, but doesn't benefit others citing sources from the same website.
- Talk to the webmaster of the website you are trying to cite and convince them to embed structured metadata. Webpages that include structured metadata are generally understood seamlessly by Citoid. This is the best solution long term.
- If you know JavaScript (or can find someone who does) edit or create the specific Zotero translator for the website you are trying to cite. Note that this may take some time until your changes are merged into the Zotero's repository, and then until updates are pulled into Wikimedia. This is the most advanced option, but also one more powerful than Web2Cit.
See also
edit- List of problematic URLs. A collaborative list of URLs for which Citoid returns incorrect or incomplete results.
- List of Web2Cit tools on Wikimedia's Toolhub.
- Web2Cit documentation
Notes
edit- ↑ Croatian and Romanian Wikipedias support Web2Cit as a gadget and can be easily enabled from your Wikipedia preferences.