Small Wikipedia Community Sustainability/Wikidata
Modern language communicative value
“ | Artificial life support of whatever unique cultural phenomenon and/or languages and worldviews relied on when it came into being or was later in active use, without proven utility of the latter for addressing current life's practical tasks is no more valuable than worshipping cuneiform or any other historic technology. | ” |
This work has matured in Wikimedia Languages of Russia Community volunteers' internal discussions when taking part in both Wikimedia Russia-initiated and global Wikimedia Language Diversity projects. It is summarized and published by Farhad Fatkullin (Kazan, Russia) with special thanks to Renat Shigapov (Germnay) for help with WikibaseCirrusSearch used in data collection, and Paul Kaganer (Saint Petersburg, Russia) for recommendations and support in choosing the topic, critical feedback in the process and proposals for further development of the analysis. First publication and material presentation is planned to take place in Tatar as part of 2nd Russia-wide "Language, Society and Information Technologies" Scientific and Practical Conference (17-18 Feb. 2023).
The following is a proposal to use Quantitative assessment method for evaluating the amount of work necessary to sustainably support any of the languages of Russia at a hypothetical digitally human-stationary orbit. The state of the languages is evaluated using absolute and relative data on Wikidata knowledge base elements' labels and descriptions in respective natural language, as well as lexicographical data used to describe various existing Wiki-functions depicted relationships between them. Analytical Tables below are filled by both statistical data and calculated shares per moment of last query, will be periodically updated.
Thesis
editA culture is a non-genetic mean of transferring information,[2] a language is a communicative protocol used within respective unique cultural environment,[3] language speakers (users) are information creation, operative storage and interchange nodes, whilst ongoing interaction of language users is a distributed data processing when generating new knowledge, reorganizing society structure or transformation of its essence (changing approaches in its interaction with the surrounding physico-biological world).[4]
Within such a model, an instrumental function of a standardized working language is in assuring communication speed and precision. In Knowledge-based economy era any subject, object and/or an information product therein is simultaneously participating in a multitude of parallel processes. Long-term viability of a separate natural language (and related cultural knowledge) depends on the level of its support and usage effectiveness of the tool within ever-changing technologic and social environment.[5]
An individual or a community of language speakers will keep investing into preserving respective linguistic competencies, as well as simultaneous continuous development of the communicative protocol in question for as long as it can be effectively used within global ecosystem of added-value creation (contribution into global GDP).[6] Thus long-term global multilingualism preservation is dependent on humanity's transitioning towards technologies allowing language equality within global division of labour based economic cooperation systems.[7].
This is exactly the founding pillar of language-independent Semantic Web (Web 3.0) technology. These opportunities have already been successfully demonstrated:
- since 2012 — Wikidata knowledge base (and other semi-structured data repositories based on independent Wikibase and similar installations, as well as generation of knowledge blocks and demonstration thereof in a target language),
- 2016-2018 transitioning of language description into Lexicographical data format (becoming more and more popular, likely to become a foundation of European language equality, including state and municipal services, etc., aimed to be reached by 2030), and a
- 2020-2023 launch of Wikifunctions, relying on the achievements above (recent 6 language limited pilot has been successfully completed, now getting ready for full production launch).
Unlike other Wikimedia projects, these are published and distributed under the highest degree of legal freedom license (Creative Commons Zero, CC0 - Public Domain). All those interested to contribute into providing reliable long-term sustainability of your favourite languages are invited to join volunteers developing their presence in the systems above!
Statistical Tables
editTables below (except for the comparative example) include languages which satisfy at least one of the two conditions:
- Official Status somewhere within the Russian Federation;
- Majority or a significant share of speakers reside within the territory of Russia.
Attention: * marks languages with official status in some administrative-territorial entity of Russia, with majority of its speakers residing outside the Federation's boundaries.
Comparative Analysis Source Data & Example Table
edit- Total Wikidata elements as of today - 101850730
- Lexemes and other lexigoraphical data by language
Language [[MainPage]] |
Code [[Category:]] |
Labels number |
Labels % |
Descriptions number |
Descriptions % |
Lexemes number |
Forms number |
Sense number |
---|---|---|---|---|---|---|---|---|
English | en | 86534411 | 85 | 84304578 | 82.8 | 72561 | 132029 | 30160 |
Arabic | ar | 6361419 | 6.2 | 51285699 | 50.4 | 1384 | 310 | 136 |
Spanish | es | 20974374 | 20.6 | 43824405 | 43 | 30480 | 352459 | 10425 |
Chinese | zh | 6317996 | 6.2 | 34690543 | 34.1 | 4322 | 4458 | 3995 |
Russian | ru | 8775094 | 8.6 | 39399582 | 38.7 | 101554 | 1238078 | 12833 |
French | fr | 21394767 | 21 | 50751745 | 49.8 | 19271 | 325157 | 9482 |
German | de | 21147425 | 20.8 | 65515077 | 64.3 | 213398 | 550155 | 10269 |
Turkish | tr | 1980389 | 1.9 | 33405809 | 32.8 | 225 | 188 | 189 |
Hindi | hi | 1185205 | 1.2 | 8194601 | 8 | 1407 | 2430 | 1892 |
Farsi | fa | 2773491 | 2.7 | 15532782 | 15.3 | 6768 | 11417 | 7425 |
Russian Federation languages with active Wikipedias (34)
editGroup | Language [[MainPage]] |
Code [[Category:]] |
Labels number |
Labels % |
Descriptions number |
Descriptions % |
Lexemes number |
Forms number |
Sense number |
---|---|---|---|---|---|---|---|---|---|
East Slavic (2) | Russian | ru | 8775094 | 8.62 | 39399582 | 38.68 | 101554 | 1238078 | 12833 |
Ukrainian* | uk | 6093531 | 5.98 | 59582944 | 58.5 | 16258 | 507956 | 283 | |
Turkic (10) | Altai | alt | 2247 | 0 | 10 | 0 | 1 | 0 | 1 |
Azerbaijani* | az | 1041742 | 1.02 | 1007804 | 0.99 | 28 | 22 | 22 | |
Bashkir | ba | 594414 | 0.58 | 6570985 | 6.45 | 16 | 3 | 14 | |
Crimean Tatar | crh | 398457 | 0.39 | 530708 | 0.52 | 13 | 10 | 15 | |
Chuvash | cv | 727163 | 0.71 | 786208 | 0.77 | 11 | 1 | 11 | |
Kazakh* | kk | 936517 | 0.92 | 929021 | 0.91 | 20 | 2 | 18 | |
Karachay-Balkar | krc | 456559 | 0.45 | 102963 | 0.1 | 2 | 0 | 2 | |
Sakha (Yakut) | sah | 475916 | 0.47 | 240898 | 0.24 | 8 | 0 | 8 | |
Tatar | tt | 980788 | 0.96 | 6509824 | 6.39 | 8 | 0 | 8 | |
Tuvan | tyv | 455568 | 0.45 | 94971 | 0.09 | 5 | 0 | 5 | |
Mongolic (2) | Buryat | bxr | 457138 | 0.45 | 142643 | 0.14 | 3 | 0 | 3 |
Kalmyk | xal | 455456 | 0.45 | 144964 | 0.14 | 12 | 2 | 14 | |
Indo-European (3) | Pontic Greek | pnt | 453735 | 0.45 | 144974 | 0.14 | 1 | 0 | 1 |
Ossetian | os | 685107 | 0.67 | 808088 | 0.79 | 7 | 1 | 7 | |
Yiddish* | yi | 677620 | 0.67 | 2889421 | 2.84 | 276 | 704 | 330 | |
Northeast Caucasian (5) | Avar | av | 458950 | 0.45 | 145011 | 0.14 | 6 | 0 | 6 |
Chechen | ce | 990769 | 0.97 | 1186533 | 1.16 | 10 | 1 | 10 | |
Ingush | inh | 456006 | 0.45 | 6984 | 0.01 | 4 | 0 | 4 | |
Lak | lbe | 452758 | 0.44 | 2 | 0 | 5 | 0 | 5 | |
Lezgian | lez | 458967 | 0.45 | 103329 | 0.1 | 6 | 0 | 6 | |
Northwest Caucasian (2) | Kabardian | kbd | 455580 | 0.45 | 28 | 0 | 7 | 1 | 7 |
Adyghe | ady | 451459 | 0.44 | 13 | 0 | 7 | 10 | 7 | |
Finno-Ugric (10) | Finnish* | fi | 7274103 | 7.14 | 34355848 | 33.73 | 636 | 8383 | 569 |
Permyak | koi | 458804 | 0.45 | 145570 | 0.14 | 6 | 0 | 6 | |
Komi | kv | 461827 | 0.45 | 344 | 0 | 28 | 36 | 24 | |
Moksha | mdf | 457818 | 0.45 | 181892 | 0.18 | 3 | 0 | 3 | |
Meadow-Eastern Mari | mhr | 669640 | 0.66 | 679351 | 0.67 | 3 | 1 | 3 | |
Hill Mari | mrj | 464752 | 0.46 | 7 | 0 | 4 | 0 | 4 | |
Erzya | myv | 494156 | 0.49 | 31494 | 0.03 | 10 | 2 | 9 | |
Livvi-Karelian | olo | 469880 | 0.46 | 20 | 0 | 2 | 0 | 2 | |
Udmurt | udm | 459479 | 0.45 | 193 | 0 | 7 | 0 | 7 | |
Veps | vep | 517757 | 0.51 | 31895 | 0.03 | 15 | 34 | 17 |
Russian Federation languages with Wikipedias in incubator (45)
editReferences
edit- ↑ Farhad Fatkullin Wikipedias in the languges of Russia today and tomorrow: why and how?, Finno-Ugric Wikiseminar, Petrozavodsk, 6-9 May 2016
- ↑ Alexander V. Markov in http://amp.gs/j8d9b per Frans de Waal, 2007
- ↑ Gainulla F. Shaykhiev. The Language of Reason. We think in Tatar, Russian, English, ... simultaneously... Kazan: 2000 (Russian) [Jazyk Razuma. My dumaem i po-tatarski, i po-russki, i po-angliyski...]
- ↑ Farhad Fatkullin: Mythic consciousness and the intangible cultural environment. Empirical analysis using multilingual Wikipedia materials readership structure and statistics. 1st Russia-wide "Language, Society and Information Technologies" Scientific and Practical Conference (19-20.02.2022). (Tatar) Provisions (Russian) Information Letter (Russian) Program (Russian)
- ↑ Farhad Fatkullin: Tatar Language and Culture Digital Sustainability Ecosystem. Hows and Whys. 8th Tatar Language and Literature Teachers Russia-wide Convention. Roundtable of the «Multicultural Education as a Factor in Developing Child's Identity and Ethnic Self-Counsciousness» Section, 28th June 2022.
- ↑ Farhad Fatkullin. Digital Ecosystem for Tatarstan's Linguistic and Cultural Diversity Sustainability. What, why and how?. «Preservation and Development of Native Tongues within a Multiethnic State: Language Policy, Challenges and Prospects» Interregional Scientific-Practical Conference (co-organized by UNESCO Information for All Program Committee for Russia, Russia's Federal Agency for Ethnic Affairs, etc.), «Role of ICT in Language Preservation» section, Kazan, Republic of Tatarstan Academy of Sciences Sh.Mardjani Institute of History, 21.06.2022.
- ↑ Farhad Fatkullin. Wikipedias as language speakers community cultural transformation catalyst. Wiki-Conference Russia 2022.