Analysis of the use of scientific sources on Wikipedia depending on the topic and language

The paper entitled “Understanding the Use of Scientific References in Multilingual Wikipedia across Various Topics” was published in Elsevier’s Procedia Computer Science on ScienceDirect. As part of the scientific research, hundreds of millions of footnotes in Wikipedia articles from various language versions were analyzed to identify scientific sources of information. In addition, Wikipedia articles have been divided into various topics using information from Wikipedia projects and based on semantic knowledge bases – Wikidata and DBpedia.

Wikipedia articles must be based on reliable sources that readers can verify. However, the assessment of the reliability of sources is subjective and may vary depending on the language version and the topic of the articles in this encyclopedia. Some footnotes in Wikipedia articles point to scholarly sources that are generally considered more reliable than websites because they are subject to a rigorous peer-review process and are published by reputable academic publishers. This means that the information presented in scientific sources has been thoroughly evaluated by experts in the field, ensuring a higher degree of validity and credibility.

In order to conduct a comparative analysis of the use of scientific sources, the articles were divided into topics using three methods: based on the semantic connections of Wikipedia articles with DBpedia objects, connection with Wikidata elements, and affiliation of articles to WikiProjects. Brief information about these open resources:

  • DBpedia is a project aimed at extracting structured content from information created in the Wikipedia project. This enables access to structured information on the World Wide Web. DBpedia allows users to semantically search for relationships and properties of Wikipedia resources, including links to other related datasets.
  • Wikidata – a multilingual knowledge graph edited by volunteers from around the world. It is a publicly available source of open data that can be used by Wikimedia projects such as Wikipedia.
  • WikiProject is a group of Wikipedia users who want to work together as a team to improve Wikipedia. These groups often focus on a specific topic area (for example, WikiProject Mathematics or WikiProject Poland), a specific part of the encyclopedia (for example, WikiProject Disambiguation), or a specific type of task (for example, checking newly created pages). The English version of Wikipedia currently has over 2,000 WikiProjects, approximately 1,000 of which are monitored by 30–2,000 editors, all with varying levels of activity.

This research is supported by the project “OpenFact – artificial intelligence tools for verification of the veracity of information sources and fake news detection” (INFOSTRATEG-I/0035/2021-00), granted within the INFOSTRATEG I program of the National Center for Research and Development, under the topic: Verifying information sources and detecting fake news.