Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information

The scientific work of the employees of the Department of Economic Informatics of PUEB was published in the monograph “Information Technology for Management: Approaches to Improving Business and Society” published by the Springer publishing house. The work focuses on the automatic assessment of the quality of Wikipedia articles and the study of the credibility of sources of information about companies in different languages.

As part of this research, more than 500,000 articles about companies from 310 different language versions of Wikipedia were analyzed. Data from DBpedia and Wikidata were used to identify such articles. Footnotes in these articles were then extracted and analyzed, identifying important sources of information. In this process, three different source evaluation models were used, which enabled the ranking of the most important sources of information about companies for each language version of Wikipedia.

The study “Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information” is available on the Springer website. scientific publication. Authors of the publication: Włodzimierz Lewoniewski, Krzysztof Węcel, Witold Abramowicz.

Quality of information and reliability of sources in Wikipedia

Wikipedia, due to its open structure, allows anyone to add and modify content. While this has some benefits, such as being able to update information quickly and covering a wide range of topics that may not be covered in conventional encyclopedias, it also comes with risks. It is because of this openness that some information may be untrue, biased or misleading. Furthermore, the quality of Wikipedia articles varies – some are well-written and well-sourced, while others may be incomplete, outdated, subjective, or obscure. Hence the need to assess the quality and verify the sources of information.

Evaluating information and sources not only can contribute to maintaining a high standard of content on Wikipedia, but can also be essential for companies seeking to effectively manage their image and public relations. The methods presented in this publication can help Wikipedia editors identify articles that need improvement. In addition, the presented models for assessing the credibility of sources can indicate websites that are a reliable source of information about companies.

Wikidata and DBpedia

Semantic databases such as DBpedia and Wikidata offer plenty of opportunities, especially for those working with data analysis, scientific research, artificial intelligence, or data-driven application development. Such semantic databases enable the effective combination of data from various sources, which allows the creation of new, rich data sets from various fields. In addition, by using semantic relationships between data, these databases are better able to interpret the context of the data. For example, if two objects are linked by a “parent” relationship, the database “understands” that there is some type of relationship between the objects.

DBpedia, Wikidata and other semantic databases are open to the public and cover many topics. They are therefore an invaluable source of information for scientists and application developers. It is worth noting that semantic databases are extremely useful in the field of artificial intelligence, especially in the context of machine learning and natural language processing. They can be used to train models, create recommendation systems, recognize proper names, answer questions, and even build chatbots.