This article presents an overview of scientific works related to quality assessment of Wikipedia in different languages. Despite the fact that Wikipedia is often criticized for its poor quality, it still is one of the most popular knowledge bases in the world. Currently, this online encyclopedia is on the 5th place in the ranking of ...
Wikilinks are internal hyperlinks on Wikipedia, a popular Internet encyclopaedia. A unique article identifier is hidden behind so-called surface form, which is a grammatical match of a given term accordingly to the context in which it occurs. Therefore, a given term can have multiple surface forms.
This article presents and classifies features that can be extracted from Wikipedia articles for the purpose of automatic information quality assessment. Based on a state of the art analysis and our own experiments, specific measures for various aspects of quality have been defined.
The use of the logistic regression in the assessment of the quality of data may have a significant impact on data management in the era of big data, where we are all dealing with a number of variables and amount of information describing some interesting phenomenon or behaviour.