Since founding and with the increasing popularity of Wikipedia there are more and more scientific publications on the quality of the information. One of the first studies in the area related to the automatic assessment of Wikipedia quality – “Assessing information quality of a community-based encyclopedia” by Besiki Stvilia, Michael B. Twidale, Linda C. Smith, Les Gasser.
This work showed that the measurement of the volume of content can help determine the degree of maturity of the article. The methodology of the metrics construction and the results of tests, along with a number of statistical characterizations of Wikipedia articles, their content construction, process metadata and social context are reported.
Research was conducted on articles of English language Wikipedia. Authors investigated different parameters:
- Article length (in # of characters)
- Num. of Internal Links
- Num. of Internal Broken Links
- Num. of External Links
- Num. of Images
- Information Noise
- Flesch index
- Kincaid index
- Total Num. of Edits
- Num. of Reverts
- Num. of Unique Editors
- Diversity (# of Unique Editors / Total # of Edits)
- Admin. Edit Share (Num. of Admin Edits / Total Num. of Edits)
- Num. of Anonymous User Edits
- Article Median Revert Time (in Minutes)
- Currency (the time between the dump date and the date of the last update of the article)
- Article Age (in days)
and measures:
- Authority/Reputation
- Completeness
- Complexity
- Informativeness
- Consistency
- Currency
- Volatility
Paper can be found here.
More recent researches on the automatic quality assessment of information in Wikipedia in different languages can be found at page Publications.