Application Logistic Regression in Assessing the Quality of Information – Wikipedia Articles Case

The use of the logistic regression in the assessment of the quality of data may have a significant impact on data management in the era of big data, where we are all dealing with a number of variables and amount of information describing some interesting phenomenon or behaviour. The calculation of actual an information value (IV) indicator allows to eliminate these variables which are irrelevant or just constitute an information overload. The article presents the use of logistic regression in the assessment of variables describing the quality of articles published on the English version of Wikipedia. A classification of variables because of the results of the information value indicator have been presented. Also the predictive capabilities of variables have been evaluated.

