An article entitled “OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation” has been published in open access. The paper describes the approach that won first place in an international competition in the area of information credibility. Authors of the publication: Dr. Włodzimierz Lewoniewski, Dr. Piotr Stolarski, Dr. Milena Stróżyna, Dr. Elżbieta Lewańska, Aleksandra Wojewoda, Ewelina Księżniak, Marcin Sawiński.
The competition was organized as part of the international conference CLEF 2024 (Conference and Labs of the Evaluation Forum). The aim of the competition was to verify the robustness of popular text classification approaches used for credibility assessment problems. The task of the competition participants was to create adversarial examples – that is, to modify texts in such a way that the classification algorithms changed their decision to the opposite, but without changing the meaning of the text. The challenge was that the texts had to be modified in such a way as not only to change the decision of each of the three different classifiers (based on the BERT, BiLSTM and RoBERTa models) on each of over 2000 text examples, but also to ensure a minimum number of changes to symbols (words) in the texts and maximum preservation of their semantic meaning. In addition, the task included texts from 5 different problem areas: assessing news bias, detecting propaganda, fact-checking, detecting rumors and COVID-19-related disinformation.
Among all the teams that reported their results, the OpenFact method obtained the highest result according to BODEGA metric, which takes into account measures related to the level of effectiveness of changes in texts, semantic similarity and Levenshtein editing distance. The largest number of points allowed it to take first place in the ranking, ahead of methods developed by, among others, the University of Zurich (UZH). The approach described in the article can help improve machine learning algorithms that are used within various websites (e.g. social media) to filter content, in particular to detect misleading, harmful or illegal texts.
In the work, various available methods were tested and modifications were proposed, which allowed for improving the results. For example, in one of the first stages of experiments, a new approach to selecting important words in the text for modification was proposed within the popular BERT-Attack method. The figure below shows the differences between the position of words in the rankings in the basic version (DIR) and in the modified version (NIR) within one of the texts. The new word ranking was generated based on the potential to change the decision of the target classifier, which can be achieved based on specific candidates for replacement.
The Department of Information Systems is currently implementing the OpenFact research project, headed by Prof. Witold Abramowicz. As part of this project, tools for automatic detection of fake news in Polish are being developed. In July 2024, the results of the OpenFact project were assessed by National Center for Research and Development as the best in Poland for the second year in a row. Winning prestigious competitions confirms that the achievements of our team are significant on a global scale and that the methods developed by the OpenFact team achieve equally high effectiveness in other languages.
The OpenFact project is financed by the National Center for Research and Development under the INFOSTRATEG I program “Advanced information, telecommunications and mechatronic technologies”.
Sources: kie.ue.poznan.pl, ue.poznan.pl