Skip navigation

putin IS MURDERER

Please use this identifier to cite or link to this item: https://oldena.lpnu.ua/handle/ntb/45496
Title: Automated building and analysis of Ukrainian Twitter corpus for toxic text detection
Authors: Bobrovnyk, Kateryna
Affiliation: Taras Shevchenko National University of Kyiv
Bibliographic description (Ukraine): Bobrovnyk K. Automated building and analysis of Ukrainian Twitter corpus for toxic text detection / Kateryna Bobrovnyk // Computational Linguistics and Intelligent Systems. — Lviv : Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 55–56. — (Student section).
Bibliographic description (International): Bobrovnyk K. Automated building and analysis of Ukrainian Twitter corpus for toxic text detection / Kateryna Bobrovnyk // Computational Linguistics and Intelligent Systems. — Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 55–56. — (Student section).
Is part of: Computational Linguistics and Intelligent Systems (2), 2019
Journal/Collection: Computational Linguistics and Intelligent Systems
Volume: 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019
Issue Date: 18-Apr-2019
Publisher: Lviv Politechnic Publishing House
Place of the edition/event: Lviv
Keywords: toxic text detection
text corpus
Twitter
Number of pages: 2
Page range: 55-56
Start page: 55
End page: 56
Abstract: Toxic text detection is an emerging area of study in Inter-net linguistics and corpus linguistics. The relevance of the topic can be explained by the lack of Ukrainian social media text corpora that are publicly available. Research involves building of the Ukrainian Twitter corpus by means of scraping; collective annotation of 'toxic/non-toxic' texts; construction of the obscene words dictionary for future feature engineering; and models training for the task of text classi cation (com-paring Logistic Regression, Support Vector Machine, and Deep Neural Network).
URI: https://ena.lpnu.ua/handle/ntb/45496
ISSN: 2523-4013
Copyright owner: © 2019 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors.
URL for reference material: https://ssrn.com/abstract=3123710
http://dx.doi.org/10.2139/ssrn.3123710
https://github.com/kennethreitz/twitter-scraper
https://fasttext.cc/docs/en/language-identi
References (International): 1. Pradheep, T. and Sheeba, J.I. and Yogeshwaran, T. and Pradeep Devaneyan, S.: Au-tomatic Multi Model Cyber Bullying Detection from Social Networks. In: Proceedings of the International Conference on Intelligent Computing, Salem, Tamilnadu, India. (2017) Available at SSRN: https://ssrn.com/abstract=3123710 or http://dx.doi.org/10.2139/ssrn.3123710
2. Kennedy, G. W., McCollough, A.W., Dixon, E., Bastidas, A.,Ryan, J.,Loo, C., Sahay, S.: Hack Harassment: Technology Solutions to Combat Online Harassment. In: Proceedings of the First Workshop on Abusive Language Online, pp. 73–77, Vancouver, Canada (2017)
3. Rubtsova, Y.: Constructing a corpus for sentiment classication training. SOFT-WARE SYSTEMS 1(109), 72-78 (2015)
4. Twitter Scraper, https://github.com/kennethreitz/twitter-scraper. Last accessed 13 April 2019
5. Language identication, https://fasttext.cc/docs/en/language-identi cation.html. Last accessed 13 April 2019
Content type: Article
Appears in Collections:Computational linguistics and intelligent systems. – 2019 р.

Files in This Item:
File Description SizeFormat 
2019v2___Proceedings_of_the_3nd_International_conference_COLINS_2019_Workshop_Kharkiv_Ukraine_April_18-19_2019_Bobrovnyk_K-Automated_building_and_55-56.pdf321.3 kBAdobe PDFView/Open
2019v2___Proceedings_of_the_3nd_International_conference_COLINS_2019_Workshop_Kharkiv_Ukraine_April_18-19_2019_Bobrovnyk_K-Automated_building_and_55-56__COVER.png291.21 kBimage/pngView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.