Skip navigation


Please use this identifier to cite or link to this item:
Title: Extracting and classification the semi-structured data of web-systems
Authors: Pelekh, Irina
Affiliation: Lviv Polytechnic National University, Lviv, Ukraine
Bibliographic description (Ukraine): Pelekh I. Extracting and classification the semi-structured data of web-systems / Irina Pelekh // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 139–145. — (Section II. Intelligent Systems).
Bibliographic description (International): Pelekh I. Extracting and classification the semi-structured data of web-systems / Irina Pelekh // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 139–145. — (Section II. Intelligent Systems).
Is part of: Computational linguistics and intelligent systems (2), 2018
Issue Date: 25-Jun-2018
Publisher: Lviv Polytechnic National University
Place of the edition/event: Lviv
Temporal Coverage: 25-27 June 2018
Keywords: semi-structured data
Number of pages: 7
Page range: 139-145
Start page: 139
End page: 145
Abstract: The extracting and classification of semi-structured data of websystems is described. The definition of semi-structured data is given and the main characteristics are defined. The variety of tasks text information processing is grouped into the eleven large classes related to the analysis of text data. The traditional models of knowledge representation are considered. An algorithm for the web-sources, from which data will to be obtained, ontological model integrating creating is proposed. The process of data extracting using the query language to the markup language elements is characterized.
ISSN: 2523-4013
Copyright owner: © 2018 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors.
URL for reference material:
References (Ukraine): 1. Bondarenko M.F., Shabanov-Kushnarenko Yu.P.: Theory of intelligence. Textbook, X.: Izdvo SMIT, 576 p. (2007).
2. Buileaar P., Eigner T.: Topic extraction from scientific literature for competency management. Іn The 7th International Semantic Web Conference PICKME 2008, Karlsruhe, Germany, 55-67. (2008)
3. Kolada A.S., Gogunsky V.D.: Automation of information extraction from the sciencecomputer databases, Management of the development of complex systems, No. 16 (2013)
4. Kushniretska I., Kushniretska О., Berko A.: Designing of Structural Ontological Data Systems Model for Mash-UP Integration Process, Applied Computer Science, 11(1) (2015)
5. Kushniretska I., Kushniretska О., Berko A.: The ontological model of knowledge of scientific and technical information system, Computer Science and Information Technologies (CSIT'2014): proc. of the IX-th Intern. Scientific and Techn. Conf., Lviv, Ukraine / Min. of Education and Science of Ukraine (2014)
6. Kushniretska I.: Semi-structured data dynamic integration Mashup system, Computer Science and Information Technologies (CSIT'2016): proc. of the XI-th Intern. Scientific and Techn. Conf., Lviv, Ukraine, Min. of Education and Science of Ukraine, 220-221 (2016)
7. Manning C., Raghavan P., Schütze H.: Introduction to Information Retrieval, Cambridge University Press, ISBN 0-521-86571-9, (2008).
8. Lytvyn, V., Pukach, P., Bobyk, І., Vysotska, V.: The method of formation of the status of personality understanding based on the content analysis. In: Eastern-European Journal of Enterprise Technologies, 5/2(83), 4-12 (2016)
9. Kravets, P.: The game method for orthonormal systems construction. In: The Experience of Designing and Application of CAD Systems in Microelectronics (2007).
10. Lytvyn, V., Vysotska, V, Veres, O., Rishnyak, I., Rishnyak, H.: Content linguistic analysis methods for textual documents classification. In: Computer Science and Information Technologies, Proc. of the XI-th Int. Conf. CSIT’2016, 190-192 (2016)
11. Zhao Li, Wee Keong Ng, Aixin Sun: Web data extraction based on structural similarity, Journal Knowledge and Information Systems archive, Vol. 8, Issue 4, 438-461 (2005)
12. Zhou L.: Ontology Learning: State of the Art, Information Technology and Management, 8 (3), 241-252 (2007)
13. Chen, J., Dosyn, D., Lytvyn, V., Sachenko, A.: Smart Data Integration by Goal Driven Ontology Learning. In: Advances in Big Data. Advances in Intelligent Systems and Computing. – Springer International Publishing AG 2017. P. 283-292 (2017).
14. Su, J., Vysotska, V., Sachenko, A., Lytvyn, V., Burov, Y.: Information resources processing using linguistic analysis of textual content. In: Intelligent Data Acquisition and Advanced Computing Systems Technology and Applications, Romania, 573-578, (2017)
15. Vysotska, V., Chyrun, L., Chyrun, L.: Information Technology of Processing Information Resources in Electronic Content Commerce Systems, CSIT, 212–222 (2016)
16. Vysotska, V., Hasko, R., Kuchkovskiy, V.: Process analysis in electronic content commerce system. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT 2015, 120-123 (2015)
17. Vysotska, V.: Linguistic Analysis of Textual Commercial Content for Information Resources Processing. In: Modern Problems of Radio Engineering, Telecommunications and Computer Science, TCSET’2016, 709–713 (2016)
18. Basyuk, T.: The Popularization Problem of Websites and Analysis of Competitors. Advances in Intelligent Systems and Computing II. CSIT 2017. Advances in Intelligent Systems and Computing, vol 689. Springer, Cham pp. 54-65 (2017)
19. Vysotska, V., Chyrun, L., Lytvyn, V.: Methods based on ontologies for information resources processing. Germany: LAP LAMBERT Academic Publishing (2016).
20. Vysotska, V.: Tekhnolohiyi elektronnoyi komertsiyi ta Internet-marketynhu. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
21. Vysotska, V., Lytvyn, V.: Web resources processing based on ontologies. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
22. Vysotska, V., Shakhovska, N.: Information technologies of gamification for training and recruitment. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
23. Vysotska, V.: Internet systems design and development based on Web Mining and NLP. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
24. Vysotska, V.: Computer linguistics for online marketing in information technology : Monograph. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
25. Lytvyn, V., Vysotska, V., Chyrun, L., Smolarz, A., Naum O.: Intelligent System Structure for Web Resources Processing and Analysis. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 56-74 (2017)
26. Lytvyn, V., Vysotska, V., Wojcik, W., Dosyn, D.: A Method of Construction of Automated Basic Ontology. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 75-83 (2017)
27. Lytvynenko, V., Lurie, I., Radetska, S., Voronenko, M., Kornilovska, N., Partenjucha, D.: Content analysis of some social media of the occupied territories of Ukraine. In: 1st Inter. Conference Computational Linguistics and Intelligent Systems, COLINS, 84–94 (2017)
28. Shepelev, G., Khairova, N.: Methods of comparing interval objects in intelligent computer systems. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, (2017)
29. Orobinska, O., Chauchat, J.-H., Sharonova, N.: Methods and models of automatic ontology construction for specialized domains (case of the Radiation Security). In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 95–99 (2017)
30. Hamon, T., Grabar, N.: Unsupervised acquisition of morphological resources for Ukrainian. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 20–30 (2017)
31. Grabar, N., Hamon, T.: Creation of a multilingual aligned corpus with Ukrainian as the target language and its exploitation. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 10–19 (2017)
32. Hamon, T.: Biomedical text mining. In: Computational Linguistics and Intelligent Systems,
33. Lande, D., Andrushchenko, V., Balagura, I.: An index of authors’ popularity for Internet encyclopedia. In: Computational Linguistics and Intelligent Systems, COLINS, (2017)
34. Lande, D.: Creation of subject domain models on the basis of monitoring of network information resources. In: 1st International Conference Computational Linguistics and Intelligent Systems, (2017)
35. Protsenko, Y.: Intuition on modern deep learning approaches in computer vision. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, (2017)
36. Kolbasin, V.: AI trends, or brief highlights of NIPS 2016. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, uploads/2017/04/CoLlnS_TuS.pdf (2017)
37. Kersten, W.: The Digital Transformation of the Industry – the Logistics Example. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, (2017)
38. Shalimov, V.: Big Data – Revolution in Data Storage and Processing. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, (2017)
39. Hnot, T.: Qualitative content analysis: expertise and case study. In: 1st Inter. Conference Computational Linguistics and Intelligent Systems, COLINS, uploads/2017/04/Qualitative-content-analysis_expertise-and-case-study.pdf (2017)
40. Romanyshyn, M.: Grammatical Error Correction: why commas matter. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, COLINS, uploads/2017/04/Grammatical-Error-Correction-why-commas-matter.pdf. (2017)
41. Yukhno, K., Chubar, E.: Gamification: today and tomorrow. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 139–140 (2017)
42. Pidpruzhnikov, V., Ilchenko, M.: Search optimization and localization of the website of Department of Applied Linguistics. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 137–138 (2017)
43. Olifenko, I., Borysova, N.: Analysis of existing German Corpora. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 135–136 (2017)
44. Kolesnik, A., Khairova, N.: Use of linguistic criteria for estimating of wikipedia articles quality. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, (2017)
45. Kirkin, S., Melnyk, K.: Intelligent data processing in creating targeted advertising. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, COLINS, 131–132 (2017)
46. Hordienko, H., Ilchenko, M.: Development and computerization of an English term system in the fields of drilling and drilling rigs. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 129–130 (2017)
References (International): 1. Bondarenko M.F., Shabanov-Kushnarenko Yu.P., Theory of intelligence. Textbook, X., Izdvo SMIT, 576 p. (2007).
2. Buileaar P., Eigner T., Topic extraction from scientific literature for competency management. In The 7th International Semantic Web Conference PICKME 2008, Karlsruhe, Germany, 55-67. (2008)
3. Kolada A.S., Gogunsky V.D., Automation of information extraction from the sciencecomputer databases, Management of the development of complex systems, No. 16 (2013)
4. Kushniretska I., Kushniretska O., Berko A., Designing of Structural Ontological Data Systems Model for Mash-UP Integration Process, Applied Computer Science, 11(1) (2015)
5. Kushniretska I., Kushniretska O., Berko A., The ontological model of knowledge of scientific and technical information system, Computer Science and Information Technologies (CSIT'2014): proc. of the IX-th Intern. Scientific and Techn. Conf., Lviv, Ukraine, Min. of Education and Science of Ukraine (2014)
6. Kushniretska I., Semi-structured data dynamic integration Mashup system, Computer Science and Information Technologies (CSIT'2016): proc. of the XI-th Intern. Scientific and Techn. Conf., Lviv, Ukraine, Min. of Education and Science of Ukraine, 220-221 (2016)
7. Manning C., Raghavan P., Schütze H., Introduction to Information Retrieval, Cambridge University Press, ISBN 0-521-86571-9, (2008).
8. Lytvyn, V., Pukach, P., Bobyk, I., Vysotska, V., The method of formation of the status of personality understanding based on the content analysis. In: Eastern-European Journal of Enterprise Technologies, 5/2(83), 4-12 (2016)
9. Kravets, P., The game method for orthonormal systems construction. In: The Experience of Designing and Application of CAD Systems in Microelectronics (2007).
10. Lytvyn, V., Vysotska, V, Veres, O., Rishnyak, I., Rishnyak, H., Content linguistic analysis methods for textual documents classification. In: Computer Science and Information Technologies, Proc. of the XI-th Int. Conf. CSIT’2016, 190-192 (2016)
11. Zhao Li, Wee Keong Ng, Aixin Sun: Web data extraction based on structural similarity, Journal Knowledge and Information Systems archive, Vol. 8, Issue 4, 438-461 (2005)
12. Zhou L., Ontology Learning: State of the Art, Information Technology and Management, 8 (3), 241-252 (2007)
13. Chen, J., Dosyn, D., Lytvyn, V., Sachenko, A., Smart Data Integration by Goal Driven Ontology Learning. In: Advances in Big Data. Advances in Intelligent Systems and Computing, Springer International Publishing AG 2017. P. 283-292 (2017).
14. Su, J., Vysotska, V., Sachenko, A., Lytvyn, V., Burov, Y., Information resources processing using linguistic analysis of textual content. In: Intelligent Data Acquisition and Advanced Computing Systems Technology and Applications, Romania, 573-578, (2017)
15. Vysotska, V., Chyrun, L., Chyrun, L., Information Technology of Processing Information Resources in Electronic Content Commerce Systems, CSIT, 212–222 (2016)
16. Vysotska, V., Hasko, R., Kuchkovskiy, V., Process analysis in electronic content commerce system. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT 2015, 120-123 (2015)
17. Vysotska, V., Linguistic Analysis of Textual Commercial Content for Information Resources Processing. In: Modern Problems of Radio Engineering, Telecommunications and Computer Science, TCSET’2016, 709–713 (2016)
18. Basyuk, T., The Popularization Problem of Websites and Analysis of Competitors. Advances in Intelligent Systems and Computing II. CSIT 2017. Advances in Intelligent Systems and Computing, vol 689. Springer, Cham pp. 54-65 (2017)
19. Vysotska, V., Chyrun, L., Lytvyn, V., Methods based on ontologies for information resources processing. Germany: LAP LAMBERT Academic Publishing (2016).
20. Vysotska, V., Tekhnolohiyi elektronnoyi komertsiyi ta Internet-marketynhu. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
21. Vysotska, V., Lytvyn, V., Web resources processing based on ontologies. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
22. Vysotska, V., Shakhovska, N., Information technologies of gamification for training and recruitment. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
23. Vysotska, V., Internet systems design and development based on Web Mining and NLP. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
24. Vysotska, V., Computer linguistics for online marketing in information technology : Monograph. Saarbrücken, Germany: LAP LAMBERT Academic Publishing (2018)
25. Lytvyn, V., Vysotska, V., Chyrun, L., Smolarz, A., Naum O., Intelligent System Structure for Web Resources Processing and Analysis. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 56-74 (2017)
26. Lytvyn, V., Vysotska, V., Wojcik, W., Dosyn, D., A Method of Construction of Automated Basic Ontology. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 75-83 (2017)
27. Lytvynenko, V., Lurie, I., Radetska, S., Voronenko, M., Kornilovska, N., Partenjucha, D., Content analysis of some social media of the occupied territories of Ukraine. In: 1st Inter. Conference Computational Linguistics and Intelligent Systems, COLINS, 84–94 (2017)
28. Shepelev, G., Khairova, N., Methods of comparing interval objects in intelligent computer systems. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, (2017)
29. Orobinska, O., Chauchat, J.-H., Sharonova, N., Methods and models of automatic ontology construction for specialized domains (case of the Radiation Security). In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 95–99 (2017)
30. Hamon, T., Grabar, N., Unsupervised acquisition of morphological resources for Ukrainian. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 20–30 (2017)
31. Grabar, N., Hamon, T., Creation of a multilingual aligned corpus with Ukrainian as the target language and its exploitation. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 10–19 (2017)
32. Hamon, T., Biomedical text mining. In: Computational Linguistics and Intelligent Systems,
33. Lande, D., Andrushchenko, V., Balagura, I., An index of authors’ popularity for Internet encyclopedia. In: Computational Linguistics and Intelligent Systems, COLINS, (2017)
34. Lande, D., Creation of subject domain models on the basis of monitoring of network information resources. In: 1st International Conference Computational Linguistics and Intelligent Systems, (2017)
35. Protsenko, Y., Intuition on modern deep learning approaches in computer vision. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, (2017)
36. Kolbasin, V., AI trends, or brief highlights of NIPS 2016. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, uploads/2017/04/CoLlnS_TuS.pdf (2017)
37. Kersten, W., The Digital Transformation of the Industry – the Logistics Example. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, (2017)
38. Shalimov, V., Big Data – Revolution in Data Storage and Processing. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, (2017)
39. Hnot, T., Qualitative content analysis: expertise and case study. In: 1st Inter. Conference Computational Linguistics and Intelligent Systems, COLINS, uploads/2017/04/Qualitative-content-analysis_expertise-and-case-study.pdf (2017)
40. Romanyshyn, M., Grammatical Error Correction: why commas matter. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, COLINS, uploads/2017/04/Grammatical-Error-Correction-why-commas-matter.pdf. (2017)
41. Yukhno, K., Chubar, E., Gamification: today and tomorrow. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 139–140 (2017)
42. Pidpruzhnikov, V., Ilchenko, M., Search optimization and localization of the website of Department of Applied Linguistics. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 137–138 (2017)
43. Olifenko, I., Borysova, N., Analysis of existing German Corpora. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 135–136 (2017)
44. Kolesnik, A., Khairova, N., Use of linguistic criteria for estimating of wikipedia articles quality. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, (2017)
45. Kirkin, S., Melnyk, K., Intelligent data processing in creating targeted advertising. In: 1st Inter. Conf. Computational Linguistics and Intelligent Systems, COLINS, 131–132 (2017)
46. Hordienko, H., Ilchenko, M., Development and computerization of an English term system in the fields of drilling and drilling rigs. In: 1st International Conference Computational Linguistics and Intelligent Systems, COLINS, 129–130 (2017)
Content type: Conference Abstract
Appears in Collections:Computational linguistics and intelligent systems. – 2018 р.

Files in This Item:
File Description SizeFormat 
COLINS_2018_2018v2_Pelekh_I-Extracting_and_classification_139-145.pdf1.98 MBAdobe PDFView/Open
COLINS_2018_2018v2_Pelekh_I-Extracting_and_classification_139-145__COVER.png250.75 kBimage/pngView/Open
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.