Text Analysis and Retrieval
Lecturers in charge: Prof. dr. sc. Bojana Dalbelo-Bašić
Doc. dr. sc. Jan Šnajder
Course description: Most human knowledge is stored in unstructured, textual format. Due to the vast and rapidly growing amount of text data available, text analysis and retrieval systems have become an indispensable part of modern ICT infrastructure. Such systems address diverse information needs of the users and enable the extraction of information from large volumes of unstructured data. Because of the complexity and ambiguity of natural language, text analysis is a non-trivial task, which relies on natural language processing, computational linguistics, and machine learning. This course provides a systematic overview of both traditional and advanced methods for text analysis and retrieval. The first part of the course deals with document representation and methods document retrieval, classification, and clustering. The second part deals with information extraction and text mining with an emphasis on methods based on statistical natural language processing and machine learning.
