Početna stranicaVisoka učilištaKorisničke stranice
Text Analysis and Retrieval
Abbreviation: Load: 30(L) + 0(E) + 0(LE) + 0(CE)
Lecturers in charge: Prof. dr. sc. Bojana Dalbelo-Bašić
Doc. dr. sc. Jan Šnajder
Lecturers:
Course description: Most human knowledge is stored in unstructured, textual format. Due to the vast and rapidly growing amount of text data available, text analysis and retrieval systems have become an indispensable part of modern ICT infrastructure. Such systems address diverse information needs of the users and enable the extraction of information from large volumes of unstructured data. Because of the complexity and ambiguity of natural language, text analysis is a non-trivial task, which relies on natural language processing, computational linguistics, and machine learning. This course provides a systematic overview of both traditional and advanced methods for text analysis and retrieval. The first part of the course deals with document representation and methods document retrieval, classification, and clustering. The second part deals with information extraction and text mining with an emphasis on methods based on statistical natural language processing and machine learning.
Lecture languages: - - -
Compulsory literature:
2. Information Retrieval: Implementing and Evaluating Search Engines; S. Buettcher, C. L. A. Clarke, G. V. Cormack; The MIT Press; 2010
4. Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications; G. Miner, J. Elder IV, T. Hill, R. Nisbet, D. Delen, A. Fast; Academic Press; 2012
Recommended literature:
1. Introduction to Information Retrieval; C. D. Manning, P. Raghavan, H. Schütze; Cambridge University Press; 2008
3. Text Mining: Predictive Methods for Analyzing Unstructured Information; S. M. Weiss, N. Indurkhya, T. Zhang, F. Damera; Springer; 2010
5. Foundations of Statistical Natural Language Processing; C. D. Manning, H. Schütze; The MIT Press; 1999
Legend
L - Lectures
E - Exercises
LE - Laboratory exercises
CE - Project laboratory
* - Not graded
Copyright (c) 2006. Ministarstva znanosti, obrazovanja i športa. Sva prava zadržana.
Programska podrška (c) 2006. Fakultet elektrotehnike i računarstva.
Oblikovanje(c) 2006. Listopad Web Studio.
Posljednja izmjena 2014-01-27