CS726 - Information Retrieval Techniques
Course Page
Q & A
Course Category: Computer Science/Information Technology
Course Level: Graduate
Credit Hours: 3
Pre-requisites: CS403

Course Synopsis

This course discusses the theory, design, and implementation of text-based information retrieval systems. The core components of an Information Retrieval include statistical characteristics of text, representation of information needs and documents, several important retrieval models (Boolean, vector space, probabilistic, inference net, language modeling, link analysis), clustering algorithms, automatic text categorization, recommender systems, search computing ,search engine optimization, multimedia IR, semantic web, and experimental evaluation. The software architecture components include design and implementation of high-capacity text retrieval and text filtering systems. Furthermore, queries related to the “deep web” are also discussed under the topic of Search Computing. Lastly, Page Rank Computation, Latent Semantic Indexing, other advance topics, and latest research trends shall also be discussed in this course.

Course Learning Outcomes

Developing understanding of theory and practice of text retrieval techniques
  • You will be able to understand theory of IR systems, the working mechanism of such systems and practical applications of the IR systems to real life problems

Course Contents

Introduction, Information Retrieval Models Boolean Retrieval Model, Boolean Retrieval Model Rank Retrieval Model, Vector Space Retrieval Model, TF-IDF Weighting Document Representation in Vector Space Query Representation in Vector Space Similarity Measures, Similarity Measures Cosine Similarity Measure, Parsing Documents, Token Numbers Stop Words, Terms Normalization, Lemmatization Stemming, Compression, Compression, Index Constructions, Merge Sort, Phrase queries, Processing a phrase query Proximity queries, Wild Card Queries B Tree, Permuterm index k-gram, Spelling Correction, Performance Evaluation of Information Retrieval Systems, BENCHMARKS FOR THE EVALUATION OF IR SYSTEMS, Precision and Recall, Mean Average Precision Non Binary Relevance DCG NDCG, Using user Clicks, Cosine Ranking, Sampling and pre-grouping, Dimensionality reduction, Web Search, Spidering, Web Crawler, Distributed Index, Link Analysis, Markov chains, HITS, Search Computing, Top-k Query Processing, Clustering, Classification, Recommender Systems, Final Notes on Information Retrieval

Course Related Links

Introduction to Information Retrieval (Course Textbook)
Course Instructor

Dr. Adnan Abid, PhD.

Introduction to Information Retrieval by C. Manning, P. Raghavan, and H. Schuetze

Modern Information Retrieval: The Concepts and Technology behind Search by Ricardo Baeza-Yates, Berthier Ribeiro-Neto

Web Information Retrieval: Data Centric Systems and Applications by Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia

Managing Gigabytes: Compressing and Indexing Documents and Images by Ian H. Witten, Alistair Moffat, and Timothy C. Bell