TU/e

2ID25 Information retrieval


Last update: 7 Sep 2011; if you notice any outdated or wrong information on this webpage, please e-mail to 2ID25.2011@gmail.com

Program: BIS, CSE, ES, INF

Course Info: OWInfo

Lecturer: Mykola Pechenizkiy
Contacting:
via e-mail: 2ID25.2011@gmail.com with a meaningful subject;
in person: option 1: please, do not hesitate to approach me during the lecture breaks on Fridays in Helix Col.2. 13.45 - 17.30; option 2: on Mondays and Tuesdays 10.00 - 12.00 I have office hours in HG7.82 dedicated for the educational activities; list with available time lots is hanging by the door; option 3: if you cannot make it during the lecture breaks or my office hours, please send a meeting request to 2ID25.2011@gmail.com indicating your availability for the corresponding period;

Course Materials: Handouts, reading and guidelines will be available via Sakai Learning Management System. Please register using your TU/e login, and join 2ID25.

Course Syllabus

Date, Time, and Room Lecture Title and Contents Introduction to IR book
(draft available online)

9 Sep 2011
Friday
15:45–17:30
Helix Col.2
Lecture 1: Introduction to the course
  • Basic IR terminology, ideas, architecture
  • Course overview, practicalities
  • Overview of possible project assignments
Ch. 1
16 Sep 2011
Friday
15:45 - 17:30
Helix Col.2
Lecture 2: Boolean IR and Document indexing
  • Boolean information retrieval
  • Inverted, skip and positional index
Ch. 1,
Ch. 2,
Ch. 3,
Ch. 4,
23 Sep 2011
Friday
15:45 - 17:30
Helix Col.2
Lecture 3: Vector space retrieval
  • From Boolean to vector space retrieval
  • Latent-Concept Models
  • Relevance feedback and query expansion
Ch. 6,
Ch. 18,
Ch. 9
30 Sep 2011
Friday
15:45–17:30
Helix Col.2
Lecture 4: IR in CoDaK
  • Part 1: IR on (evolving) collection of pdf documents: basic system architecture. Invited lecture by Wauter Bosma
  • Part 2: Examples of IR group project assignments
7 Oct 2011
Friday
15:45–17:30
Helix Col.2
Lecture 5: Guest lecture from IR industry
  • Information Retrieval at C-Content by Eric van Mullekom
13 Oct 2011
Thursday
13:45 – 15:30
Helix Col.2
Lecture 6: Into to ML Module and Classification
  • Naive Bayes, Nearest Neighbour, Decision tree learning
  • Ensemble learning
  • Evaluation of classification
Ch. 13, Ch. 14, & Ch. 15
(or IDM: Ch. 4)
14 Oct 2011
Friday
15:45 – 17:30
Helix Col.2
Lecture 7: Clustering and data/dimensionality reduction
  • Partitioning (kMeans) vs. hierarchical (AHC) clustering
  • Density-based clustering (DBSCAN)
  • Sampling, feature selection and feature transformation approaches
Ch. 16 & Ch. 17
(or IDM: Ch. 5),
Ch. 18
20 Oct 2011
Thursday
13:45 – 15:30
Helix Col.2
Lecture 8: Peculiarities of Classification and Clustering in IR/AS
  • Availability and independence of labels
  • Semi-supervised learning
  • Labeling of clustering, evaluation of clustering in general vs. in IR
  • Drifting data
Links to reading material
21 Oct 2011
Friday
15:45 – 17:30
Helix Col.2
Lecture 9: Patter mining
  • Association analysis: Itemset mining, Apriori principle,
  • Web mining and Log mining context
  • Application in IR and AS
IDM: Ch. 6
27 Oct 2011
Thursday
13:45 – 15:30
Helix Col.2
Lecture 10: Machine learning for user modelling
  • Basic ideas and Current state of the art
  • Adaptive news access as an example
  • Challenges being tackled in research community
Links to reading material
28 Oct 2011
Friday
15:45 – 17:30
Helix Col.2
No lecture this Friday because of the Education Day. There will be an opportunity (for those who are available) to come and discuss group project progress.
2XD25 and 2XD55
07 Nov 2011
14:00-15:30, Place tba
Online questionnaire for ML module (it is necessary to bring a laptop that has wifi access to TU/e network).
This partial exam is mandatory for students taking 2ID25 or 2ID55.
**Registration for this exam is required. Please follow the scheduling of partial exams
Trial exam is available in Sakai
17 Nov 2011
Thusday
17:00
Deadline: Discussion of project proposals in groups (if you have not done this in October)
  • Form groups
  • Discuss the motivation, goals and scope of your project
  • Sketch your project assignment description and send it to 2ID25.2011@gmail.com
  • Come to HG7.82 and pick a time slot to discuss your project
18 Nov 2011
Friday
15:45 – 17:30
Helix Col.2
Lecture 11: Probabilistic IR and IR Evaluation
  • Language models
  • Basic evaluation principles
Ch. 11,
Ch. 12,
Ch. 8,
25 Nov 2011
Friday
15:45 – 17:30
Helix Col.2
Lecture 12: Information retrieval on the Web
  • Web spam, SEO
  • Google’s Pagerank, Hub and authorities (HITS)
  • Trends in IR on the Web
Ch. 19,
Ch. 21
28 Nov 2011
Monday
17:00
Deadline: Submit your ML assignment
  • Instructions can be found in Sakai,
  • 15% of your final grade
Guidelines will be available in Sakai
2 Dec 2011
Friday
15:45 – 17:30
Helix Col.2
Lecture 13: MultiMedia retrieval
  • Automatic content based analysis
  • GEMINI and time-series mining view
  • Semantic gap
Multimedia Retrieval book: Introduction
9 Dec 2011
Friday
15:45 – 17:30
Helix Col.2
Lecture 14: Past, Present and Future of IR: Closing Lecture
  • Brief summary of the course and not covered topics
  • Advanced R&D issues in IR
  • Q&A session
Links to reading material
9 Jan 2012
Monday
17:00
Deadline: Submit your group project deliverables
  • Detailed instructions can be found in Sakai
  • Important: group report, besides overall architecture and achievements should contain ML part and IR part for each group member. (Clearly indicate if you improved ML part based on feedback)
Guidelines will be available in Sakai
(tbc) 12 Jan 2012
Thursday
13:30 – 17:30
(tbc) Helix Col.2
Group project presentations
  • Demo plus poster presentation,
  • Group project grade contributes 70% to your final course grade
Guidelines will be available in Sakai
(tbc) 13 Jan 2012
Friday
13:30 – 15:45
(tbc) Helix Col.2
Group project presentations
  • Demo plus poster presentation,
  • Group project grade contributes 70% to your final course grade
Guidelines will be available in Sakai
2YD25
02 Feb 2012 15:30-17:00
Place tba
Online questionnaire for the IR module (it is necessary to bring a laptop that has an access to TUe network). Exam will take 90 minutes.
**Registration for this examination is required. Please follow the schedule of partial exams.
Trial exam is available in Sakai
10 Feb 2012
Friday
14:00 – 17:00
Course grades are sent to the administration.
(tbc) 5 April 2012
Thursday
13:45 – 15:30
Location tbc
Delayed group project presentations
  • Demo plus poster presentation
  • Group project grade contributes 70% to your final course grade
Guidelines will be available in Sakai

Colour agenda:

Regular IR lectures

Lectures for both IR (2ID25) and AS (2ID55)

Partial exams (online questionnaire) and group projects

Modes of study and evaluation

  • 14 face-to-face lectures
  • self-study of the literature
  • 2 partial exams (online questionaires, including multiple-choice and fill in questions)
    • cover essential issues in lectures and reading material
  • Project assignment (includes group and individual work)
    • literature study, IR system development and evaluation
    • must include 2 individual assignments for each member of the project team
      • development/implementation/evaluation of elements of machine learning and information retrieval modules
    • Poster presentation + demo of the group project work.
    • Final report (main part about 10 pages + appendixes) - must be submitted at least 3 days before your presentation day
  • No final exam
  • Final grade (100 points max) = ML partial exam (15 points) + IR partial exam (15 points) + Group Project (70 points)
  • Group Project grade (70 points) = ML individual assignment (15 points) + IR individual assignment (15 points) + Group work including besides the quality of the project output as a whole (20 points) also the quality of the report (20 points), poster + demo (20 points) and evaluation of other projects (10 points).

Handouts and course materials will be available with Sakai or other Learning Management System.

Remarks:

  • IIR: Introduction to Information Retrieval book (by Manning, Raghavan and Schόtze), accessible online from here.
  • IDM: Introduction to Data Mining book (by Tan, Steinbach, Kumar), accessible online from here.
  • Please note that lectures 6-10 and the first partial exam are common for the participants of 2ID25 and 2ID55 and refer to Machine Learning/Data Mining module.
  • Please notice that this schedule is indicative and some changes may be still possible.
  • Exact time slots and locations for group project presentations (in January before the examination period) will be confirmed by December 9th.
  • Last update: 7 Sep 2011; if you notice any outdated or wrong information on this webpage, please e-mail to 2ID25.2011@gmail.com