TU/e

2IID0. Web Analytics

(2015-2016, Semester A, Quartile 2)


Last update: 12 Oct 2015; if you notice any outdated or likely wrong information on this webpage, please e-mail to 2IID0.Teachers@gmail.com

Announcements:
  • 1.10.15: This course is meant for the 3rd year bachelor Web Science (both major and Bachelor college), Software Science and Web Technology programs.
  • 1.10.15: Course and examination information and registration on OWInfo.
Lectures: Mykola Pechenizkiy
Instructions: Mykola Pechenizkiy and Joaquin Vanschoren
Contacting teachers:
via e-mail:
  • Send all correspondence to 2IID0.Teachers@gmail.com with a meaningful subject; it is fine to start with Hi, Hello or Dear FirstName.
  • Please do not send requests to our personal e-mails. There is also no need cc personal e-mails addresses of the teachers.
  • We will try to answer all your requests as soon as possible. However, if you have not received a reply within 3 working days please do not hesitate to resend your request.
in person:
  • option 1: please, do not hesitate to approach the teachers during the lecture breaks and during the instructions in Flux 0.01;
  • option 2: on Mondays 10.00 - 12.00 we have office hours in MF7.099 dedicated for the educational activities;
  • option 3: if you cannot make it during the lecture breaks or our dedicated office hours, please send a meeting request to 2IID0.Teachers@gmail.com indicating your availability for the corresponding period;
Modes of study and evaluation:
  • 8 weeks x 2 times per week lectures
  • 8 weeks x 1 time per week instructions (all students are in the same instructions group)
  • Self-study of the literature
  • 4 Homeworks; done in groups of 2-3. For every assignment - you need to make a new composition of groups.
  • Question answering sessions
  • Written exam.
  • Optional bonus assignment. More on this during Week1.

Final grade:

  • 50% homeworks (2IID2) and 50% written exam (2IID1).
  • You have to get at least 5.5 as a grand average to pass the course. An additional constraint imposed by the Bachelor college is that you need to score at least 5.0 for the exam and at least 5.0 for the homeworks to pass the course.
Course Materials:
  • There is no single text book that covers the topics you will study in this course. However, there are several good book chapters that cover some of the topics. These chapters are available online:
    • MMDS: Mining of Massive Datasets (by Rajaraman, Ullman, Leskovec) accessible online from here.
    • IDM: Introduction to Data Mining book (by Tan, Steinbach, Kumar), selected chapters accessible online from here.
    • DMT: Data Mining: The Textbook (by Aggarwal), all chapters accessible online from here.
    • NCM: Networks, Crowds, and Markets: Reasoning About a Highly Connected World (by Easley and Kleinberg), accessible online from here.
    For each covered topic the corresponding book chapter(s) or other reading will be suggested.
  • Lecture slides, reading materials, homework description and guidelines will be available via Sakai Learning Management System. Please register using your TU/e login, and join 2IID0.
  • Submission of the homeworks will be done via the course e-mail.

Course Syllabus:

Please note that this schedule is indicative and changes may be possible as the course progresses.

Date, Time, and Room Lecture Title and Contents
9 Nov 2015
Monday
13:45-15:30
Flux 0.01
Week1 Lecture: Introduction to the course
  • Motivation and historical perspective on the development of web analytics
  • Web analytics ecosystem(s)
  • Overview of the covered topics.
9 Nov 2015
Monday
15:45-17:30
Flux 0.01
Week1 Instructions: overview of the course practicalities
  • Brief overview of homeworks and final (written) exam
  • Bonus assignment
  • Grading policies
  • Overview of the covered topics (cont.)
11 Nov 2015
Wednesday
10:45-12:30
Flux 0.01
Week1 Lecture: Predictive modeling. User profiling
  • Common classification techniques and ideas for improvement
  • Variety of application settings and corresponding problem formulations
  • Use of WEKA and MOA for user profiling
16 Nov 2015
Monday
13:45-15:30
Flux 0.01
Week2 Guest Lecture by Peter Lem (Adversitement B.V.)
  • Web based user tracking
  • DimML
  • Beyond traditional WA: audience based monitoring Sentiment Analysis
16 Nov 2015
Monday
15:45-17:30
Flux 0.01
Week2 Guest Instructions by Peter Lem (Adversitement B.V.)
  • Introduction of the case for Homework 1.
  • Starting to work on DimML for the homework assignment.
18 Nov 2015
Wednesday
10:45-12:30
Flux 0.01
Week2 Guest Lecture by Thijs Putman (StudyPortals B.V)
23 Nov 2015
Monday
13:45-15:30
Flux 0.01
Week3 Lecture: Predictive Analytics for Computational Advertisement
  • Display and paid search advertising
  • Conversion rate optimization and conversion attribution
  • User click prediction as classification and related problem formulations
  • Ad to content/context matching
  • Examples of complementary problems: Traffic volume prediction
23 Nov 2015
Monday
15:45-17:30
Flux 0.01
Week3 Instructions: Classification techniques
  • CTR prediction with WEKA classification techniques and OpenML environment used for Homework 2.
  • Experiencing class imbalance and cost-sensitive classifier learning.
25 Nov 2015
Wednesday
10:45-12:30
Flux 0.01
Week3 Lecture: Web content mining with classification techniques .
  • 1st hour: Web content and spam classification
  • 2nd hour: Invited talk by Thijs Westerveld (Wizenoze) "Child friendly access to age specific content"
27 Nov 2015
Friday
16:00.
Deadline: submit your solution and report for Homework1 (peter.lem@o2mc.io with a copy to 2IID0.Teachers@gmail.com)
30 Nov 2015
Monday
13:45-15:30
Flux 0.01
Week4 Lecture: Utility of Web analytics
  • Predictive models vs. explanatory models
  • Causal discovery and targeted learning
  • Methodological issues of knowledge discovery
  • 30 nov 2015
    Monday
    15:45-17:30
    Flux 0.01
    Week4 Instructions: Distributed analytics
    • Brief introduction to Hadoop stack
    • Examples of writing map-reduce programs, e.g. estimating popularity of pages, queries
    • Introduction to Homework 3.
    2 Dec 2015
    Wednesday
    10:45-12:30
    Flux 0.01
    Week4 Lecture: Persuasion of users
    • 1st hour: Invited talk: "Persuasion Profiling in Data Streams" by Maurits Kaptein
    • 2nd hour: Predicting causal effect, mining data from A/B testing
    5 Dec 2015
    Friday
    16:00.
    Deadline: submit your solution and report for Homework2 (2IID0.Teachers@gmail.com)
    7 Dec 2015
    Monday
    13:45-15:30
    Flux 0.01
    Week5 Lecture: Pattern mining and clustering techniques
    • Mining association rules, subgroups and exceptional models
    • kMeans, AHC, and DBScan clustering
    • Evaluation of patterns and clustering
    7 Dec 2015
    Monday
    15:45-17:30
    Flux 0.01
    Week5 Instructions: Continuation of Homework 3.
    • Feedback on Homework 1.
    • Exercises on clustering, classification, pattern mining
    9 Dec 2015
    Wednesday
    10:45-12:30
    Flux 0.01
    Week5 Lecture: Computing similarities
    • Similarity in metric spaces.
    • Similarity in high-dimensional and sparse data.
    • Matching sequential and time-series data
    • Finding similar nodes in a (labeled) graph
    11 Dec 2015
    Friday
    16:00.
    Deadline: submit your solution and report for Homework3 (2IID0.Teachers@gmail.com)
    14 Dec 2015
    Monday
    13:45-15:30
    Flux 0.01
    Week6 Guest Lecture by Michiel Hochstenbach: Linear algebra for Big Data
    • Efficient approaches for eigen-value decomposition
    • Applications to dimensionality reduction, PageRank, LSI, and eigen-tastes
    14 Dec 2015
    Monday
    15:45-17:30
    Flux 0.01
    Week6 Instructions: Matrix decomposition
    • Practicing to make use of random projections and eignenvalue decomposition
    • Feedback on Homework 2.
    16 Dec 2015
    Wednesday
    10:45-12:30
    Flux 0.01
    Week6 Lecture: Recommender systems
    • Content-based, collaborative-based, and hybrid approaches
    • Ads, recommendations, and persuasion
    • Exploration-exploitation principle
    • Recommenders on Netflix, LinkedIn, Booking.com
    4 Jan 2016
    Monday
    13:45-15:30
    Flux 0.01
    Week7 Lecture: Social network analytics: properties
    • Example of simple analytics on MSN messenger data
    • Properties of large-scale networks (degree, diameter, centrality, clustering)
    • Graph sampling
    4 Jan 2016
    Monday
    15:45-17:30
    Flux 0.01
    Week7 Instructions: SNA
    4 Jan 2015
    Monday
    21:00.
    Deadline: submit your solution and report for Homework4 (2IID0.Teachers@gmail.com)
    6 Jan 2015
    Wednesday
    10:45-12:30
    Flux 0.01
    Week7 Lecture: Social network analytics: dynamics
    • How networks form and grow: rich-gets-richer, community-guided attachment, Kronecker graphs
    • Influence propagation, viral marketing, acceptance behavior, general contagion model
    • PageRank and HITS, top influencing nodes, ambassadors, etc
    11 Jan 2015
    Monday
    13:45-15:30
    Flux 0.01
    Week8 Closing lecture
    • 1sr hour: Heterogeneous network (re)construction, querying and clustering
    • 2nd hours: Summary of the covered topic
    11 Jan 2016
    Monday
    15:45-17:30
    Flux 0.01
    Week8: Instructions: Feedback and QA session
    • Feedback on the Homework4 and Bonus assignments
    • QA session (e-mail your questions in advance)
    11 Jan 2015
    Monday
    21:00.
    (Optional) Deadline: submit your solution and report for the Bonus assignment (2IID0.Teachers@gmail.com)
    13 Jan 2016
    Wednesday
    10:45-12:30
    Flux 0.01
    Week8 Closing lecture
    • Catching up with the remaining material if some delay has been accumulated
    • Future of Web analytics
    • QA session if need (e-mail your questions in advance)
    29 Jan 2016
    Friday
    9:00-12:00
    Place t.b.a.
    FINAL EXAM
    • Do not forget to register for the exam.
    • The results will be available by Feb 11.
    • You can come and check your results Feb 12, 10.00-12.00
    • Second attempt: 6 Apr 2015, 18:00-21:00

    Colour agenda:

    Lectures

    Instructions

    Deadlines for submitting homeworks and final exam

    Handouts and course materials will be available with Sakai or other Learning Management System.

    Remarks:

    • MMDS: Mining of Massive Datasets (by Rajaraman, Ullman, Leskovec) accessible online from here.
    • IDM: Introduction to Data Mining book (by Tan, Steinbach, Kumar), chapters accessible online from here.
    • DMT: Data Mining: The Textbook (by Aggarwal), all chapters accessible online from here.
    • NCM: Networks, Crowds, and Markets: Reasoning About a Highly Connected World (by Easley and Kleinberg), accessible online from here.
    • Please notice that this schedule is indicative and some changes may be still possible.
    • Last update: 17 Nov 2015; if you notice any outdated or wrong information on this webpage, please e-mail to 2IID0.Teachers@gmail.com