The web has been constantly evolving from a distributed hypertext system to a very large information processing machine. As fast as it is, this evolution is grounded on theoretical principles borrowing to several fields of computer science such as programming languages, data bases, structured documentation, logic and artificial intelligence. The smooth operation of the past and future web at a large scale is relying on these foundations. The goal of this course is to present them, the problem that they solve as those that they uncover. It considers three milestones of this evolution: XML, the social web and the semantic web.
The first part aims at introducing programming language foundations, algorithms and tools for processing tree-structured information, and for the analysis of queries and programs that manipulate trees. This part consists in an introduction to relevant theoretical tools with an application to NoSQL and XML technologies in particular. The theoretical part introduces tree grammars, finite tree automata, classical tree logics and a recent mu-calculus of finite trees, in connection to practical problems and technologies such as XPath/XQuery, DTD, schemas, etc. Applications are illustrated through scalable validation of document streams, efficient query evaluation, static analysis of expressive queries in the presence of constraints, and static type-checking of programs manipulating labeled trees. The course also aims at presenting challenges, important results, and open theoretical issues in the area of NoSQL programming.
The second part summarizes data models and algorithms required to extract, manage and access massive amounts of social content. The course examples are drawn from real-world applications such as URL search and recommendation on Delicious, group recommendation in MovieLens and extracting travel itineraries from Flickr photos. The course goals are: acquire knowledge on scalable algorithms for processing large volumes of social data and extracting value from that data and learn how to run and interpret large-scale user studies.
The third part introduces the semantics of knowledge representation on the web. The semantic web extends the web with richer and more precise information because it is expressed in a formal language using a vocabulary defined in an ontology (a structured vocabulary of concepts and properties defined in a logic). Ontologies are used for describing web resource content and reasoning about these resources formally. We introduce the semantic web languages (RDF, RDFS, OWL) and show their relations with knowledge representation formalisms (conceptual graphs, description logics) and XML. This provides tools for reasoning with ontologies and, in particular, to evaluate queries. However, the distributed nature of the web leads to heterogeneous ontologies which must be matched before using them. We discuss ontology matching and explain how to semantically interpret the relations between ontologies. Finally, this is applied to network of peers using knowledge together.
To be announced
Title | Lecturer |
Core XML (XML, Schemas, Parsing) | PG |
Programming with XML (Streaming Validation, XPath, XQuery) | PG |
Foundations of XML Types (Tree Grammars, Tree Automata) | PG |
Tree Logics (FO, MSO) | PG |
Tree Logics continued (μ-calculus) | PG |
Introduction to the social web | SAY |
Search and recommendation in the social web | SAY |
Semantic web languages (Data: URI, RDF, closure, interpolation lemma) | MA |
Semantic web languages (Ontologies: RDFS and OWL) | MA |
Querying RDF (SPARQL) | MA |
Querying data though ontologies (NSPARQL, PSPARQL, DL-Lite) | MA |
Alignment semantics and networked ontologies | MA |
Final exam |
Lecturer: Pierre Genevès
Slides and relevant teaching material are available from: http://pierre.geneves.net/teaching.html.
Lecturer: Sihem Amer Yahia
Lecturer: Jérôme Euzenat and Manuel Atencia
This part of the course is now collected into a single Lecture notes volume. These notes are always evolving so, avoid printing them until before the exams. It is easier to download (and update) it and browse through the PDF. It is divided in three parts correponding to the main sessions.
In previous years, we had 3h exams at the end of the course. Starting in 2010-2011, we have two exams. This aims at being sure that the students know what is expected from them. In addition here are some past exams.
Here are some questions of an exam proposed at EPFL in 2009 and their corrections (in English) for the XML part only.
Here is the exam of 2008-2009 (in French) and its correction (in English) for the semantic web part only.
Here is the exam of 2009-2010 (in French or English) and its correction (in English) for the semantic web part only.
Here is the exam of 2010-2011 (in French or English) and its correction (in English) for the semantic web part only.
Here is the exam of 2012-2013 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.
Here is the exam of 2013-2014 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.
Here is the exam of 2014-2015 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.
Here is the midterm exam of 2015-2016 and its correction for the semantic web part (in English).
Here is the midterm exam of 2016-2017 and its correction for the semantic web part (in English).