University of Grenoble Alpes, 2nd year Master of science in informatics (MoSIG), specialty Artificial intelligence and the web

Semantic web: from XML to OWL

Sihem Amer Yahia (Sihem : Amer-Yahia # imag : fr)
Manuel Atencia (Manuel : Atencia # inria : fr)
Pierre Genevès (Pierre : Geneves # inria : fr)
36h, 6 ETCS
Marks are given after All documents allowed.
Official web site



The web has been constantly evolving from a distributed hypertext system to a very large information processing machine. As fast as it is, this evolution is grounded on theoretical principles borrowing to several fields of computer science such as programming languages, data bases, structured documentation, logic and artificial intelligence. The smooth operation of the past and future web at a large scale is relying on these foundations. The goal of this course is to present them, the problem that they solve as those that they uncover. It considers three milestones of this evolution: XML, the social web and the semantic web.

The first part aims at introducing programming language foundations, algorithms and tools for processing tree-structured information, and for the analysis of queries and programs that manipulate trees. This part consists in an introduction to relevant theoretical tools with an application to NoSQL and XML technologies in particular. The theoretical part introduces tree grammars, finite tree automata, classical tree logics and a recent mu-calculus of finite trees, in connection to practical problems and technologies such as XPath/XQuery, DTD, schemas, etc. Applications are illustrated through scalable validation of document streams, efficient query evaluation, static analysis of expressive queries in the presence of constraints, and static type-checking of programs manipulating labeled trees. The course also aims at presenting challenges, important results, and open theoretical issues in the area of NoSQL programming.

The second part summarizes data models and algorithms required to extract, manage and access massive amounts of social content. The course examples are drawn from real-world applications such as URL search and recommendation on Delicious, group recommendation in MovieLens and extracting travel itineraries from Flickr photos. The course goals are: acquire knowledge on scalable algorithms for processing large volumes of social data and extracting value from that data and learn how to run and interpret large-scale user studies.

The third part introduces the semantics of knowledge representation on the web. The semantic web extends the web with richer and more precise information because it is expressed in a formal language using a vocabulary defined in an ontology (a structured vocabulary of concepts and properties defined in a logic). Ontologies are used for describing web resource content and reasoning about these resources formally. We introduce the semantic web languages (RDF, RDFS, OWL) and show their relations with knowledge representation formalisms (conceptual graphs, description logics) and XML. This provides tools for reasoning with ontologies and, in particular, to evaluate queries. However, the distributed nature of the web leads to heterogeneous ontologies which must be matched before using them. We discuss ontology matching and explain how to semantically interpret the relations between ontologies. Finally, this is applied to network of peers using knowledge together.

Place and time

To be announced

Planning (2017-2018)

This can be consulted on the official timetable in ADE

Core XML (XML, Schemas, Parsing)PG
Programming with XML (Streaming Validation, XPath, XQuery)PG
Foundations of XML Types (Tree Grammars, Tree Automata)PG
Tree Logics (FO, MSO)PG
Tree Logics continued (μ-calculus)PG
Introduction to the social webSAY
Search and recommendation in the social webSAY
Semantic web languages (Data: URI, RDF, closure, interpolation lemma)MA
Semantic web languages (Ontologies: RDFS and OWL)MA
Querying data though ontologies (NSPARQL, PSPARQL, DL-Lite)MA
Alignment semantics and networked ontologiesMA
Final exam

Outline and documents

First part: Foundations for processing tree-structured information

Lecturer: Pierre Genevès

Slides and relevant teaching material are available from:

Second part: Social networks

Lecturer: Sihem Amer Yahia

Third part: Semantic web

Lecturer: Jérôme Euzenat and Manuel Atencia

This part of the course is now collected into a single Lecture notes volume. These notes are always evolving so, avoid printing them until before the exams. It is easier to download (and update) it and browse through the PDF. It is divided in three parts correponding to the main sessions.

  1. Graphs and ontologies
    1. Resource description framework
    2. Ontology languages
  2. Queries
    1. Querying RDF with SPARQL
    2. Extending SPARQL
    3. Querying modulo ontologies
  3. Networks of ontologies
    1. Networks of ontologies and alignments
    2. Alignment semantics
    3. Distributed query evaluation


  1. Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart, Web Data Management, Cambridge university press (UK), 2011
  2. Philippe Adjiman, Philippe Chatalic, Francois Goasdoué, Marie-Christine Rousset, Laurent Simon, Distributed Reasoning in a Peer-to-Peer Setting : Application to the Semantic Web, Journal of Artificial Intelligence Research (JAIR) Volume 25. 2006.
  3. Philippe Adjiman, Francois Goasdoué, Marie-Christine Rousset, Some RDFS in the Semantic Web. Journal of Data Semantics (JoDS), 2007
  4. Gediminas Adomavicius and Alexander Tuzhilin. Towards the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), June 2005.
  5. Sihem Amer-Yahia et al. Efficient network aware search in collaborative tagging sites. PVLDB 1(1):710-721, 2008
  6. Sihem Amer-Yahia et al. Group Recommendation: Semantics and Efficiency. PVLDB 2(1):754-765, 2009
  7. Sihem Amer-Yahia et al. It takes variety to make a world: diversification in recommender systems. EDBT 2009.
  8. Sihem Amer-Yahia et al. Getting recommender systems to think outside the box. RecSys 2009.
  9. Sihem Amer-Yahia et al. Efficient Computation of Diverse Query Results. ICDE 2008.
  10. Grigoris Antoniou, Frank van Harmelen, A semantic web primer, The MIT press, 2004 (rev. 2008)
  11. Senjuti Basu Roy, Sihem Amer-Yahia, Ashish Chawla, Gautam Das, Cong Yu. Constructing and exploring composite items. SIGMOD Conference 2010: 843-854
  12. Senjuti Basu Roy, Sihem Amer-Yahia, Gautam Das, Cong Yu. Interactive Itinerary Planning. ICDE 2011.
  13. Hubert Comon, Max Dauchet, Rémi Gilleron, Christof Löding, Florent Jacquemard, Denis Lugiez, Sophie Tison, Marc Tommasi, Tree Automata Techniques and Applications (, October, 12th 2007
  14. Mahashweta Das, Saravanan Thirumuruganathan, Sihem Amer-Yahia, Gautam Das, Cong Yu, Who Tags What? An Analysis Framework, PVLDB 5(11):1567-1578, 2012
  15. Mahashweta Das, Sihem Amer-Yahia, Gautam Das, Cong Yu, MRI: Meaningful Interpretations of Collaborative Ratings PVLDB 4(11):1063-1074, 2011
  16. Jérôme Euzenat, Pavel Shvaiko, Ontology matching, Springer Verlag, Heidelberg (DE), 2007; 2nde edition, 2013
  17. Ronald Fagin, Amnon Lotem, Moni Naor. Optimal Aggregation Algorithms for Middleware. PODS 2001
  18. Pierre Geneves, Nabil Layaida and Alan Schmitt, Efficient Static Analysis of XML Paths and Types, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI), San Diego (CA US), pp342-351, 2007.
  19. Pierre Geneves and Nabil Layaida, A System for the Static Analysis of XPath, ACM Transactions on Information Systems 24(4):475-502, 2006.
  20. Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, Foundations of semantic web technologies, Chapman & Hall/CRC, 2009
  21. Ian Horrocks, OWL 2: the next generation, 2011
  22. H. Hosoya, Foundations of XML Processing, April 2, 2007
  23. Modal μ-Calculi, Chapter of "Handbook of Modal Logic"
  24. Julia Stoyanovich, Sihem Amer Yahia, Cameron Marlow, Cong Yu. Leveraging Tagging to Model User Interests in AAAI Social Information Processing workshop 2008.

Previous exams

In previous years, we had 3h exams at the end of the course. Starting in 2010-2011, we have two exams. This aims at being sure that the students know what is expected from them. In addition here are some past exams.

Here are some questions of an exam proposed at EPFL in 2009 and their corrections (in English) for the XML part only.

Here is the exam of 2008-2009 (in French) and its correction (in English) for the semantic web part only.

Here is the exam of 2009-2010 (in French or English) and its correction (in English) for the semantic web part only.

Here is the exam of 2010-2011 (in French or English) and its correction (in English) for the semantic web part only.

Here is the exam of 2012-2013 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.

Here is the exam of 2013-2014 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.

Here is the exam of 2014-2015 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.

Here is the midterm exam of 2015-2016 and its correction for the semantic web part (in English).

Here is the midterm exam of 2016-2017 and its correction for the semantic web part (in English).