University of Grenoble Alpes, 2nd year Master of science in informatics (MoSIG), specialty Artificial intelligence and the web

Semantic web: from XML to OWL

Lecturers

Sihem Amer Yahia (Sihem : Amer-Yahia # imag : fr)
Manuel Atencia (Manuel : Atencia # inria : fr)
Pierre Genevès (Pierre : Geneves # inria : fr)

Language

English

Credits

36h, 6 ETCS

Evaluation

Marks are given after

A 60mn midterm exam counting for 2/5, and
A 120mn final written exam counting for 3/5

All documents allowed.

Official web site

GBX9MO25

Teams

Objective

The web has been constantly evolving from a distributed hypertext system to a very large information processing machine. As fast as it is, this evolution is grounded on theoretical principles borrowing to several fields of computer science such as programming languages, data bases, structured documentation, logic and artificial intelligence. The smooth operation of the past and future web at a large scale is relying on these foundations. The goal of this course is to present them, the problem that they solve as those that they uncover. It considers three milestones of this evolution: XML, the social web and the semantic web.

The first part aims at introducing programming language foundations, algorithms and tools for processing tree-structured information, and for the analysis of queries and programs that manipulate trees. This part consists in an introduction to relevant theoretical tools with an application to NoSQL and XML technologies in particular. The theoretical part introduces tree grammars, finite tree automata, classical tree logics and a recent mu-calculus of finite trees, in connection to practical problems and technologies such as XPath/XQuery, DTD, schemas, etc. Applications are illustrated through scalable validation of document streams, efficient query evaluation, static analysis of expressive queries in the presence of constraints, and static type-checking of programs manipulating labeled trees. The course also aims at presenting challenges, important results, and open theoretical issues in the area of NoSQL programming.

The second part summarizes data models and algorithms required to extract, manage and access massive amounts of social content. The course examples are drawn from real-world applications such as URL search and recommendation on Delicious, group recommendation in MovieLens and extracting travel itineraries from Flickr photos. The course goals are: acquire knowledge on scalable algorithms for processing large volumes of social data and extracting value from that data and learn how to run and interpret large-scale user studies.

The third part introduces the semantics of knowledge representation on the web. The semantic web extends the web with richer and more precise information because it is expressed in a formal language using a vocabulary defined in an ontology (a structured vocabulary of concepts and properties defined in a logic). Ontologies are used for describing web resource content and reasoning about these resources formally. We introduce the semantic web languages (RDF, RDFS, OWL) and show their relations with knowledge representation formalisms (conceptual graphs, description logics) and XML. This provides tools for reasoning with ontologies and, in particular, to evaluate queries. However, the distributed nature of the web leads to heterogeneous ontologies which must be matched before using them. We discuss ontology matching and explain how to semantically interpret the relations between ontologies. Finally, this is applied to network of peers using knowledge together.

Place and time

To be announced

Planning (2017-2018)

This can be consulted on the official timetable in ADE

Title	Lecturer
Core XML (XML, Schemas, Parsing)	PG
Programming with XML (Streaming Validation, XPath, XQuery)	PG
Foundations of XML Types (Tree Grammars, Tree Automata)	PG
Tree Logics (FO, MSO)	PG
Tree Logics continued (μ-calculus)	PG
Introduction to the social web	SAY
Search and recommendation in the social web	SAY
Semantic web languages (Data: URI, RDF, closure, interpolation lemma)	MA
Semantic web languages (Ontologies: RDFS and OWL)	MA
Querying RDF (SPARQL)	MA
Querying data though ontologies (NSPARQL, PSPARQL, DL-Lite)	MA
Alignment semantics and networked ontologies	MA
Final exam

Outline and documents

First part: Foundations for processing tree-structured information

Lecturer: Pierre Genevès

Slides and relevant teaching material are available from: http://pierre.geneves.net/teaching.html.

Course Introduction
Core XML: XML, DTD, XML Schema, XML Parsing
Excursion (streaming DTD validation with SAX)
XPath
XQuery
Foundations of XML Types: An Introduction
Tree Grammars
Finite Tree Automata
First-Order Logic and Monadic Second-Order Logic
Advanced Static Analysis for XML/XPath

Second part: Social networks

Lecturer: Sihem Amer Yahia

Cours 1: Recommendation Systems
- Overview of Recommendation Approaches (slides)
- Hotlist Recommendation in Del.icio.us (slides)
- Recommendation Evaluation (slides)
Cours 2: Social Data Mining
- Social Data Mining (slides)
- Travel Itinerary Mining (slides)

Third part: Semantic web

Lecturer: Jérôme Euzenat and Manuel Atencia

This part of the course is now collected into a single Lecture notes volume. These notes are always evolving so, avoid printing them until before the exams. It is easier to download (and update) it and browse through the PDF. It is divided in three parts correponding to the main sessions.

Graphs and ontologies
1. Resource description framework
2. Ontology languages
Queries
1. Querying RDF with SPARQL
2. Extending SPARQL
3. Querying modulo ontologies
Networks of ontologies
1. Networks of ontologies and alignments
2. Alignment semantics
3. Distributed query evaluation

References

Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart, Web Data Management, Cambridge university press (UK), 2011
Philippe Adjiman, Philippe Chatalic, Francois Goasdoué, Marie-Christine Rousset, Laurent Simon, Distributed Reasoning in a Peer-to-Peer Setting : Application to the Semantic Web, Journal of Artificial Intelligence Research (JAIR) Volume 25. 2006.
Philippe Adjiman, Francois Goasdoué, Marie-Christine Rousset, Some RDFS in the Semantic Web. Journal of Data Semantics (JoDS), 2007
Gediminas Adomavicius and Alexander Tuzhilin. Towards the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), June 2005.
Sihem Amer-Yahia et al. Efficient network aware search in collaborative tagging sites. PVLDB 1(1):710-721, 2008
Sihem Amer-Yahia et al. Group Recommendation: Semantics and Efficiency. PVLDB 2(1):754-765, 2009
Sihem Amer-Yahia et al. It takes variety to make a world: diversification in recommender systems. EDBT 2009.
Sihem Amer-Yahia et al. Getting recommender systems to think outside the box. RecSys 2009.
Sihem Amer-Yahia et al. Efficient Computation of Diverse Query Results. ICDE 2008.
Grigoris Antoniou, Frank van Harmelen, A semantic web primer, The MIT press, 2004 (rev. 2008)
Senjuti Basu Roy, Sihem Amer-Yahia, Ashish Chawla, Gautam Das, Cong Yu. Constructing and exploring composite items. SIGMOD Conference 2010: 843-854
Senjuti Basu Roy, Sihem Amer-Yahia, Gautam Das, Cong Yu. Interactive Itinerary Planning. ICDE 2011.
Hubert Comon, Max Dauchet, Rémi Gilleron, Christof Löding, Florent Jacquemard, Denis Lugiez, Sophie Tison, Marc Tommasi, Tree Automata Techniques and Applications (http://tata.gforge.inria.fr/), October, 12th 2007
Mahashweta Das, Saravanan Thirumuruganathan, Sihem Amer-Yahia, Gautam Das, Cong Yu, Who Tags What? An Analysis Framework, PVLDB 5(11):1567-1578, 2012
Mahashweta Das, Sihem Amer-Yahia, Gautam Das, Cong Yu, MRI: Meaningful Interpretations of Collaborative Ratings PVLDB 4(11):1063-1074, 2011
Jérôme Euzenat, Pavel Shvaiko, Ontology matching, Springer Verlag, Heidelberg (DE), 2007; 2nde edition, 2013
Ronald Fagin, Amnon Lotem, Moni Naor. Optimal Aggregation Algorithms for Middleware. PODS 2001
Pierre Geneves, Nabil Layaida and Alan Schmitt, Efficient Static Analysis of XML Paths and Types, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI), San Diego (CA US), pp342-351, 2007.
Pierre Geneves and Nabil Layaida, A System for the Static Analysis of XPath, ACM Transactions on Information Systems 24(4):475-502, 2006.
Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, Foundations of semantic web technologies, Chapman & Hall/CRC, 2009
Ian Horrocks, OWL 2: the next generation, 2011
H. Hosoya, Foundations of XML Processing, April 2, 2007
Modal μ-Calculi, Chapter of "Handbook of Modal Logic"
Julia Stoyanovich, Sihem Amer Yahia, Cameron Marlow, Cong Yu. Leveraging Tagging to Model User Interests in del.icio.us. AAAI Social Information Processing workshop 2008.

Previous exams

In previous years, we had 3h exams at the end of the course. Starting in 2010-2011, we have two exams. This aims at being sure that the students know what is expected from them. In addition here are some past exams.

Here are some questions of an exam proposed at EPFL in 2009 and their corrections (in English) for the XML part only.

Here is the exam of 2008-2009 (in French) and its correction (in English) for the semantic web part only.

Here is the exam of 2009-2010 (in French or English) and its correction (in English) for the semantic web part only.

Here is the exam of 2010-2011 (in French or English) and its correction (in English) for the semantic web part only.

Here is the exam of 2012-2013 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.

Here is the exam of 2013-2014 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.

Here is the exam of 2014-2015 (in English) for the semantic web and social network part and its correction (in English) for the semantic web part only.

Here is the midterm exam of 2015-2016 and its correction for the semantic web part (in English).

Here is the midterm exam of 2016-2017 and its correction for the semantic web part (in English).

http://moex.inria.fr/teaching/sw/