MOSIG Master 2ND
Formal concept analysis, extracting concepts found in a data set, can be expressed as a constraint satisfaction problem. Relational concept analysis deals simultaneously with several interrelated data sets. We aim at answering if it, or which part of it, can be expressed with constraints.
Formal concept analysis (FCA, [Ganter 1999]) is a data mining approach that identifies concepts within a data set. More precisely, from a set of data items, e.g. Researchers, identified by attributes, e.g. topic, age, it finds out concepts which are characterized by the set of objects sharing a set of attributes, e.g. Researchers in artificial intelligence of less than 25 years old.
Relational concept analysis (RCA, [Rouane-Hacene 2013]) enables to find interdependent concepts within related data sets, i.e. that Researchers are also characterised by the Papers they wrote and the Labs they belong to which, in turns, depend on other Researchers. For that purpose, it splits the data in related datasets (here Researchers, Papers, Labs) which are processed individually by formal concept analysis. The resulting concepts are then injected back to the initial FCA problems with the help of 'scaling operators' and formal concept analysis is performed again on these problems, for instance new attributes would be: has-some-members-as-Reserachers-in-AI-of-less-than-25-years-old, or wrote-all-her-papers-with-AI-Researchers. This process is iterated until it reaches a fixed point, i.e. until no more concepts are added.
We use such techniques, in the ANR Elker project, to extract sameAs links in RDF data sets [Atencia 2020]
More interestingly, relational concept analysis is designed to work when there are cyclic dependencies between data sets and concepts, that a Researcher belongs to a Lab whose members are Researchers. This requires the computation of fixed points. We have recently provided an alternative fixed-point semantics to RCA accounting for all the stable solutions representing the input data [Euzenat 2021]. The classical RCA semantics is in fact the least fixed-point semantics.
A recent trend is to express data mining algorithms as constraint satisfaction problems (CSP) [Bessiere 2017, Guns 2017]. Such problems can then be solved automatically through techniques known as constraint programming. In that context, each solution of the constraint satisfaction problem corresponds to a concept for the corresponding data mining problem. This has been achieved for formal concept analysis whose principles may be expressed in a few constraints.
The goal of this work is to find out if it is possible to express relational concept analysis as a constraint satisfaction problem. For datasets without cycles, the answer should be positive as it simply consists in performing several FCA steps in a row. It however remains to express the mapping between both problems.
Hence, the contribution of this work would be to answer positively or negatively the raised question: is it possible or not to express relational concept analysis to constraint satisfaction. There is an option to answer this question partially, e.g. YES for acyclic RCA and NO for cyclic RCA, or to develop specific techniques to precise cases of RCA. There is also an option to to answer differently depending on the considered semantics. The question could be investigated theoretically, by designing a constraint satisfaction problem equivalent to RCA, or practically, by finding encodings for such problems to CSP.
Answering this question, beside providing the opportunity to use constraint programming to perform relational concept analysis, would lead to new questions such as 'Can relational concept analysis be then reduced to formal concept analysis?' and especially provide an encoding for that.
[Atencia 2020] Manuel Atencia, Jérôme David, Jérôme Euzenat, Amedeo Napoli, Jérémy Vizzini, Link key candidate extraction with relational concept analysis, Discrete applied mathematics 273:2-20, 2020
[Bessiere 2017] Christian Bessiere, Luc De Raedt, Tias Guns, Lars Kotthoff, Mirco Nanni, Siegfried Nijssen, Barry O'Sullivan, Anastasia Paparrizou, Dino Pedreschi, Helmut Simonis, The Inductive Constraint Programming Loop, IEEE Intelligent systems 32(5):44-52, 2017
[Euzenat 2021] Jérôme Euzenat, Fixed-point semantics for barebone relational concept analysis, in: Proc. 16th international conference on formal concept analysis (ICFCA), Strasbourg (FR), (Agnès Braud, Aleksey Buzmakov, Tom Hanika, Florence Le Ber (eds), Proc. 16th international conference on formal concept analysis (ICFCA), Lecture notes in computer science 12733, 2021), pp20-37, 2021 [Ganter 1999] Bernhard Ganter, Rudolf Wille, Formal concept analysis: mathematical foundations, Springer, Berlin (DE), 1999
[Guns 2017] Tias Guns, Anton Dries, Siegfried Nijssen, Guido Tack, Luc De Raedt, MiningZinc: A declarative framework for constraint-based mining, Artificial intelligence 244:6-29, 2017
[Hacene 2013a] Mohamed Rouane Hacene, Marianne Huchard, Amedeo Napoli, Petko Valtchev, Relational concept analysis: mining concept lattices from multi-relational data, Annals of Mathematics and Artificial Intelligence 67:81-108, 2013
Reference number: Proposal n°3018
Master profile: M2R MOSIG, M2R MSIAM, M2R Informatics.
Advisor: Jérôme Euzenat (Jerome:Euzenat#inria:fr) and Jérôme David (Jerome:David#inria:fr).
Team: The work will be carried out in the mOeX team common to INRIA & Université Grenoble Alpes. mOeX is dedicated to study knowledge evolution through adaptation. It gather permanent researchers from the Exmo team which has taken an active part these past 15 years in the development of the semantic web and more specifically ontology matching.
Procedure: Contact us and provide vitæ and possibly motivation letter and references.