As part of my university course Informative Systems and Semantic Web I developed this project with the aim to test the potentialities of advanced NLP tools (SpaCy, https://spacy.io/) in the field of data integration, especially for solving issues related to:
-
Schema Matching (SM): the columns are seen as unified string texts and the similarity of tables schemas is measured through the similarity of such strings using NLP tecniques.
-
Entity Matching (EM): similarly the rows are seen as unified string texts and the similarity between string-tuples is measured through and NLP similarity measure.