Pay-as-you-go data integration using functional dependencies

Authors
Publication date 2012
Host editors
  • G. Quirchmayr
  • J. Basl
  • I. You
  • L. Xu
  • E. Weippl
Book title Multidisciplinary Research and Practice for Information Systems
Book subtitle IFIP WG 8.4, 8.9/TC 5 : international cross-domain conference and workshop on availability, reliability, and security, CD-ARES 2012, Prague, Czech Republic, August 20-24, 2012 : proceedings
ISBN
  • 9783642324970
ISBN (electronic)
  • 9783642324987
Series Lecture Notes in Computer Science
Event The 7th ARES conference (ARES 2012): international cross domain conference and workshop (CD-ARES 2012)
Pages (from-to) 375-389
Publisher Heidelberg: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Setting up a full data integration system for many application contexts, e.g. web and scientific data management, requires significant human effort which prevents it from being really scalable. In this paper, we propose IFD (Integration based on Functional Dependencies), a pay-as-you-go data integration system that allows integrating a given set of data sources, as well as incrementally integrating additional sources. IFD takes advantage of the background knowledge implied within functional dependencies for matching the source schemas. Our system is built on a probabilistic data model that allows capturing the uncertainty in data integration systems. Our performance evaluation results show significant performance gains of our approach in terms of recall and precision compared to the baseline approaches. They confirm the importance of functional dependencies and also the contribution of using a probabilistic data model in improving the quality of schema matching. The analytical study and experiments show that IFD scales well.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-642-32498-7_28
Permalink to this page
Back