- Structural Properties as Proxy for Semantic Relevance in RDF Graph Sampling
- Lecture Notes in Computer Science
- Pages (from-to)
- Document type
- Faculty of Science (FNWI)
Faculty of Law (FdR)
- Informatics Institute (IVI)
Leibniz Center for Law (FdR)
The Linked Data cloud has grown to become the largest knowledge base ever constructed. Its size is now turning into a major bottleneck for many applications. In order to facilitate access to this structured information, this paper proposes an automatic sampling method targeted at maximizing answer coverage for applications using SPARQL querying. The approach presented in this paper is novel: no similar RDF sampling approach exist. Additionally, the concept of creating a sample aimed at maximizing SPARQL answer coverage, is unique. We empirically show that the relevance of triples for sampling (a semantic notion) is influenced by the topology of the graph (purely structural), and can be determined without prior knowledge of the queries. Experiments show a significantly higher recall of topology based sampling methods over random and naive baseline approaches (e.g. up to 90% for Open-BioMed at a sample size of 6%).
- go to publisher's site
- Proceedings title: The semantic web - ISWC 2014: 13th International Semantic Web Conference, Riva del Garda, Italy, October
19-23, 2014: proceedings. - Part 2
Place of publication: Cham
Editors: P. Mika, T. Tudorache, A. Bernstein, C. Welty, C. Knobloch, D. Vrandečić, P. Groth, N. Noy, K. Janowicz, C. Goble
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.