Substructure counting graph kernels for machine learning from RDF data

G.K.D. de Vries; S. de Rooij

doi:https://doi.org/10.1016/j.websem.2015.08.002

Substructure counting graph kernels for machine learning from RDF data

Authors	G.K.D. de Vries S. de Rooij
Publication date	12-2015
Journal	Journal of Web Semantics
Volume \| Issue number	35 \| 2
Pages (from-to)	71-84
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	In this paper we introduce a framework for learning from RDF data using graph kernels that count substructures in RDF graphs, which systematically covers most of the existing kernels previously defined and provides a number of new variants. Our definitions include fast kernel variants that are computed directly on the RDF graph. To improve the performance of these kernels we detail two strategies. The first strategy involves ignoring the vertex labels that have a low frequency among the instances. Our second strategy is to remove hubs to simplify the RDF graphs. We test our kernels in a number of classification experiments with real-world RDF datasets. Overall the kernels that count subtrees show the best performance. However, they are closely followed by simple bag of labels baseline kernels. The direct kernels substantially decrease computation time, while keeping performance the same. For the walks counting kernel this decrease in computation time is so large that it thereby becomes a computationally viable kernel to use. Ignoring low frequency labels improves the performance for all datasets. The hub removal algorithm increases performance on two out of three of our smaller datasets, but has little impact when used on our larger datasets.
Document type	Article
Language	English
Published at	https://doi.org/10.1016/j.websem.2015.08.002 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Substructure counting graph kernels for machine learning from RDF data