Tuesday, October 27, 2009

ISWC In Use Track - raw notes 1

Auer & Lehmann - Spatially Linked Geodata

Many real-world tasks use spatial data. Current LOD datasets only have large-scale geographic structures, not bakeries, recycling bins, etc. How to get geo data for small scale objects? OpenStreetMap.com - provides a crystallization point for spatial web data integration. stats on current size of database, growth rates 7-11% montly in various categories. collaborative process, data stored in RDB but available as periodic dumps or incremental update feeds. Can add arbitrary key-value pairs to any element, can be used to add semweb annotations.

Authors' project converts OSM models and properties to RDF/OWL. Result: 500 classes, 50 object properties, 15K data properties (which seems like a lot)

Use triplify to generate RDF from relational data. Dump at linkedgeodata.org/Datasets, sparql endpoint hosted by OpenLink. Other REST interfaces: points within a circular radius of a given point (cool!), points within a radius belonging to a class, points in a radius with a given property value.

Want to link to other LOD datasets, e.g DbPedia. Some owl:sameAs links in schema are obvious. Also use DL-learner to match categories. For instance data, three matching criteria: name, location, type. Some problems matching locations, since no consensus on where to place location markers for large entities like cities. For large countries, e.g. Russia, centroids can be 1000km apart between OSM and Wikipeida. needed some string matching metrics to get name matches, but set threshold fairly high. Generated 50K matches to DbPedia objects, mostly cities.

Demo - very nice. Facet browsing can be used to narrow selections. Much effort to index data for efficient facet lookup. Quadtile indexing - 2 bits per quad, recurse. 18 zoom levels, producing discrete hypercube.

Future work: link to other datasets. Refine LGD schema. Refine browser. Apply best practices from other Geo projects.

No comments: