FAGI-gis: fusing geospatial RDF data

GeoKnow introduces the latest version of FAGI-gis, a framework for fusing Linked Data, that focuses on the geospatial properties of the linked entities. FAGI-gis receives as input two datasets (through SPARQL endpoints) and a set of links that interlink entities between the datasets and produces a new dataset where each pair of linked entities is fused into a single entity. Fusion is performed for each pair of matched properties between two linked entities, according to a selected fusion action, and considers both spatial and non-spatial properties.

The tool supports an interactive interface, offering visualization of the data at every part of the process. Especially for spatial data, it provides map previewing and graphical manipulation of geometries. Further, it provides advanced fusion functionality through batch mode fusion, clustering of links, link discovery/creation, property matching, property creation, etc.

As the first step of the fusion workflow, the tool allows the user to select and filter the interlinked entities  (using the Classes they belong to or SPARQL queries) to be loaded for further fusion. Then, at the schema matching step, a semi-automatic process facilitates the mapping of entity properties from one dataset to the other. Finally, the fusion panel allows the map-based manipulation of geometries, and the selection from a set of fusion actions in order to produce a final entity, where each pair of matched properties are fused according to the most suitable action.

The above process can be enriched with several advanced fusion facilities. The user is able to cluster the linked entities according to the way they are interlinked, so as to handle with different fusion actions, different clusters of linked entities. Moreover, the user can load unlinked entities and be recommended candidate entities to interlink. Finally, training on past fusion actions and on OpenStreetMap data, FAGI-gis is able to recommend suitable fusion actions and OSM Categories (Classes) respectively, for pairs of fused entities.

The FAGI-gis is provided as free software and its current version is available from GitHub. An introductory user guide is also available. More detailed information on FAGI-gis is provided int he following documents:


GeoKnow at Semantics 2015, Vienna

Several partners of GeoKnow were present this year at the Semantics conference 2015.
The previous day of the conference we organised a workshop about the work done during these last three years in GeoKnow.
In the conference, three papers with GeoKnow acknowledgement were presented:

  • Integrating custom index extensions into Virtuoso RDF store for E-Commerce applications, presented by Matthias Wauer,
  • An Optimization Approach for Load Balancing in Parallel Link Discovery presented by Mohamed Ahmed Sherif, and
  • Data Licensing on the Cloud – Empirical Insights and Implications for Linked Dat, presented by Ivan Emilov

And two posters in the posters sessions:

  • The GeoKnow Generator Workbench – An Integrated Tool Supporting the Linked Data Lifecycle for Enterprise Usage, and
  • RDF Editing on the Web

Moreover, the GeoKnow team was demonstrating tools and the Workbench at the Booth reserved for us. It was a nice experience and good opportunity to share our work and to see other peoples projects.


The 2nd Geospatial Linked Data Workshop

This week we the 2nd GeoLD workshop took place previous to the Semantics conference 2015 in Vienna. We had as invited speaker Franz Knibbe from Goedan in Netherlands. Franz is currently contributing to the Spatial Data on the Web Working Group, where people from OGC and W3C are trying to define the best ways to integrate geospatial data on the web of data. His talk was very inspiring, for instance he described us part of the spatial aspects that matter for both working groups, data that goes from gathering data from the galaxy, to microscopical skin structures. You can discover little bit more of his talk in slideshare.
The workshop continued with the presentation of three software tools for exploring geospatial data on the web. Facete is a faceted browser of geospatial data in RDF format, and also allows to edit the data. The second tool was ESTA-LD, which can be used for exploring statistical data that is represented using the Data Cube Vocabulary. And DEER, a data extraction and enrichment framework, allows to create pipelines for analysing unstructured data, finding interlinks with other datasets, and extracting knowledge form the linked datasets in order to enrich the data.
We also presented the GeoKnow Generator demo, which integrates the tools presented +9, offering enterprise ready features, in order to support the company usage of such tools. The usability of GeoKnow tools was demonstrated with two more presentations. The Supply Management showed how they integrated spatial data for improving information and decision making in the supply chain. Finally, the Tourism e-Commerce showed how the integration of geospatial data is used to improve recommendations in a motive-based user request.

2015-09-15 09.09.26

2nd edition of GeoLD Workshop at Semantics Conference

We are preparing the second edition of the Geospatial Linked Data Workshop that will be held before the Semantics conference the 15th of September 2015, in Vienna.

For the GeoLD workshop we have invited an active member of the Spatial Data on the Web Working Group, who will be presenting the story so far carried out by this WG. This WG was created a year ago, and brought together two major standards bodies, the Open Geospatial Consortium (OGC) and the W3C with the objective to improve interoperability and integration of spatial data on the Web.

The rest of the presentations at the workshop are about useful tools for exploring geospatial data on the web, and enriching data with geospatial features. These tools and a complete Use Case scenarios will demonstrate the importance of integrating geospatial data to solve business questions.

You can have a detailed agenda in the workshop website. You can register for free in the conference website HERE.

DEER at ESWC 2015

GeoKnow was present at the Extended Semantic Web Conference (which took place in Portoroz, Slovenia) in many ways. In particular, we presented a novel approach for automating RDF dataset transformation and enrichment in the machine learning track of the main conference. The presentation was followed by a constrictive discussion of results and insights plus hints pertaining to further possible applications of the approach.

GeoKnow presented semantic search approach with geospatial context at KNOW@LOD workshop on ESWC

One of the key questions considering the use of linked data is search. Search-driven applications are widely spread, for example in the e-commerce industry or for business information systems. Hence, GeoKnow is also aiming on improving semantic search components, particularly considering geospatial data. Within the GeoKnow consortium the partner Unister — as being active as B2C service provider — is focussing on this topic in the last year of the project.
At the 12th instance of the Extended Semantic Web Conference (ESWC 2015) one result of this research presented in the well-organized 4th Workshop on Knowledge Discovery and Data Mining meets Linked Open Data (KNOW@LOD). The presented paper is named “Computing Geo-Spatial Motives from Linked Data for Search-driven Applications”. Hence, we are considering geo-spatial motives within search queries (e.g., “winter holiday”, “cutural”, “sports activities”) that cannot be answered by a data instance itself but need to interpret the information from the available data to discover relevant regions (e.g., populated places having a sufficient number of cultural hotspots nearby).

Know@LOD 2015 is Over!

The 4th Edition of Know@LOD was held at ESWC 2015 in Portoroz, Slovenia. The workshop was a success, featuring a highly inspiring keynote by Marko Grobelnik, a number of interesting research paper presentation, a very competitive Linked Data Mining Challenge, and a social dinner.

The best paper award goes to Mayank Kejriwal and Daniel P. Miranker: Sorted Neighborhood for Schema-free RDF Data.

The Linked Data Mining Challenge award goes to Suad Aldarra and Emir Muñoz: A Linked Data-Based Decision Tree Classifier to Review Movies.

Congratulations to the winners!

GeoKnow presented paper at WASABI workshop on ESWC

Data-driven processes are the focus of the International Workshop on Semantic Web Enterprise Adoption and Best Practice (WASABI 2015) of the 12th Extended Semantic Web Conference (ESWC 2015). At the 3rd instance of the workshop Andreas Both presented on behalf of the GeoKnow consortium the GeoKnow Generator Workbench which is a key outcome of the our project.
The GeoKnow Generator is a stack of components that covers several steps in the Linked Data Lifecycle. These components have been integrated in the Generator Workbench. To demonstrate its usability we also present four different real world use cases where the Generator and Workbench are used, which are dedicated to the verticals e-commerce, logistics, e-government and automotive industry.

GeoKnow at Belgrade Fair

The GeoKnow project was presented at the recently finished international science and technology fair in Belgrade, Serbia, the largest event of its kind in the region, held from May 11 through 15. The Institute Mihajlo Pupin team used the opportunity to present the goals and accomplishments of the project, as well as PUPIN’s own results, such as GEM, the mobile semantic geospatial browser, and ESTA-LD, the exploratory spatiotemporal analytics tool for Linked Data, both outcomes of the GeoKnow Work Package 4 efforts, to a wider audience.

2zESYrX3_Ux3dKcLyDzs4y0R0glm-lH7YVsiSrFa7Ys Continue reading

OSMRec – Α tool for automatic annotation of spatial entities in OpenStreetMap

GeoKnow has recently introduced OSMRec, a JOSM plugin for automatic annotation of spatial features (entities) into OpenStreetMap.  OSMRec trains on existing OSM data and is able to recommend to users OSM categories, in order to annotate newly inserted spatial entities. This is important for two reasons. First, users may not be familiar with the OSM categories; thus searching and browsing the OSM category hierarchy to find appropriate categories for the entity they wish to insert may often be a time consuming and frustrating process, to the point of users neglecting to add this information. Second, if an already existing category that matches the new entity cannot be found quickly and easily (although it exists), the user may resort instead to using his/her own term, resulting in synonyms that later need to be identified and dealt with.

The category recommendation process takes into account the similarity of the new spatial entities to already existing (and annotated with categories) ones in several levels: spatial similarity, e.g. the number of nodes of the feature’s geometry, textual similarity, e.g. common important keywords in the names of the features and semantic similarity (similarities on the categories that characterize already annotated entities). So, for each level (spatial, textual, semantic) we define and implement a series of training features that represent spatial entities into a multidimensional space. This way, by training the aforementioned models, we are able to correlate the values of the training features with the categories of the spatial entities, and consequently, recommend categories for new features. To this end, we apply multiclass SVM classification, using LIBLINEAR.

The following figure represents a screen of OSMRec within JOSM. The user can select an entity or draw a new entity on the map and ask for recommendations by clicking the “Add Recommendation” button. The recommendation panel opens and the plugin automatically loads the appropriate recommendation model that has previously been trained offline.

6. Recommendation panel

The recommendation panel provides a list with the top-10 recommended categories and the user can select from this list and click “Add and continue”. As a result the selected category is added to the OSM tags. By the time the user adds a new tag at the selected object, a new vector is computed for that OSM instance in order to recalculate the predictions and display an updated list of recommendations (taking into account the previously selected categories/tags, as extra training information). Further, OSMRec provides functionality for allowing the user to combine several recommendation models, based on (a) a selected geographic area, (b) user’s past editing history on OSM and (c) combination of (a) and (b). This way, personalized category recommendations can be provided that take into account the user’s editing history and/or the specific characteristics of a geographic area of OSM.

OSMRec plugin can be downloaded and installed in JOSM following the standard procedure. Detailed implementation information can be found in the following documents: