Category Archives: EU Project

EDF2015 and Linked Data Europe: Big Geospatial Data Workshop

In 2015, the European Data Forum took place in Luxembourg on the 16th and 17th November. GeoKnow team had the pleasure to be present at the event with a booth for showing GeoKnow results. The conference welcomed over 700 participants from industry, research, policy makers, and community initiatives form all over Europe.

Our representrs at the EDF2015

Our representers at the EDF2015

The day after the conference we participated at the Linked Data Europe Workshop, that was organized by IQmulus, GeoKnow, LEO and MELODIES teams. Jens Lehmann of the University in Leipzig and Jonas Schulz from Ontos AG demonstrated our GeoKnow workbench, talked about the tools in our Linked Data Stack and had insights into other projects with the scope of Linked Geo Data and Big Data. Overall 10 projects were presented and the workshop ended with an informative discussion about Linked Geo Data tools replacing or extending existing GIS solutions.

Thanks to everyone, who organized the EDF and the workshop.

Third GeoKnow Plenary Meeting in Leipzig

Unister and Brox hosted the 3rd GeoKnow Plenary Meeting in the beautiful city of Leipzig. On 30th June and 1st July, the project partners gathered to discuss the progress in the work packages and the status of the demonstrators. The project partners are developing tools to help Web users, companies and organisations find and exploit geospatial data. The primary use cases are tourism e-commerce and supply chain management, but the tools can be applied in many more scenarios.

Travel back in time to some warm summer weather…
groupphoto2

GeoKnow Plenary Meeting Belgrade

The GeoKnow team meets in Belgrade for the plenary meeting. During the 2 days the team discusses the achievements since the 1 year review meeting. Besides the ongoing improvement of the various tools the team discusses the topic of benchmarking and quality assessment. A key focus of benchmarking is on Virtuoso store, Facete and Mappify, LIMES, FAGI, GeoLift and TripleGeo. Results of the benchmarks will be published on https://github.com/GeoKnow/GeoBenchLab.

On the second day and the break-out sessions each individual work-package was thoroughly discussed and next steps were defined. Some of the findings are:
- Dashboard requirements and batch processing
- Parallelisation of LIMES process
- Notification and subscription service
- Mobile version for smart phones and tablets
- More free datasets that can be used for the use cases

IMG_3253 IMG_3243

GeoKnow plenary meeting (29.01.2014) in Luxembourg after the 1 Year Review

The GeoKnow project just went through the Year 1 Review and straight after that meeting the consortium met for the plenary meeting on the 29 January 2014 in order to plan the activities within the work packages for the 2nd year. Below you will find the major high level information about the tasks that we will tackle in the upcoming months.

Work Package 1: One of the tasks will be to work on an authentication system that will connect the workbench with the underlying components. The initial version will be graph-based and tightly connected to the Virtuoso Graphs Security function. A first version should be ready by end of May 2014. In an extended version we will add multiple login options such as G+, Web ID etc. Another important issue is to work on scalability and therefore work on the benchmark such as the slippy benchmark, the BSBM BI load and harmonise this with the LDBC. Within the GeoKnow Workbench we will develop a dashboard and simple Workflow system that will allow the specification of processes that are interlinked and will be executed in the background and provide the monitoring of those processes.

Work Package 2: The team will work on improving algorithms related to geospatial dimensions that will support queries in relation to the different use cases. Most of the improvement will be within Task 2.6. InfAI and Athena will elaborate on on enhancements for the TripleGeo module that will include topics such as schema mappings and better support for the transformation process. Those activities are mostly a part of Task 2.7 and will also help to improve the scalability of the GeoKnow system.

Work Package 3: A high workload is related to task 3.2 and the integration of the tools such as FAGI, Facete and LIMES inside the GeoKnow Workbench. The integration work consist two parts. The first mainly concerns a tight technology integration and the second takes care of user interactions. Within task 3.3 major work will be devoted to the improvement of the GeoLift module and adding new datasets into the spatial mapping process. This will help to enrich the RDF datasets and create a larger knowledge base. Precision and and comparison of geospatial features such as metrics for polygons are the core challenges within task 3.4 that have to be solved and implemented. Based on those new implementations the link discovery process within LIMES has to be adapted and evaluated.

Work Package 4: Within WP4 the focus is on the visualisation and interaction with the users. IMP will add components that support mobile devices and a major step is to fully integrate the Facete and Mappify module inside the workbench. An improvement in filtering and selection will help users to navigate faster to the content. Further we have to solve the integration challenge of CubeViz and Facete. The above modules have to satisfy the requirements for knowledge engineer users but also for end users that will use just simplified UI versions to navigate and explore geospatial data content in the browser or on a mobile device.

Work Package 5: The main objective is to present at the next plenary meeting a supply chain scenario that is already using large portions of the GeoKnow workbench and have a real use case feedback.

Work Package 6: The real use case from the travel industry will do a full benchmark including the usage of the workbench on the transformation process including external and internal datasets. Further within this task will be a test where visualisation components will be deployed to the end user portals. Such a benchmark will provide feedback in scalability and handling and therefore provide deep insight about functional requirement, architectural issues and performance.

Work Package 7: Part of the dissemination work is to launch the GeoKnow workbench at two major events in March 2014 and attract a community that will start to use and test the system. Based on this we will be able to collect additional information that will help to tune the system. We will also provide documentation and tutorials on the Web page and continue to closely work with other funded projects.

Work Package 8: InfAI will continue the excellent work on managing the project, collecting and submitting the reports on time and hosting an advisory board telco. The next plenary meeting is set to take place in June 2014 and will most likely be organised by IMP in Belgrade.

Summary and Outlook
There are many small challenges that have to be solved but the major objective is to improve the overall integration of tools, run benchmarks with real use cases and work on scalability.

GeoKnow First Year Benchmark Results

A GeoKnow Benchmarking laboratory has been setup for comparison of benchmark results over the duration of the GeoKnow project. In it current form this takes for the form of original the LOD2 project GeoBench program which has been taken over and adopted for use in the GeoKnow project and available from the GeobenchLab GIT repo.

The current improvements in the GeoBench program are primarily related to the expansion of the benchmark, in order to make it employable not only to RDF data, but to relational data as well. This will open opportunities for a performance comparison between RDF and relational spatial data management  systems.

Below are some comparison results using the planet wide Open Street Map (OSM) datasets hosted in both Virtuoso and PostGIS.

Screenshot 2014-01-20 11.02.07

In summary, the result demonstrate that Virtuoso in both SQL and SPARQL outperformed PostGIS by significant factor. Specifically, all the queries in the power run were executed 33 times slower in PostGIS than in Virtuoso SQL (single server). If we compare PostGIS with Virtuoso SPARQL, the factor will be even greater: 131 for low zoom level queries, 23 for high zoom level queries, and 113 in total. If we correlate Virtuoso SPARQL and SQL (single server), we will conclude that the relational version is slower almost 4 times on low zoom level queries, while it is faster 23% on high zoom level. In total, SQL version is slower more than 3 times.

The full  descriptions and results of these benchmarks can be found in the GeoKnow D1.3.2 - Continuous Report on Performance Evaluation deliverable.

GeoKnow at ISWC 2013: Geospatial Data Integration, Enrichment, Quality, Federated Querying and Winning the Big Data Prize

Logo ISWC 2013

Members of the InfAI’s GeoKnow team attended ISWC 2013 in Sydney, Australia. There, they presented several papers pertaining to the project. The first talk was centered around ORCHID, a scalable link discovery approaches for geo-spatial resources which deals with WKT data. ORCHID was shown to outperform the state of the art w.r.t. scalability and is now an integral part of the link discovery  framework LIMES. Furthermore, an approach to semi-automatically improve the schema of knowledge bases was integrated into DL-Learner and shown at the conference. This helps knowledge engineers to more easily structure their data – a problem which is often perceived as a bottleneck in achieving the Semantic Web vision. Another timely topic – data quality assurance via crowdsourcing – was also accepted and demonstrated at the conference. In the fourth talk, the GeoKnow team presented DAW, a duplicate-aware solution for federated SPARQL queries. DAW can be used in combination with any federated SPARQL query engine to optimize the number of sources it selects, thus reducing the overall network traffic as well as the query execution time of existing engines. More information on DAW can be found here. We are also very happy that GeoKnow won the Big Data Prize at the Semantic Web Challenge. The paper can be found here and the demo is here.

How can GeoKnow help you?

At this stage of the GeoKnow project, we are shaping our dissemination and exploitation strategy. In order to stay as close as possible to the needs of our potential users, we have created a survey to find out how we can help your business most.

If you work with geospatial data, your participation would be greatly appreciated. The survey will take no longer than 10 minutes to complete. There is a little incentive as well: If you leave your email address, you will be automatically entered to win one of three Amazon vouchers worth 50 Euro.

Please click on this link to take part: GeoKnow exploitation plan survey

A Survey for Geospatial Data Users

Many different applications we deal with on a daily basis have some kind of geographic dimension. This geospatial information is normally required for decision making at different levels. However, this information is dispersed among a multiplicity of sources. At GeoKnow we aim to make information seeking easier by allowing exploration, editing and interlinking of heterogeneous information sources with a spatial dimension.

Now we are interested in getting to know the people that face these kinds of issues in their everyday work. We have created a survey to help us to understand and to hear more about their experience with geospatial data.  This survey targets geospatial data consumers and providers, and GIS users interested in having an integrated web of geospatial data.

If you use geospatial data in your work, your contribution in this survey will be highly appreciated. The outcome of this survey will impact the use cases and requirements for the GeoKnow project, which aims to create a versatile software framework to rapidly generate spatial semantic web applications.

We are offering a 20 euro Amazon voucher to the first 50 completed surveys. Willing to participate? Please go right away to:

The Geoknow Survey

We value your participation!

GeoKnow consortium works on making the Web an exploratory place for geospatial data

EU, researcher and industry partners working together adding spatial dimensions to the Web in order to improve search, reuse and interlinking of data.

Leipzig, December 17, 2012: GeoKnow, an EU FP7 funded project, recently started according to schedule in December 2012. Its goal is to research geospatial data, in particular, the integration and linking of such data from different domains, scalable reasoning over billions of geographic features within the Linked Data Web and the efficient crowd-sourcing and collaborative authoring of geographic information.

Nowadays, many applications have a geographic dimension. Map services such as Google Maps, Yahoo Maps and Microsoft Live Maps display locations of shops and reviews of customers. Yet it is not possible to link geographic locations to data sets with semantic information such as the offered type of products in the shops. Hence, the type of queries in those services is very limited as it is, for example, not possible to ask for nearby shops offering a certain type of product with opening hours after working time. The information required to answer such queries is available, however dispersed among a multiplicity of information sources like isolated Geographic Information Systems, enterprise warehouses, proprietary data formats such as Excel sheets or simple web pages.

GeoKnow’s aim is to simplify information seeking by allowing exploration, editing and interlinking of heterogeneous sources with spatial dimensions. The research project will develop open source tools which will help users, companies and government organizations to expose and utilize structured geospatial information on the Web. Thus, the project addresses various stakeholders who will benefit from the research project. Use cases from supply chain management and e-commerce (travel industry) will supplement the research. One exemplary aim within the supply chain use case is to provide a unified spatial view on parts of a logistic process. To achieve this target, information with geographical reference points will be connected to the Data Web, allowing for a better observation of the information flow to provide better analytics and to improve decision making processes. In the e-commerce use case, travel industry users will benefit from more background information, enriched content and sophisticated spatial search functionalities.

The GeoKnow consortium is going to meet for the kick-off meeting in Leipzig on January 16th and 17th and is planning to inform the community about the results using actual channels such as LinkedIn, Google+, Facebook, Twitter and the GeoKnow web page.

The GeoKnow consortium. Working towards the achievement of the ambitious project goal in the consortium are six partners representing four different countries. The consortium assembled leaders from industry and research that will work in the next three years on the EU funded project. The project coordinator is InfAI (Institute für Angewandte Informatik) from the University of Leipzig (Germany), Athena Research and Innovation Center (Greece), OpenLink Software (United Kingdom), Unister (Germany), Brox (Germany) and Ontos (Switzerland).

More information

About the project: http://geoknow.eu

About InfAI: http://infai.org

What is GeoKnow about?

Spatial dimensions of information have high relevance  for everyday problems. A typical example is knowing the locations of the closest stores which have a specific product in stock and are currently open. This geographic dimension information is normally available, but dispersed among a multiplicity of information sources such as isolated Geographic Information Systems, enterprise warehouses, proprietary data formats such as Excel sheets or simple web pages.

The aim of the GeoKnow project is to make information seeking easier by allowing exploration, editing and interlinking of heterogeneous information sources with a spatial dimension. Complex scenarios such as the logistical status of a product within a supply chain and data warehouses of e-commerce systems are also dealt with in the Geoknow project.

Geoknow aims to contribute to the following areas concerned with geospatial data:

  • Creation and maintenance of qualitative geospatial information from existing unstructured data such as OpenStreetMap, Geonames and Wikipedia. Geoknow will develop quality assessment methods which anticipate a geospatial search and the acquisition and aggregation of information resources.
  • Reuse and exploitation of unforeseen discoveries found in geospatial data. Geoknow will provide methods to acquire, analyse and categorise data that is rapidly evolving,  immense, incomplete and potentially conflicting. This will be achieved with:
    • Tools and methodologies for mapping and exposing existing structured geospatial information on the web of data, considering comprehensive and qualitative ontologies and efficient spatial indexing.
    • Automatic fusing and aggregation of geospatial data by developing algorithms and services based on machine learning, pattern recognition and heuristics.
    • Tools for exploring, searching, authoring and curating the Spatial Data Web by using Web 2.0 and machine learning techniques based on scalable spatial knowledge stores.

All these contributions are integrated in the open source GeoKnow Generator framework developed by the consortium.  This framework allows the creation of spatial semantic web applications rapidly by:

  • integrating  geospatial reasoning tools,
  • processing of billions of geospatial information sets
  • spatial-semantic browsing
  • enabling the combination of datasets from the LOD cloud and private data with a geographic dimension.

The GeoKnow Generator will provide a comprehensive toolset of easy to use applications covering a range of possible usage scenarios (e.g. mobility/traffic, energy/water, culture, etc). It will be used by Unister as a spatial-semantic travel e-commerce data management tool; and by BROX as a spatial-semantic collaboration and data integration tool along value-chains in supplier and customer networks. The relevance of these scenarios will be described in another post.