Challenge 4: SemaGrow

The opportunities to create added value by combining and cross-indexing heterogeneous data at a large scale are increasing. SemaGrow is developing scalable, efficient, and robust data services needed to take full advantage of these opportunites and boost the real time performance of global agricultural data infrastructures. With support from experts of the SemaGrow project, you’ll be putting SemaGrow’s tools and technologies to the test in a realistic usage scenario.

The problem to be addressed

Over the last years the trend to open up data and provide them freely on the Internet has intensified, creating opportunities to generate added value by combining and cross-indexing heterogeneous data at a large scale. But most of the low-hanging fruits have been picked and it is time to move on to the next step, combining, cross-indexing and, in general, making the best out of all public data, regardless of their schema, size, and update rate. In other words, a new kind of infrastructure is needed that, besides being efficient, real-time responsive and scalable is also flexible and robust enough to allow data providers to publish in the manner and form that best suits their processes and purposes and data consumers to query in the manner and form that best suits theirs. SemaGrow addresses these challenges by developing a distributed infrastructure layer on top of existing data repositories and networks that will support the interoperable and transparent application of data-intensive techniques over heterogeneous data sources.

This track will offer both hands on hacking and a focused technical workshop, showcasing and demonstrating the technology developed in the SemaGrow project for software developers, researchers working on similar technologies and agri-food data sets. You can experiment with the components and tools of SemaGrow and show what actually can be done with the big open agricultural data that impacts end users (i.e.: add your big open agricultural data to SemaGrow, set up your own instance of the SemaGrow stack, align vocabularies and add data to a central SemaGrow Stack)

You will work in an interdisciplinary team with support from experts from the SemaGrow project putting SemaGrow’s tools and technologies to the test in a realistic usage scenario.

Preferred: Working knowledge of data and the use of the REST-based APIs and/or the RDF query language SPARQL.

Organizers:

  • Alterra, Wageningen UR
  • National Center For Scientific Research, Athens
  • Agro-Know, Athens
  • Semantic Web Company, Vienna

Datasources  (Open government data, description in Dutch)

The SemaGrow stack integrates:

  • Novel indexing algorithms that support the efficient storage and retrieval under the SemaGrow infrastructure.
  • An extension of state-of-the-art query decomposition and rewriting methods.
  • The integration of a variety of state-of-the-art schema alignment methods under a novel architecture.
  • The development of a toolkit and best practices guides for both data providers and consumers using the distributed infrastructure layer.

Inspiration and relevant links

Project web-site: http://semagrow.eu

SemaGrow is an FP7 funded project that envisions the next level of distributed querying over linked data. SemaGrow is developing a new level of scalable, efficient and robust data services based on linked data that is poised to re-shape the way that data analysis techniques are applied to the heterogeneous data cloud.

SemaGrow is organised around a number of real world agricultural resource management use cases. This area represents a good example of a real-world situation where data-intensive analysis needs to combine information from different, large-scale sources that are actively maintained in incompatible schemata: the agricultural domain includes various different topics with subjects varying from plant science and horticulture, to agricultural engineering, to agricultural economics. These different subjects are extensively researched by scientists all over the world, consuming as well as producing an enormous volume of data; agricultural scientists are inundated by an abundance of data as well as reported results relevant to their research as much as their colleagues from different disciplines.

Contact challenger (name, email, Skype, mobile)

Rob Knapen <rob.knapen@wur.nl>
Sr. Research Software Engineer
Alterra, Wageningen UR
Team Earth Informatics
+31 (0)317-481634

Christian Blaschke, Project Manager, c.blaschke@semantic-web.at
Semantic Web Company GmbH, http://www.semantic-web.at/
Mariahilferstrasse 70 / Neubaugasse 1, Top 8
A – 1070 Wien, Austria
Tel +43.1.402 12 35 – 32

Charalampos Thanopoulos (Agro-Know / SemaGrow)

NCSR – Giannis Mouchakis