Connecting Hundreds of Oceanographic Data Sources from 35 Countries in and around Europe into a Big Data Network

Monday, 15 December 2014: 5:45 PM
Taco De Bruin, Royal Netherlands Institute for Sea Research, Den Burg, 1790, Netherlands
Until some years ago, oceanographic data in Europe were scattered across dozens of data centres and hundreds of government agencies, research institutes and university groups. As a consequence, it was impossible to get an overview of available data and very difficult to get access to the data. And even if one managed to get access, data were difficult to use because the data came in different formats and were of varying quality.

All these issues were successfully addressed in a series of projects culminating in the current SeaDataNet-project. The resulting and operational SeaDataNet infrastructure now connects hundreds of data sources from 35 countries in and around Europe. It was designed to provide a central overview of available data as well as direct access to a distributed system of online data sources. When data are being transferred to the user, the data are converted into a standard format, chosen by the user. Through a joint activity with the MyOcean project, all data within the SeaDataNet infrastructure have been quality controlled.

The SeaDataNet infrastructure now offers many possibilities in support of Europe’s ‘blue economy’.
It forms the backbone for the projects under the umbrella of the European Marine Observation and Data Network or EMODNet. The EMODNet projects produce a series of dataproducts on all aspects of the European marine environment. The data for these dataproducts are provided using the SeaDataNet infrastructure.

This presentation describes the principles of the SeaDataNet infrastructure and how it connects those hundreds of data sources in Europe. It will explain the strong link, but also the differences, with the MyOcean project and will go on to introduce the EMODNet-programme and -projects as an example of what can be done if one has achieved uniform access to a distributed system of hundreds of data sources. Finally, the presentation will also address the long-term sustainability of the chosen approach.