The SDI for A Drought Observatory
A coherent workflow, from the data acquisition to the information delivering
The SDI built to support the DO is based on the two paradigms Open Innovation and FAIR.
The Open innovation concept consists of three pillars:
- Open Source
- Open Data
- Open Access
FAIR is the acronym defining how should be information: Findable, Accessible, Interoperable, Reusable
Moreover, the SDI responds to some fundamentals requirements: research data openness, interoperability, flexibility, scalability, responsiveness, specific user needs and skills.
The DO SDI technological components are organized in typical client-server architecture and interact from the data provider’s download data process to the results representation to end users, following general OGC guidelines.
Data Cube
The geospatial Data Cube multi-dimensional approach allows the ingestion, storage, access, analysis, and use of large amounts of data elements inherently ordered according to shared attributes, one of which has to be their geospatial location (Strobl et al., 2017).
OGC Standards
OGC (Open Geospatial Consortium) standards are used in several elements developed into the DO SDI, starting from the data model, designed using Unified Modeling Language (UML) (ISO TC/211), to PostGIS open source software.
Data Model
The data model is developed following a participative approach among the researchers involved in data collection and analysis for the application schema implementation.
Platform Interoperability
To ensure the platform interoperability between geospatial data and services, three main services are considered in the general SDI architecture: catalog service, data service and processing service.
The design of the DO SDI
Open source softwares and a layered architecture for the DO monitoring framework optimization
Providers Layer Retrieving input data
Drought Framework Layer Managing metadata and processing stored data
Client-side Layer Results dissemination
All the three layers communicate through specific Representational State Transfer (REST) web services, following the SOA paradigm.
REST paradigm, even if only marginally considered in the OGC standards implementation (i.e. for the WMTS), is preferred to the Simple Object Access Protocol (SOAP) because it is lightweight and less client-side complex to manage by the users.
Furthermore, RESTful Web Services provide functions of data extraction and downloading in an effective and highly flexible way.
The Providers Layer is in charge of managing input data coming from different sources (CHIRPS rainfall, and MODIS LST, NDVI, EVI) and storing them into the Geodatabase (GeoDB), implemented into the Framework layer. The OGC data formats actually supported by the DO SDI are the NetCDF for input CHIRPS rainfall dataset, and the Well-Known Text (WKT) for vectors used for the extraction and processing functions.
MODIS data are in Hierarchical Data Format-Earth Observing Systems (HDF-EOS) format, an approved standard recommended for use in NASA Earth Science Data Systems.
Specific Bash scripts have been developed in order to download and prepare input data before saving them into the GeoDB.
Geospatial Data Abstraction Library (GDAL) and PostGIS reprojection, tiling and storing functions are used to improve the GeoDB performances and to harmonize the datasets with the data model.
All datasets are reprojected into a common and widely used reference system: the EPSG:4326 (i.e. Latlong, WGS84). The same Bash scripts call RESTful Web Services supplied by the Framework Layer to store the datasets into the GeoDB.
A continuous input data updating is ensured by cron daemon (crontab file) that launches the scripts automatically.
The Drought Framework Layer (B) is the main component of the DO SDI architecture, in which the PostgreSQL Data Cube represents the only environment for data storage and geoprocessing.
At this implementation stage, geoprocessing queries do not completely follow OGC WPS specifications, working with REST Web Services instead of SOAP. All the services allow the storage of new data while their retrieving and processing are developed locally using REST paradigm, and called through simple HTTP GET and POST operation requests. PostgreSQL Data Cube is used to store input data (rainfall, LST, NDVI, EVI), to perform all geoprocessing procedures (queries, indices elaborations, statistical operations, etc.), and to generate intermediate data (LSTmin, LSTmax, NDVImin, NDVImax, EVImin, EVImax) and output images (SPI, TCI, VCI, E-VCI, VHI, E-VHI) with different formats, i.e. GeoTIFF, PNG, ASCII Grid.
Though all the indices are calculated inside PostgreSQL, the different complexity of vegetation and rainfall indices computation has forced to use different libraries. TCI and VCI, in fact, result from simple arithmetic operations that can be done directly in PL/pgSQL using PostGIS library, successfully taking advantage of its features.
The SPI index, instead, is obtained with more complex statistical functions (fitting of a Gamma probability distribution, transformed into a standard Gaussian variable). For this reason, SPI elaboration has been implemented with the integration of a specific R library with PostGIS library. The integration between R engine and PostGIS is made possible by PL/R, the R wrapper for PostgreSQL.
A process-based SDI should stress information communication in order to reach a wide range of users and facilitate effective decision planning.
The web services implemented in the Client-side Layer (C) support the development of custom applications for the dissemination of results and services handling.
Custom client applications are developed following the specific user’s needs (researchers, practitioners, public authorities and community) taking advantage of interoperable services supplied by the Drought Framework Layer.
The RESTful API (Application Programming Interface) functions of the Drought Framework Layer allows to:
- create WebGIS applications, customized websites, and services that require DO SDI data;
- develop a plug-in for other desktop GIS applications such as QGIS or ArcGIS;
- share and integrate DO data with other interoperable SDIs.
The Framework Layer Services
The open source Comprehensive Knowledge Archive Network (CKAN) data-management platform and the GeoServer data-publishing web server, are respectively used to harvest the catalog and to publish data and metadata. CKAN supports ISO 19139 (Geographic information – Metadata – XML schema implementation) encoding for metadata description and it is also able to manage the OGC CSW and WMS standards.