Realising spatial data infrastructure by making it usable

Cloud native spatial data infrastructure offers a practical and cost-effective way to enable a “queryable” Earth and the open science movement which is critical to tackling global problems.

Realising spatial data infrastructure by making it usable

At this year’s FOSS4G North America conference in Baltimore, Chris Holmes presented his vision for a reimagined spatial data infrastructure (SDI) that is simple and usable. Cloud native spatial data infrastructure, he explained, could offer a practical and cost-effective way to enable a “queryable” Earth and the open science movement which is critical to tackling global problems.

“I believe prioritising the ease and cost of publishing data can fundamentally change the dynamics for SDIs to become useful. The core problem is that we've continued to let spatial data infrastructure be this abstract task, not something to be solved with real urgency,“ says Chris Holmes, a developer of software including GeoServer and GeoNode, and the product architect at satellite firm Planet.

“The major problems that face our planet – climate change, the biodiversity crisis, health care, peace and security, feeding everyone – are all spatial in nature. But spatial data is rarely used by those making the major decisions that impact everybody.”

According to Holmes, the core architecture to implement the traditional SDI concept has been too complicated. SDI implementations are difficult to use, and they are even more difficult to publish data to as a data producer.

He uses Werner Khun’s definition of an SDI as “a coordinated series of agreements on technology, standards, institutional arrangements, and policies that enable the discovery and use of geospatial information by users and for purposes other than those for which it was created”.

For a small municipality to publish only a few layers of vector data requires standing up a WMS, a WFS, a CSW, configuring an XML mapping schema and GML profiles – all of which make it unrealistic. “Lowering the barriers to participation will enable a set of data publishers who haven't had the resources or capabilities to stand up servers.”

SDIs have missed their moment. If there was lots of truly valuable geospatial information, truly available and accessible, language learning models would be slurping it up with ease and letting people query it. Instead, all geospatial knowledge in Large Language Models (LLMs) is based on what was written online, and usually falls short for questions that are easy to answer if you have the spatial data. – Chris Holmes

Cloud native data infrastructure could be a simpler SDI, Holmes says. Query interfaces (e.g. LLMs) and standards are already established, but growing the ecosystem with extensions and data schemas requires more work.

Making the SDI the database

Recent innovations in mainstream database technologies allow the separation of data query and data storage, which has “profound implications for spatial data infrastructure” as the data no longer needs to be loaded into the database to be queried. “It flips the traditional notion of SDI on its head. Instead of distributing data from a database over APIs for desktop or server-based analysis, the SDI is the database,“ Holmes explains.