This is the blog version of my talk From Data Silos to Data Sharing-A Geodata Perspective at State of the Map Asia 2024, Cox’s Bazar, Bangladesh. Here I am to touch upon are standards and their needs.This article was first published on arkives.in by the same author.
License: This article is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
Preface
Two years back I had written Role of Open Data in the Geospatial World, an article on open data and why companies need to contribute to open data. That was the starting point for this article as well. Though this was on my list pending for a long time, I found the motivation to shape it only after hearing the podcasts of Nadine Alameh and Scott Simmons, folks from OGC. Thank you to Daniel for hosting the Map Scaping podcast as well. I was so much influenced by their podcasts, you may even find exact examples from it being used here.
Geodata
ISO/TC21, a standard technical committee within ISO, tasked with covering the areas of geographic information in the digital world states that data and information having an implicit or explicit association with a location relative to Earth are called Geodata. In simple terms, geodata refers to any information that includes a spatial or location-based component.
Features on the earth’s surface such as roads, buildings, forests, rivers, or even abstract concepts like population density or weather patterns can be represented as geodata. It often includes attributes like a road’s name or width(in the case of roads) along with the spatial coordinates to locate them on a map. These attribute data can be considered as an adjective of the data. In day-to-day life, we use geodata for various use cases such as navigation apps, weather updates, emergency services, food delivery apps, ride-sharing services, social media check-ins, and whatnot.
OGC Standards
The Open Geospatial Consortium (OGC) is an international voluntary consensus standards organization that develops and maintains international standards for geospatial content commonly known as OGC Standards and widely accepted as the geodata standards.
Geodata Standards are structured rules, formats and protocols that ensure geographic data is stored, shared and used in a consistent, interoperable and reliable manner. These standards enable diverse systems, applications, and organizations to work together efficiently by ensuring geodata is understood and processed uniformly.
Standards are like the rules of the road for drivers.
Imagine everyone driving on their own side, using the same signs and signals — that’s how standards ensure different systems “speak the same language” and work together seamlessly. Without these rules, it would been such chaos, with each system needing its own special rules!
Like all rules Standards are Boring, Standards are fixed and plain and there are no amusing elements for you in them, But remember Boring is not Unimportant !
Why do Standards Matter?
Our world is always changing. Standards are becoming more and more crucial as the geospatial environment evolves at an accelerated rate. They are crucial for staying up to date and providing a common language to guarantee interoperability, enabling various platforms and applications to use geographic data easily.
A HEAVY Launch Story
Let’s go back to Feb 6, 2018, when SpaceX launched Falcon Heavy, with Elon Musk’s Tesla Roadster into space, which not only opened a new era in the space industry but also showed us all once again the need for standards and interoperability of standards. According to the Federal Aviation Administration (FAA) as many as 563 flights experienced delays that day, and they had to fly an extra 34841 nautical miles in total. With an increase in commercial space launches the older system of intimating the launch time a week before to FAA became not feasible, and flights from the eastern coast and the aviation industry there kept suffering with increased and longer durations of airspace closures.
There were no such systems in place to date which could communicate to the space system and air traffic controls, because there is no interoperability and standard protocols of information exchange in place to have automated ways of communication. Later FAA came up with Space Data Integrator (SDI) which reduced the time and optimised closure spaces according to the launched and re-entries.
FAIR Principles
The FAIR principles are regarded as the fundamental rule of thumb when discussing the data standards. Adopting FAIR principles has been emphasised in several national and international policies to increase the usage of geodata.
FAIR stands for Findable, Accessible, Interoperable, and Reusable.
Its not fair, there is no FAIR
Imagine you arrived in a city where Public transport was run by multiple agencies. Bus services are as per pre-decided schedules, available only at the agency office. Each agency has a different system of recording the schedule and hence in various formats.
Since you’re new to the area, you want to make travel plans but are unsure of the schedules and where to get them. You managed to locate one agency and obtain the timetables. However, they are barely halfway through the route.
The schedule for the other agency is formatted differently. You spent hours figuring out how to travel and interpreting both the schedule.
A second person like you must go through all of these hassles to prepare his trip itinerary when he arrives in the city.
Many of us might have faced similar situations earlier, however, these days technology has solved this problem which was possible because of the adoption of FAIR principles. Let us try to apply the FAIR principles to the above problem.
FAIR ification
All transit agencies document their schedules in an open standard like GTFS and publish them on a portal under an Open Data License. This enables the data to be mapped to mapping platforms like OpenStreetMap.
As an end user, you simply use some navigation app to give Start and End points, select the mode and you get your travel plan ready. The same logic applies to any number of people arriving in the city.
The same bus route data will be reused by city administrators to plan new routes, optimise existing routes and solve many other complex problems
Findable Data
Finding data is the first step in (re)using it. Both machines and people should be able to easily locate data and metadata. Automatic discovery of datasets and services requires machine-readable metadata. There are four principles for making data findable as mentioned in GoFAIR.org
F1: (Meta)data are assigned a globally unique and persistent identifier
F2: Data are described with rich metadata (defined by R1 below)
F3: Metadata clearly and explicitly include the identifier of the data they describe
F4: (Meta)data are registered or indexed in a searchable resource
Accessible Data
After locating the necessary data, the user must understand how to access it, with all authorisation and authentication. There are two main principles for making data accessible as mentioned in GoFAIR.org
A1: (Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1: The protocol is open, free, and universally implementable
A1.2: The protocol allows for an authentication and authorisation procedure, where necessary
A2: Metadata are accessible, even when the data are no longer available
Interoperable Data
Usually, the data must be combined with additional data. The data must also work with workflows or applications for processing, storing, and analysing. There are three principles for making data interoperable as mentioned in GoFAIR.org
I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2: (Meta)data use vocabularies that follow FAIR principles
I3: (Meta)data include qualified references to other (meta)data
Reusable Data
FAIR’s ultimate objective is to optimise data reuse. This is accomplished by having well-described metadata and data that can be integrated and/or replicated in many contexts. The principles for making data reusable as mentioned in GoFAIR.org is
R1: (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1: (Meta)data are released with a clear and accessible data usage license
R1.2: (Meta)data are associated with detailed provenance
R1.3: (Meta)data meet domain-relevant community standards\
Open Standards and Benefits
Open standards can be defined as an industry-standard system, schema, or format that makes it easier for devices that are OPEN to communicate with each other consistently. A standard can be considered open when it satisfies several measures that ensure accessibility, transparency, and fairness in its development and implementation. The key factors which define that are\
- Open Accessibility: The standard must be freely available for anyone to read, implement, and use without restrictions or fees. This includes being accessible from a stable website free from copyright restrictions.
- Transparent Development Process: The creation of the standard should involve collaboration among all interested parties, not just a select group of suppliers. It must follow an open process that ensures public review and feedback mechanisms.
- No Hidden Patents: Any patents essential to the implementation of the standard must be licensed under royalty-free terms or covered by a promise of non-assertion when practised by open-source software. This ensures that there are no barriers to implementing the standard due to intellectual property issues.
- No Discriminatory Practices: The standard and its governing body should not favour any particular company/group or implementation over others, ensuring equal opportunity for all parties to get involved in its use and development.
- Implementation Flexibility: The standard should allow for multiple implementations across different platforms and technologies, promoting diversity and innovation in how its application.
- Clear Specification: The standard must be comprehensive and must provide all details necessary for interoperable implementation, including processes for addressing flaws identified during use.
- Independence from Single Vendor Control: The ongoing management and development of the standard should not be controlled by any single vendor, it should allow for equal participation even from competitors and third parties.
The benefits of using open standards are
- We don’t need to reinvent the wheel, as the problems might have already been solved by someone else.
- We will have an increased user base and use cases, which implies lots of people to help or guide.
- Nothing is hidden hence trust can be increased.
- Many people from the community collaborate to solve a common problem.
- As many things are already available we can reduce the developmental cost.
Challenges
Technology is ever-changing and growing, and so as the problems to solve. Like in the above Falcon Heavy launch story, there wasn’t a need for the air traffic system to talk to the space system until the emergence of commercial launches. We haven’t solved all the problems in the world yet and hence the challenges for drafting standards cannot be limited some some points. The major reasons I could see are
- Lack of awareness and understanding among the stakeholders
- All the legacy systems and historical data may not be standardised
- Resistance from the stakeholders to change
- The change comes with a cost
- Lack of enforcement or incentives from authorities
- Cultural and political barriers
- Complexity of certain standards
- Confusion to adopt for a better standard as multiple standards will be available for one particular data.
Imagine if there are no standards available. Don’t we need to share the data? Do we have all the data to solve all complex problems? How do we connect to devices & become smart? How will we communicate with all the IoTs and sensors? Hence standards are not just for compliance, they make things interoperable!