Role of Open Data in the Geospatial World

ark Arjun
6 min readApr 17, 2024

--

This is an modified version of my article published in SatSure blog

Why do I need to contribute to open data?
Well, the valid question is, why should I share my data with everyone, which I created with my money, which I put my time into the building, and data I generated using my resources?
The answer is simple, to create value for your own business!
I am putting my thoughts about Open data in today’s world and its role in the Geospatial industry.

What is Open Data?

Open data is data that can be used free of cost. Are all data with zero-cost open data?
No
, Open data is the data that is available to use openly under any open licenses.

What is an open license? The concept of open data and open licenses are byproducts of open movements such as open source, open knowledge, open hardware, etc., happened/are happening in the world. They are similar to the free software philosophies. As GNU states the four freedoms for software, the data should also have similar freedoms.

  1. The freedom to use the data as you wish for any purpose (freedom 0).
  2. The freedom to study the data and modify it as you wish (freedom 1).
  3. The freedom to redistribute copies so you can help others (freedom 2).
  4. The freedom to distribute copies of your modified versions to others (freedom 3).

By doing this, you can give the whole community a chance to benefit from your changes.

Thus, open data generally means the liberty given to the user for its usage and not the cost. All the data with all the freedom technically qualify as open data, but ethically it doesn’t. For data to be open in all senses, it should be ethical too. Unlike software, open data does not just have the freedom aspects; in addition to the freedom of usage, the privacy of the data is also a concern. All user-specific data should be anonymized before publishing it, respecting the associated owner’s privacy.

How are open data helping my businesses?

1. Reduction the storage

Once you have released the data in an open license, it reduces the storage space of your drive. Many community-managed portals host and archive the datasets of open licenses. There are also domain-specific portals that host open datasets. You can make use of those spaces and free up your drive storage. Internet archive is one service in which people archive open data and other documents. No data person will be missing out on the name Kaggle when you talk about the data; Kaggle is also one such platform to share the data.

2. Opportunity not threat

Releasing the data is an opportunity to learn about the other perspective and usage of data, which might not have stricken your mind.

3. Learning for free

When you release the data for free and open, you will get comments and suggestions on all the perspectives of datasets, such as what more could have been done or not been done. This learning experience from a vast group of people is priceless.

4. Business Growth and Value Creation

Making your data available as open data will bring you business growth. A company cannot think around something, give all possible solutions, and create businesses out of it. Whatever you do, there is still a space you might have missed. Or trying to solve the problems like that will make you lose your focus and stumble down. Instead, leave the data open and let interested people try it while you focus on what you know better, and if some opportunities are discovered, you can readily pitch into the new opportunity. Or maybe someone who needs that data regularly is in a specific format and is ready to pay for it. Why do you want to lose a business coming to you without any special effort? This is something similar to the revenue from byproducts.

For example, what if YouTube releases a detailed analytics dataset of its contents, such as what kind of videos are more demanding for a specific age group in one particular geographic region, which contents are most watched, what data, etc.

This helps many content creators make content based on that, which in turn increases the views and revenue of both the creator and the platform — creating a win-win situation for all while creating value for their customers.

5. Open innovation/ Reduce your R and D

You focus on building what you are best at and continue excelling in it. Let others who can explore the other limits of a different domain do it. Who knows that innovation cannot bring or save revenue for you in the future? Moreover, companies, especially startups, cannot rely only on their resources for innovations as it is expensive to hold a very vast team of experts expecting some things “MAY HAPPEN”. Here is where open innovation can reduce your cost of R and D. This also synergies internal and external innovation

6. Social good, mine as well

Data is the new oil and cannot be useless no matter how old they become. When you release the data in open license, researchers, students, or any other company can use it to solve different problems facing the world. Those solutions may also help your company solve the issues you are facing. Data scientists are looking for new datasets to learn and build models solving real-world problems, to which your data may be a benefit. While conducting any research data study is the biggest hurdle, often due to insufficient data, research produces poor results or biased results, making them not usable. Buying data for results is also a budgetary concern in many research studies. The availability of open data is a boon to these researchers. You may also get solutions for your long-time problems from this research.

7. Reduction in Cost

All the points discussed above are directly or indirectly cutting down your cost.

Releasing Open Data

What are the thing need to be considered while releasing the data as open data?

  • The data should be FREE.
  • The data need to be published in the open formats.
  • The data should be documented properly.
  • The data should be reliable.
  • The data should be published in any of the Open Data Licenses.
  • The data should be easily accessible.
A visualization of factors effecting the publishing of Open Data and Open Data Portals
A visualization of factors effecting the publishing of Open Data and Open Data Portals. (Source: data.europa.eu)

What data can one release as open data?

  • The data you think is of great use to society can be released as open data.
  • All data which is non-personal can be released as open data.
  • The data which my competitor cannot use against my business
  • Old data which are of not much importance in the organization
  • Data that are no more my piece of interest
  • Noncontroversial datasets.

Still doubtful about the impact of open data in today’s world?

I am quoting examples from the geospatial industry. Corporations and governments release many other open data from different sectors which also have a greater impacts in the world.

Though ESA is the world’s largest open space data provider, they are not the only organization; NASA started releasing open space data even before that through their different missions. The European Space agency’s Copernicus mission generates 16 terabytes of satellite data daily from 10 satellites into open data, which is the sole reason many solutions are affordable and even the existence of many companies in the geospatial industry is possible. Copernicus provides support for a wide range of non-space applications that may also influence businesses and organizations’ regular operations.

Not only the government or semi-governmental agencies releasing the data in open license, but corporates also do. Esri and their partner Impact Observatory and Microsoft released a worldwide Land Use Land Cover map developed from their machine learning algorithm. Microsoft released building footprints of the entire world as open data detected from their deep learning algorithm. Google AI released building footprints across Africa generated through their AI. Facebook, with its mapwith.ai, is contributing to the most significant open map data platform OpenStreetMap with its RapID tool. Ride-hailing company Lyft open-sourced its autonomous driving dataset from its self-driving fleet in 2019.

More reads and references:

--

--

ark Arjun

Geospatial Engineer | #Opendata | #OpenStreetMap | Blogs @ arkives.in