Towards an improved data model for OpenStreetMap

We all know and love the OpenStreetMap data model with its nodes, ways, and relations and the open tagging that has allowed OpenStreetMap to be so innovative. But the data model also shows its years and some improvements might be possible. There is a lot we don’t want to change. Especially the open tagging model has proven itself. We might think of some small improvements, but the core idea of allowing any number of key-value (string) tags has worked amazingly well.

But there are some pain points due to the way we organize our data. The biggest problem is that geographic location is only available to nodes and not higher-level geographic objects like ways and relations. This means that accessing the location of, say, a way always means we have to follow the references to the member nodes of that way. This makes processing OSM data extremely cumbersome and resource-intensive.
The other major pain point often talked about is the missing “area” datatype. We use work-arounds like closed ways and multipolygon relations, but that has always been problematic, because we can’t be sure that those objects are actually valid polygons.

The OSMF Engineering Working Group (EWG) has commissioned me (Jochen Topf) to write a study over the next months outlining the problems with our current data model, possible improvements and their impact on our systems. Any changes to our data model will, of course, have a large impact on our mappers, the data users, our software and on the whole OSM ecosystem. So the study will also suggest ways to move forward implementing those changes step by step.

You are welcome to contact me via email at jochen@topf.org if you want to discuss any of this. After my talk at the State of the Map in Milano in 2018 in which I already outlined some of the issues with the data model, I created the osm-data-model repository. Feel free to comment there. After my preliminary study I will expect there will be a more formal discussion process where we can decide as a community which (if any) changes we want and how we are going to implement them.

The OpenStreetMap Foundation is a not-for-profit organisation, formed to support the OpenStreetMap Project. It is dedicated to encouraging the growth, development and distribution of free geospatial data for anyone to use and share. The OpenStreetMap Foundation owns and maintains the infrastructure of the OpenStreetMap project, is financially supported by membership fees and donations, and organises the annual, international State of the Map conference. Our volunteer Working Groups and small core staff work to support the OpenStreetMap project. Join the OpenStreetMap Foundation for just £15 a year or for free if you are an active OpenStreetMap contributor.

This post is also available in: Spanish Ukrainian Polish Korean

8 thoughts on “Towards an improved data model for OpenStreetMap

  1. Chris

    Can you please remove the hard line breaks from this post? Makes it quite awkward to read on mobile. Thank you!

  2. Tobias Knerr Post author

    Chris, thanks for pointing this out! I’ve updated the text to remove the hard line breaks.

  3. Richard Welty

    are you interested in talking about OpenHistoricalMap experience adapting the data model for historical mapping? we are largely wedded to the OSM model as we try to work out representations that suit our needs.

  4. Ilya Zverev

    Too bad Jochen is focusing on bringing OSM data model closer to the mainstream model, and not on strengthening stuff that make OSM distinct and better. Nice of OSMF to finally fund some strategic work, after 18 years of funding only ten-year-old projects and mapping events, but we could have made better.

    1. RicoElectrico

      Our current model brings many problems that would not demonstrate themselves in a standard GIS model. For example, broken multipolygons or otherwise invalid geometries are endemic to OSM.

      Areas are the bare minimum I expect to come from this effort; but I’m also rooting for the “only nodes convey locations” to be solved. Not having to assemble geometries would make self-hosting and processing OSM on a global scale more accessible and democratized.

      1. Ilya Zverev

        It is already pretty democratic, with dozens of tools to support, and docker containers for common tasks like tlies, geocoding, and routing. Geometry issues are rare and easily fixable, as Jochen’s own initiative for retagging multipolygons has shown.

  5. Pingback: Em defesa do modelo de dados OpenStreetMap - Paignion.info

Comments are closed.