Where the ODbL Ends and the Community Guidelines Begin
In the beginning…
OpenStreetMap (OSM) is, at its core, a global database of geographic information and has a license, the Open Database License (ODbL), which is designed from the ground up to ensure freedom for publicly released databases. In spirit, it is very similar to the Creative Commons “Attribution Share-Alike” (CC-BY-SA) license, which is designed for creative works, or the GNU General Public License (GPL), which is designed to cover computer source code.
Both the CC-BY-SA and the GPL have existed for many years and are built on “copyright” laws, which allow the author or authors of a work to control under what conditions it may be duplicated. These laws have been around, in some form or another, for over 300 years and, because of their long history, have been scrutinised by legislators, lawyers, judges and juries many times. This process of scrutiny results in legislative or judicial rulings, and each of these decisions helps build up a body of “case law” and precedents that can be used later on to form an opinion on whether a particular use is likely to be challenged or not. It is important to know that decisions are only made when judges or juries give verdicts, which means that it is often impossible to make any definitive determinations without prior case law and precedents.
The extent and powers of copyright have been tried in court many times and it would seem sensible to base OSM’s license on it. However, it is far from clear that copyright would apply to a database of geographic information and so our license is based on copyright, contract law and the “database right”, which was first enshrined in law in 1996 as part of the European Database Directive. Sadly for us, open data, as distinct from the more established fields of open computer source and open highly creative works, has a set of distinct challenges, especially when share-alike licenses are involved. The “young” nature of the database right also means there’s very little history, case law and few precedents which leads to uncertainty about the implications of ODbL. This uncertainty translates into risk for the users of OSM data, which can prevent OSM being more widely used and hinders one of the project’s primary goals: allowing the data to be used in “creative, productive, or unexpected ways”.
Until case law and precedents can be decided by court cases and judicial decisions we can reduce the uncertainty by clarifying the intentions of those (i.e: the OSM community) who released the data. Our consensus opinion carries a great deal of weight and can help shape the direction of any future decisions regarding the use of OSM, and possibly other open data.
The new guidelines
The Licensing Working Group (LWG) has been working hard to ensure that uncertainty is reduced for data users while the intent of the community is protected. After much discussion, in June this year the first set of guidelines was approved.
Just as copyright has “Fair Use” exceptions when the sample is not substantial, so does the database right. Whether an extract is substantial or not according to copyright depends on the relationship of the extract to the original work, as it does in database right. Unfortunately, this creates uncertainty for data users as to whether their use is substantial or not. This guideline tries to define the term “substantial” more precisely in the context of OSM. For more details, see the “Substantial Guideline”.
“Produced Work” is a term used by ODbL to broadly separate something created from a database but not a database itself. Because the share-alike provision of ODbL applies only to databases and not to “produced works”, it is clearly important to make the distinction between the two as unambiguous as possible. For more details, see the “Produced Work Guideline”.
There are situations where OSM data can be manipulated or “transformed”, but in such a way that the manipulation does not actually add to or enhance the core contributions made by the OSM community. Therefore, there is no common good to be served by forcing the publication of the result of those manipulations. An example of this might be loading it into a PostGIS rendering database with osm2pgsql – no value has been added by this transformation, so we call it “trivial”. For more details, see the “Trivial Transformations Guideline”.
There are many places in the world where OSM data is the best available map data, and some where it isn’t. In regions where it isn’t, many users would like to use an alternative source instead, but are unsure whether this would trigger share-alike requirements on the whole dataset. This uncertainty prevents, in some cases, any use of OSM data, even in regions where it is superior. This guideline adopts and formalises the established principle that OSM data may be used for some regions and not others, as long as certain conditions are met. For more details, see the “Regional Cuts Guideline”.
Just as there are many regions of the world for which OSM data is the best available, there are also many thematic “layers”, for example restaurants, for which OSM data is superior. However, the question of whether the use of additional layers from other sources is acceptable is preventing some uses of OSM data. This guideline adopts and formalises another long-established principle: that isolated layers in a map may come from OSM or not, as long as certain conditions are met. For more details, see the “Horizontal Map Layers Guideline”.
Where do we go from here?
These are just the first guidelines and there is still much work to be done in clarifying the grey areas surrounding proper use of OSM data. Specifically, work is needed to help make coherent guidelines on:
- Metadata Layers – If a layer of externally collected (non-OSM) metadata is made and kept completely separately but matched to numbers generated by the database to identify individual elements in OSM, when is it derivative and therefore must be shared?
- Indexing – If OSM data is indexed, for example by a search engine, is that a derivative database which would need to be shared?
- Geocoding – If locations are found for addresses, or descriptions generated for locations, in a process of “geocoding” would that trigger the “share-alike” clause on the license and require the sharing of the data being geocoded?
- “Fall Back” – In a service which first attempts to find an answer by looking at OSM data and, if an answer cannot be found, “fall back” to search another database, are these databases separate or does the process you are using mean that you have combined them and are therefore required to share the combination?
- Dynamic Data – If providers of dynamic data, such car-park occupancy, use OSM data as an underlying reference source, does that require the sharing of the dynamic data?
- Offering alteration files – When sharing a database, the ODbL says one can offer “a file containing all of the alterations…”, but is not specific. What form should this file be in?
The LWG will continue to work hard and discuss these issues with the community and data users. If you feel that you would like to contribute, then please contact LWG, join the OSM Foundation and the discussion there, or join the general legal discussion mailing list. We would love to welcome your voice and views to the conversation.
This post is also available in: Japanese