Category Archives: Geodata

Turk meets GIS

Theres an absolutely fascinating use of Amazons Mechanical Turk (?) right now. There’s a HIT (a small task you get paid almost nothing to complete) that involves GIS:

Geospatial Vision are paying people to do image recognition on sequential video stills from a car that they are apparently then recombining in to videos. These are on their (flash only, sigh) site.

You are paid 5 cents to tag 50 images with yellow lines, manholes, drains, bollards and pedestrian crossings. They are also, from looking at the videos, using these locations to then magically classify the sign type (one way, no entry, speed limit etc). Most images have only one feature if at all, there were about 2,000 HITs last night and at 25 frames a second that puts it at about an hour of footage for $100. That is insane.

If you wanted to get data out of it, the video stills themselves could be captured from your screen like the above screen shot and put back in to a movie. People and number plates can be seen in the images… and street signs so you could figure out where they are. You could add bad data – bollards in the sky or whatever. Amazon have various methods to combat these attacks. But it’s all academic as they’re putting at least some of the work on their site anyway.

It strikes me that this is just scratching the surface of the potential of this class of problem, Mechanical Turk is still only known to a small subset of tech people really. People with big data sets would want entire teams of lawyers to look at this and have Snow Crash-esque schemes to keep people from ‘stealing’ their precious data. Could you imagine the OS ever touching this with a barge pole?

The barrier to entry is a little high in that you have to create a flash app or similar if you want to do more interesting HITs but simpler ones are done automagically with forms by amazon. The other barrier is that like many other companies they think the entire world ends at the edge of CONUS so you can’t really make use of it unless you have a US bank account.

There’s another HIT which just asks for an idea from you – what small program would you like to see that doesn’t exist? I imagine the person who did that one just sitting back and browsing ideas for things to work on. How meta can you get?

So we now live in a world where you can effectively treat data storage (Amazon S3), processing (Amazon ECC) and mass non-linear human intelligence (Mechanical Turk) as infinitely cheap and available. You can get programming, design, legal advice and more from rentacoder, elance and more.

Given this, I can’t think of much that you don’t have covered in Phase 1 of your average business plan. Or to look at it another way, the google kids have been living in this world for maybe 4-5 years.

So readers, if you had a big dataset what would it be and how would you get it processed using the above? Best idea gets $10 worth of HITs.

OSM Maplexed

Jerome Parkin at Lovell Johns has been playing with mapinfo, maplex and OSM data to produce pretty maps and get them in to Illustrator. This open mapping stuff could really take off.

The process involved the following

  • Downloading data from OSM
  • Load in file into postgis
  • Export to shape file format
  • Re-project the data from geographic to British National Grid
  • Re-coded (in arcmap) to match LovellJohns MapVU range
  • Maplexed (text placement) this re-coded data
  • Exported to Illustrator

As Jerome says ‘[It] looks promising as an alternative to expensive OS data.’:

Multimap sponsors OSM work

have started to sponsor me to work on openstreetmap.

This was very much inspired by the month of OSM (and is not a replacement) which has seen a soft launch over the last three days. I hadn’t intended to spend all this week on OSM, I wanted to launch with a camera so you could see me working and stuff but hey. I’ll count it as one day, and if you’ve been following the slippy map progress then you know it’s making an impact. More on the month of OSM in later posts, back to multimap.

I and others have been looking for sponsors which didn’t compromise the soul of openstreetmap for a while. The great Nestoria sponsorship of the related mapstraction was a great start but wasn’t going to go on forever. I met the founders of multimap and others at their offices to talk about openstreetmap and they immediately ‘got it’, to the extent of sponsorship. This makes them pretty far ahead of the pack and, I think, visionary.

What’s in it for them? Well, that’s not entirely clear right now. Obviously there is publicity. It’s early days and there are things to be explored but I think 80n has something on his user page when he says ‘My dream is of the day that Google Maps starts to use OSM data (under our license terms).’. Because multimap, google and a lot of others arn’t really in the geodata creation business, they have to license it for lots of money whether its maps, postcodes or whatever. If that is cheaper then great. If the technology behind OSM can be used on existing proprietary data then that’s good too, there’s many uses that can come as OSM matures. This might all be disturbing to some people, so I need to state some things plainly:

  • Will the OSM license change? NO!
  • Will OSM close off the data? NO!
  • Will OSM be re-branded as ‘multimap OSM’? NO! But, given that they sponsor us along with UCL and bytemark who host us I think it’s entirely fair that we acknowledge that like we do for the others already on the front page
  • Are there any strings attached to the money? NO!

Those are real ‘NOs’. What’s telling is multimap haven’t even asked me for these things. They ‘get it’. If they did, then sponsorship simply wouldn’t happen as OSM will and must remain Open. OSM is really you and the data and code you’re putting in – there won’t be an CDDB-like debacle.

So what do you get out of it? Well, thanks to you the month of OSM is getting started and now further, deeper work can occur thanks to multimap. This means more bugs fixed, more hours spent rendering places, more coding and ultimately a better OSM.

So, a big thank you to multimap and to all of you for continuing to build OSMs data and code.

Feel free to discuss all this on the mailing list and I’ll try to answer any questions that arise. Hopefully we should have some more good announcements soon.

Complete UK Maps from 1950s online

npemap is a nuke-from-orbit-quality browser of out of copyright UK maps. They’ve bootstrapped postcode data from freethepostcode and you can submit postcode data using whizzy ajax click-on-map goodness. Map data comes from scans of Richards New Popular collection and the code (so far as I know) comes from the Charlbury based code ninjas – the UKs highest concentration of mappers. It’s very pretty, try finding the forest your house was built on top of.

OS Shows Off Open Spaces

That’s three OSs, count ’em. OS OS is their gmaps-like API. It’s in beta, non-commercial and is OSGB projected. Slippy map, markers, bubbles… it’s all there. Someone in the audience pointed out the data quality in the countryside is much better than what’s available now (eg, google). No link as yet.

Podcast: BCS

I spoke at the BCS a few weeks ago about OSM and open data. Ian made the good point afterward that you can get around the database directive by requiring your license contractually in order to access the data in the first place. More investigation needed. Anyway, here’s the pdf and mp3. The questions at the end are hard to heard, if you have time to process it and make them audible then I’ll happily update the mp3. Update: links fixed.

This week on the OSM lists…

So we are back again with another round up of the fabulous OSM mailing lists. First up, the redesign of the OSM front page, was kicked of by OJW with this entry. Top marks for including one of the only photos of a girl taking part in OSM data collection. Richard Fairhurst’s design included some sleek rounded borders – a challenge to replicate using only CSS. You can follow, or better yet, participate in improving the interface design of OSM here.

Next up, a question was raised about how you can find out about edits that are happening in your area. Because of OSM’s privacy policy, its not possible to query the API to find out who inserted or edited a node/segment etc. The ever wise and helpful Andy Robinson, pointed out that it is possible to subscribe to an RSS feed centred on a speceific location or of a specific user – more details here.

One of the longest threads recently has discussed entering street name details into OSM. It turns out that the situation is not as simple as tagging a node or way with a postcode or address, several issues complicate the situation. First off, as Andy R points out, tagging map data with address details will lead to a large increase in file size complexity. So TM and others suggest using address nodes that are tagged with a post code and address. The main problem with this approach is that it doesn’t take into account streets that have an irregular numbering system (what will we do when OSM expands to Japan where buildings are numbered based on the date they were built?). Postcodes largely remove the need for a geo-database to even bother with house numbers – but then a lot of countries either don’t have postcodes or have systems that are not as accurate as the UK system. Tom Chance hits the nail on the head when he points out that recording every single address as a tagged POI is totally impractical. What this all boils down to is the very worthy goal of using OSM for navigation – the defining factor must be whether people think that having addresses coded to house number level is worth it?

Tireless database warrior Nick Hill has been running tests on the DB, as well as upgrading to MySQL 5.1. As a result of Nick’s tests, the DB now stores lat and lon and integer values, sacrificing millimetre accuracy, but leading to substantially improved query times. The main bottleneck now exists within the rendering system – perhaps time to consider client side SVG rendering, Nick suggests.

Finally, there has been some discussion about representing areas. Chris Morley’s Chester map uses areas to colour parks – this is achieved by creating a way that is a closed loop and tagging it leisure=park, giving a green area. Or alternatively, tag natural=water to get – you guessed it – a blue area.

That’s all from the list this week. Keep on mapping and hacking…

Nickb

FOSS4G – First Report

The FOSS4G conference in Lausanne, Switzerland, is almost over. There has been a lot of talk about Free, Open, geo-data and quite a lot of interest in OpenStreetMap. Much of the interest is generated from the restrictive geo-data licensing that we are all too familiar with – it seems that people across the world share the same problems with accessing data. There is also an interesting case of “grass is always greener” that exists between people in the USA (where the government provide basic geo-data for free) and most of the rest of the world where we pay for the map data. The US representatives point out how crude their geo-data is, and also that private geo-data vendors supply data under licenses that are equally as restrictive as those of the Ordnance Survey or other European mapping agencies.

So there is a lot of discontent within the Geo and wider community, which catalyses projects like OpenStreetMap and is also drawing the attention of larger organisations. OSGeo have had a massive presence at the conference. They are an organisation that have been recently set up to support open source geospatial software – products like GRASS, Mapserver and GDAL are all benifitting from their assiciation with OSGeo. In a discussion session about open geodata, OSGeo expressed a lot of itnerest in helping grass roots projects, possibly by providing legal advice or by providing contact with a wider community and also by providing representation Governments and administrators who pull the strings. They are definitely worth taking a look at.

OSGeo themselves are partially supported by Autodesk, who have just made the move into Open Source with their Map Guide Open Source product. For a proprietery software house like Autodesk to release an open source project may have been unimaginable a few years ago and demonstrates the turning of the open source software tide. How long will it be until the open geo-data tide turns the same way?

Nick