Category Archives: Operations

Hardware and system administration related posts. Anything related to the Operations Working Group

Powering OpenStreetMap’s Future: A year of improvements from OpenStreetMap Foundation’s Site Reliability Engineer

Just over one year ago, I joined the OpenStreetMap Foundation (OSMF) with the goal of enhancing the reliability and security of the technology and infrastructure that underpins OpenStreetMap. Throughout the past year, I have worked closely with the Operations Working Group, a dedicated team of volunteers. Together, we have made significant progress in improving our processes and documentation, ultimately strengthening our collective effectiveness. I am immensely grateful for the support and collaboration within this group, and I am delighted to witness the remarkable strides we have taken in building a solid foundation for the future of OpenStreetMap.

I’ll go into a little detail below about what’s transpired. At a high level, I made it easier to manage deployment of the software running on our servers; hardened our network infrastructure through better redundancy, monitoring, access, and documentation; grew our use of cloud services for tile rendering, leveraging a generous AWS sponsorship; improved our security practices; refreshed our developer environments; and last but definitely not least, finalised migration of 16 years of content from our old forums to our new forums.

If you want to hear more from me over the course of the work last year, check out my talk at State of the Map 2022 and my interview on the GeoMob podcast. And I’d love to hear from you, email me at osmfuture@firefishy.com.

2022-2023 Site Reliability Details

Managing software on our servers

Containerised small infrastructure components (GitHub Actions for building)

I have containerised many of our small sites which were previously built using bespoke methods in our chef codebase as part of the “Configuration as code” setup. Moved the build steps to Github Actions. Setup a base for any future container (“docker”) based projects going forward. These are our first container / docker based projects hosted on OSMF infrastructure.

Our chef based code is now simpler, more secure and deploys faster.

Improved chef testing (ops onboarding documentation)

We use chef.io for infrastructure (configuration) management of all our servers and the software used on them. Over the last year the chef test kitchen tests have been extended and now also work on modern Apple Silicon machines. The tests now reliably run as part of our CI / PR processes. The tests add quality control and assurance to the changes we make. Adding ARM support was easier to add because we could use test kitchen before moving onto ARM server hardware.

Having reliable tests should help onboard new chef contributors.

Hardened our network infrastructure

Network Upgrades @ AMS (New Switches, Dual Redundant Links, Dublin soon)

Our network setup in Amsterdam was not as redundant as it should have been. The Cisco Small Business equipment we used we had out-grown. We had unexpected power outages due to hardware issues. The equipment was also limiting future upgrades. The ops group decided to replace the hardware with Juniper equipment which we had standardised on at the Dublin data centre. I replaced the equipment with minimal downtime in a live environment (<15mins).

Both Dublin and Amsterdam data centres now use a standardised and more security configuration. Each server now has fully bonded links for improved redundancy and performance. The switches have improved power and network redundancy. We are awaiting the install of the fully resilient uplinks (order submitted) in the next month.

Out of Band access to both data centres (4G based)

I built and installed an out-of-band access devices at each site. The devices are hard wired to networking and power management equipment using serial consoles. The out-of-band devices have resilient 4G link to a private 4G network (1NCE). The out-of-band access devices are custom built Raspberry PIs with redundant power supplies and 4x serial connectors.

Documentation of Infrastructure to easy maintenance (Racks / Power)

I documented each rack unit, power port (Power Distribution Unit), network connection and cable at the data centres. This makes it easier to manage the servers, reduces errors and allows us to properly instruct remote hands (external support provider) to makes any chances on our behalf.

Oxidized (Visibility of Network Equipment)

Our network and power distribution configuration is now stored in git and changes are reported. This improves visibility of any changes, which in turn improves security.

Config is continiously monitored and any config drift between our sites is now much easier to resolve.

Terraform Infrastructure as Code (improve management / repeatability)

Terraform is an infrastructure-as-code tool, we now use it for managing our remote monitoring service (statuscake) and I am in the process of implementing it to manage our AWS and Fastly infrastructure.

Previous these components were all managed manually using the respective web UIs. Infrastructure-as-code allows the Ops team to collaboratively work on changes, enhances visibility and the repeatability / rollback of any changes.

We manage all domains DNS using dnscontrol code. Incremental improvements have been made over the last year, including add CI tests to improve outside collaboration.

Grew our use of cloud services

AWS in use for rendering infrastructure. Optimised AWS costs. Improved security. Improved Backup. More in pipeline

Ops team have slowly been increasing our usage of AWS over a few years. I have built out multiple usage specific AWS accounts using an AWS organisation model to improve billing and security as per AWS best practise guidelines. We generously received AWS sponsorship for expanding our rendering infrastructure. We built the experimental new rendering infrastructure using ARM architecture using AWS Graviton2 EC2.
We haven’t previously used ARM based servers. As part of improvements to our chef (configuration as code) we had added local testing support for Apple Silicon (ARM), only small additions were required to add the required compatibility for ARM servers to chef.

We were impressed by the performance of AWS Graviton2 EC2 instances for running the OSM tile rendering stack. We also tested on-demand scaling and instance snapshotting for potential further rending stack improvements on AWS.
We have increased our usage of AWS for data backup.

Improved our security

Over the last year a number of general security improvements have been made. For example: Server access is now via ssh key (password access now disabled). We’ve also moved from a bespoke gpg based password manager for the ops team to using gopass (feature rich version of https://www.passwordstore.org/ ), gopass improves key management and sharing the password store.

Additionally we have also enhanced the lockdown of our wordpress instances by reducing installed components, disabling inline updates and disabling XMLRPC access. We are also working to reduce the users with access and removing unused access permissions.

Documented key areas of vulnerability requiring improvement (Redundancy, Security, etc)

Documentation on technical vulnerability: I am producing a report on key areas of vulnerability requiring improvement (Redundancy, Security, etc). The document can be used to focus investment in future to further reduce our expose to risks.

Refreshed our developer environments

New Dev Server

We migrated all dev users to a new dev server in the last year. The old server was end of life (~10 years old) and was reaching capacity limits (CPU and storage). The new server was delivered directly to the Amsterdam data centre, physically installed by remote hands and I communicated the migration, and then migrated all users and projects across.

Retired Subversion

I retired our old svn.openstreetmap.org code repository in the last year. The code repository was used since the inception of the project, containing a rich history of code development in the project. I converted svn code repository to git using a custom reposurgeon config, attention was made to maintain the full contribution history and correctly link previous contributors (350+) to respective github and related accounts. The old svn links were maintained and now link to the archive on github https://github.com/openstreetmap/svn-archive

Forum Migration

The old forum migration, we migrated 1 million posts and 16 years of posts to discourse. All posts were converted from fluxbb markdown to discourse’s flavour of markdown. All accounts were merged and auth converted to OpenStreetMap.org “single sign-on” based auth.

All the old forum links redirect (link to the imported) to correct content. Users, Categories (Countries etc), Thread Topics, and individual posts.

Meet Grant Slater, the OpenStreetMap Foundation’s new Senior Site Reliability Engineer

Thanks to the support of corporate donors, the OpenStreetMap Foundation has been able to hire its first employee, who is starting on 1 May 2022. Grant Slater and Guillaume Rischard, the Foundation’s chairman, sat down for a virtual chat.

Hi! Tell us about you?

Hi! I’m Grant Slater, and I’m the new Senior Site Reliability Engineer (SRE) working for the OpenStreetMap Foundation. I’m originally from South Africa, and now live in London (UK) with my wife Ingrida and our son Richard.

What do you do in OSM? Where do you like to map?

I’ve been mapping since 2006, mostly in the Southern Africa and in the United Kingdom. I have a strong interest in mapping the rail network of South Africa; holidays “back home” often involve booking railway trips across the country, with a GPS in hand.

My latest toy is an RTK GPS base station and rover. I’ll soon be mapping my neighbourhood with centimetre-level accuracy.

For the last 15 years, I’ve been part of the volunteer OpenStreetMap Operations Team who install and maintain the servers and infrastructure which runs the OpenStreetMap.org website and many other related services.

What are your plans for the new SRE job?

My main objective will be helping improve the reliability and security of the project’s technology and infrastructure.

One of my goals will be to improve the project’s long-term stability as we grow. OWG can’t work without volunteers, and I will be improving the Operation Team’s bus factor by also improving our processes, documentation, and by smoothing the path to onboarding new team members.

I will be helping to drive forward modernising the project’s infrastructure by reducing complexity, paying-down technical debt, and reducing our need to maintain undifferentiated heavy lifting, by tactically using Cloud and SaaS services, where suitable.

Is there anything else you’d like to say?

With time, I would like to see OpenStreetMap introduce new tools and services to improve our mappers’ access to opted-in passively collected data to improve the mappers’ ability to map and detect change.

Gamification! OpenStreetMap should always remain a fun and gratifying experience for all. We’re building an invaluable and unique dataset with far-reaching consequences for which we should be incredibly proud. Happy Mapping!

I would like to hear your feedback and suggestions, please email me osmfuture@firefishy.com

Grant gave a talk at State of the Map US (2013) – OSM Core Architecture and DevOps and is hoping to give an updated talk at State of the Map 2022 in Florence, Italy 19–21 August 2022.

https://www.openstreetmap.org/user/Firefishy
https://twitter.com/firefishy1
https://github.com/firefishy

OpenStreetMap RSS/Torrent functionality for planet files

OpenStreetMap is open data, available to all for free on https://planet.openstreetmap.org. We release six large files, totaling 428 GB every week. These files contain the complete OpenStreetMap data, including files with the full history of OSM.

To make it easier to share the load of downloading these files, we also supply BitTorrent files, which allow spreading the load across multiple web servers as well as using peer-to-peer file transfers.

mnalis has recently implemented RSS feeds which announce when the torrents come out. This allows users to automatically subscribe to sharing new planet files as they are produced, thus reducing the load on the planet.openstreetmap.org servers, which are bandwidth-limited.

 More information is available on:
 https://wiki.openstreetmap.org/wiki/Planet.osm#BitTorrent_RSS.2FAtom_feed

Operations Working Group, Mnalis


Do  you want to translate this and other blog posts in your language…? Please email communication@osmfoundation.org with subject: Helping with translations in [your language]

The OpenStreetMap Foundation is a not-for-profit organisation, formed to support the OpenStreetMap Project. It is dedicated to encouraging the growth, development and distribution of free geospatial data for anyone to use and share. The OpenStreetMap Foundation owns and maintains the infrastructure of the OpenStreetMap project, is financially supported by membership fees and donations, and organises the annual, international State of the Map conference. The OSMF supports the OpenStreetMap project through the work of our volunteer Working Groups, such as the Operations Working Group. Please consider becoming a member of the Foundation.

Thanks to our new tile cache and domain name sponsors

Content Delivery Network of tile delivery caching servers.

The OpenStreetMap Foundation Operations Working Group wants to thank all the recent donations of nodes for our tile cache CDN: 

Tile cache nodes allow us to serve all users by answering map tile requests closer to the user, giving a faster response time, reducing rendering server load, and saving international bandwidth.

Caches added in 2019

In 2019, thanks to sponsors, caches have been hosted in the following countries:

Australia

Brazil

France

Germany

New Zealand

Sweden

Switzerland

Ukraine

United States


Chinese Dragon by Nyo, public domain

OpenStreetMap has an internal server naming theme based on fictional dragons, as in “here be dragons“.

Full list of tile caches here and on a map.

Would you like to host a tile cache?

If you operate an internet exchange, host company, or otherwise have a site with good internet connectivity and high regional bandwidth, you can look at the tile CDN node requirements. We welcome hosting of tile caches elsewhere, and are particularly looking for tile caches in Africa and Asia. If you are interested, please contact us.

Domain name sponsoring by Gandi

Gandi, in addition to hosting the new tile cache server Gackelchen in Bissen, Luxembourg and supporting OpenStreetMap France (an OSM Foundation Local Chapter) are now very generously sponsoring many of our domain names. We would like to thank them for their support.

Operations Working Group

Do you want to translate this and other blogposts in your language..? Please send an email to communication@osmfoundation.org with subject: Helping with translations in [your language]

The OpenStreetMap Foundation is a not-for-profit organisation, formed to support the OpenStreetMap Project. It is dedicated to encouraging the growth, development and distribution of free geospatial data for anyone to use and share. The OpenStreetMap Foundation owns and maintains the infrastructure of the OpenStreetMap project, is financially supported by membership fees and donations, and organises the annual, international State of the Map conference. It has no full-time employees and it is supporting the OpenStreetMap project through the work of our volunteer Working Groups. Please consider becoming a member of the Foundation.

OpenStreetMap was founded in 2004 and is a international project to create a free map of the world. To do so, we, thousands of volunteers, collect data about roads, railways, rivers, forests, buildings and a lot more worldwide. Our map data can be downloaded for free by everyone and used for any purpose – including commercial usage. It is possible to produce your own maps which highlight certain features, to calculate routes etc. OpenStreetMap is increasingly used when one needs maps which can be very quickly, or easily, updated.

Can you help the Operations Working Group?

Image by OSM Communication Working Group, CC-BY-SA 3.0
OSM logo by Ken Vermette, CC-BY-SA 3.0 & trademarks apply.

The OSM Operations Working Group is a volunteer group, responsible for running of the servers owned by the OpenStreetMap Foundation. 
We are always keen to find new members and we are particularly looking for people who:

  • can analyse our server infrastructure
  • make plans
  • forecast future hardware needs
  • draw up budgets

This does involve a certain level of technical expertise but it’s not writing code, for example, and OWG membership doesn’t grant access to any of the servers – that’s for our Sysadmins. If you would like to join us, have a read of our membership policy, and please get in touch!

Some additional information:

  • OWG’s main communication channels are Github and email. We rarely have meetings.
  • Estimate of hours per week: 1-3

Email us at operations@osmfoundation.org
We are also on Twitter @OSM_Tech

If you have the technical expertise and experience to be a sysadmin, read our sysadmin membership policy and get in touch.

OSM Operations Working Group

Do you want to translate this and other blogposts in your language..? Please send an email to communication@osmfoundation.org with subject: Helping with translations in [your language]

The OpenStreetMap Foundation is a not-for-profit organisation, formed in the UK to support the OpenStreetMap Project. It is dedicated to encouraging the growth, development and distribution of free geospatial data for anyone to use and share. The OpenStreetMap Foundation owns and maintains the infrastructure of the OpenStreetMap project, is financially supported by membership fees and donations, and organises the annual, international State of the Map conference. It has no full-time employees and it is supporting the OpenStreetMap project through the work of our volunteer Working Groups. Please consider becoming a member of the OSM Foundation.

OpenStreetMap was founded in 2004 and is a international project to create a free map of the world. To do so, we, thousands of volunteers, collect data about roads, railways, rivers, forests, buildings and a lot more worldwide. Our map data can be downloaded for free by everyone and used for any purpose – including commercial usage. It is possible to produce your own maps which highlight certain features, to calculate routes etc. OpenStreetMap is increasingly used when one needs maps which can be very quickly, or easily, updated.

Can you help make OpenStreetMap.org faster in Brazil, or Australia/New Zealand?

CDN of tile delivery caching servers.

A big use of the OpenStreetMap data is the web map on OpenStreetMap.org. Along with our hard working team of volunteer sysadmins who keep it going, we are helped by many donated tile cache servers around the world, which speed up the map in various regions.
We are always open to more servers, but the Operations Working Group is currently looking for servers in Brazil, and Australia/New Zealand. If you or your organisation would like to donate a cache server and hosting, we’re ideally looking for a physical server or powerful VM with 8GB+ RAM and at least 146GB of storage. Read more details.

Please email operations@osmfoundation.org if you are interested.

Our peak Brazil traffic is currently around 65 Mbps. Our peak Australian and New Zealand traffic is currently around 20 Mbps. See the full country breakdown in bits per second.

Some more information:

  • Brazil has the 10th highest traffic and is the largest country without a cache in it or nearby.
  • Worldwide peak traffic is 2300 Mbps.
  • Antarctica and Australia are the two continents we do not have caches on.

We fully manage the software and operating system. All config is managed via our chef recipes. We also run a local firewall on each server. If physical hardware, we monitor using it SMART, hp-health, etc and report any hardware issues back to the hosting organisation.

Will you help us and join the people and organisations that support OpenStreetMap? Thank you!

The OpenStreetMap Foundation is a not-for-profit organisation, formed in the UK to support the OpenStreetMap Project. It is dedicated to encouraging the growth, development and distribution of free geospatial data for anyone to use and share. The OpenStreetMap Foundation owns and maintains the infrastructure of the OpenStreetMap project. Volunteers, like the indefatigable team of server administrators, keep all of this hardware working. 

OpenStreetMap tiles are free for everyone to use, but should be used with moderation. If you are a high traffic site you should look at switch2osm.org to find out how to use the data and keep the tiles available for everyone.

If you can’t donate server hosting, you can always make a financial donation to the OSMF.

New Tile Render Server in the USA

We have a new Tile Render server in the United States! The hardware has been kindly provided by OpenStreetMap US and hosted by the Oregon State University Open Source Lab. Big thanks to them, and to Ian Dees who coordinated this response to the Operation Working Group’s request.

Our distributed tile serving infrastructure brings the “standard” map tiles to your browser wherever you are in the world in a reasonably fast fashion, resulting in a pleasant map viewing experience on the OpenStreetMap.org front page, and with new map edits reflected a few minutes after they are made. It should always be noted that this is far from the only way of using our maps, and we encourage developers to take our data, render it, and otherwise make it available to users in a new ways. However, we do like the front page map to work well. We have a set of “rendering” servers doing the hard work of creating and refreshing raster map tiles, and a larger set of caching servers. With the introduction of a new rendering server in the United States (the first outside of Europe) tiles will load faster. The server itself is fast, and for users in the United States we expect to remove about 100 milliseconds of latency for people viewing the map.

Network latency for requests to the new tile server from various locations in the USA

Details of this new server (which we’ve named “Pyrene”) can be found on the hardware.openstreetmap.org site.

We’re still building our tile serving infrastructure, with a lot of help from people and organisations donating resources. If you are in a position to help with this sort of thing, a caching server – or better yet a rendering server – in India would make a huge performance improvement for people there. Learn more about the kind of servers we need at our wiki page and contact the Operations Working Group.

Server moves: Goodbye Imperial. Hello Equinix Amsterdam

Servers de-racked and ready to move

Some of our servers are moving to a new home. Quite a few of our important servers have been housed at Imperial College in London for the past few years, but it’s time to move on from there as they look to reclaim some space for offices. We’d like to thank Imperial for our time together!

We continue to be thankful to University College London, and Bytemark who are still generously providing hosting for some other keys servers, not to mention our many Tile Cache hosts around the world. If you’re interested in server details you can see the full list on our hardware page.

That list is set to change very soon, as Imperial machines are powered down and moved. The move is being carried out this week by volunteers from the OWG/OSMF.

Where are we moving these servers to? We sought proposals for a new home (thanks to all those who replied), and Equinix Amsterdam has been selected as our new data centre provider. This brings a little more diversity of locations for our servers (many of the others being in the UK), but it’s still not a million miles away, in case our operations team need to visit. Equinix Amsterdam provide excellent “smart hands”, removing the need for physical visits on a regular basis. That being said, the Operations Working Group are seeking someone to help in Amsterdam who can visit the data centre if we need. To quote the OWG folks this volunteer would “need to be trusted, competent and did I say trusted”!

As ever, we owe a big thanks to OWG volunteers for all the hard work going into managing these server moves.

OSMF Request for Proposals: Data Centre 2018

Photo by cosheahan on flickr. Licence: Attribution 2.0 Generic (CC BY 2.0)

Statement of Purpose

The OpenStreetMap Foundation (OSMF) Operations Working Group (OWG) is looking for proposals to provision space in a data centre to continue to run the OpenStreetMap (OSM) project’s infrastructure.

Background Information

OSMF is a nonprofit organisation dedicated to supporting, but not controlling, the OpenStreetMap project. OSMF created the OWG in order to support OSM’s technical infrastructure, including the main website, API, data distribution, community sites, and manage them for the benefit of the project.

The map data created by OpenStreetMap and distributed through ​OSMF​ is the best free global map available. It powers services all over the world, including for companies such as Apple, Foursquare, Craigslist and Mapbox. Ple​ase see https://www.openstreetmap.org/about​ for more background information.

Scope of Work

The data centre provider must meet the requirements set out below.

Requirements

Primary

●  The data centre must be in the EU.
●  One rack (at least 40U) of space, at industry standard rack dimensions.
●  Power capacity at least 3kW w/ dual redundant supplies.
●  Cooling to keep the servers suitably cool, e.g: under 35 degrees celsius.
●  Secure cages, so that only access authorised by the data centre or OSMF is possible.
●  On-site “remote hands” to be able to receive and replace HDDs and press power buttons during weekday business hours and at least some service weekends and holidays.
●  Network connection capable of 1Gbit/s peak traffic and 500Mbit/s sustained.

Secondary

●  Control over the configuration of any upstream firewalls for the purposes of ensuring necessary ports are open.
●  Good peering connection to major European backbone network. Ideally within 20ms of our existing sites on JANET.

Additional questions

Please provide detailed information on:
●  The procedure for shipping parts to the data centre, and
●  The procedure for raising a ticket for “remote hands” work, and
●  Whether “remote hands” would be available outside of business hours, and
●  The site’s uptime and network reachability over the past year, and
●  The procedure for an OSMF representative to visit and access the data centre.

Term of Agreement

The agreement would start on or before 1st April 2018 and run for a minimum of 3 years (at
OSMF’s option), preferably renewable annually or on a longer basis after that. Any renewal or
cancellation on either side would need a minimum notice period of 3 months.

Terms and Conditions

If you have Terms and conditions or Acceptable Use Policies then you should submit them in editable form for legal review, where possible. T&C/AUP changes should be expected to ensure we meet the privacy and security commitments required for our users.

Schedule, Evaluation and Award Process

This RFP is expected to be open until 28th February 2018. Only applications received prior to this date can be considered for this RFP. All proposals will be received in confidence and will be kept private.

After the date above, all proposals will be evaluated by the OWG against the requirements set out above, after which OWG may contact candidate sites with follow-up questions or to arrange site visits. The final agreement will require legal review and approval by the OSMF board.

OSMF particularly welcomes responses from anyone willing to support the work of the foundation at minimal cost.

The successful candidate will be publicly thanked on the main OSM project website as well as OWG websites in accordance with OWG’s Hosting Provider Credit Policy.

Points of Contact

Many thanks for your interest. If you have any questions or proposals, please send them to
operations@osmfoundation.org​.

Server maintenance May 13-15

Some secondary services like the wiki may be affected by maintenance this weekend. Following this weekend, OpenStreetMap users may experience some slow uploads due to continued tuning after our recent database move.

On Monday May 9, our Operation Team smoothly moved the master OSM database server to Bytemark hosting, from Imperial College. Setting up multiple data center redundancy avoided further downtime due to planned power testing and maintenance at the Imperial data centre.

Over the weekend, we are switching some additional services to Bytemark, due to that power maintenance at Imperial. Primary OSM services will be operational, but the Wiki will be in read-only mode during this time. The slow uploads are a known issue, and are being addressed following the maintenance this weekend. More details on the talk mailing list.