Real World Computing
Bowled a Googly
Costly data
Why would a company the size of Google have such a problem? After some further investigation, it began to look like a cost issue. I discovered an open-source file of basic UK postcodes that's free but gives rough locations since it decodes only the first group of three or four letters: there are some mathematical tricks you can employ to "guesstimate" an offset from the house number, but these work only for towns built on a block grid. So what's the problem with getting more accurate data? Strictly speaking, there's no problem, you just need to pay several thousand pounds per year to the Royal Mail for the use of its 28-million address Postcode Address File (PAF) - and that's just for a single-user licence. The Royal Mail obviously maintains the PAF data for its own purposes, but also sells it on to direct mail companies who are obliged to use accurate postcodes so they can qualify for the bulk discounts on their mailings using the Post Office system called "Mailsort".
When you start to look closely at who pays to maintain this postcode data, you'll find that local councils have to provide the Royal Mail with information about all new addresses created, and the Ordnance Survey Office then provides the location of these new addresses to the Post Office. The Post Office claims to have made a profit of only £1.6 million on a total PAF revenue stream of £18.4 million in 2005-06 (a lot of it from selling the data back to local councils and other government departments). The "Free Our Data" campaign suggests that as this data is so critical to the running of national services, it should be maintained by central government and made available free online. At the moment, we're indirectly paying twice for the data: once for the input from councils and the Ordnance Survey, and again to use it on our websites.
Many countries already make such information open source, and until the UK government makes the data truly public we'll remain a backwater in map-based applications, at a time when the technology is growing explosively. Far too often, I sit through presentations of geographically based web applications, amazed and excited at what can now be achieved, only to find it prohibitively expensive to implement for my clients. For example, wouldn't it be cool to be able to show all places of interest within a given radius of a given postcode? Without the PAF data that's practically impossible in this country, whereas in the US it's no problem. Another boat we're in danger of missing...
One or two websites offer their own lists of postcode and latitude/longitude, but how accurate they are I don't know. Linuxbox (http://linuxbox.co.uk) used to sell a list for only £90, but the owner tells me the data is now out of date due to new house building, which highlights that the main problem with geographical data is maintenance. Another provider of postcode data can be found at www.freethepostcode.org, which asks users to submit known postcodes and GPS location data with the aim of building up a totally open-source database. It still has some way to go, however: it currently lists just 3,547 postcodes out of the more than 1.5 million in the UK. And take a look at www.npemap.org.uk, which uses 1940-vintage maps (copyright-free, I guess) on which you can find your house's location and click on the map to submit your postcode to their database. None of these efforts is ideal or 100% accurate (not that the PAF is 100% accurate, either, but it's the best we have). What we need is some way to update our websites with the new data via an online resource.





