Tuesday, 16 July 2019

River Till

The plan today was to stick the bike on the back of the car, and try part of NCN68 a bit further north than the section that I rode a few weeks ago. However, as I drove north the sky went dark and there was a heavy shower of rain. Instead of stopping near Ford, and getting soaked as I worked north, I thought it better to carry on driving until the rain stopped, then work south.

The rain had pretty much stopped by the time I reached Norham. From there I followed NCN68 south, to Etal then back. It was a nice ride, with a particularly lovely section following the bank of the River Till.

In a few days time I'll complete my first two months back on the bike. Fourteen miles today takes the total distance past 200 miles. That doesn't sound much, but the air was humid and the landscape crinkly enough to make it feel like more. And the extended journey meant that I got home later than I planned, but as content as I hoped (and I didn't get soaked).

Sunday, 14 July 2019

Littlemill limekilns

A pleasant ride today, passing this massive structure on the way to Embleton. This is one of the largest 19th century limekilns in the country. They built it next to the railway, so that lime could be loaded directly from the kiln onto wagons, and it remained in service for about a century - until shortly after World War II.

As I settle into this loop as a fairly regular ride it seems to be getting easier. Is that because it's becoming more familiar, or because I'm getting stronger? I have no idea, but it was a pleasant afternoon, and another seventeen miles bumped my Eddington number for 2019 up by one.

Sunday, 30 June 2019

Breamish Valley

Breamish ValleyThis beatiful valley in the Northumberland National Park is crossed by NCN68.
I'm out of practice, so the gentle climb up the valley against a steady headwind was hard work.
But the return felt like flying.

Thursday, 27 June 2019

Friday, 25 September 2015

Lidar data

I'm not sure what it is going to be used for, but the Lidar data recently opened up by the Environment Agency is remarkable. Here's central Alnwick, hillshaded in QGIS, using 1m DSM elevations downloaded from here.

Edited: 26/9/2015

Thanks to Chris for the comment and pointers.

There's more.

If sea levels continue rising at 3.2mm p.a. Alnwick will look like this by 20,000 a.d. Thankfully our local pub will still be above the waves.

And this is what I reckon the duke can see from the top of Alnwick Castle.

Sunday, 23 August 2015

OSM retail survey: Conclusions-2

This picks up from previous posts to consider more specifically what tools might help contributors.

The examples are rudimentary – stuff I've assembled for my own use, rather than robust tools for the wider community. If they have any value I hope it will be as prototypes for something more polished.

Missing data

There are about 385,000 retail properties in England that are missing from OSM, and the obvious way to help contributors is to point out where they are.

To help achieve the most rapid improvement across the whole country I have tried to find dense retail concentrations that haven't been thoroughly mapped yet.

These are the biggest concentrations of unmapped retail property in England and Wales: about 1,000 of them, each with an average of 100 missing retail outlets across an area of under 2 sq. km.

I've used a mix of Food Hygiene Data, Non-Domestic Rates, population data, and various other statistics to identify concentrations of retail outlets at a local level. I've done this for England and Wales. The same basic technique should work in Scotland because similar data is available, but the structure of the census geography, and data on non-domestic rates for Scotland is quite different, so the process needs tweaking, and I haven't got round to that yet.

My formula for estimating the number of retail premises at a local level can probably be improved, but it will never be perfect. At this stage I don't think it is good enough to reliably identify areas that are almost complete, because that needs more precision. But I think it is good enough to flag up areas that are far from complete. Contributors who are looking for significant concentrations of missing retail outlets should be able to do a quick check on the area. If it still looks empty on the map, they can head there with a reasonable expectation of adding enough new retail outlets to make the trip worthwhile.

Feedback based on local knowledge would be welcome, to help refine this a bit more.

Helping contributors to find nearby concentrations of missing retail outlets is one way to quickly increase the overall volume of data. A different starting point is to assume that thorough retail coverage in some areas has a higher value to data users than adding missing shops elsewhere. On that basis we may want to encourage contributors to concentrate first on mapping areas which we think have the highest potential value.

This example picks out a limited number of smaller towns and cities where OSM data might have high value (e.g. to students or visitors).

Areas coloured:
  • blue already contain more than 75% of my estimated number of retail outlets
  • green contain 50-75% of my estimated number of retail outlets
  • orange contain 25-50% of my estimated number of retail outlets
  • red contain less than 25% of my estimated number of retail outlets 

Each area is intended to cover a manageable size: one where a few contributors should quickly be able to bring retail content up to an impressive level. Larger cities are excluded on the basis that they justify a more systematic approach. My list is  bit arbitrary – it is intended to cover a mix of different towns of roughly similar size, distributed across the country. Are these really the areas where OSM retail data is likely to have most value? I doubt it, but that might be a useful discussion point in its own right. For each suggestion of a settlement that should be added, please feel free to suggest one that should be removed.

I can only assess how useful these estimates might be in areas that I know fairly well. Feedback on any unexpected results would be useful: to better understand where the technique can be improved.

Feedback to contributors

All contributors deserve to see the results of their work. But not all retail information is rendered on the standard map. And in my view it never can (and shouldn't) be. So to encourage contributors I would like to see a decent alternative to the standard map which shows more complete retail information. When I want to check specific content of the database I use either a data extract, Overpass, or the “Map Data” overlay on the standard map view. I'm happy to do this, but for many contributors (and particularly for novices) none of these techniques are particularly user-friendly. I suspect this is beyond my own technical capabilities, but there are examples (based on various data extracts) that illustrate the kind of thing that can be done.

Data collection

When mapping retail areas there will normally be some shops already recorded in OSM, which need checking. Alongside other existing features such as road junctions, these also provide reference points for adding new data. When surveying retail premises, it's handy to have a crib sheet to hand, on which to collect notes of any changes, which shows the current state of the data. This needs to show every relevant feature in the database, including some which won't be rendered on the standard map.

Below is an example generated (automagically, with some rather clunky SQL) from OSM data for Winchester High Street (the pedestrian part). It starts at the western (top) end.

I've set this up to collect any shops, amenities and offices within 25 metres of the highway centre line, and display them in order. This simplistic approach only exhibits the most basic information, and includes more features than I would really want: including shops beyond each end of the central line, up side streets, and occasionally from a nearby street running parallel. But it's easy enough to cross out any unnecessary entries. To allow for some additions there's an additional spacer inserted every 20 metres (roughly twice the width of a conventional shop front). I find sheets like this speed up the data collection process and make it easier to add notes.

Consistency checks

I hope I've made a clear case that across most of the UK adding missing retail data is a higher priority than cleaning up tagging inconsistencies. However, this isn't true everywhere, and pointers to inconsistencies could help contributors to clean local data.

Some basic consistency checks are easily carried on Overpass:

But this isn't ideal for finding all quirky data within a local area, and finding more complex inconsistencies sometimes involves extensive processing that isn't really practical interactively. Overpass isn't the ideal solution here, but it is possible to do more crunching on a data extract. Here are some examples. Unlike Overpass, anything here that is fixed won't be quickly updated in the overlay (some of these quirks are already fixed, which could get annoying). Note that, for the sake of simplicity, this overlay only contains some of the features in the UK that exhibit these quirks.

Wednesday, 12 August 2015

OSM Retail Survey: Conclusions-1

OSM has thrived by bringing together a community with diverse interests, and aligning their efforts behind a common purpose. In thinking how best to improve retail coverage it seems useful to consider how different groups with different interests and different skills will be able to contribute.

The most obvious question for the community is how the existing tools might be improved. But I am not going to start there. Instead I will begin with how contributors might view the priorities - because that will determine which tools will be of greatest help.

My starting point is based on findings from the survey:
  • In a some localised areas retail data in OSM is the most comprehensive retail data that is generally available. Because OSM data has a degree of structure it should be capable of supporting certain types of structured search that are extremely difficult to achieve in any other way. These are the areas where the data will be of greatest value to end users and hence of greatest interest to application providers
  • We are still a long way from being able to offer comprehensive retail data across the whole of the UK. In the foreseeable future this means that most viable applications based on OSM data are likely to have a local focus, rather than aiming for national coverage. So far only a few areas have been really thoroughly mapped. One priority is to increase the number of thoroughly mapped areas.
  • Elsewhere, whatever issues data users find with the consistency and accuracy of UK retail data in OSM, the impact of those issues is small in comparison to the amount of retail data that is missing from OSM. Another priority is to reduce the volume of missing retail data.
To address missing data, I assume the community needs to expand the number of contributors, as well as encouraging existing contributors to add more basic retail data. We need to ensure that the process of collecting and contributing data is both satisfying and productive.

Most contributors have only a limited choice of where to map. The question we need to help them with is how to make the biggest impact in their local area. Some contributors have more choice of where they map. The question we need to help them with is where they can make the biggest impact.

If OSM is going to provide a decent platform for viable applications based on retail data, then the priority is to bring more areas of the UK up to a standard that compares with the best. OSM data doesn't have to be complete in order to be the best available source of retail data in a well-defined area: but it should be getting near complete. In towns and smaller cities individual contributors can quickly make an impact, by bringing retail data up to a good standard across a well-defined area. In an ideal world they might chose a location that would most interest potential application providers – perhaps a university town, or a city that attracts a large number of visitors.

I'd like to think that contributors who want to improve retail data will start by assessing how retail coverage currently stands in their chosen area. For a rough idea, they can examine the standard map, or for more precision they can compare the number of shops in the OSM database with an estimate of how many there should be. There are various ways to get that estimate, but that's a separate question which I'll defer for now.
  • If local coverage is currently under 25%, then this part of the map is still close to being a blank canvas. OSM data is far from providing the best source of retail data, and there will still be gaps in some of the most commonly mapped features, such as post-offices and pubs. The first priority is to make a start, develop technique, and demonstrate progress to encourage others. For a contributor's own motivation, they should begin with whatever interests them personally. This probably includes retail outlets that they are familiar with (i.e. ones that their family, friends and neighbours use regularly). Beyond that, major retail premises, such as supermarkets, banks and larger high street stores are relatively easy to tag, and are all properly rendered on the standard map. Their relatively large scale helps to build visibility. To help raise awareness add any retailers with a high public profile. This could include any well-known local specialists, those who advertise heavily, those who regularly feature in the local paper, or take an active part in the local chamber of trade.
  • If local coverage is around 25-50% then a fair number of shops will appear on the standard map, but there will still be plenty that are missing. Quite often some retail categories will have been well covered (pubs often seem to appear first), while others still have to be added. The priority now is to build momentum. The quickest results will be achieved in densely occupied retail zones such as the central shopping area and larger retail parks. Complete coverage is still some way off, and trying to include everything at this stage will slow things down. It is more important to include major outlets, and a representative sample of outlets that are of high public utility, and widespread interest. It seems to me that this should include retailers that cater for a broad section of the population – both their daily needs (convenience stores, post offices, pharmacies, take-aways, caf├ęs, pubs), and more significant purchases (electrical goods, clothing, furniture, etc.). Others will have better insight into those catering for specific groups of customer (visitors, students, etc.), and in some towns these groups will be particularly important.
  • Once local coverage reaches around 50-75% then OSM data is providing some of the most complete retail data that is generally available. The standard map will contain a good number of shops – particularly in the town centre. But anyone familiar with the area will still find it fairly easy to spot shops that are missing. Particularly outside the main shopping areas there will be shops scattered across residential and commercial areas that haven't been added.  Now is the time for contributors to work towards something approaching complete coverage. Missing shops are likely to be the more specialised, smaller and more quirky independent retailers, shops outside the central retail core, in suburban shopping parades, and corner shops in residential areas.
  • Over 75% coverage means that locally OSM is capable of providing some of the most comprehensive retail data that is generally available. Contributors will find it increasingly time-consuming to deal with the remaining gaps, and the more difficult categories are the most likely to have been left aside. Now is the time to include them. This is also the time to verify that existing data is up to date and consistent. There will be opportunities to add value with information that will be of use to different types of data user. This might include features such as wheelchair access, ATMs, non-standard opening times, specialist services, etc. Beyond this, contributors have  choice. They can continue to add unreasonable levels of detail, that will never be used. Or better, broaden the scope of their survey into neighbouring towns and villages.

If this is broadly how things work, then I'd suggest the following priorities to help contributors:

  • Firstly, contributors are encouraged by seeing the results of their work. Currently the standard map is the main source of such feedback, but it doesn't show all retail outlets, and it doesn't render all retail characteristics that contributors add. It is unrealistic to expect the standard map to render everything, so I'd like to see a different way of showing contributors the results of their work: one that doesn't depend on adding increasing detail to the standard map.
  • Secondly, for areas where retail data is relatively thin, contributors who have some choice of where to map may benefit from guidance on where and how they can make the biggest impact. Tools that highlight areas where there is a substantial amount of missing retail data could save them time. Suggesting areas where retail data could be of high utility may influence their choice of where to map.
  • Thirdly, retail data generally has to be collected by survey, and there is a lot missing. Tools that help contributors collect data in the field (i.e. on the high street) will make the process more satisfying and help to speed the process
  • And finally, once  retail data in an area is relatively complete, the emphasis will change from improving coverage to improving consistency and adding value. Bulk edits won't help much, but tools that highlight inconsistencies and quirks in retail data will help contributors identify issues and improve quality.