Tuesday, 21 July 2015

OSM Retail Survey: Part-2

With 528,000 retail premises in England for a population of 53million, there is roughly one retail property for every 100 people across the whole country. It would be handy if we could use this ratio to examine coverage at a more detailed level than local authority.

The Office of National Statistics provides boundary data and population figures for Lower Layer Super Output Areas (LSOA) and Middle Layer Super Output Areas (MSOA). An LSOA has a population of 1,000 – 3,000 and an MSOA has a population of 5,000 – 15,000.

So we would expect to find 10-30 shops in an LSOA and 50-150 shops in an MSOA. However, when we use these ratios to measure actual coverage in OSM we find wide divergence. It is particularly noticeable that rural areas seem to be exceptionally well mapped, while suburban / residential areas appear under-mapped.



The underlying problem is that the number of retail premises is not proportional to the population at this level of detail. Suburban areas are well served by city centres, so have fewer shops than we expect. Rural areas with a dispersed population tend to have relatively large numbers of small shops - i.e. more than we expect. This diversity is demonstrated by examining how OSM coverage compares to the national average for different types of area. Sparse areas look well-mapped, even when they aren't. Urban areas don't look well mapped even when they are.

For what it's worth, at this level of detail, the correlation between the number of shops in OSM, and the number of residents employed in the retail sector  is even worse than the correlation between numbers of shops, and total population. So retail employment is likely to prove even less useful as an indicator of how many shops to expect, and I haven;t pursued this further.


ONS Rural / Urban classification
OSM retail units 
as % of expectation 
based on national average
Rural town and fringe
30%
Rural town and fringe in a sparse setting
83%
Rural village and dispersed
33%
Rural village and dispersed in a sparse setting
55%
Urban city and town
31%
Urban city and town in a sparse setting
61%
Urban major conurbation
32%
Urban minor conurbation
51%

As a result of these variations this approach is of limited use to us.

There are some variants that might be more useful. This example from North Tyneside highlights several Middle Layer Super Output Areas where there is no post-office recorded in OSM. Some of these might really have no post-office, but it's a fair bet that some of them really will contain a post-office, alongside other retail outlets that haven't been mapped yet.


So there may be some useful ways of using data from output areas based on population, but it turns out that it is probably more useful to examine the coverage of retail outlets across built-up areas.

Of the retail properties that I have found in OSM, 85% fall within a built-up area. It makes sense to look for numbers of retail properties within settlements. Again the Office of National Statistics provides us with handy geography and population data to work with. Here I'm using their data on population and boundaries of built-up areas to compare the volume of OSM retail data in larger towns and smaller cities. This is a less reliable, but a more granular view than we can extract from VOA statistics on retail properties that are available at local authority level.


This approach seems to work particularly well for mid-sized towns, and it can be adapted to make it useful for smaller towns and larger villages. We would expect most larger towns to be the main retail centre for the local population, so they should have roughly the average number of retail premises in proportion to the population. In practice, we find that well-mapped towns come close to this ratio. Where a town falls well short of the expected ratio it suggest that there is scope for improvement, and visual examination tends to confirm this impression.

We can see that Exeter and Chesterfield have roughly half the number of retail premises in OSM that we would expect to find on the ground. In Hull and Lincoln perhaps two-thirds of the retail premises are missing from the data.

Searches for retail locations will have higher utility to some people in some places than in others. If an application provider wants to use OSM retail data to support views of individual towns, then they might chose to begin with towns and cities of a manageable size, with large numbers of visitors, relatively high turnover in population, good technology infrastructure, etc.

University towns, for example, can be expected to have a high turnover of technologically adept students,



Cathedral cities are likely attract large numbers of visitors.



And so are Seaside towns.



Some towns fit into more than one of these categories, and quite a few of these look well-mapped (Bangor, Cambridge, Canterbury, Durham, Ely, Norwich, Oxford, Salisbury, Scarborough).  Others probably wouldn't take a huge effort to bring up to a similar level of coverage (Exeter, Worcester, York).

Measuring the ratio between retail premises and population begins to break down for smaller settlements. Retail is not evenly distributed, and we expect things to average out across a larger settlement, but not across a smaller settlement. The population of many smaller towns expect to travel elsewhere for some of their shopping. Some smaller settlements are predominantly residential. Others serve as retail centres for a wider area, so these have more than their fair share of shops and services. Similar variations apply in towns and villages that are popular visitor destinations. Nevertheless, it's unlikely that a settlement of 1,000 people would have only a couple of shops. It's not impossible that a town of more than 5,000 people will have no post office, or no pharmacy, but it seems unlikely. We should be able to use these assumptions (and others) to identify smaller settlements where retail is suspiciously under-represented in OSM.
  • There are 26 built-up areas with a population of more than 10,000, and 83 with a population of more than 5,000 where no Post-Office is recorded in OSM. The largest are Kirkby, Haverhill, Witham, and Formby.
  • There are 80 built-up areas with a population of more than 10,000; and more than 200 with a population of more than 5,000 where no Pharmacy is recorded in OSM. The largest are Braintree and Grantham.
  • There are 30 built-up areas with a population of more than 10,000; and 120 with a population of more than 5,000 where no food shop seems to be recorded in OSM.
I've used a mixture of these assumptions to identify smaller settlements close to home where the number of shops in OSM is implausibly low. It pointed me to a couple places locally that needed attention, and I have started to add shops. However, my home area is not a good example to illustrate the principle. Here, the under-mapped towns turn out to be quite widely dispersed, and don't show up well on a map. I'm less familiar with the locations in the more densely populated rural area around Durham, but it is a better example to illustrate the principle. The city itself is exceptionally well-mapped, but some of the surrounding small towns and villages look as though they might benefit from attention.



This approach of measuring the content across built-up areas seems more promising than using Output Areas, but it still has limitations. Individual settlements lying within a more rural landscape can easily be highlighted in this way, but it is more difficult to identify areas of incomplete retail mapping within a large conurbation.

To be continued....

No comments: