Thursday, 26 January 2012

Shops that aren't

Sometimes OSM contributors mark a high street business as an "office" rather than a "shop". There is logic in this. Accountants, employment agencies, travel agents, and solicitors are delivering a service rather than offering goods for sale, and they are normally sat at desks rather than behind a counter or a till. But the boundaries between the two can become blurred.

For a number of professions both keys: "shop" and "office" are fairly widely used in the UK.

The chart shows the mix of "shop" and "office" for some of the more common professions on the high street. Each of the labels shows common variants (after the most widespread term).


There seems to be some consensus among contributors that most accountants, employment agents and travel agents work in offices, while most estate agents work in shops. Opinion is more evenly divided when it comes to solicitors and insurance brokers.

The mix of singular and plural forms (solicitor / solicitors) is a bit of a nuisance, but apart from that I've only got a couple of quibbles. To me, saying that a funeral director operates from a shop feels even more odd than saying that they operate from an office (I seem to be in a minority here). And in my understanding "financial services" covers very broad scope. Describing a financial adviser as offering financial services doesn't convey their role very precisely. Unfortunately that's how many of them characterise themselves. So dissent here is likely to prove fruitless.


Personally, I feel pretty comfortable with the other different combinations. For members of the legal profession who deal directly with the public I like to see the term "solicitor" rather than "lawyer" as it is in line with normal UK terminology. And using a mix of "shop" and "office" keys seems both reasonable and manageable.

Sunday, 22 January 2012

New GPS

I first got a GPS for the bike back in 2008. Astonishingly that was nearly four years ago. It was a Garmin Edge, and I used it to collect traces for Open Street Map, time my rides, and for occasional route finding. A couple of posts here about putting the OSM Cycle Map on a Garmin Edge have generated as much traffic as almost anything else I've written, and I had a bit of an exchange with Wired Magazine when they  illustrated an article on OSM cycle maps with one of my pictures, despite the license conditions.

So we have had some interesting journeys together, but my old Garmin Edge packed up towards the end of last year. I've been using various apps on a Smartphone since, but now I've been treated to a proper replacement - a new Garmin Edge 800.

Things have clearly moved on in the last four years, and this new GPS is lighter, easier to read, with a better mount. The touch screen is a big advantage, and unlike the smartphone, I can use that while wearing gloves.

It didn't come with anything other than a minimal base map, and I obviously wanted to take the OSM cycle map with me. I had generated a copy of this a year ago for the old Garmin. To get started quickly I transferred the memory card across, and found that it still worked up to a point, but minor roads were invisible.

I couldn't find an up-to-date, ready-made version of the OSM cycle map for a Garmin, so again I set out to build my own. The tools in this area have also moved on since I last did this. It has taken quite a bit of fiddling around to make a new version of the cycle map, and so far I have only partly succeeded. I'm still struggling with minor roads, and the sea is missing (which matters more round here than it did in Berkshire).

If I forget about special styling for the the cycle stuff, and just opt for the standard map format, then things are a bit easier. I can generate a nice basic map with a considerable amount of detail, and lots of POIs. As best I can tell, route finding has improved, though it's difficult to know whether to credit the map or the device for that. With the standard format I get to see the sea, as well as minor roads. 



That will do me for the time being, but this is an itch that needs to be scratched. I can see that I'm in for a bit of fiddling with mkgmap styles over the next few days (unless anyone has a better suggestion).

Friday, 20 January 2012

Impersonating a police officer



Today's destination on the bike was this odd structure, which is called Ratcheugh Observatory. It isn't an observatory in the same sense as the Royal Observatory at Greenwich, or Jodrell Bank.

It was commissioned by the 1st Duke of Northumberland, and designed by Robert Adam towards the end of the 18th century. The room on pillars allowed the Duke to observe his lands (or at least part of them). It was one of a number of monuments that he built in memory of his wife (who died in 1776) at their favourite picnic spots. That seems rather sweet, but their favourite picnic spots mostly seem to have been in places where they could admire the extent of the properties they owned.

This was also my first chance to try out my new GPS, but more of this later.

It was a nice ride, and good to get out, but it was a bit of a grey day. So I was wearing a bright yellow, high visibility jacket. On my way home, I passed a police station, and a toddler in a buggy yelled out "look mummy, there's my daddy". It was all a bit like Jenny Agutter in the Railway Children. I don't think I've ever been mistaken for a policeman before. Thankfully, the mother had better eyesight.

Tagging of shops

One of the challenges in adding various different types of retail outlet to the map is that retailers have always had a habit of finding gaps in the market where they can expand their range, in order to reach more customers. So whatever boxes we try to put shops into, somebody is trying to break out of them. We happily buy sweets from a newsagent, and kitchenware from a hardware shop.

It is difficult to measure how contributors tag individual shops, because variations in the data reflect variations in the real world. But because big chains tend to operate a lot of similar shops, we can look at the different ways these are tagged to get some idea of how contributors handle difficult categories.

To keep this in proportion, there are several large chains where there isn't much of an issue. In these cases there is a high level of consensus, and almost all branches in a chain carry similar tags. This suggests that most contributors see the same type of shop. Examples where more than 80% of branches carry the same value for the "shop" tag include Tesco, Sainsbury, Lidl, Morrisons, Asda, Aldi, Waitrose, Iceland, Somerfield and Tesco Metro = supermarkets; Londis, Premier, and One Stop = convenience stores; B and Q and Homebase = doityourself; Greggs = bakery; and Next = clothes.

Other chains seem to be operating across a genuine boundary between two similar categories. As a result a limited number of different tags have been used across the whole of the chain. Among the big chains this mainly applies to certain brands of supermarkets / convenience stores, typically with 50-75% of branches in one category, and the rest in another: examples include Co-op, Spar, Tesco Express, Costcutter, Cooperative Food, Sainsburys Local and Budgens.

Other chains have business models that are proving more difficult for contributors to categorise. This often seems to depend on how much emphasis the retailer places on offering a wide range of goods, or specialising in one type (or at least on how different contributors perceive this balance). In these cases different contributors normally choose between just a couple of options. For example, both shop=department_store and shop=clothes are commonly applied across chains such as Matalan, Debenhams, TK Maxx.

There are a few cases were there is a wide variety of different categories within the same chain, and little consensus between contributors. These seem to be chains that do not just offer a wide range of goods, but also compete with others that are much more specialised. There are a few where there is a huge mixture of tags. Contributors have used ten or more different values for "shop" to describe each of the following examples: Halfords (with 34 different values for "shop"; "bicycle" as the most common); Argos (25 different values for shop); Marks and Spencer; W H Smith; and Wilkinson.

So the challenge facing contributors is often that a shop offers a range of goods (or a shop format) that cannot easily be pinned down. But there are also some areas where there seems to be consensus among contributors on the type of shop they are dealing with, but not necessarily on  the best tag to use. The most common examples use different terminology which means much the same thing - such as shop=betting or shop=bookmaker (for William Hill, Ladbrokes, etc). Either "shop=alcohol" or "shop=beverage" is common used for chains such as Bargain Booze, Oddbins, and Thresher.

Supermarkets and convenience stores are an example where one term is better recognised for smaller shops, and a different term among larger chains, with both covering similar scope. There are similar examples, including hardware shops and do-it-yourself stores, with a certain amount of overlap between the two.

Finally, there is the issue of spelling mistakes: abbreviations, plural / possessive forms, and alternative approaches to capitalisation and spacing ("newsagent", "news agent", "News Agent", "news_agent", or even "newspaper", or just "news"). It is not difficult to find examples like this, but in practice they don't seem to account for a significant proportion of the total. It is mainly a problem where the normal term for the shop is long, or uses more than a single word, but even there it's not a particularly serious problem. Out of the list of different terms for shops selling newspapers, "newsagent" accounts for 98% of occurrences in the database. In general, around 99% of shops with spelling variations are tagged with the most common value, while the rarer variants account for only 1%. As far as I can see the widest variations in the UK are among DIY shops (variously tagged : doityourself, diy, DIY, etc.) and estate agents (tagged "estate agency", "Estate Agent", "estate agents", "estate_agency", etc.). Even in these, the most common option has been applied across more than 90% of the sample.

The bottom line is that while it is easy to envisage a tool to fix spelling variants, this isn't the real challenge. There are a few cases where more consensus on terminology would make things a bit tidier. But the real challenge is to find better ways of handling nuances between the different business formats that we find in the real world - such as where we use different terminology for different sizes of shop, or for different levels of specialisation. It's hard to imagine a tagging scheme that is going to directly solve the problem of where to buy a spanner, a cycling magazine, or a bottle of balsamic vinegar. Some problems are probably best left to a fuzzy search process. In the meantime the best approach for contributors seems to be to stick to the common values when adding a shop; and pick the value that best describes what we see on the ground.

Nothing new there.

Thursday, 19 January 2012

Spring cleaning

I've just been weeding out dead links in the blog roll. I should have done it a couple of months ago. Because then I would have discovered earlier that Rob Ainsley's excellent Real Cycling burst back into life last November.

Wednesday, 18 January 2012

Schools

In the same spirit as my earlier posts on OSM coverage of shops and healthcare facilities, I've now had a crack at estimating the number of schools that can be found on the map. I've used Department for Education figures to get an estimate of the number of schools we should find in each local authority area in England. These are pretty comprehensive. They include  independent schools as well as state funded primary and secondary schools. They also include some nursery schools (basically those receiving state funding), special schools (for children with specific educational needs) and pupil referral units (for children excluded from other schools).

By comparing these figures with the number of features in OSM that are tagged as a school I get an overall figure for coverage of schools in England around 75% - which looks pretty good. Local authorities that have particularly thorough coverage include Cambridgeshire, York, Sheffield, Hartlepool, and Bath. There's a patch around Barnsley, Doncaster, and Rotherham where coverage is low.

Some of these figures look too good to be true. I suspect that my collection of OSM features is too crude, and that this results in some double counting - perhaps as a result of the way campuses have been tagged. That's going to take a while for me to investigate, but meanwhile, this is what the first iteration looks like.






Tuesday, 17 January 2012

Primary healthcare

Our ability to access public services has been used as one indicator of social inclusion and quality of life.

To help with transport planning and other aspects of policy, governments have measured the ease with which we can access education, and healthcare, using public transport, walking or cycling. They also measure our ability to reach food shops, and employment. It occurred to me that it would be interesting to know how well OSM content covers similar ground.

Since policy makers see access to such facilities as being important, perhaps there are creative ways in which OSM data could be used to make it easier for people to reach key public services?

At the moment I can’t see how to replicate the official measures exactly. I can't see an easy way of measuring completeness of food shops, but I can get somewhere near by trying to measure how well all shops are covered. In healthcare I can compare lists of NHS GPs, Pharmacies, Opticians and Dentists in England with those I can find in the OSM database. I'm not sure where to find similar data for Scotland and Wales. At some point it should be possible to produce a similar measure for schools, but I haven't got round to it yet.



Breaking the lists down by local authority, I reckon that Halton, Wokingham, Cambridgeshire, Islington, and Derby score particularly well mapping local healthcare, with around half of their facilities mapped. Luton, Blackburn, and Doncaster have the most healthcare facilities to add. In those authorities I can find less than 2% of the number of facilities I expected to see in the database.

Looking at both healthcare and retail together, most places show similar levels of coverage for both. Derby and Islington seem to have particularly good coverage of retail, alongside some of the highest coverage of healthcare facilities. Other areas that rank highly on both counts include Bedford, Birmingham, Cambridgeshire, Camden, Halton, Southwark, and Wokingham.

It's worth noting that I'm using government data as a way of measuring OSM data, not the other way round, and this is not to suggest that OSM should be deliberately working towards a comprehensive directory of this stuff. It is interesting, though, to speculate what kind of new applications might start to be viable as coverage develops.