Sunday, 2 August 2015

OSM Retail Survey: Part: 12a

Retail outlets in OSM are represented in different ways.

Very few shops have been added as a relation. Around two-thirds are a node, and one-third an area (almost always a closed way, occasionally as a relation between multiple ways).

Larger types of outlet (supermarkets, motor trade outlets, petrol station, furniture showrooms, etc.) are more likely to be recorded as an area (with around 50% in that form); while smaller outlets such as post-offices, and pharmacies more likely to be recorded as a node (around 85% in that form).

In effect, about a third of retail features are represented only by their location. In around two-thirds of cases there is more information on the geometry. The most common ways of representing the geometry of a retail outlet are

  • as an area which represents both a shop and a building
  • as a node or area that represents a shop, and lies within an area that represents a building

These two cases are equally common in the data. In the real world, some shops are always going to be closely associated with a specific building, while others are always going to be perceived as a facility that happens to be located within a particular building. So it is reasonable to expect data users to find both of these approaches acceptable.

Tagging indicates that almost all (87%) of areas that have been marked as a retail outlet are equivalent to a building footprint - i.e. they hold both a "shop" tag and a "building" tag. Among the remainder, some areas are tagged to represent landuse, and a few represent road surface, but most others carry no indication of what physical feature the area represents (i.e. no relevant tag added alongside “shop” or some type of “amenity”). The proportion of contributors who have used an area to represent landuse varies by the type of retail.  In the case of petrol stations, for example, it is 3%, and in the case of garden centres it is almost 25%.

When a retail outlet is drawn as a closed area, with a “shop” tag, but no other indication of what that area represents, it is most likely that the area is intended to represent a building. A random check suggests that this is what contributors normally intended, but renders cannot be certain, so the most likely outcome for retail features mapped as an area, without a “building” or “landuse” tag is that these features will not be rendered at all.

Where landuse is specified on a retail area it is normally “retail” (76% of cases). Of the other generic urban landuse terms, “commercial” represents 16%. The remainder are mostly more specific terms such as “landuse=plant_nursery”.

Of the retail outlets that do not have their own area defined, and are represented only by a node, just under half are contained within a (separately defined) building. In most cases (75%) the type of building is not defined further (“building=yes”), and in 10% the building is described as a retail building.

There are a couple of thousand retail buildings containing at least one shop. The biggest clusters of retail nodes within such buildings represent individual outlets within large shopping centres (e.g. the St James Centre in Edinburgh). However, these only account for a small proportion of the total.  Most buildings that contain shops only show a single shop node, and it is common for a single building to contain only a few retail outlets.

In around 20% of cases, the retail node within a building is the only retail feature within that building. It might be assumed in this case that a single shop occupies the whole building, but renderers cannot be certain whether there are other, unmapped shops within the same building. Their only safe option is to render both building and shop, and place the node in the position marked.
Although we can safely assume that almost all retail outlets should exist within a building, something over a third of all retail outlets in the database have no representation of a building associated with them. This is an indication of areas where building data is likely to be incomplete, but the information is of little value otherwise.

Some large retail outlets are made up of numerous different components: petrol stations and garden centres are common examples. For petrol stations there is relatively clear guidance on how the various components should be mapped. Guidance is less comprehensive when it comes to garden centres.

I've found 3,772 examples of “amenity=fuel” in the UK data, of which 70% are mapped as nodes, and 30% are areas. To map a service station as a node is simply to indicate the location. To map it as an area suggests that the contributor is at least aiming to provide more detailed visual information for rendering. Adding further detail on the products and services available, and detailed mapping of routes through the forecourt suggest that the contributor is aiming to support more sophisticated applications for more demanding users. To function properly, viable applications that can handle such complexity will require some consistency in the way that petrol stations are described.

My interpretation of the mapping guidance for petrol stations is that:

  • the building in the forecourt should be tagged as “amenity=fuel”: this guidance is generally followed, and is the approach in around 75% of cases where the petrol station has been mapped as an area. In around 3% of cases the area marked as “amenity=fuel” is also tagged to indicate retail landuse, which suggests it covers the whole site. In around 2% of cases it is also tagged as a shop, which suggest that it is intended to represent a building. However, renderers and applications cannot be certain that either is what the contributor intended. In around 20% of cases there is no indication from the tagging what the “amenity=fuel” area represents. Inspection suggests that in most cases it is the paved area around the pumps
  • the area around the pumps should be mapped as an area of highway: in practice this approach is only used in around 2% of cases, although there are a few more cases where the forecourt area is tagged as “amenity=parking”
  • use the shop tag alongside “amenity=fuel” to indicate other retail formats within the petrol station, such as a kiosk, or convenience store: this guidance is not generally followed – a shop tag is only used in 10% of petrol stations marked as an area, and only 5% of those marked as a node. Some of these petrol stations may truly have no other retail facilities, of course, but experience suggests that there are not many fuel outlets these days that only offer  fuel
  • the routes through the forecourt should be mapped as one-way service roads: (this I've not measured)
  • any canopy should be mapped as “building=roof”: (this I've not measured)
  • add a node for toilets as an amenity: because fuel is treated as an amenity in the tagging, there is little problem in tagging coexistence of a petrol station with with retail formats that are tagged as a shop, but there are potential issues around how best to tag co-existence with other common amenities. This guidance helps with adding toilets, but there is no guidance yet for other common amenities, such as a café 
  • there is no guidance yet on how to map the wider extent of the site – which may include customer parking, children's play areas, etc.

The result is that rendering for some petrol stations presents a reasonable interpretation of the different components on the ground, but this is too unusual, and the underlying data is too inconsistent to be of further use.

Representing the perimeter of a complex retail outlet would provide geometry information that would be particularly useful for data users. This would offer a mechanism for aggregating different components within the same facility that have been mapped separately (such as identifying a petrol station with toilets and a cafe) . However, for contributors there is confusion over how best to do this. The community is probably nearest to consensus on "landuse=retail", or "landuse=something more specific". However, this approach isn't widely adopted. In any case, it is virtually useless for anything more than rendering, because it loads the "landuse" tag with more than one meaning. Data users are not able to distinguish between cases where the "landuse" tag defines the outline of a specific outlet, and cases where it encompasses a wider area and more retail outlets.

In summary, the largest gap in the information on the geometry of retail premises is a lack of any information on the footprint for around two-thirds of retail features. In around 4% of cases there is some basic information on footprint, but data users will face considerable difficulty in interpreting what it means.

Where there is information on retail geometry it can help to identify gaps in data, particularly for building footprints.

The database also contains information on some more complex retail footprints, but this is not presented in a consistent way, and the various components are not sufficiently well-integrated for applications to make use of the information (other than basic rendering). There is some guidance for contributors on mapping more complex retail features, but this is not widely followed, and I have found no feedback mechanisms to encourage more consistent tagging of  complex cases.


Marc Gemis said...

I often map a shop as a point. This is usually the case when the shop occupied some part of the ground floor and the rest of the building is used for e.g. apartments. In those case it is pretty difficult to get more detailed information on the actual size of the shop. So for me it is not surprising that this information is lacking in some may cases.

Or did I misunderstand your text ?

gom1 said...

No, you didn't misunderstand. I sometimes use a node, and sometimes an area. The data is a mix of both, in (roughly) equal numbers.