The basic OSM data scheme is the XML exchange format. It’s simple, plain and includes all data, but it’s not optimized for any purpose.
To use the data in applications, several database schemes (and file formats) are common.
api db (osmosis)
used by the rails core of OSM. Optimized for editing capabilities. possible to include history
pgsql simple (osmosis)
pgsql simple history (osmosis + history plugin)
The osmosis-history plugin provides additional fields for the simple schema to add history capabilities.
pg snapshot (osmosis)
used by Nominatim and optimized for geocoding. It’s highly optimized and perhaps not very well capable for other purposes like using the original data as they are. One table (place or placex) connects the gazetteer database to the original osm data.
Decision and Rationale
The gazetteer format is useful and needed for the geocoding process, but in the first step we can use the MapQuest Nominatim instance and the webservice availlable there to avoid more load then necessary while testing.
Using the Postgres Schema concept, where a schema is kind of a sub database, I will use several database users and schemas for the portal. The main application uses the pg simple schema with history additions (but without deep history import currently).
Rationales for each single schema used can be found below.
- main schema for OSM data. It uses the pgsql simple schema provided by osmosis and the additions provided by the osmosis history plugin. The deep history of OSM is not required to be imported. The history features are necessary to allow references to old versions of ways. A prone utility could delete old osm element versions that are not referenced by the application any more.
- Routing should be done with the pgrouting library. This library is pure postgresql code, and does not require much preprocessing. The routing module therefore will get an own, dedicated schema, where the routing tables are stored and routing functions are defined.
- The import process of the main database will be time and memory consuming. Additionally it’s not clear, if it will be possible to import into the live database directly. For now, the import user uses the basic public schema. Later it will probably be better to use two separate import schemes, where one is in the import process, the other one is used by the application.
- search and geocoding
- At a later stage, nominatim should be used for geocoding and search purposes. It has a highly specialized database schema not useful for any other application type. Additionally it requires expensive precalculation efforts. The only connection between nominatim and the osm database are the OSM ids, that can be used across schemas.