OpenStreetMap

MiroJanosik's Diary

Recent diary entries

Whitebikes update of OSM data from CSV export

Posted by MiroJanosik on 1 October 2018 in English. Last updated on 3 October 2018.

So I did update of the OSM data that we get as CSV export from WhiteBikes database. I had to take care of few troubles - export is not exactly CSV, data order differs, but at least they are consistent. I think it can’t be automated and changes should be reviewed, I did it by hand as I want to see which data modified. There are around 50 bike stands in the city.

So, lets see what did I have to do:

Prepare list of stands from database, in a format similar to previously retrieved data

  • download new data from whitebikes at * censored *. In case that website is saved as HTML and not as pure text try to use view source + select all - copy - paste into file.
  • It is in format of CSV (spreadsheet) that can be loaded into office if needed.
  • But it is in incorrect format, so it has to be adapted beforehand.
  • Replace all commas by dot. Replace all semicolons by comma.
  • After this change it will be CSV (Comma-separated-values) and it can be used in excel.
  • Though, there may be still some issues - semicolons are sometimes used also in text, not only as separators.

  • import into office spreadsheet editor (excel, calc) and make it nice and aligned by station number:
  • add new first column, fill it with numbers 1-100
  • set other columns for sorting, sort them by number (they may be reordered from database export)
  • align them with the numbers in first column, fill in empty lines for numbers that are not present
  • export into CSV again (see sample file export-2018-09-26-ordered-with-gaps.csv)

  • now, you can compare old and new csv with your favourite compare tool (Meld, WinDiff) and see it nicely aligned, see the differences that happened.

You may have to modify files to match some general differences (for example http to image changed to https, and such), (as example see sql-2017-03-26.csv and sql-2017-03-26-ordered-with-gaps-look-like-new.csv)

note - CSV column names are: “standId,standName,standDescription,standPhoto,serviceTag,placeName,longitude,latitude”. Important column is ‘serviceTag’, if that is 1 then it is not a public sharing stand, it is a service one, and it should be either not imported, or marked as disabled:amenity.

Prepare these data into OSM file to see the data in JOSM

Install python to run https://raw.githubusercontent.com/OSMBrasil/csv2osm/master/csv2osm.py for conversion of csv file into OSM file. If it won’t work for you, then you can use your favourite editor which can do search-and-replace with regexp (Notepad++ on windows, Kate on linux)

  1. Take file (see export-2018-09-26-ordered-with-gaps.csv) and remove empty lines, and those with coords “,0,0” (see export-2018-09-26-ordered-with-gaps-to-osm-cleaned-0-0.csv)

  2. Do a replace of:

^([^,]+),([^,]+),([^,]+),([^,]),([^,]),([^,]+),([^,]+),([^,]+),(.*)$

into:

<node id='-\1' action='modify' lat='\9' lon='\8'><tag k='name' v='\3' /><tag k='description' v='\4' /><tag k='number' v='\2' /><tag k='amenity' v='bar' /></node>

This will convert CSV lines into nodes with lat, lon, description, name. It is made into bar to have big visible icon.

Add these two lines before first line:

<?xml version=’1.0’ encoding=’UTF-8’?> <osm version=’0.6’ generator=’JOSM’>

Add this line after last line:

</osm>

Then save it with extension .osm (see export-2018-09-26-ordered-with-gaps-to-osm.osm) and you can load it into JOSM.

Prepare existnig stands in JOSM

Run JOSM, open the dialog for data download, and switch to tab “Download from overpass API” (in my JOSM 13756)

Fill in this query to get WhiteBike stands: > node > [operator=WhiteBikes] > ({{bbox}}); > out;

Now lets do the change

You have:

  • layer with existing stands data
  • you open layer with stands as they are in whitebike database (export-2018-09-26-ordered-with-gaps-to-osm.osm)
  • you have open a comparison of old data (export-2018-09-26-ordered-with-gaps-to-osm.osm and sql-2017-03-26-ordered-with-gaps-look-like-new.csv)
  • additional data: satelite imagery (from bing), mapillary data

You do:

  • go along the list in comparison tool, see if there is any difference
  • if there is then either delete the stand (ZRUSENY means deleted), or modify it.
  • do not insert stands with 6th column value ‘1’, that means it is service stand and it won’t show on WhiteBike maps. Only ‘0’ is ok.
  • If stands are not at the same place in existing data and in database, then look into export-stands-20170613-popisky.csv where is described that many stands are incorrectly placed in WhiteBikes map (sometimes up to 50 meters away from position in description or on photo).
  • mark stand types according rules on wiki https://wiki.openstreetmap.org/wiki/Sk:bicycle_parking_Cyklokoalicia_import
  • if stand is temporarily disabled, then mark it as disused:amenity=bicycle_rental and keep other properties, for easy update back to working state

Notes - helper, for copying attributes:

Parking stands:

  • amenity=bicycle_parking
  • bicycle_parking=stands
  • ref:cyklokoalicia=108 NOVEMESTO

Changeset is 62997154.

Location: Kalmárka, Nivy, Ružinov, District of Bratislava II, Bratislava, Region of Bratislava, 821 09, Slovakia

Toto nie je nový denník, iba slovenský preklad anglického denníka z pred 6 mesiacov http://www.openstreetmap.org/user/MiroJanosik/diary/39355

Prečo a ako

Na slovensku máme dostupný elektronický kataster. Poskytuje vektorové dáta budov a súpisné čísla budov (na Slovensku máme súpisné čísla Sk:Key:addr:conscriptionnumber ). Tieto dáta sú verejne dostupné a nejaké múdre hlavy z OSM komunity vytvorili JOSM plugin ktorým sa dajú importovať do OSM máp Sk:KaporSKAddress#Import_s.C3.BApisn.C3.BDch_.C4.8Disiel .

Rozhodol som sa že použijem tieto dostupné dáta a importovací nástroj a priradím ich do OSM mapy mojej obce. Potom tie dáta skontrolujem či sú pravdivé. Moje hlavné ciele boli:

  1. aby sa dal v OSM mape nájsť dom podľa čísla na mape. Už sa podľa správ v médiách ukázalo že to môže byť nápomocné pre sanitky v prípade núdze.
  2. aby moja obec mala aktuálne dáta v OSM - aby OSM reflektoval realitu - čo je cieľom OSM
  3. zvedavosť aké presné dáta sú v elektronickom katastri

Moja obec má približne 1600 obyvateľov a 632 domov plus zopár garáží, chatiek, a kôlní. Obec má približne 10 ulíc.

Rýchla časť

Vektory budov mojej obce sú v OSM už nejaký čas naimportované, a podobne aj pre väčšinu Slovenska. Avšak import súpisných čísel sa nedá plne automatizovať a sú tam chytáky. Jose mi pomohol s procesom importu (vďaka!) a mal som to hotové za menej ako pol hodiny; Toto bol môj prvý set dát nahraných do OSM.

Pomalá časť

Import ukázal že v asi 15 prípadoch majú rôzne domy rovnaké číslo. Rozhodol som sa prekontrolovať všetky domy v obci aby som zistil reálne čísla domov (a aj z dôvodov 1-3 hore).

Takže, vytlačil som si OSM mapu priblíženú dostatočne na to aby som na nej videl čísla domov. Chodil som po dedine a značil som si do papiera správne aj nesprávne čísla domov (mal som z toho 4 prechádzky po obci). Po každej prechádzke som spustil JOSM a upravil čísla domov ktoré potrebovali zmenu, odstránil som ‘poznámky z importu’ a pridal som parameter “source:conscriptionnumber”.

Moje zistenia

  • Spravil som si 4 prechádzky po obci, každá z nich trvala asi 1,5 hodiny a úprava dát v JOSM trvala ďalších 30-45 minút. Keď to spočítam je to 8-9 hodín iba pre moju malú obec!
  • 73/632 domov nemalo viditeľné číslo domu
  • 10/632 domov nebolo dostupných aby sa dalo skontrolovať či majú číslo domu - boli vo vnútri zamknutých dvorov, a na bráne nebolo číslo
  • 26 domov vyzeralo ako bežný obývaný rodinný dom ale nemali číslo ani na dome ani v katastri
  • ~5% domov malo nejakú chybu v katastrálnych dátach - nesprávne číslo, alebo dva domy mali rovnaké číslo (či už dve spojené budovy kedy niekedy bola jedna z nich garáž, alebo dve úplne rozdielne budovy)
  • nie je jednoduché mapovať s papierovou mapou keď je chladné počasie, umŕzajú z toho prsty :)
  • keď sa pozeráte na domy a značíte si niečo do papiera tak sa na teba ľudia divne pozerajú :-)

Po štvrtej prechádzke som zistil že mi ešte stále chýbajú nejaké dáta a tak som musel spraviť piatu prechádzku. Včera som kontroloval dáta v JOSM a zistil som že na asi 30 budovách mi ešte chýba vlastnosť “source:conscriptionnumber” a tak budem musieť spraviť aj šiestu prechádzku.

Čo by som spravil nabudúce inak

  • prechádzky by som spravil v lete kým nemrznú prsty, keď sa blíži zima tak to nejde a treba počkať na teplú sezónu
  • vytlačil by som si “walking papers” s vyšším detailom, a vyhradil si na nich dostatok prázdneho miesta na poznámky
  • písal by som si detailné poznámky na papier. Písal som si skratky o pár dní som si nevedel spomenúť na to čo znamenajú

Why and how

In Slovakia we have electronic cadastre available and it contains buildings in vector format with building conscription numbers (in our country it is conscription number Key:addr:conscriptionnumber ). These information are public and we have some smart OSM users that created a conversion possibility to get it into OSM maps KaporSKAddress#Conscription_numbers_import .

I have decided that I will use available method that takes cadastre house numbers and assign them to OSM map buildings in my town and then I will verify how correct these data are. Goal was:

  1. have houses with numbers searchable in OSM. It was found to be helpful also for ambulances in case of emergency to have searchable map!
  2. to keep OSM map up to date - to reflect reality which is goal of OSM
  3. by curiosity to see how precise are cadastre data

My town has approx. 1600 inhabitants and 632 houses plus some garages, cottages and huts. Town has approx. 10 streets.

Fast part

Building shapes are already imported into OSM for most of Slovakia for some time. Though, importing conscription numbers can’t be fully automated and it can be tricky. Jose helped me with import process (thank you!) and it took less than half a hour; This was the first commit.

Slow part

Import process shown that in about 15 cases different houses had the same house number. I decided to resurvey whole village to verify all the house numbers (because of reasons 1-3 above).

So I have printed out zoomed OSM map with visible building numbers on paper. I walked around the town and noted down if something was not correct (I have done this in 4 separated sessions). After each session I started JOSM and corrected houses that needed correction, removed ‘import notes’ from houses, added source:conscriptionnumber.

My findings

  • I had 4 walking sessions around the town, each took me about 1,5 hour and fixing the data in JOSM was another 30-45 minutes. That sums up to 8-9 hours for my small town!
  • 73/632 of houses did not have visible numbers on the house
  • 10/632 houses were not reachable to verify their number - they were inside locked yards, and number was not on the gate either
  • 26 houses look like a real house with families living for some time inside, but they have no number in cadastre or on the house
  • ~5% of houses in cadastre data were somehow wrong - incorrect number, two houses had the same number (either completely different houses or two joined buildings, sometimes one of them is garage)
  • it is not good to do paper-walk mapping in cold weather, your fingers freeze off :)
  • people stare at you if you are looking on buildings and note something into a paper sheet :-)

I have realised that after 4 walking sessions I am still missing some data and had to make 5th, and yesterday I was checking all the data if they all have a proper source:conscriptionnumber and I realised that I’m still missing that on 30 buildings and I need to make 6th walk.

What would I do differently next time

  • do walking sessions while it is summer and you can feel your fingers, otherwise you will need to wait for next warm season
  • print out walking papers in good detail and think if there will be enough space to write comments on walking papers
  • write real comments on walking papers, not just abbreviations, you forget those abbreviations if you upload the data after few days
Location: Sološnica, District of Malacky, Region of Bratislava, 906 37, Slovakia

Starting with album of unofficial bicycle routes around Malacky. It can be found at http://rudava.mypage.sk , it is of course based on OSM data (and my paper notes) and drawn by hand in UMAP.

I’m expecting to slowly add new tracks as I find them and as I get contributions from other cyclist in the region. I found a local cyclist group “SCK Záhorák Malacky” which will have knowledge about local paths.

Location: Malacky, District of Malacky, Region of Bratislava, Slovakia