pnorman's Diary Comments

Diary Comments added by pnorman

Post When Comment

Future deprecation of HTTP Basic Auth and OAuth 1.0a

3 months ago

mmd’s example does this in 11 lines of bash, including endpoint discovery. I think the right environment variables could make all subsequent curl calls add that header, but you might need to write to a .curlrc file and change $CURL_HOME. Please don’t do this as if you aren’t careful you’ll leak the token everywhere. This is better than the HTTP Basic equivalent of leaking username/pass everywhere, but still best avoided.

I haven’t had to implement an OAuth 2.0 CLI app before, so I wanted to see how long it would take me. It took me about half an hour using a library I had never used or read the documentation of. I could have used Ilya’s script which does all of this and handles saving the token.

I implemented what mmd did, except without endpoint discovery in 13 lines, taking under half an hour starting from scratch. The code and output from running it is https://gist.github.com/pnorman/19c103add9fcc6b9ee8a5792d2598ef4. I’ve deleted the OAuth 2 application, so you’ll need to make your own.

From this point in my code writing API calls by hand would just be like normal, as the interface is the same as the requests library.

If you’re doing this for real best practice would to have some error handling better than bailing out on error

uMap: fine-grained permissions and more

7 months ago

It would be good to make it clearer that visability to “anyone with link” is essentially viewable to everyone, as you can trivially increment the ID of the map and view all possible maps.

🌂 The Past, The Present, The Future

9 months ago

@RobJN, [banners that interfere with the close button are] a sign of OSM changing.

No, releasing a banner which conflicts with the close button is a sign of OSM staying the same. It’s happened with SOTM banners in the past, and probably will in the future.

친절인가요, 배타성인가요?(Is that kindness or exclusivity?)

10 months ago

Why isn’t there an “OSM UK”?

There is.

Analyzing OSM's Tile Logs

11 months ago

The other columns are request rates with documentation at https://github.com/openstreetmap/tilelog#format-documentation

A critical analysis of Bing Map Builder part 2 - an update to 'OpenStreetMap is in Trouble'

about 1 year ago

So, what is the behaviour now? I drew a new building, clicked saved and… the building disappeared from my screen. When opening the network console, this network call proved my suspicions. The created data is now sent towards https://bing.com/mapbuilder/changeset/submit and contains the changeset data (and bit of extra information)

It’s always been sent to Bing directly. What appears to have changed is Bing isn’t taking those edits and submitting them to the OSM API.

I would watch for the building you mapped.

There’s a few possibilities - the data goes nowhere right now, - the data eventually appears on bing maps, or - the data appears in OSM eventually.

If it appears on Bing, then we can get the data from Bing under the ODbL and if we want, bring it into OSM.

The 20% drop in new contributors (preliminary analysis)

over 1 year ago

In 2022 we had 813883 account creation events in Matomo, while in 2021 there were 953865. This is fairly close to the percentage drop in edits.

Unfortunately for finding out more, about 70% of people creating accounts come to the site from direct entry or search engines, which don’t tell us how they arrived at OSM. There’s 20% coming from website links, and the other 10% is social media and rounding errors.

The website category can be further broken down. For reference, there was a drop of 15% overall. The following are the top inbound sites for account signups in 2021, with the number of signups in each year, and the percentage difference.

Site	Conversions (2021)	Conversions (2022)	Difference
tasks.hotosm.org	37064	25367	-32%
accounts.google.com	17032	13993	-18%
umap.openstreetmap.fr	9177	8041	-12%
www.openstreetmap.de	8019	7928	-1%
export.hotosm.org	6193	4788	-23%
wiki.openstreetmap.org	5552	4445	-20%
openstreetmap.jp	2980	2189	-27%
www.orangefox.com	2149	#N/A	-100%
url-opener.com	1876	2573	37%
www.so.com	1626	1492	-8%
www.openstreetmap.fr	1610	1439	-11%
www.onosm.org	1602	1638	2%
israelhiking.osm.org.il	1530	1227	-20%
link.zhihu.com	1337	2123	59%
link.csdn.net	1287	1039	-19%
leafletjs.com	1173	753	-36%
mapcarta.com	1138	1677	47%
admin.booking.com	1101	1269	15%
geohack.toolforge.org	1068	790	-26%
framacarte.org	1059	1125	6%
www.missingmaps.org	945	#N/A	-100%
www.10bestseo.com	896	1057	18%
www.osmhydrant.org	838	934	11%
osm-boundaries.com	832	717	-14%
nominatim.openstreetmap.org	799	680	-15%

Not all signups are equal, of course. I suspect a signup driven from a local chapter website is more likely to be useful than one coming from a SEO site.

Cleaning up cuisines in Canada with JOSM

over 1 year ago

As a Canadian, I’m not quite sure what Canadian cuisine is, but I know it’s a recognized category. See for example Yelp Canadian (New), “modern Canadian cuisine” at Brix & Mortar, or OpenTable Contemporary Canadian.

Testing planet import

over 1 year ago

Does overpass not read the history PBFs for historical data?

Discord Ban Appeal

over 1 year ago

The Discord server is run by the Discord admins, so you can contact them through Discord to appeal.

As for the ban, you were told what you were doing is disruptive, told to stop doing it, and you continued. That leaves admins with no option except a ban. You may not agree that it was disruptive, but it’s up to the admins to decide that.

If you disagreed, you should have still stopped your conduct, and discussed the matter, not ignored admins.

OpenStreetMap Carto release v5.6.0

over 1 year ago

The reason that I’m asking is that I (or someone) will need to update Manually building a tile server (Ubuntu 22.04 LTS) et al and it’d be good to know whether the same caveats are needs as are around get-external-data.py, which offers no user feedback and does not “fail safe”, hence the warnings such as “not much will appear on the screen…” on the switch2osm pages.

get-external-data.py does offer the user feedback on what it’s doing. If there’s a case where it’s not fail safe - i.e. it exits with a 0 status on an error - please open a bug.

One other question - what operating systems are supported (expecially with regard to the fonts change) and which have explicitly been tested?

The requirements are shell and curl. It might take a bit of work to get it running on WSL or cygwin, but getting style design software running on them is more involved too.

Google Summer of Code 2022: Phase 1

almost 2 years ago

I’ve been using tiles2image to turn map tile lists into images. Because when you fix the zoom, you only need 1px per tile, this makes the resulting images small.

To get grayscale I’ve been using something hacked together, diff below

diff --git a/tiles2image.py b/tiles2image.py
index d65cc15..4455f16 100755
--- a/tiles2image.py
+++ b/tiles2image.py
@@ -4,6 +4,7 @@

 import sys
 import argparse
+from math import log
 from PIL import Image

 parser = argparse.ArgumentParser()
@@ -11,16 +12,24 @@ parser.add_argument("zoom", type=int, help = "Zoom level of tiles")
 parser.add_argument("filename", help = "Name of PNG to write to")
 args = parser.parse_args()

-img = Image.new('1', (2**args.zoom, 2**args.zoom), "black")
+img = Image.new('L', (2**args.zoom, 2**args.zoom), "black")
 pixels = img.load()

+max_hits = 0
 for line in sys.stdin.readlines():
+    splitline=line.split(' ',2)
     # Standard z/x/y format
-    splitline=line.split('/',4)
-    if (splitline[0] != str(args.zoom)):
+    tile=splitline[0].split('/',4)
+    if (tile[0] != str(args.zoom)):
         raise ValueError("Line {} does not have zoom {}".format(line, args.zoom))
-    x = int(splitline[1])
-    y = int(splitline[2])
-    pixels[x,y] = 1
+    x = int(tile[1])
+    y = int(tile[2])
+    hits = int(splitline[1])
+    max_hits = max(max_hits, hits)
+    loghits = (log(hits-9,10))/(log(500000,10))
+
+    pixels[x,y] = min(255, int((loghits ** 1.8)*300))
+
+print(max_hits)

 img.save(args.filename)
\ No newline at end of file

For zoom 12, this results in a 612K PNG image, for zoom 15 it is a 3.7 MB PNG.

Have you decided what zoom tile you are going to have as equal to 1px in the image you have? Looking at the tiff, it is 262144px wide, which is z18 tiles. This seems like an unnecessarily high resolution, as z18 tiles are only a few houses wide.

Does the data need to be in the database? If it’s being processed in the PHP application, storing it in a PNG is an option to consider, since getting a pixel value for a <1MB PNG is pretty fast.

I have conducted performance tests twice on each of the two tile sizes of both ends of the recommended range to understand the time it takes to load the GeoTIFF file into the database, the space the raster data takes, and the number of rows of the newly created table. The table below is the performance test results:

I would focus on time to get the value for a given coordinate, not on loading time. This is likely to be correlated with table size more than loading time, so sizes larger than 100x100 might be better.

Monitoring Tile Servers with Fastly healthcheck status

almost 2 years ago

Yes - the improved monitoring of StatusCake might have caught it, but if Pyrene had been working (e.g. if it had been the most recent to be upgraded) then we’d have still had the problem among the European servers which would have been just as difficult to diagnose, so I think both monitoring changes are important.

I like the presentation on the Tile CDN dashboard.

Publishing sites using tile.openstreetmap.org

almost 2 years ago

Somehow I’m missing the bigger picture here. Should site owners check this information to evaluate their own resource usage and find out how they compare to others?

No. The goal is to improve the ability of the community to see how the service is being used and reduce the number of times admins need to run ad-hoc queries to see usage. We’ve done this in the past for user-agents.

In particular, historical usage was exceptionally difficult to query because of the large amount of data (~130GB/day compressed) that needed to be queried.

Publishing sites using tile.openstreetmap.org

almost 2 years ago

I’ve gone with CSV, quoting the string field. On the technical site, it’s using Python’s csv.writer with unix_dialect and QUOTE_NONNUMERIC.

sqlite would be overkill for what is 15kb of data in text form in a single table.

If I were truly reporting website domains, I wouldn’t worry about quoting, since domain names have a restricted set of characters. Instead, what I’m reporting is extracted domains from the referer header, and there’s nothing that stops an app from sending https://www.foo,bar.com/ as a referer header. They’d eventually get blocked for sending a fake header, but it doesn’t rule out them appearing in the logs.

That is related to the origin of “” in the hosts log. They are from referers that don’t parse as a valid URI.

In practice, for the type of log processing I’m doing, I’m likely to ignore that. If I run a command like fgrep '"umap.openstreetmap.fr"' hosts* | sed 's/hosts-//' | sed 's/.csv:"umap.openstreetmap.fr"//' > umap.csv I get a CSV I can open in a spreadsheet and do further stuff with.

The first file is up at https://planet.openstreetmap.org/tile_logs/hosts-2022-06-26.csv. I’m preparing the files to backfill into 2021

Inferring Default Speed Limits

almost 2 years ago

Actually, I think the best approach to properly get to current (non legislative but) de-facto speed limits is to not look at OSM data but have a fleet of cars constantly on the roads and sourcing the data for the current traffic situation for that. E.g. like Google does it (with data sourced by users of Google Maps). But whoever else may be collecting this kind of data (big taxi companies?, navigation system manufacturers?) , it is proprietary data and there is no good reason for them to give that away for free other than maybe to combine forces against a common market leader (Google).

You need this information for good routing anyways, which is the main use-case for speed limit data. Typically you want to develop routing cost functions based on actual speed, not max speed.

Google Summer of Code 2022

almost 2 years ago

No, the tile logs are from the CDN and are an accurate count of successful requests for tiles. Prior to 2021-04-13 the logs were only from the second layer of the old CDN.

The logs include successful requests on tiles where there were at least 10 requests, and the requests came from at least 3 distinct IPs. Most of them represent real views, but there are some artifacts.

Please, stop to devastate Mogadishu

about 2 years ago

Hi Alessandro, Immaculate Mwanja has recently posted about the project in the hot mailing list: https://lists.openstreetmap.org/pipermail/hot/2022-February/015767.html

Unfortunately I tried contacting them back in January without reply, asking for information required by the OEG.

Lot of frustration due to bad H.O.T. tasks

over 2 years ago

Immaculate Mwanja, do you have a link to the documentation on the wiki the OEG call for? In particular, I’m interested in the “plans for a ‘post-event clean up’ to validate edits”

GitHub's backward blocking causes conflict aggravation

over 2 years ago

With the large number of people blocking you, have you considered modifying your behavior so that they no longer feel the need to do so?