Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm one of the comments quoted in that chain of tweets, heh. Here's my specific example. This was years ago, so I don't remember much anymore and things may have changed. But I did now just give it a basic attempt and it still seems Wikipedia is easier than Wikidata. (I did put more effort into using Wikidata when I tried years ago, but all I really remember is it wasn't as fruitful as just fetching wikipedia).

My goal, a list of every airport on wikipedia with an IATA code and the city it is attached to. There is a perfect wikipedia page to start this off on, while as far as I can tell, wikidata does not have any of the data from the table on that page?

https://en.wikipedia.org/wiki/List_of_airports_by_IATA_airpo...

https://www.wikidata.org/w/api.php?action=wbgetentities&form...



I like that geospatial join you have there. Really it should be two query tabs and an interactive map.

I have often wanted a geofilter around my wikipedia search, esp when I am on vacation. Basically, give me every wikipedia page that ever talked about anything within 50km of here. And then one could filter down or have a personal recommendation system boost stuff you like.


I hope this helps with getting started: https://w.wiki/3x3n

And here's a visualization on a map, using geocoordinates: https://w.wiki/3x3g


Thanks, the queries are very powerful, but it still seems like this data is not as usable as the data in the HTML table. Any airports that don't have wikipedia links for the airport or city don't get picked up, and there are disagreeing duplicates in the wikidata that the HTML does not have.

For example (AKG) Anguganak Airport and city Anguganak don't have an article so they don't appear in the wikidata. ALZ doesn't appear in the data because Lazy Bay does not have an article page. There are some duplicate entries, with different cities or airport names like AAL, AAU, ABC. ABQ has 4 different entries. The data also is out-of-date in some instances. "Opa-locka Airport" was renamed to "Miami-Opa Locka Executive Airport" in 2014 for example. In the HTML table all these issues are solved.


Thanks for the answer!

I got the query wrong (reason: https://twitter.com/vrandezo/status/1430206988177219593 )

Here's the corrected query: https://w.wiki/3x8u

This includes a few more thousand results.

AKG does show up (but has indeed no connection to Anguganak), ALZ shows up (again, without a connection to a city). Article pages are not a requirement for the data to be in Wikidata.

I see your point. The duplicate entries can often be explained (e.g. ABQ is indeed the IANA code both for Albuqerque Sunport and the Kirtland AF Base, which are adjacent to each other), but that's already a lot of detail.

If a single table provides the form of clean data one is looking for, that's great and should be used (and slightly different than the original question that triggered this, where we had to go through many different pages and fuse data from thousands of pages together). Different tasks benefit from different inputs!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: