We do something similar for some limited geospatial search using elastic search. We make a set of h3 indexes for each of the hundreds of millions of gps recordings on our service, and store them in elastic search. Geospatial queries become full text search queries, where a point is on the line if the set of h3 indexes contains the point. You can do queries on how many cells overlap, which lets you match geospatial tracks on the same paths, and with ES coverage queries, you can tune how much overlap you want.
Instead of using integers IDs for the hexes, we created an encoded version of the ID that has the property that removing a character gets you the containing parent of the cell. This means we can do basic containment queries by querying with a low resolution hex (short string) as a prefix query. If a gps track goes through this larger parent cell, the track will have hexes with the same prefix. You don’t get perfect control of distances because hexes have varying diameters (or rather the approximation, since they aren’t circles they are hexes), but in practice and at scale for a product that doesn’t require high precision, it’s very effective.
I think at the end of this year we’ll have about 6tb of these hex sets in a four node 8 process ES cluster. Performance is pretty good. Also acts as our full text search. Half the time we want a geo search we also want keyword / filtering / etc on the metadata of these trips.
Pretty fun system to build, and the concept works with a wide variety of data stores. Felt like a total hack job but it has stood the test of time.
Elastisearch and Opensearch have a built in geo_shape type that is a bit more optimal for queries like this.
Before that existed (pre 1.0 actually), I did something similar with geohashes, which are similar to h3 but based on simple string encoded quad trees. I indexed all the street segments in openstreetmap with that (~800 million at the time) and implemented a simple reverse geocoder. Worked shockingly well.
The geo_shape type uses a bkd tree in binary format. It's heavily optimized for this type of intersects/overlaps queries at scale. Basically does the same thing but using a lot less disk space and memory. It's similar to what you would find in proper GIS databases. Elasticsearch/opensearch also support h3 and geohash grid aggregations on top of geo_shape or geo_point types.
I'm guessing the author is using something like postgresql which of course has similar geospatial indexing support via post gis.
Doesn’t meet all our product requirements unfortunately. We used returned hexes in certain queries, and we also hacked in directionality of line using least significant 12 bits of the hex (didn’t need that level of hex precision), and we are doing direction oriented matching and counting. For simpler use cases it’s definitely a better option. thanks for reminding me and other people reading my comment!
No idea if they are doing this, but you can use Gosper islands (https://en.wikipedia.org/wiki/Gosper_curve) which are close to hexagons, but can be exactly decomposed into 7 smaller copies.
Yes! A Gosper Island in H3 is just the outline of all the descendants of a cell at a some resolution. The H3 cells at that resolution tile the sphere, and the Gosper Islands are just non-overlapping subsets of those cells, which means they tile the sphere.
Not quite - you need 12 pentagons in a mostly hexagonal tiling of the sphere (and if you're keeping them similar sizes, Gosper-islands force hexagon-like adjacency). I don't think it's possible to tile the sphere using more than 20 exactly identical pieces.
You could get a Gosper-island like tiling starting from H3 by saying that each "Hex" is defined recursively to be the union of its 6/7 parts (stopping at some small enough hexagons/pentagons if you really want). Away from the pentagons, these tiles would be very close to Gosper islands.
> I don't think it's possible to tile the sphere using more than 20 exactly identical pieces.
I was wrong about this (e.g. https://en.wikipedia.org/wiki/Rhombic_triacontahedron). It still seems possible to me that there's a limit to the smallest tile that can tile a unit sphere on its own. (Smallest by diameter as a set of points in R^3).
Awesome comment, thanks for sharing the details. I love this kind of pragmatic optimization. Also, one dev's "total hack* job" [e.g. yourself, in the past] is another's stroke of genius.
* I'd frame it as "kludge", reserving "hack" for the positive HN sense. :)
I do this all the time in a dumb but effective way. Add logging statements to code paths that drop timing info. Another dumb but effective way, instead of using a step through debugger, is drop "here, value is {val}". Telling claude to do this is trivial, it's quick, and it can read its own output and self-solve the problem all with just the code itself.
IMHO git bisect is slower, especially depending on the reload/hot-reload/compile/whatever process your actual app is using.
Flameshot is the best! I've been using it for 10+ years. I have it wired up to some hot keys in my window manager, and have it dump to s3 so I can paste around links to screenshots everywhere for work.
> and have it dump to s3 so I can paste around links to screenshots everywhere for work.
I wanted something like this too but I modified Flameshot so I don't need a bash script in-between.
Flameshot already has a feature to upload to Imgur so I modified that and also added some small things (like randomized file names, some new config options).
> Flameshot already has a feature to upload to Imgur
Didn't they remove it though? Because someone complained about "privacy" or something? Devs promised to bring it back as the plugin, but I wasn't following progress on it, I don't know if that happened yet.
it all depends on your philosophy on dependencies. if you maintain a small set of core dependencies that are there for good reasons and are actively maintained, then rails upgrades are pretty easy. if you have a Gemfile that has a bunch of third party gems that you bring in for small problems here and there, you have to occasionally pay down that debt on version upgrades. we have an 18 year old rails codebase currently on 7.1 that hasn't proven to be a big pain for upgrades. the hardest upgrade we did was because of a core dependency that had been dead for 5 years broke with a new version of rails. but that was a story of letting technical debt ride for too long and having to pay it back.
this is a common problem in any complex codebase that has a culture of using third party dependencies to solve small problems. you see this conversation all the time with modern frontend development and the resulting dependency tree you get with npm etc....
Minidsp flex ht or htx paired with a buckeye 6 channel amp. As cheap as you can get premium sound quality. Not cheap but you get the software control you actually want via the minidsp
Preschool is just daycare with structure, so it costs more. Optional, privately owned. Nice to do 2-3 days a week for young kids to give them more social and learning opportunites. But it’s not public school, it’s usually just a small locally owned business.
And this was a co-op preschool, which is a special variety of private preschool (usually non-profit) where the parents are usually involved in classes with the kids and much of the maintenance of the school itself is handled through volunteerism of member families.
My wife served as treasurer for the penultimate year, saw the writing on the wall, and then turned the position over to someone else to actually wind down the school. The model just doesn't work where we live: it requires a large number of single-income families so that one parent can be full-time involved in the kids' upbringing, and housing prices are such that single-income families cannot afford homes in the area. As a result, their market just evaporated. People just can't do it anymore.
We've been running a production ceph cluster for 11 years now, with only one full scheduled downtime for a major upgrade in all those years, across three different hardware generations. I wouldn't call it easy, but I also wouldn't call it hard. I used to run it with SSDs for radosgw indexes as well as a fast pool for some VMs, and harddrives for bulk object storage. Since i was only running 5 nodes with 10 drives each, I was tired of occasional iop issues under heavy recovery so on the last upgrade I just migrated to 100% nvme drives. To mitigate the price I just bought used enterprise micron drives off ebay whenever I saw a good deal popup. Haven't had any performance issues since then no matter what we've tossed at it. I'd recommend it, though I don't have experience with the other options. On paper I think it's still the best option. Stay away from CephFS though, performance is truly atrocious and you'll footgun yourself for any use in production.
We're using CephFS for a couple years, with some PBs of data on it (HDDs).
What performance issues and footguns do you have in mind?
I also like that CephFS has a performance benefits that doesn't seem to exist anywhere else: Automatic transparent Linux buffer caching, so that writes are extremely fast and local until you fsync() or other clients want to read, and repeat-reads or read-after-write are served from local RAM.
We are the world's largest library of bike routes, and we enable cyclists to go on better rides, more often. We have a website and mobile apps that allow people to discover the best riding in their area, and get turn by turn navigation using either our mobile apps or the bike computer of their choosing. Come join us in taking Ride with GPS to the next level! We have two openings right now, and are starting to build out the hiring plan for a third:
Senior Software Engineer - API & Product Development: We are looking for an experienced backend engineer to join our small and effective engineering team with a focus on supporting web and mobile app development using our APIs. The right candidate for this role brings extensive experience supporting modern product development, in collaboration with frontend and mobile developers, product management, and design. This requires excellent communication and collaboration skills, both on the engineering side, and from a product perspective. We use rails, but prior rails experience is not required.
Senior Software Engineer - API Development: We are looking for an experienced backend engineer to join our small and effective team with a focus on our APIs and supporting our platform at scale. This doesn't mean you are isolated from product development &emdash; everything we do serves our users in some way, and being a small team we regularly share responsibilities. However, this role will spend more time on efficiency and system design rather than delivering this quarter's new features. The right candidate should have a depth of experience supporting a large API surface area with efficient, well organized code, and should be excited about maintaining and improving performance over time. Experience with developer tooling, database design, query optimization, and DevOps workflows will serve you well in this role. We use rails, but prior rails experience is not required.
Senior Software Engineer - iOS Development: In mid July, we will officially start the hiring process for an iOS developer, and potentially another Android engineer. We are reviewing applications for qualified candidates at this time, and will officially post the job by July 15th. If you think you are an excellent fit please apply now, however there might be some delays in screening, interviewing, etc while we finalize our hiring plan. We have a technically interesting, battery efficient set of mobile apps that act as a companion to our website, and need another iOS or Android engineer to help us take our apps to the next level.
Instead of using integers IDs for the hexes, we created an encoded version of the ID that has the property that removing a character gets you the containing parent of the cell. This means we can do basic containment queries by querying with a low resolution hex (short string) as a prefix query. If a gps track goes through this larger parent cell, the track will have hexes with the same prefix. You don’t get perfect control of distances because hexes have varying diameters (or rather the approximation, since they aren’t circles they are hexes), but in practice and at scale for a product that doesn’t require high precision, it’s very effective.
I think at the end of this year we’ll have about 6tb of these hex sets in a four node 8 process ES cluster. Performance is pretty good. Also acts as our full text search. Half the time we want a geo search we also want keyword / filtering / etc on the metadata of these trips.
Pretty fun system to build, and the concept works with a wide variety of data stores. Felt like a total hack job but it has stood the test of time.
Thanks uber, h3 is a great library!
reply