More

written-beyond · 2026-02-24T00:36:10 1771893370

Idk about the grammatical correctness of the punctuation, but I really enjoyed reading his writing. Never read something by him before, it was genuinely refreshing, specially given it was a glorified ad.

written-beyond · 2026-02-24T00:27:11 1771892831

Can you elaborate a bit more on the challenges faced in making Postgres shard-able?

I remember that adding sharing to Postgres natively was an uphill battle. There were a few companies who has proprietary solutions for it. What you've been able to achieve is nothing less than a miracle.

levkk · 2026-02-24T01:04:55 1771895095

So many, where to begin.

1. People don't design schemas to be sharded, although many gravitate towards a common key, e.g. user_id or country_id or tenant_it or customer_id. Once that happens, sharding becomes easier.

2. Postgres provides a lot of guarantees that are tricky to maintain when sharded: atomic changes, referential integrity, check constraints, unique indexes (and constraints), to name a few. Those have to be built separately by a sharding layer (like PgDog) and have trade-offs, usually around performance. It's a lot more expensive to check a globally enforced constraint than a local one (network hops aren't free).

3. Online migrations from unsharded to sharded can be tricky: you have to redistribute terabytes of data while the DB continues to serve writes. You can't lose a single row - Postgres is used as a store of record and this can be a serious issue with business impact.

We're taking increasingly bigger bites at this apple. We started with basic query routing and are now doing query rewrites as well. We didn't handle data movements previously and now have almost fully automatic resharding. It takes time, elbow grease and most importantly, willing and courageous early adopters to whom we owe a huge debt of gratitude.

written-beyond · 2026-02-24T13:46:26 1771940786

That's was my second question, how on earth can you replicate real world Postgres workloads that benefit the most from sharing.

Are there some specific standard Postgres test suites you run PgDog through to ensure it's compliant with Postgres standards?

You've mentioned NoSQL quite a bit, what sort of techniques do shard-able NoSQL database employ which makes sharding inherently easier? Do you attempt to emulate some of those techniques in PGDog?

Lastly how do you solve the problem of Postgres constraints, from what I've understood PgDog runs standard Postgres instances as the shard, if let's say one table in shard 1 has a foreign key to a record in shard 2 how do you prevent Postgres from rejecting that record since it technically doesn't exist on it's current shard?

levkk · 2026-02-24T17:39:22 1771954762

> Are there some specific standard Postgres test suites you run PgDog through to ensure it's compliant with Postgres standards?

That's right. We have many levels of testing: unit, integration, and acceptance, where we run the same query against an unsharded Postgres database and PgDog, and compare the result.

> what sort of techniques do shard-able NoSQL database employ which makes sharding inherently easier?

They remove features. For example, most of them don't support joins, so each table can be stored anywhere in the cluster with no data locality restrictions. There are no foreign key constraints either, or even transaction support. The list goes on. Ultimately, NoSQL databases are just K/V stores, with a fancy API. Scaling K/V is a solved problem.

> one table in shard 1 has a foreign key to a record in shard 2 how do you prevent Postgres from rejecting that

We don't, at least not yet. We can and will build a more sophisticated query engine that will validate constraints, but it may not always be completely atomic or performant. Cross-shard queries are expensive, because of the laws of physics. For example, if a query is executed outside of a transaction, validating the constraint could introduce a race condition, while in non-sharded Postgres, all queries run inside implicit transactions.

written-beyond · 2026-02-24T21:31:51 1771968711

Aaah you've got me excited and thinking about all sorts of ways this can fix the issue. I really appreciate your time for answering my questions, it's all very interesting.

Can't PgDog pull in the query planning and execution part from Postgres, and maintain a cache of the different indexes that are available pulled in from the different postgres shards and then follow through on the execution. This way PgDog could technically scale up to as many instances and keep postgres instances themselves as just a persistence backend?

However, I understand that at that point you're basically just making an entirely new database not really a sharding support service on top of postgres, you'd need to attempt to maintain feature parity with postgres which can turn into a maintenance pain.

Do you have any insights on how platforms like planetscale or cockroach are doing some of this stuff?

written-beyond · 2026-02-23T20:48:18 1771879698

idk man it's rare to fight the compiler once you've used Rust for long enough unless you're doing something that's the slightest bit complex with async.

You get to good at schmoozing the compiler you start to create actual logical bugs faster.

zamalek · 2026-02-23T20:59:29 1771880369

That's why I said "two weeks."

jacquesm · 2026-02-23T21:02:48 1771880568

That goes for almost every language. I recall my first couple of weeks with various compiled language and they all had their 'wtf?' moments when a tiny mistake in the input generated reams of output. But once you get past that point you simply don't make those mistakes anymore. Try missing a '.' in a COBOL program and see what happens. Make sure there is enough paper in the box under LPT1...

written-beyond · 2026-02-22T15:02:53 1771772573

Can someone provide a true engineers perspective on the ADCs' on ESP SoC's?

I've heard a lot about people trashing it and most experienced engineers admit that it's finicky however if you have the knowledge you can make it work as well as any STM chip.

ESP32's are so interesting, they're the only major chip that (used to) have their own newish ISA (before transitioning to RISCV) and be so successful.

glassconclusion · 2026-02-23T22:28:05 1771885685

If you need more accurate analog measurements it is better to use an external ADC (with i.e. SPI interface). This will cost quite a bit more but will save the hassle of calibrating each individual device. Mostly comes down to how much dev time you want to invest in it vs. hardware cost vs. TTM.

buescher · 2026-02-22T19:11:32 1771787492

I'm not too familiar with the ESP32 ADCs except I remember they're unusually lightly specified even for microcontroller ADCs. If "the knowledge" involves things you couldn't do - or rely on - in production like careful calibration and characterization, that would answer your question.

sitkack · 2026-02-22T16:10:06 1771776606

The ESP32 pre-RISCV ISA is from Tensilica, this IP they purchased.

Kiboneu · 2026-02-23T17:01:52 1771866112

To be clear and towards the OP's comment about ESP32 ISA -- Xtensa isn't really a self contained architecture, it can be customized (extended) by the vendor. The ISA can be extended for these customizations. ESP32 is one customization of it.

sitkack · 2026-02-23T21:23:27 1771881807

It was only opaquely supported by GCC, no LLVM, so no Rust.

It is a cool design, but it was a major PITA for awhile. Xtensa is parametric so every instance of a cpu has a custom instruction set.

Kiboneu · 2026-02-24T15:35:05 1771947305

Heh. Yep. Reverse engineering firmware for a custom xtensa cpu is also a lot of “fun” (and also fun).

leptons · 2026-02-22T18:38:20 1771785500

The ADCs on the ESP32 are similar to other embedded MCUs in that they are not intended for audiophile-level audio capture, as some people seem to think they should be capable of.

The main value proposition for these ADCs is to hook them up to a simple potentiometer to allow physical input controls, and even for that purpose you need to average multiple samples to get a somewhat steady value. Of course the ADCs can be used for various other tasks, but "ADC" does not mean they can do anything any ADC can do, there's a wide variety of quality and purpose in the field of ADCs, and the ESP32's ADCs are a cheap and easy way to add a simple ADC function to the chip.

I have been able to use the ADCs quite easily for input controls and monitoring slow-changing voltages, in ap;lications where abdolute precision wasn't the goal, it works perfectly fine for that.

written-beyond · 2026-02-21T16:00:15 1771689615

Interesting, I was a mildly heavy cannabis user in my very early teens but stopped short right before starting my junior year of high school. My use was mostly motivated by teen angst and peer pressure, I never really enjoyed it I always felt uncomfortably anxious and hyper sensitive.

This study shows that anxiety was identified in many of the participants of the study, which is pretty close to how I feel now. I am in general and anxious individual and I wonder if this was because of the marijuana use. Well it could also be how anxiety is pretty rampant in my family so could just be in my genes.

written-beyond · 2026-02-19T14:17:14 1771510634

That reduces the quality of the response though.

debugnik · 2026-02-19T14:45:29 1771512329

As opposed to emitting non-JSON tokens and having to throw away the answer?

jgalt212 · 2026-02-19T15:21:05 1771514465

Or just run json.dumps on the correct answer in the wrong format.

written-beyond · 2026-02-19T15:31:38 1771515098

Don't shoot the messenger

debugnik · 2026-02-21T08:49:00 1771663740

Whom's messenger? You didn't point us to anyone's research.

I just don't see how sampling tokens constrained to a grammar can be worse than rejection-sampling whole answers against the same grammar. The latter needs to follow the same constraints naturally to not get rejected, and both can iterate in natural language before starting their structured answer.

Under a fair comparison, I'd expect the former to provide answers at least just as good while being more efficient. Possibly better if top-whatever selection happened after the grammar constraint.

Der_Einzige · 2026-02-19T16:09:34 1771517374

THIS IS LIES: https://blog.dottxt.ai/say-what-you-mean.html

I will die on this hill and I have a bunch of other Arxiv links from better peer reviewed sources than yours to back my claim up (i.e. NeurIPS caliber papers with more citations than yours claiming it does harm the outputs)

Any actual impact of structured/constrained generation on the outputs is a SAMPLER problem, and you can fix what little impact may exist with things like https://arxiv.org/abs/2410.01103

Decoding is intentionally nerfed/kept to top_k/top_p by model providers because of a conspiracy against high temperature sampling: https://gist.github.com/Hellisotherpeople/71ba712f9f899adcb0...

otabdeveloper4 · 2026-02-19T20:14:03 1771532043

I use LLMs for Actual Work (boring shit).

I always set temperature to literally zero and don't sample.

iugtmkbdfil834 · 2026-02-19T19:33:49 1771529629

I honestly would like to hope people were more up in arms over this, but.. based on historical human tendencies, convenience will win here.

written-beyond · 2026-02-18T07:27:43 1771399663

This is really nice, specially the pdf report generation.

I feel very moronic making a dashboard for any products now. Enterprise customers prefer you integrate into their ERPs anyway.

I think we lost the plot as an industry, I've always advocated for having a read only database connection to be available for your customers to make their own visualisations. This should've been the standard 10 years ago and it's case is only stronger in this age of LLMs.

We get so involved with our products we forget that our customers are humans too. Nobody wants another account to manage or remember. Analytics and alerts should be push based, configurable reports should get auto generated and sent to your inbox, alerts should be pushed via notifications or emails and customers should have an option to build their own dashboard with something like this.

Sane defaults make sense but location matters just as much.

oogali · 2026-02-18T14:01:34 1771423294

> I've always advocated for having a read only database connection to be available for your customers to make their own visualisations.

Roughly three decades ago, that *was* the norm. One of the more popular tools for achieving that was Crystal Reports[1].

In the late 90s, it was almost routine for software vendors to bundle Crystal Reports with their software (very similar to how the MSSQL installer gets invoked by products), then configure an ODBC data source which connected to the appropriate database.

In my opinion, the primary stumbling block of this approach was the lack of a shared SQL query repository. So if you weren’t intimately aware with the data model you wanted to work with, you’d lose hours trying to figure it out on your own or rely on your colleagues sharing it via sneakernet or email.

Crystal Reports has since been acquired by SAP, and I haven’t touched it since the early ‘00s so I don’t know what it looks or functions like today.

1: https://en.wikipedia.org/wiki/Crystal_Reports

skeeter2020 · 2026-02-18T15:50:28 1771429828

My best friend from early uni days did a co-op with Crystal Services, and he's been with them for their entire history through Seagate Software, Crystal Decisions, BusinessObjects (and relocating from Canada to France) and then SAP. I myself have had 2 temporary retirements, at least 4 different careers and countless jobs in that time, and it's wild to know someone who has the same internal drive but has satisfied it with a much more linear path (though you could definitely argue he's seen just as much change as me). From employee ~50 to ~100,050!

yesbabyyes · 2026-02-18T22:45:32 1771454732

This brings me back! My first job was at the Norwegian ERP Agresso, now part of Unit4. I started as a support technician, which was a experience since around the time, '97-'98, everyone was moving from Sybase/Ingres/Informix etc, to either MSSQL or Oracle. I got to interact with those older database systems and install and export/import data to systems running on eg Oracle across parallel Solaris servers at SAAB Areospace and Windows NT running on DEC Alpha at Ericsson, among other more vanilla deployments.

I was a developer albeit not professionally, and my boss gave me the opportunity to develop the integration between Agresso and Crystal Reports, my first professional development project, for which I am still grateful. It was a DLL written in C++ and I imagine they shipped that for quite a while after I left for greener pastures.

I was already a free software and Linux enthusiast, so I did a vain skunkworks attempt at getting Agresso to run with MySQL, which failed, but my Linux server in the office came in handy when I needed some extra software in the field--I asked a colleague to put a CD in the server so I could download it to the client site some 500 km away, and deliver on the migration.

written-beyond · 2026-02-18T17:29:14 1771435754

Aaaaaah I had a professor rave about Crystal Reports once. Didn't know it had such gilded history.

AgharaShyam · 2026-02-18T14:41:49 1771425709

100% agreed regarding shipping a read-replica, for any sufficiently complex enterprise app (ERP, CRM, accounting, etc.).

Customers need it to build custom reports, archive data into a warehouse, drive downstream systems (notifications, audits, compliance), and answer edge-case questions you didn’t anticipate.

Because of that, I generally prefer these patterns over a half-baked built-in analytics UI or an opinionated REST API:

Provide a read replica or CDC stream. Let sophisticated customers handle authz, modelling, and queries themselves. This gets harder with multi-tenant DBs.

Optionally offer a hosted Data API, using something like -- PostgREST / Hasura / Microsoft DAB. You handle permissions and safety, but stay largely un-opinionated about access patterns.

Any built-in metrics or analytics layer will always miss edge cases.

With AI agents becoming first-class consumers of enterprise data, direct read access is going to be non-negotiable.

Also, I predict the days of charging customers to access their own goddamn data, behind rate-limited + metered REST APIs are behind us.

conormccarter · 2026-02-18T15:52:19 1771429939

I fully agree in spirit, but in practice, read-replica's have some edge cases that are hard to control for. Namely, the incentives aren't fully aligned between the database host and consumer, and that dynamic can lead to some difficult resourcing decisions for the DB host. Whereas an API can be rate limited or underlying API queries can be optimized (however frustrating that might be for consumers).

The CDC stream option you flagged is more viable in my (admittedly biased) opinion. At my company (Prequel) our entire pitch is basically "you should give your customer's a live replica of their data in whatever data platform they want it in" (and let us handle the cross-platform compatibility & multi-tenant DB challenges).

I think this problem could also be a killer use case for Open Table Formats, where the read-replica architecture can be mirrored but the cost of reader compute can be assumed by the data consumer.

To your point, this is only going to be more important with what will likely be a dramatic increase in AI agent data consumption.

mitjam · 2026-02-18T08:29:20 1771403360

1999-2000, the company I worked with gave a smallish number of key users full read rights to the SAP minus HR, briefly after introducing SAP to the global supply chain of that company. The key users came from all orgs using SAP, basically every department had one or two key users.

I was part of this and "saw the light". We had such a great visibility into all the processes, it was unreal. It tremendously sped-up cross-org initiatives.

Today, I guess, only agents get that privilege.

jorin · 2026-02-18T11:05:27 1771412727

hi, dev building Shaper here. I agree re sending reports vs dashboards. Many users use Shaper mostly as UI to filter data and then download a pdf, png or csv file to use elsewhere. We are also currently working on functionality to send out those files directly as messages using Shaper's task feature.

written-beyond · 2026-02-18T12:51:46 1771419106

It would be a game changer, very interesting to see this grow. How did you get your PDF generation so good?

jorin · 2026-02-18T13:11:34 1771420294

happy to hear that! pdfs are generated in a headless chrome in the same docker container as shaper itself using chromedp.

owlstuffing · 2026-02-18T18:29:12 1771439352

> I think we lost the plot as an industry

I get your point, but generally with most enterprise-scale apps you really don’t want your transactional DB doubling as your data warehouse. The “push-based” operation should be limited to moving data from your tx environment to your analytical one.

Of course, if the “analytics” are limited to simple static reports, then a data warehouse is overkill.

mrits · 2026-02-18T15:03:51 1771427031

Customers don’t want to learn your schema or deal with your clever optimizations either. If you expose a DB make sure you abstract everything away in a view and treat it like a versioned API.

written-beyond · 2026-02-18T19:51:11 1771444271

The best example for this are iot devices that share their data. Instead of reinventing the wheel for a dashboard for each customer just give them some docs and restricted access via a replica.

matsz · 2026-02-18T08:13:32 1771402412

> I've always advocated for having a read only database connection to be available for your customers to make their own visualisations.

A layer on top of the database to account for auth/etc. would be necessary anyways. Could be achieved to some degree with views, but I'd prefer an approach where you choose the publicly available data explicitly.

GraphQL almost delivered on that dream. Something more opinionated would've been much better, though.

written-beyond · 2026-02-18T10:39:58 1771411198

That's exactly what I meant. It's a specific replica instance with it's own security etc. but not necessarily a separate API you try to integrate too. APIs can stay for writes, but for reads you have the db

written-beyond · 2026-02-17T08:26:18 1771316778

Man you really nostalgia pilled me. It's lovely and a work of art.

What's the deal with the audio delay showing up for button taps until the icons have loaded?

hereonout2 · 2026-02-17T08:39:42 1771317582

Others getting nostalgia over the Xbox 360 reminds me how old I am!

lloeki · 2026-02-17T12:30:37 1771331437

Now for an additional kickback: nostalgia induced here is about the NXE, but it famously displaced the original Blades dashboard.

nozzlegear · 2026-02-17T15:16:13 1771341373

I loved the Blades dashboard. Something about idly pressing the shoulder buttons to flip through the blades while talking to my friend with that goofy wireless "Xbox communicator" on my ear.

butlike · 2026-02-17T16:31:14 1771345874

Blades was better than the redesign

bigstrat2003 · 2026-02-17T22:06:36 1771365996

Yeah it was. I hated the NXE so much. It was both harder to use and slower than the original UI. It looked prettier but that was it.

ErneX · 2026-02-17T11:24:06 1771327446

Best Xbox console. It had pretty good games. Sad they were unable to keep that momentum going and are basically nope’ing from the console business altogether now.

butlike · 2026-02-17T16:30:08 1771345808

Late night uno sessions were a lot of fun. Not everyone had a camera so voice chat was "off the chain" as they used to say

pjmlp · 2026-02-17T09:39:32 1771321172

I just bought this the other day, https://www.retro-gamer.de/shop/heft/retro-gamer-2-26-einzel...

The Xbox 360 is now considered a retro gaming device, that was such a reminder how old I am now, to note my first home computer was a Timex 2068.

Jhsto · 2026-02-17T09:54:53 1771322093

I was able pull together a Halo 3 LAN party last year, although the "consoles" were Linux PCs and the game was the MCC edition (60fps instead 30). Split-screen was resurrected via mods. I bought some Microsoft gamepad receiver to bring Xbox 360 original controllers under Linux. Some people insisted they get to play on the original gamepad (otherwise it was a mixed bag of PlayStation and newer Xbox/PC controllers). I also realized that Halo 3 itself would have been old enough to drink with us!

andrepd · 2026-02-17T11:34:39 1771328079

The Xbox 360 is about as old now as the NES was when the Xbox 360 came out.

pjmlp · 2026-02-17T13:05:12 1771333512

Yeah, and that is why some of us feel rather old. :)

I still remember when all that Nintendo had were Game & Watch handhelds, before NES came to be.

https://en.wikipedia.org/wiki/List_of_Game_%26_Watch_games

technothrasher · 2026-02-17T11:57:00 1771329420

> my first home computer was a Timex 2068.

I don't know if the Altair 8800 would count as my first home computer, as I was too young to really understand what it was and mostly just liked to play with the paper tape feed on the Teletype attached to it. By the time we got the PET 2001, I was old enough to actually use it as intended.

keyle · 2026-02-17T09:40:31 1771321231

I still have it in a box with all its games

I still love its controller design.

written-beyond · 2026-02-16T18:30:47 1771266647

Never thought of it that way, very interesting insight. I always thought those "K circle back" emails were fake but nope looks like they're very real.

written-beyond · 2026-02-15T21:24:00 1771190640

Is this the gnu version of systemd?

edit: I know it's not a monolith like systemd but service/unit files are a core component of systemd

eliaspro · 2026-02-15T21:44:46 1771191886

systemd is not a monolith.

It's a collection of losely coupled components and services of which basically every single one can be disabled or replaced by another implementation.

chlorion · 2026-02-16T01:01:18 1771203678

No it definitely is a monolith.

It's NOT loosely coupled in any way. Try running any part of the systemd software suite on an openrc system and see how that works out?

I have no idea why people are so insistent on claiming that its not a monolith, when it ticks off every box of what a monolith is.

jcgl · 2026-02-16T12:19:07 1771244347

Most systemd components do rely on some core systemd components like systemd (the service manager) and journald. I would say that a core thesis of systemd is that Linux needs/needed a set of higher-level abstractions, and that systemd-the-service-manager has provided those abstractions. The fact that other parts of systemd-the-project rely on those abstractions does not imply that the project is monolithic.

mos87 · 2026-02-18T05:04:15 1771391055

>Try running any part of the systemd software suite on an openrc system and see how that works out?

Well from this POV it's kinda openrc's problem if it doesn't. What about trying to run any part of the Openrc software suite on an Upstart system? The question why would anyone sane want to is rhetorical tho...

Why obsessing over whether systemd is monolithic and in what measure anyway? There certainly ARE optional systemd parts. So it's correct to say it's not entirely monolithic.

chlorion · 2026-02-19T01:50:36 1771465836

openrc-init can be used on an upstart system, the daemon manager itself can't but that's because you'd have two different daemon managers. Beyond that there aren't any openrc software components, because it was designed to be a modular init system that just handles what it was intended to handle.

The rest of the system for example chrony, sysklogd, cron, etc run fine on upstart systems, because they aren't tied to systemd and are fully modular.

It's okay to be a monolith, that doesn't make it inherently bad or anything, but we should be honest about it, and it does come with some tradeoffs.

dTal · 2026-02-16T09:41:24 1771234884

Explain the existence of "elogind" and "eudev" then?

It might be the case that one can disable some components of systemd, on a systemd system. It is certainly not the case that they are "loosely coupled", or there would be no incentive to maintain forks of core systemd components with the sole and explicit purpose of decoupling from systemd.

cyberax · 2026-02-15T23:52:02 1771199522

In theory. In practice, systemd is a mess of different components that have subtle dependencies on each other. And while the core of systemd is solid enough, everything around it is not.

stackghost · 2026-02-15T21:51:40 1771192300

It's a collection of tightly-coupled components that are functionally a monolith because large distros tend to rely on the various components rather than allowing modularity.

bladeee · 2026-02-15T21:25:57 1771190757

GNU Shepherd

throw_a_grenade · 2026-02-15T21:31:59 1771191119

"Pies" means "dog" in Polish an Ukrainian (пес).

fangorn · 2026-02-15T21:40:05 1771191605

So, "Gnu is Not Unix, Dawg"?

seemaze · 2026-02-16T04:43:26 1771217006

“Pies” is Spanish for feet, which was my second reading after seeing the pronunciation. The first was in reference to round baked deserts.

otterley · 2026-02-15T21:40:14 1771191614

Is that pronounced “peace” or “piss”?

throw_a_grenade · 2026-02-15T21:43:03 1771191783

More like pi+[y]es, but single syllable and no y.

EDIT: Here are three audio files to hear: https://pl.wiktionary.org/wiki/pies#pies_(j%C4%99zyk_polski)

jagged-chisel · 2026-02-16T01:23:16 1771204996

As an American, I hear “pyes” - a single syllable “yes”, with a preceding “p.”

otterley · 2026-02-15T21:46:04 1771191964

When do you use that vs собака (sobaka)?

throw_a_grenade · 2026-02-15T21:57:16 1771192636

I don't, I'm Polish. Can't say for sure for Ukrainians, don't know Ukrainian that well, but my reading of https://en.wiktionary.org/wiki/%D1%81%D0%BE%D0%B1%D0%B0%D0%B... and https://en.wiktionary.org/wiki/%D0%BF%D0%B5%D1%81#Ukrainian suggests that пес must be male, but собака is either male or female. I might be wrong.