More

marcusbuffett · 2025-11-02T20:49:11 1762116551

Do you find the bots on chess.com play human-like chess? I'm not sure what they're using under the hood but I assume they've also put some effort into this sort of stuff.

marcusbuffett · 2025-11-02T14:52:14 1762095134

Yeah the intention is to make them available on my chess site eventually (Chessbook). It's in very early alpha though, so it's breaking constantly as I tweak things and try playing against it with different configurations.

marcusbuffett · 2025-05-18T15:38:41 1747582721

I used this for spaced repetition for opening training on Chessbook, can vouch that the Rust package is excellent. Easy to use and immediately lowered the training load on our users while keeping retention high. FSRS is awesome.

Oreb · 2025-05-18T16:29:36 1747585776

I use Chessbook and mostly like it, but I wish there was a way to configure the spaced repetition parameters. In particular, I find the frequency of reviews too low to retain well, especially for recently added moves.

marcusbuffett · 2025-02-25T01:26:53 1740446813

I strongly recommend "A Philosophy of Software Design". It basically boils down to measuring the quality of an abstraction by the ratio of the complexity it contains vs the complexity of the interface. Or at least, that's the rule of thumb I came away with, and it's incredible how far that heuristic takes you. I'm constantly thinking about my software design in these terms now, and it's hugely helpful.

I didn't feel like my code became better or easier to maintain, after reading other programming advice books, including "Clean Code".

A distant second recommendation is Programming Pearls, which had some gems in it.

narnarpapadaddy · 2025-02-25T03:20:15 1740453615

Implicitly, IIRC, the optimal ratio is 5-20:1. Your interface must cover 5-20 cases for it have value. Any fewer, the additional abstraction is unneeded complexity. Any more, and your abstraction is likely too broad to be useful/understandable. The example he gives specifically was considering the number of subclasses in a hierarchy.

It’s like a secret unlock code for domain modeling. Or deciding how long functions should be (5-20 lines, with exceptions).

I agree, hugely usual principle.

abhis3798 · 2025-02-25T08:47:38 1740473258

This is a good rule of thumb, but what would be a good response to have interfaces because, "what if a new scenario comes up in the future"?

Copenjin · 2025-02-25T09:06:29 1740474389

The scenario NEVER comes up in the future as it was originally expected. You'll end up having to remove and refactor a lot of code. Abstractions are useful only used sparingly and when they don't account for handling something that doesn't even exist yet.

narnarpapadaddy · 2025-02-25T15:46:49 1740498409

When doing the initial design start in the middle of the complexity to abstraction budget. If you have 100 “units of complexity” (lines of code, conditions, states, classes, use cases, whatever) try to find 10 subdivisions of 10 units each. Rarely, you’ll have a one-off. Sometimes, you’ll end up with more than 20 in a group. Mostly, you should have 5-20 groups of 5-20 units.

If you start there, you have room for your abstraction to bend before it becomes too brittle and you need to refactor.

Almost never is an interface worth it for 1 implementation, sometimes for 3, often for 5-20, sometimes for >20.

The trick is recognizing both a “unit of complexity” and how many “units” a given abstraction covers. And, of course, different units might be in tension and you have to make a judgement call. It’s not a silver bullet. Just a useful (for me at least) framing for thinking about how to manage complexity.

d0mine · 2025-02-26T18:44:05 1740595445

Even one use case may be enough e.g., if one class accepts another then a protocol (using Python parlance) SupportsSomething could be used to decouple two classes, to carve out the exact boundary. The protocol may be used for creating a test double (a fake) too.

kragen · 2025-02-25T11:34:40 1740483280

If you own the code base, refactor. It's true that, if you're offering a stable interface to users whose code you can't edit, you need to plan carefully for backward compatibility.

lmm · 2025-02-25T08:59:42 1740473982

"We'll extract interfaces as and when we need them - and when we know what the requirements are we'll be more able to design interfaces that fit them. Extracting them now is premature, unless we really don't have any other feature work to be doing?"

kragen · 2025-02-25T11:32:31 1740483151

Maybe some examples would clarify your intent, because all the candidate interpretations I can think of are absurd.

The sin() function in the C standard library covers 2⁶⁴ cases, because it takes one argument which is, on most platforms, 64 bits. Are you suggesting that it should be separated into 2⁶⁰ separate functions?

If you're saying you should pass in boolean and enum parameters to tell a subroutine or class which of your 5–20 use cases the caller needs? I couldn't disagree more. Make them separate subroutines or classes.

If you have 5–20 lines of code in a subroutine, but no conditionals or possibly-zero-iteration loops, those lines of code are all the same case. The subroutine doesn't run some of them in some cases and others in other cases.

tremon · 2025-02-25T15:22:18 1740496938

That function covers 2⁶⁴ inputs, not cases. It handles only one case: converting an angular value to (half of) a cartesian coordinate.

kragen · 2025-02-25T17:11:55 1740503515

Sounds like you haven't ever tried to implement it. But if the "case" you're thinking of is the "case" narnarpapadaddy was referring to, that takes us to their clause, "Any fewer [cases], the additional abstraction is unneeded complexity." This is obviously absurd when we're talking about the sin() function. Therefore, that can't possibly have been their intended meaning.

tremon · 2025-02-25T20:40:52 1740516052

The alternative and more charitable interpretation, of course, is that a single function like sin() is not what said GP meant when using the word "interface". But hey, don't let me interrupt your tilting at straw men, you're doing a great job.

narnarpapadaddy · 2025-02-26T15:00:04 1740582004

Appreciate the charitable interpretation. Both “complexity“ and “abstraction” take many different forms in software, and exceptions to the rule-of-thumb abound so it’s easy to come up with counter examples. Regardless, thinking in terms of complexity ratios has been a useful perspective for me. :)

IMO, a function _can_ be an interface in the broadest sense of that term. You’re just giving a name to some set of code you’d like to reuse or hide.

narnarpapadaddy · 2025-02-25T14:15:30 1740492930

Think of it more like a “complexity distribution.”

Rarely, a function with a single line or an interface with a single element or a class hierarchy with a single parent and child is useful. Mostly, that abstraction is overhead.

Often, a function with 5-20 lines or an interface 5-20 members or a class hierarchy with 5-20 children is a useful abstraction. That’s the sweet spot between too broad (function “doStuff”) and too narrow (function “callMomOnTheLandLine”).

Sometimes, any of the above with the >20:1 complexity ratio are useful.

It’s not a hard and fast rule. If your complexity ratio falls outside that range, think twice about your abstraction.

narnarpapadaddy · 2025-02-25T14:21:29 1740493289

And with respect to function behavior, I’d view it through the lens of cyclomatic complexity.

Do I need 5-20 non-trivial test cases to cover the range of inputs this function accepts?

If yes, function is probably about the right level of behavioral complexity to add value and not overhead.

If I need only 1 test or if I need 200 tests it’s probably doing too much or too little.

kragen · 2025-02-25T15:31:13 1740497473

That's not what cyclomatic complexity is, and if you think 5–20 test cases is enough for sin(), open(), or Lisp EVAL, you need your head examined.

narnarpapadaddy · 2025-02-25T15:57:15 1740499035

You’re right, I suggested two different dimensions of complexity there as a lens into how much complexity a function contains. But I think the principle holds for either dimension.

I don’t think you need only 20 test cases for open(). Sometimes, more than 20 is valid because you’re saving across some other dimension of complexity. That happens and I don’t dispute it.

But the fact that you need >20 raises the question: is open() a good API?

I’m not making any particular judgment about open(), but what constitutes a good file API is hotly contested. So, for me, that example is validation of the principle: here’s an API that’s behaviorally complex and disputed. That’s exactly what I’m suggesting would happen.

Does that help clarify?

kragen · 2025-02-25T16:55:21 1740502521

Yes, open() is a good API. I can't believe you're asking that question! It's close to the Platonic ideal of a good API; not that it couldn't have been designed better, but almost no interface in the software world comes close to providing as much functionality with as little interface complexity, or serving so many different callers or so many different callees. Maybe TCP/IP, HTTP, JSON, and SQL compete along some of these axes, but not much else.

No, 20 test cases is not enough for open(). It's not even close. There are 36 error cases for open() listed in the Linux man page for it.

What constitutes a good file API is not hotly contested. It was hotly contested 50 years ago; for example, the FCB-based record I/O in CP/M and MS-DOS 1.0, TOPS-20's JFN-based interface, and OS/370's various access methods for datasets were all quite different from open() and from each other. Since about 35 years ago, every new system just copies the Unix API with minor variations. Sometimes they don't use bitwise flags, for example, or their open() reports errors via additional return values or exceptions instead of an invalid file descriptor. Sometimes they have opaque file descriptor objects instead of using integers. Sometimes the filename syntax permits drive letters, stream identifiers, or variables. But nothing looks like the I/O API of Guardian, CP/M, Multics, or VAX/VMS RMS, and for good reason.

marcusbuffett · 2025-02-10T03:55:51 1739159751

Yeah as someone working on educational software, anything hand-rolling their own SRS is a pretty big red flag. Beating FSRS is going to be next to impossible, especially FSRS with parameters optimized from your users’ review history.

williamsss · 2025-02-10T14:03:43 1739196223

The app is using the same Anki FSRS algo its open source https://github.com/open-spaced-repetition/fsrs4anki

Agreed would be swimming against the tide rolling a custom one. Anki's is awesome! This app functions much the same just makes card generation MUCH faster than Anki.

jamager · 2025-02-10T09:32:26 1739179946

FSRS is so cargo-culted. It's just an algorithm, claiming that any algorithm can't be improved is ridiculous.

jarrett-ye · 2025-02-10T12:17:06 1739189826

Actually, we have found some algorithms which outperform FSRS[1]. Unfortunately, it's hard to deploy them in user's local device.

[1] https://github.com/open-spaced-repetition/srs-benchmark?tab=...

marcusbuffett · on Dec 5, 2024

Yeah this is a good shot at using existing verbiage, better than the candidates I came up with at least. Still not entirely self-descriptive, and has some overlap with usage in other parts of the codebase, like in data processing and user onboarding, but maybe that's a fine trade-off to make in order to use a normal word. I'd be equally fine with them being called "steps", but now I'm attached to my Keps :D

marcusbuffett · on Dec 4, 2024

Oh god yeah I’ve been in this situation too. Working at Apple it was like “is everyone in this meeting cleared for Tigris?” And you’d be like “I don’t know, what the hell is Tigris”, and then it turns out after you check your clearances it’s just iOS 13, which is obviously no mystery that after 12 there will be a 13. Just call it that!

Nothing was named according to what it did either, I think our deploy tool was Carnival? Just codenames everywhere

euoia · on Dec 4, 2024

In general I completely agree that code name overload is dreadful. But I think in this case, where this pair is so fundamental and used so frequently in the codebase, I think I would probably permit it. Beware codename creep.

boredtofears · on Dec 4, 2024

I dunno carnival seems like an apt name for a deploy tool at some of my past jobs.

marcusbuffett · on Dec 4, 2024

It’s one made-up name in a 30,000 line codebase. I think if anyone else ever works with the project they’ll have a lot harder things to catch up on than this convention

marcusbuffett · on Nov 8, 2024

I gave this a spin, this is the best iteration I've seen of a CLI agent, or just best agent period actually. Extremely impressed with how well it did making some modifications to my fairly complex 10,000 LOC codebase, with minimal instruction. Will gladly pay $99/mo when I run out of credits if it keeps up this level.

marcusbuffett · on Oct 28, 2024

Yeah I'm also sick of seeing articles cite this mythical trade-off, where any increase in programming output must be correlated with being a bad team member, churning out bad code, and generally being a pain to work with.

Anyone that's worked with engineers can tell you that there are simply some people getting more done than others, in the same amount of time. Are there people producing bad code? Yes. But I don't think output is inversely correlated with code quality. In fact the people I've worked with that have the most output, also had some of the highest quality code. I've never experienced this mythological 10x rockstar figure that works alone creating impossible to maintain systems, and I've worked closely with dozens of engineers. He probably exists somewhere, but not with the sort of prevalence that justifies every programming productivity article ripping on his archetype.

kgeist · on Oct 28, 2024

In our team's current project, the engineers who can be described as "10x engineers" are the slowest when it comes to delivery of features. They have been transferred to a legacy project with lacking tests, messy spaghetti code full of bugs. They spend a lot of time adding tests, refactoring the code to be more modular, removing various cruft. It looks like they are much slower than the previous mediocre engineers, and they produce a lot of code, but it pays off: while the userbase is increasing, our bug reports rate per month has decreased by 2x in a matter of 1.5 years. So I think the amount of code output is only part of the equation.

earnesti · on Oct 28, 2024

If a system is impossible to maintain, that system is nothing of value, and therefore doesn't fit to any sensible definition of 10x.

I have always assumed that the 10x means value creation, not some kind of "lines of code" output or other nonsense. And for sure there are 10x programmers, maybe even 100x. I have been mostly working at startups and you see these early stage decisions and designs that create huge costs when the company starts to scale, similarly you have some decisions that might save huge amount of costs during the company life time, or allow better revenue generation etc.

ownagefool · on Oct 28, 2024

> I've never experienced this mythological 10x rockstar figure that works alone creating impossible to maintain systems, and I've worked closely with dozens of engineers.

I've seen this plenty of times.

Recently I was trying to explain the pros and cons of micro-services, and more importantly, microrepos, with regards to automated testing. The lead engineer that said he'd quit if I colocated the tests with the app.

Same place, they replaced all deploy system with argo, but every environment is pinned against main, which means you can no longer test a change without it going to all environments at the same time.

In both cases, the engineers are actually much higher skilled than average and churn out / lead change, but they'd rather be chasing a fad and just don't care if their changes shit on other parts of the SDLC.

earnesti · on Oct 28, 2024

That case sounds more like you being 0.1x developer about obsessing over test automation, where the 1x guy just didn't seem them to be worthwile.

Personally I have had clashes during my career with people who obsess about unit testing way too much in places where it just doesn't add any value (in fact destroys it by requiring lots of work and additional mainteinance). Value in the end is quite a subjective thing so it doesn't make sense to argue that much about it.

ownagefool · on Oct 28, 2024

So have I, and I'm pretty impressed you could decipher that from a single comment, and it doesn't reflect poorly on you at all that you immediately drew such a conclusion from someone refusing to discuss pros and cons of solutions.