I was hoping we moved beyond this "clean code"-ish nonsense of functions having to be short.
Quality of a software design, maintainability, etc, have virtually no relation to length of functions and the most respectable software out there contain functions hundreds if not thousands of lines of code long without being impacted by its own weight.
To add to this, there are fundamental information-theoretic principles that support inlining code and components. It's about reducing code entropy, reducing length and referential distances. https://benoitessiambre.com/entropy.html
The good thing is LLMs try to optimize for information theoretic measures of language so they naturally generate better scoped more inline code. LLMs might help us win this battle :-)
Eh, hard disagree. (Though I didn't downvote you, since it's silly to downvote out of disagreement...)
Readability, while subjective, plays a large role in software maintenance. Developers will be more reluctant to change code they don't understand, and more likely to introduce bugs. "Long" functions require following large blocks of code that work within the same context, and relying on comments to explain functionality, which could be wrong or outdated, rather than descriptive function names.
Also, "long" functions are typically difficult to test. They either require complex setup and mocking, or are only tested under very specific conditions in end-to-end tests, if at all. Chances are that only users are actually testing them, which is not a good place to be.
We can argue whether "AI" tools help with this or not, but while humans are still reading and writing code, following standard conventions of keeping functions relatively "short", "clean" (whatever those terms mean for you and your team), and with a single purpose, makes maintenance and testing easier, and, in turn, produces more robust and higher quality software.
You break up things when it makes sense, not for the sake of it.
Having to jump out of the code you're reading comes with its own downsides and tends to compromise maintainability where you are increasing the shallowness of your code (same functionality, but higher api surface).
You break up things when there are benefits to breaking them and Unix provides a very sensible reference to this topic where plenty of syscalls run generously in the thousands or tens of thousands of lines.
Stanford professor Jon Ousterhout (among other things the author of the Tcl language, the Tk framework, the Raft consensus algorithm and many other things) has an entire paragraph in his book "A philosophy of software design"[1], on why the argument "functions should be short" is short sighted and should never be taken at face value.
> You break up things when it makes sense, not for the sake of it.
I never claimed otherwise.
> Having to jump out of the code you're reading comes with its own downsides and tends to compromise maintainability where you are increasing the shallowness of your code (higher api surface).
I don't buy this argument. The code you're reading should do one thing according to what it says on the tin (the function name). When the code does something else, you navigate to that other place (easily done in most IDEs), and change contexts. This context change is important, since humans struggle with keeping track of a lot of it at once. When you have to follow a single long function, the context is polluted with previous functionality, comments, variables, and so on, not unlike the scope of the program at that point. If you're changing the code, it becomes easier to shadow a previous variable, or to change something that subsequent code depends on. Decomposing the large function into smaller ones avoids all of this.
As well as aiding in testability, which you conveniently ignored from my previous comment.
The criteria for determining what is "short" and "long" is subjective, of course, and should be determined by whatever the team collectively agrees on. But there should be some accepted definition of these.
> Stanford professor Jon Ousterhout
Eh, I'm not swayed by arguments from authority. Jon's opinion is as valid as mine or yours.
If you are breaking something up for "long" and "short" you're optimizing for the wrong thing. You don't care about code being short for its own sake or long for its own sake right?
Ultimately, you're going to revisit this code to make the change after some time passes. Is it easy to follow the code and make the change without making mistakes? Is it easy for someone else on the team to do the same?
Sometimes optimizing for "easy to understand and change" means breaking something apart. Sometimes it means combining things. I've read that John Carmack would frequently inline functions because it was too hard to follow.
So, rather than whether something is big or too small, I would ask whether it would be easy to understand/change when coming back to it after a few months.
Put another way: why not optimize for the actual thing you care about rather than an intermediate metric like LOC?
> If you are breaking something up for "long" and "short" you're optimizing for the wrong thing. You don't care about code being short for its own sake or long for its own sake right?
You're misunderstanding. Code is not broken up because it's "long". It's broken up because it is difficult to comprehend and maintain, and its length is one criterion that might signal that to be the case. Another sign is cyclomatic complexity, which is another arbitrary number left for teams to decide how to use best.
The main topic, and why it is so widely argued, is that readability and maintainability are entirely subjective concepts that are impossible to quantify. This is why we need some specific guidelines that could point us in certain directions.
This doesn't mean that these guidelines should be strictly enforced. I've often decided to silence linters that warn me about long functions or high cyclomatic complexity, if to me the function is readable enough, and breaking it up would be more problematic. This is open to interpretation and debate during code reviews, but it doesn't mean that these are useless signals that developers should ignore altogether.
> and its length is one criterion that might signal that to be the case
You seem to be the one misunderstand it.
It's just not. Function length is not a useful metric, at all. The probability of some problems increase with length, but even then it's not the length that will tell you if your code has a problem or not.
If you have length guidelines, your guidelines are bad.
And, yeah, cyclomatic complexity is almost as useless as function length. If you have warnings bothering people about those, you are reducing your code quality.
John has had meaningful impact to computing from algos, operating systems, programming languages, frameworks, etc.
Fowler has written *nothing* anybody ever cared for. Nothing.
Ousterhout has taught Software Design at Stanford where he's had students implement the most diverse software and finding the limits of his own theories and having them questioned rigorously year after year.
I was hoping we moved beyond this "clean code"-ish nonsense of functions having to be short.
Quality of a software design, maintainability, etc, have virtually no relation to length of functions and the most respectable software out there contain functions hundreds if not thousands of lines of code long without being impacted by its own weight.