_wp_'s comments

_wp_ · on April 16, 2022

There's quite a bit of overhead in execing and piping data from ffmpeg. If performance is a concern, then using the C ffmpeg API is a better bet(see pyav for a good example of this) . Cool project though!

_wp_ · on June 30, 2021

I strongly agree with your first two points. Good naming supplants the need for commenting. I find diagrams a far terser communication tool of architecture than comments.

matthiaswh · on June 30, 2021

The book has an entire chapter called "Choosing Names" and another titled "Comments Should Describe Things That Aren't Obvious from the Code". Yes, good naming supplants the need for many comments, but the author goes into much greater detail about when and why comments are helpful even when you have given deep consideration into naming things.

smichel17 · on June 30, 2021

I wish we had better tooling for showing the git blame inline like comments. I'd rather put a one line comment "read the commit log for this line" which some editor could inline or pull up quickly than litter the source with prose (but either is better than nothing).

nonameiguess · on June 30, 2021

Some code is intended to last, potentially a very long time, and has to be separable from history metadata. If for no other reason, git trees can get really big. I had a large monorepo at an old job with some code dating back to the mid 80s and it took 10 minutes to clone because of all the history. When we finally gave up on Kiln, years after its own developers abandoned it, we migrated only the code and not the history. After that, you could clone the entire repo in 10 seconds instead of 10 minutes.

They key is you want to be able to do something like that without losing crucial information. So anything that absolutely has to be there to understand what code is doing should be directly embedded in the code, that is, it needs to be a comment, not a commit message.

TeMPOraL · on June 30, 2021

Except the same belief in "self-documenting code" that makes people ignore the need for comments, also makes them ignore the need for writing proper commit messages. Git blame won't save you, if every commit message is just a one-liner like "fix foo in bar".

Viewed from the other end: commit messages are essentially comments over changesets. If you write those well, you can use the same approach to write good comments for your types, functions and modules.

See also https://news.ycombinator.com/item?id=27009308 on what I consider to be good style of commit messages (scale up or down, depending on the size of your commits).

_wp_ · on June 30, 2021

I bought this book off the back of the referenced discussion trashing Clean Code. Whilst Clean Code has its problems, well articulated in that previous discussion, I am loathe to recommend this book. A large part of it is dedicated to commenting practices and seems a bit out of touch with the way software is developed today. There were some rather dubious claims on TDD as well, suggesting that it aims to 'get features working, rather than finding the best design' which seems to completely ignore the refactoring phase practised in a TDD cycle. A choice quote about comments that I strongly disagreed with: "without comments, you cannot hide complexity". The book also strongly advocates for large classes and considers small ones an anti-pattern called 'classitis'.

I'd say half the book contained good advice, the other half was mediocre or dubious at best.

I'm curious to hear what others think who've read both books.

Chris_Newton · on June 30, 2021

I’ve read both books. I consider Ousterhout’s to be one of the better recent books on software development, though as mentioned in my other comment, this is more because of the earlier content than the later chapters. I have been critical of Clean Code since before it was cool and I actively recommend against junior developers reading it.

I would have liked to see Ousterhout make a more thorough argument if he was going to criticise TDD. His central criticism — that TDD results in what he calls tactical programming, prioritising the implementation of specific features over good design — is certainly defensible. However, I think he was too superficial in what he actually wrote on the TDD section, and consequently I don’t think he made a particularly convincing connection with the ideas developed earlier in the book.

I think you’re slightly unfairly misrepresenting his position on large or small classes. He makes a solid case that what he calls deep modules are better for managing complexity than shallow ones. He also identifies a correlation with size because small modules tend to be shallow. That’s not the same as arguing for large classes or against small classes just because of their size, though.

shakezula · on June 30, 2021

I think you're referring to the part in the book where Ousterhout says he doesn't like to start with a test harness when writing a new abstraction or feature like some advocates of TDD would. I think that was some of the best advice in the book in my opinion.

Instead, Ousterhout recommends designing the interface for the abstraction you're building before you start writing a test harness for it, and I can't agree enough with that statement.

If you write a good interface, testing it will be easy. If testing the interface isn't easy, then you have a bad interface. (This is relative to the complexity of the code invovled, obviously)

verinus · on June 30, 2021

I am just in the middle of the book but so far I also have mixed feelings:

+ thoughts on complexity and how constant addition of new features adds to complexity + deep vs. shallow modules but at the same time...

- "..classitis": author criticizes the use of many small classes and functions/methods while I think form my experience SOLID principles are there for a reason- every method/class should have one purpose only.

- which leads me straight to my second point of critique so far: nomenclature. for several ideas exist established names already that are not used in the book.

civilized · on June 30, 2021

>-"..classitis": author criticizes the use of many small classes and functions/methods while I think form my experience SOLID principles are there for a reason- every method/class should have one purpose only.

I don't understand how you can be so confident that SOLID means you should have many small classes and functions/methods. The question of when we should carve off a piece of reality (natural or artificial) and call it one thing, or say that it does one thing, is an ancient philosophical question with no single right or easy answer.

bvrmn · on June 30, 2021

The book's design approach suites TDD perfectly. Narrow deep modules allow to freely refactor internals and keep tests green.

_wp_ · on June 30, 2021

I didn't say that the book's approach contradicted TDD, I'm merely quoting from the book and refuting one its claims (that TDD doesn't lead to good design). I agree that narrow and deep modules support refactoring internals if their unit tests are written to treat them as black boxes.

bvrmn · on June 30, 2021

TDD in classical form (understood by most devs) aka "one test per function/method" leads to poor design indeed. There is a little training about why this approach couples code with tests and what to do instead.

rileymat2 · on June 30, 2021

Yes, that would make for both terrible design and terrible tests.

I think sometimes people refuse to go past the words naming a practice or past the tldr; and it causes problems.

_wp_ · on Nov 16, 2019

> To use people's random images for training, they would have to be manually annotated by a human (e.g. facial boxes, eyes, nose, mouth, ears drawn in).

That's not true. There is a large and growing body of research on semi-supervised, self-supervised, and unsupervised learning that can take advantage of these unlabelled images.

arghwhat · on Nov 19, 2019

Different learning techniques have different applications. I do not believe those techniques are applicable to the hypothetical use-cases of this dataset.

Perhaps semi-supervised could be utilized, which reduces the required annotation by some factor k, but still leaves it as a function of the dataset.

Self-supervised basically replaces human annotation with machine annotation, making it only applicable to a small subset of tasks in which this is possible (e.g. you could train "guess time from picture" using EXIF timestamp).

Unsupervised is only applicable to very specific tasks.

_wp_ · on May 8, 2019

Allegedly around $200 source - https://www.omgubuntu.co.uk/2019/05/pinebook-pro-video-demo

pier25 · on May 8, 2019

That's a really nice price point.

_wp_ · on March 13, 2019

The Atavist magazine version of this is available at: https://magazine.atavist.com/the-mastermind It's a fantastic (but long) read, I highly recommend it.

_wp_ · on Jan 11, 2018

Could you provide an equivalent Makefile to the one written in the article to demonstrate best practices?

utborin · on Jan 11, 2018

I am by no means a make expert but this is how I would write it:

  CC = clang
  CFLAGS += -Wall -Wextra
  
  libs = -lm
  objs = hellomake.o
  
  hellomake: $(objs)
  	$(CC) $(CFLAGS) -o $@ $(objs) $(libs)
  
  .PHONY: clean
  
  clean:
  	rm -f hellomake *.o *.d
  
  deps := $(objs:.o=.d)
  
  %.d: %.c
  	$(CC) $(CFLAGS) -MM -MF $@ $<
  
  -include $(deps)

The first two lines tell us which compiler we're using and set up some extra compiler warnings. Then we specify which libraries we use (it's a matter of taste whether this is a variable or just spelled out later) and the names of our object files.

The first target, hellomake, is our executable. This is the default target, so it's the one that'll be built if we run make with no arguments. It runs the C compiler with the flags we specified earlier. Its dependencies are the object files, which means they'll be built automatically and rebuilt when their dependencies change. As noted, there is a built-in rule to build the object files from the C source using our CC and CFLAGS. We don't need to tell make how to do that.

The next target, clean, is a "phony" target—it doesn't refer to a file. Normally a make target tells make how to build a file. A phony target, in contrast, is really just a shell script that doesn't produce any file output. This one deletes all the build products.

The remaining lines set up the dependency tracking for headers. This makes it so that a change in a header file will trigger a rebuild of any source that includes it. We begin by making a list of files by copying the list of object files and swapping in ".d" for the ".o" file extension. Next is a rule that tells how to build a .d file from a .c file. This rule runs the C compiler and tells it to output a dependency tracking file. The last line says to include each of the .d files in the Makefile. (The .d files contain Makefile syntax to specify the header dependencies.)

I omitted the IDIR and ODIR and such because in a project of this size I wouldn't bother putting includes and object files in separate directories from the source.

I hope that helps. I know make can seem outdated and weird but I think it really can be simple once you learn a few things about how it works. Hopefully this will get you started in that direction!

exDM69 · on Jan 11, 2018

Put -MMD to CFLAGS and you can drop the rules to build dependency files. The compiler will then generate the deps files when compiling.

Don't use wildcards in clean target. Use $(RM) $(OBJS) $(DEPS) instead.

jwilk · on Jan 11, 2018

> $(CC) $(CFLAGS) -o $@ $(objs) $(libs)

Please honour LDFLAGS and LDLIBS here.

> $(CC) $(CFLAGS) -MM -MF $@ $<

Please honour CPPFLAGS here.

_wp_ · on Oct 21, 2016

The repository is here: https://gitlab.com/open_nsfw/open_nsfw.gitlab.io/tree/master

_wp_ · on Sept 16, 2016

I don't think that's particularly fair on Julia. I have found her posts very useful, even the ones written in this style. Documenting her journey from newbie to proficiency is very useful to others wanting to learn the topic in question as they can relate to the questions in posts like these and see whether she found answers later on in her archive.