Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Notation as a Tool of Thought (1979) (jsoftware.com)
188 points by memorable on July 21, 2022 | hide | past | favorite | 62 comments


Does the notion of suggestability extend to the notion of "creative breakage"? The latter is an important feature of a mathematical notation.

For example the dy/dx notation in calculus naturally leads to inquisitive thinking. Can you multiply by dx? It looks like a division but it isnt, really, except that a soon as you get past Calc 101 you are spraying dys and dxs all over. the notation is just incredibly suggestive.

Another example is exponents where m^n is introduced naively for natural n but instantly prompts the question around non integer n.

i think that having a notation that is suggestive and can be creatively broken to use in new ways is v important


Iverson usually expresses the derivative as an operator, which ties into Heaviside's operational calculus[0] but doesn't have multiplication rules like dx/dy (I'd probably be more on your side here, in favor of dx and dy). I don't think he was too comfortable with mathematical concepts that don't resolve into specific calculations. Which may be why his notation ended up being so easy to turn into a programming language. To me, Iverson's suggestivity would be more about unifying or making analogies between established concepts. For example, APL uses * for exponentiation and ⍣ for repeated function application.

[0] https://en.wikipedia.org/wiki/Operational_calculus


Leibniz's notation for calculus didn't start out as creative breakage, it meant what it said on the tin, dy/dx is an infinitesimal fraction. By comparison the modern notion of a derivative is that of a higher order function of type (R -> R) -> (R -> R). Something that was only made clear with the advent of types in the turn of the last century.

Creative breakage is a bug which tells you to think harder about what you're doing.


> Can you multiply by dx?

Maybe. But can you multiply by ∂x?


"dx" is a term rigorously defined in infinitesimal calculus (IMO a much easier to understand approach to differentiation.) So, yes, you can multiply by dx (even if x is a more complicated function.)

∂x would have a problematic definition, as it would require selecting from its components based on information not supplied. E.g., let x = y+z. Then dx = dy + dz. But ∂x (by extension) = ∂y + ∂z, but at least one of these terms on the right is identically zero, depending on whether y or z is held constant. So ∂x doesn't have a meaning.


This is one book on infinitesimal calculus: https://www.amazon.com/Infinitesimal-Calculus-Dover-Books-Ma... .

However, and this is very amusing to me, it turns out that the process of automatic differentiation (see https://en.wikipedia.org/wiki/Automatic_differentiation, the section on dual numbers) works in exactly the same way. Just replace all of the primed symbols (u', v', etc.) with du, dv, etc. and dual numbers are isomorphic to infinitesimals (if I'm using that term correctly.)


Actually, “dx” is rigorously defined in the standard calculus, too, and yes, you can multiply (and divide) by it.


For the love of anything sacred, can you please point me to a resource that explains all possible operations with “dx” and their conceptual meaning.

Like, I get “dx”, but I cannot put my finger to it!!! This might be because the Precise Definition of the Limit phrases it as “x approaches a”; it is as though we are “sent” to the land of dx, but not told what it is as an atomic concept!


I am not entirely sure what the above commenter means that dx is rigorously defined in "infinitesimal calculus" because I don't know what they necessarily mean by "infinitesimal calculus". As far as I am aware, there is standard calculus, non-standard analysis by Robinson, and smooth infinitesimal analysis that uses intuitionistic logic. The three are very different. dx has no meaning in standard calculus. It is simply there for notation. It is given meaning by the theory of smooth manifolds and differential forms. In that setting, differentials such as dx are given explicit meaning: they are functions that operate on tangent vectors. For example, apply dx to the unit vector d/dx + d/dy to get dx(d/dx+ d/dy) = d/dx(x) + d/dy(x) = 1 + 0 = 1.


> dx has no meaning in standard calculus

Sure it does. There is no need to know about smooth manifolds or differential forms to understand the differential of a function of one variable at a point and the meaning of dy = f’(x)dx.


What is the meaning then?

dy = f'(x)dx is just a definition for notional convenience, primarily employed when doing u or u-v substitution. My point is that dx in single variable calculus is notation. It is not an intrinsic object. dx is an intrinsic object as a differential form on a smooth manifold. Of course, the real line R is a 1-manifold, so dx does have that meaning, but you need to understand what a differential form is to know that.

One doesn't necessarily need the full generality of smooth manifolds though. Harold Edwards' Advanced Calculus: A Differential Forms Approach and Advanced Calculus: A Geometric View teach differential forms for Euclidean manifolds.


I think I get your point, but at the same time I disagree.

1) The differential of a function (at a point), dy, is not notation, it is a concept.

2) The differential of the function y = x, dx, is not, then, a notation, either; and, since the derivative is 1, dx = 1 Δx = Δx = x - x0.

3) You can argue, of course, that using dx instead of Δx in dy = f'(x)dx is "notation," but I think the above shows that it is more than that.


The second book is by James Callahan. I accidentally left that off.


Can you please point me to a resource that explains all possible operations with “dx” and their conceptual meaning.


Any introductory calculus book worth the paper it’s printed on would gladly tell you that the differential of the function y = x at a point x0 is nothing more than x - x0 and that you do not have to think about it as something that is “infinitely small” or anything equally mysterious. (Some would even go as far as saying that “the differential of a function of one variable is a linear map of the increment of the argument.”) So, with dx = x - x0, you can do with it anything you want, even divide by it (assuming that dx stays non-zero).


Thank you for asking this. This plagued me for years as an undergrad physics student.


Yesterday, I posted this 1974 interview with Iverson and the group who developed APL.

https://news.ycombinator.com/item?id=32173840

I started down the APL/J path last week after watching this APL study group with Jeremy Howard of fast.ai:

https://youtu.be/CGpR2ILao5M

https://forums.fast.ai/t/apl-array-programming/97188


If you haven't seen it, I wrote an intro to modern APL you might enjoy if you're starting out: https://xpqz.github.io/learnapl


This was the blog post that got me introduced to APl. It’s a good intro!


If you enjoy this, you might find this meta list on "notation and thought" interesting: https://github.com/k-qy/notation


I found this by Knuth on Iverson's notation interesting: https://www.maa.org/sites/default/files/images/images/upload...


Very interesting. I did never really look at the "Concrete Mathematics" book, so I missed this take by Knuth on turning a formula F into a term [F] by defining it as 1 if F is true, and 0 if F is false. Note that this cannot be done in first-order logic, as a formula cannot be part of a term. But it is not a problem in simply-typed higher-order logic, for example, and it is not a problem in abstraction logic, either.


Thank you so much for this!!

The bit on "avoid ambiguity, or introduce useful ambiguity" is especially fascinating.


I loved APL for decades but realized a while back that thing I appreciated most about the language wasn't how it deals with arrays but rather how well it supported the easy composition of functions -- and that primed me well for languages that do that sort of thing even better.

And yet I miss many things about APL when coding in modern functional languages. Specifically, having the shapes of arrays being a concept that's distinct from both their types and from their values is something that I can't reproduce in Haskell.


Can you elaborate a bit more what you can do with shapes that's not easily replicated in other languages?


I love the idea that notation matters for thought. I remain unconvinced that APL is a particularly good one, but I have definitely adopted "tool of thought" as a core productivity concept. For example, a plain text file, Google doc, or Trello board are all decent tools of thought, while JIRA isn't.


Perceptual & cognitive ergonomics


Compactness of notation is important, and verbosity of code (which I think became widespread since the advent of Java) only serves to hurt understanding. Given the right context, mathematical notation is easy to understand (and it often “computes itself”). Mathematical texts use the “literate” style which unfortunately has not found its adoption in software industry, even despite the fact that prof. D. Knuth has been advocating it since the 1960s…


I find that terseness has a real downside when debugging code. If you need to get down to the level of what is actually executing, having to unpack all that compact code involves many more things than I can keep in my short-term memory.

Compactness is great for things that are true and work, but when there's a bug in there somewhere, terse code requires a lot of scribbling on paper.


Mathematicians use very terse notation in formulas, but accompanied by a lot of natural language text. The equivalent in programming would be terse code with long comments and documentation.

Many programmers instead see self-documenting code as the ideal outcome: maybe not very compact, but virtually free of comments (and with documentation at least partially autogenerated).

In reality, successful open-source projects tend to have many comments in the source code. Often not one-liners, but detailed descriptions of functions, their arguments and algorithms, motivation for the choice of the implementation and so on.


I used to believe this but i don't anymore.

From https://github.com/tlack/b-decoded

Arthur is famous for his very dense programming style. Most C programmers would scream when seeing this code.

In his view (and others in the terse scene), it is much better to have everything in your application readable on the screen at once than to have great names for things or a lot of white space to comfort the first timer reader.

To them, once you've sufficiently studied that screen or two of code, you can understand all of it at the same time. If it's spread out over thousands of files, it's very difficult to understand all of it, which leads to bugs, unnecessary abstraction, and the need for advanced tooling just to work with your own project's code.

He wants to see the code "all at once" so he can understand all of its behavior without paging around and shifting his focus to another tab, window, etc. To get there he makes a lot of tradeoffs in terms of the code formatting and naming conventions. He also, in b, creates a dense set of interlocking macros and abstractions that can make the code very hard to follow.

Critics and the uninitiated say that his code is like old school modem line noise: random punctuation intermixed with bits of understandable code. I would suggest that he's actually quite careful with the abstractions he chooses and they are actually not always the most dense, highly compressed code structures available to him. He chooses wisely and his code rewards deep study.

Interview with Arthur Whitney: https://queue.acm.org/detail.cfm?id=1531242


I looked at the two-line code sample at the link, and it is extreme.


Yeah, Arthur Whitney's code is extreme (he is one of a kind) but the sentiment is something i wholeheartedly subscribe to.

The key point is this; once you've sufficiently studied that screen or two of code, you can understand all of it at the same time. If it's spread out over thousands of files, it's very difficult to understand all of it,

Because there are so many interlocking concepts in code you have to keep as much as possible in your head to build up the entire picture. This is where concise, terse and direct-to-the-point code shines; nothing gets in the way of putting all the pieces of the jigsaw puzzle in front of you so you can "get" everything at a glance. A good example is K&R C style espoused in their book which i used to find difficult in the beginning but now understand. Always put as much of relevant code as possible into one screenful.


This a case for more powerful debuggers that can unroll code for you.


Underwhelming article. The title is so powerful though, that this article makes the front page of HN quite often.

The thing is, notation is always a tool of thought. No exceptions.

Take two matrices A and B. The notation AB for the matrix product is a great thinking tool. The matrix product is so strange when first seen, but the fact that it is introduced as a product, with the same notation as the product notation for scalars, makes it so much easier to grok. Imagine that we'd had used A@B, like in Python (numpy).


There is GNU APL which is a decent 100% free APL, for anyone who wants to try it out. The Emacs mode for it allows you to pop up a keyboard with the symbols; as well, there is a "." mode, where for instance .i becomes the iota operator.


Eh? This line of the appendix:

> Boolean: ∨ ⍱ ~ (and, or, not-and, not-or, not)

In some places in the pdf the notation looks sketchy. But this line demonstrates beyond any doubt that the pdf is missing chars.


There is an "Errata" section at the bottom of the document. One of the entries:

> In Appendix A, the list of boolean functions should be ∧ ∨ ⍲ ⍱ ~ instead of ∨ ⍱ ~


Thanks. I've asked someone to fix it.


So obviously this gave birth to the hugely popular kdb. And now shakti is around the corner.


Among all languages, APL has the highest ratio of times its founding document is posted to Hacker News to lines of code written.

I don't mean this to be terribly dismissive: I've always been "tangentially fascinated", like I think a lot of people are, by APL and Forth. But I've never properly used it because ultimately it's in conflict with how I think programs should be written: with types, abstraction, a focus on readability etc.


To be fair, "lines of code written" is a particularly poor choice of denominator when discussing APL. But I think you're also unaware of historical usage. There are some very large (I believe close to a million lines) financial applications still in use, and far more applications for ordinary tasks like scheduling, payroll, and other administration made for companies and universities in the 80s.

Now that the ACM has made them free to access, papers from the APL conference are a good place to look to get a sense for this; APL79 was the first really huge one: https://aplwiki.com/wiki/APL_conference#1979


I'm aware of kdb+. It has an SQL interface and clients in multiple languages. How many of their users use the APL interface? And how many financial companies are planning on building new products on top of a technological dead-end from the 1960's?

There was a time when APL was taught in universities, and some computers came with APL-specialized keyboards. That time came and went because the bulk of mainstream software development went in a wildly different direction.


For clarity, I'm talking mainly about SimCorp[0], which is a user of Dyalog APL, not kdb. The core of Dimension uses APL (with many other parts in C#), and there's a separate APL product bought with APL Italiana as well. They employ hundreds of APL programmers, preferring to hire math and physics majors out of college because they're easier to teach. Yes, SimCorp probably wouldn't start with APL today, and I wouldn't give it a second thought either if I were trying to start a big business. I pointed to the large codebases because they're easier to quantify, but I'm sure there are millions of lines written by hobbyist developers.

I know you're not trying to flame, but what you've given us so far is a false claim that nobody writes code in APL, and a bare assertion that APL is a bad way to program. On that note, you're trying to say a language you've never used doesn't allow for "abstraction" or "a focus on readability", which I think is dead wrong. All of which is at best tangentially relevant to Iverson's claim in the paper (not APL's founding document, by the way; that would be A Programming Language) that APL's notation is a way to enhance your thinking. APL has good and bad aspects, and there's a lot to discuss, but can it please be a discussion instead of these careless remarks?

[0] https://en.wikipedia.org/wiki/SimCorp


Plenty. The sql-like interface (in Q) is mostly syntactic sugar; the functional style takes you much further. That said KX is making a huge push on ease of use and front end where you don’t necessarily need to use Q.


FWIW APL is pretty influential to Matlab which is the inspiration for Numpy, etc. I think the trouble is that the authors are right about the importance of notation but they were also making the first pass and they made some choices that proved to be off in practice.


If you're familiar with both modern APL and the "Iverson ghost" NumPy, you'll likely find the latter a frustratingly crippled experience.


Fair. I don’t think anyone transitions from Matlab or Julia to Numpy and feels good about it, but the broader ecosystem advantages are decisive there.


I suspect a lot of the casual fascination with APL these days comes from frustration with Numpy. In the same way that nobody had anything good to say about Ada until C++ and Java were firmly entrenched.


I am an APL-fan specifically because it's like a parallel universe where evolution took a different path: what are often considered anti-patterns in main stream languages is best practice in APL. It's a welcome relief from the strait-jacket of "the Zen of Python". No libraries? It's a feature. Single-letter variable names? Of course. Terseness as a virtue? Oh yes. Right-to-left flow? Why not? Precedence levels? Who needs them? Tacit? Bring it on. APL is easily the most productive tool in my chest.


What kinds of tasks have you used APL for?


I use APL for any task I previously used Python for, involving grabbing some data typically via a json-over-http API, massaging, aggregating and otherwise combining it to produce summaries or reports. I've gradually rewritten my bunch of scripts and code I use daily from python to APL, and seen a 10-100x reduction in code size, and usually a significant speed-up (admittedly, Python is a low bar here).


It would be trivial to have a typed forth, the reason why no one bothers is because at the micro controller level all data is the same datatype: bits.

What helped me understand and become productive in forth was that you had the normal commands which manipulated values and are in every language, and a meta-language for manipulating the stacks. Once I made the stacks dance forth became just another language.


Does anyone else have a long list of dense material like this that you have an intention of going through someday? Do you ever end up actually doing it?


In the aftermath of a long prostrated war with tabs on browsers on tabs (bookmarks for me I know it's where links go to die, often literally so) and figuring I rather work analog than digital, I came to a system to debrief and save references.

I always save the HN thread when applicable rather than the link, because it gives me short hand notation and normalization for writing with a pen. Now in a paper notebook I have a few pages where each line is a theme, a word etc. Next to it write the HN ids (can as well save a comment thread; superbly useful) to reference under that theme. Done. If particularly useful I can write few words in small print over the id detailing more the subject of a link.

This is however done for a set purpose, as reference material for writing speculative and science fiction (HN is great for ideas and research) and also articles. When and whether I finally visit a link is staked on a piece and theme ever getting picked by me for writing; I find this is a great compromise and more realistic than simply hoarding.


So long. Bookmarks, pocket articles, org-roam notes, things jotted in notebooks, I even have an org capture template that quickly populates a list of "recommended media" (books, movies, etc) from friends and relations. Maintaining my list of things to read or watch has become a hobby unto itself.

Every once and a while I go through some[1] of it and ask myself "Will I ever actually look at this"? Sometimes there's a clear answer, but more often it stays in the list because it still looks interesting and I only consume maybe 5-10% of what I saved. There's got to be some term for this. Digital hording? It makes me anxious to have it and anxious to just delete it. There are plenty of times I do remember something I saw and wish I could find it again, but I lean too hard into "maybe I'll think about this again and want it".

[1]: It's so long I generally lose the will to even evaluate the items after a while.


Funny you mention that, cause I just added this article to a longer-ish list of long-form posts I want to read in depth.

I don't make my way to that list as often as I'd like, but I have found plane trips and other similar times are great for when I want to do something like that. Sitting down with a longer blog post and taking notes on a 5 hour flight is oddly relaxing. So yes, to answer your question, I do get around to it eventually when the post is worthwhile.


I keep such a list in Roam, and usually neglect to pluck things from it. But it's more that when I go to add something and it's already there or has some relation to something already there (e.g. same author), I'll often give it more attention immediately. It's not the best system, but I suspect it's better than never writing it down :)

(This perhaps a bit of an ADHD-specific tactic)


"Pluck" is a wonderful verb for this action of picking from a curated list. Thanks, I will use it from now on.


I've been experimenting with a search engine/personal assistant to index the long list of material for me, and feed back snippets/articles as I want. I guess the analogy is having a person "read" the material for me and use it to answer general questions I have/point me to the article.

If you want to try it for yourself id be happy to give you beta access. I'm still experimenting for the best UX.


One way to go is through a text to speech reader so its the kind of thing you can listen to standing in line at the bank. Speechify is a good service for bookmarking into a listening list.

Ps YMMV a lot based on figures code snippets and anything illustrated so not for all content including the featured article


Tried Speechify on my phone and found it terrible. Hated the navigation and found the pricing model ridiculous. Deleted it and stuck with Voice Dream Reader. It’s the first app I install on every iPhone, and where I keep all my reading - articles, ebooks, and even physical books which I scan just so I can read them inside VDR. I can read with my eyes. I can listen to the text while doing other things. And I can take highlights and notes and export them.

This also is my answer to OP. I save these articles to VDR. And when I have time to read or listen to something, I open the app. It helps that VDR shows me the length of each document in terms of reading time. When you ask me how long a particular book was, I’ll say “It’s a 12 hour book”. Really helps to put things in perspective.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: