Hacker Newsnew | past | comments | ask | show | jobs | submit | kbr's commentslogin

NNs have complex non-convex loss functions that don't admit a closed-form solution. Even for small models, it can be shown that it's an NP-complete problem. In fact, even for linear regression (least squares), which has a closed-form solution, it can be computationally cheaper to run gradient descent since finding the closed form solution requires you to calculate and invert a large matrix (X^T X).


Which in some sense is intuitive: any closed form that can model general computation to any significant degree should be hard: if it weren't, you could encode your NP-complete problem into it, solve it in an efficient closed form, and collect your Fields medal for proving P = NP.


Intuition is often wrong, even for high IQ people, like your average HN user. lol.

For a long time it was intuitive that you cannot find the area under arbitrary functions, but then Calculus was invented, showing us a new "trick", that was previously unfathomable, and indistinguishable from magic.

I'm just not sure mankind's understanding of Mathematics is out of new "tricks" to be learned. I think there are types of algorithms today that look like the require N-iterations to get X-precision, when in reality we might be able to divide N by some factor, for some algorithms, and still end up with X-precision.


> I'm just not sure mankind's understanding of Mathematics is out of new "tricks" to be learned.

This is my opinion also as it relates to AI/ANN. Things I read about how scientists see the brain shifting due to learning (minimum energy of network type stuff) seem like the brain has some functions figured out that we haven't identified yet.

Maybe it's math already fully understood just not applied well to ANN's, but maybe there's some secret sauce in there.


One reason to believe there's even new low hanging fruit (that doesn't even require new math) is how simple and trivial the "Attention Heads" structure of the Transformer architecture really is. It's not advanced at all. It was just a great ideal that panned out that pretty much any creative AI researcher could've thought up after smokin' a joint. lol. I mean someone could do trivial experiments with different Perceptron network structuring and end up revolutionizing the world.

I think things are gonna get interesting real quick once LLMs themselves start "self experimenting" with writing code for different architectures.


Thanks for that great clarification. I had seen all those words before, but just not in that particular order. haha.

Maybe our only hope of doing LLM training runs in a tiny amount of time will be from Quantum Computing or even Photonic (wave-based) Computing.


As a high schooler interested in things that are probably out of my league, I love reading this kind of stuff. The author mentions their unconventional style at the beginning: hyperlinks, color, italics, exclamations, parentheses, side notes, etc. They address your thoughts and craft their explanations around them instead of being terse and optimizing for length.

The author's blog [1] has more of this style and has really helped me get deeper into category theory. Without it, I wouldn't have even begun to try and understand things like the Yoneda lemma (which still remains a mystery to me).

Does anyone know any other papers or resources with this style? Maybe something like "explorable explanations" [2]?

[1] https://www.math3ma.com/

[2] https://explorabl.es/


Not addressing your question, but the Yoneda Lemma is kind of a charlatan.

On first reading, it seems magical and deep, but once you grok the proof, it feels like a relatively trivial observation. The whole thing is just about arrow composition!

In a way, once you're on the other side, the Yoneda Lemma feels a bit like a checkpoint during the accimatization period where your brain gets used to thinking in categories and commutative diagrams.

Not quite sure it's exactly the style you're talking about, but Bartosz Milewski has a nice series of video lectures going all the way from nothing up to Coend stuff. IIRC the series gets to the Yoneda Lemma somewhere in the second series:

https://invidious.snopyta.org/channel/UC8BtBl8PNgd3vWKtm2yJ7...


> The whole thing is just about arrow composition!

There's much much more to it.

For example, a version of the yoneda lemma also holds for metric spaces (instead of a set of arrows between to things, you simply have a number indicating a distance between two things).

Here's how I like to think about the yoneda lemma:

If you have some kind of objects you want to talk about, one way to do this is by relating these objects to each other. Once you have established such a "method of discourse" (i.e. a way to talk about how your objects relate to each other) the yoneda lemma tells you:

1. You can forget about the inner structure of your objects; Everything is already contained in your "method of discourse".

2. Inversely: The choice of a "method of discourse" severely limits what you can say about your objects.

3. There is no spoon: To understand how you can escape these limitations you need a "method of discourse" for "methods of discourse".


I like your example and perspective. I would just add on that it is not facile to do this for metric spaces, but leads directly to constructive solid geometry [0], where we render images of complex solid objects by exchanging the object for a signed distance function [1], a function which indicates how far the object is from any point in the space.

[0] https://en.wikipedia.org/wiki/Constructive_solid_geometry

[1] https://en.wikipedia.org/wiki/Signed_distance_function


> a number indicating a distance between two things

That *is* a arrow. Arrow composition is adding up the distance along a path.


The correct term would be "composition in an enriched category". The morphism objects in this context are usually not called arrows.

I think the other poster meant "elements in the set of morphisms" when they said "arrows".

The difference is the following: Metric spaces "are" categories enriched over the real numbers, ordinary categories are categories enriched over the category of sets. So in one case the morphism objects are sets while in the other case they are real numbers. "Arrows" then refers to something internal to the morphism object. The distance (a real number) between two points in a metric space does not have any internal structure.


That almost describes discourse about politics and social issues, too.


"On first reading, it seems magical and deep, but once you grok the proof, it feels like a relatively trivial observation. The whole thing is just about ____ composition!"

I think if you replace _____ with the right word almost every result I've seen in my (albeit somewhat limited) exposure to category theory can be described this way.

(not that that detracts from your answer!)


Well yeah, because category theory studies composition. That's what it's for. That's why a category is defined the way it is: a bunch of objects, identity arrows from each to itself, and arrows between them, which enjoy an associative composition operator.


Yoneda just says that we may exchange an object for all of the arrows which point to (dually, from) it. This is extremely deep; it is rather surprising that objects and arrows would have such a duality or exchange!


+1. The lemma is trivial not because the result isn't deep but because we have the right definitions.


This 'triviality' issue has reminded me of Grothendieck's "two styles in mathematics":

http://www.landsburg.com/grothendieck/mclarty1.pdf


Not exactly unexpected conceptually though, given that objects have no intrinsic structure. What may be surprising is that this fact has useful consequences in applications to concrete mathematical objects that do have structure.


I don't have any resources to share on this, but I was also interested in this type of thing at your age (still am). If it interests you, never assume something is out of your league.

You have time on your side. Sustained effort compounded over time is unreasonably powerful. Keep pushing!


> Sustained effort compounded over time is unreasonably powerful.

Thank you, this hits close to home and is great advice, I appreciate it.


90s-ish videogame lore, “Time is an ally, not an enemy. Patience can sharpen even the smallest of efforts into a weapon that can strike the heart of an empire.”

On my cube walls, whenever I get back to my cube: “magic is just cleverly concealed patience.” The approximate η-expansion: you can generally figure out what professional magicians do, you just dismiss it, “I mean of course she coulda known what card I was going to pull if she bought 52 decks of cards and assembled an entire card made out of the Jack of Spades... but that’s expensive and difficult and who would do that.” Penn Jillette puts the same point a different way, saying that he is going to have to perform the same trick over and over again and so it cannot be 99% likely to not injure him, that is far, far too low, he has to be 100% certain his tricks are safe. If he is firing a nail gun at Teller’s neck it must surely be incapable of firing a nail at Teller, and if he waxes on about how much he has memorized the pattern of the nails in the gun then he must be cleverly concealing whatever patience went into making that nail gun and the illusion that it is loaded.

But the key is that once you see this one place you see it everywhere. A restaurant works by the same magic. You say what you want and it magically appears before you. How did that happen? Prep work for hours in advance of the meal, so that when the time comes I just need to put tab A in slot B. Like a beef wellington; pre-cooked steak wrapped into a nice little bundle that just needs to be finished in the oven for 20 minutes and sliced. Something to think about next date night—you can impress your date immensely if you make the wellingtons the night before, so that your focus is on your lover and not on your cutting board.

You don’t usually think about the prep work that went into the Mother of All Demos, but you should. Sometimes when Alan Kay shows a clip from there, he mentions something like, “you might notice that they get sub-second latency on all these operations, but the terminal was here and the computer was over there so I like to ask students how they got sub-second latency on all these operations, and only one time have I got the right answer, one student said ‘because they wanted sub-second latency?’ and that is absolutely correct, they-goddamn-wanted-subsecond-latency, these things are there for you if you really want them.”


https://betterexplained.com if you haven’t been through it yet.

Already often quoted on HN, and some discussions are interesting with the author bringing feedback. https://hn.algolia.com/?q=https%3A%2F%2Fbetterexplained.com


You may like the books Conceptual Mathematics: A First Introduction to Categories, Creative Mathematics, Advanced Calculus: A Differential Forms Approach, and Geomtrical Vectors.

https://www.amazon.com/dp/052171916X/

https://www.amazon.com/dp/0883857502/

https://www.amazon.com/dp/0817637079/

https://www.amazon.com/dp/0226890481/


Not exactly what you asked for, but this explanation of the Yoneda Lemma is amazing: https://youtu.be/h64yZs8ThtQ


If you’re interested in machine learning I can recommend Christopher Olah’s blog: https://colah.github.io/


Although they might be very out of your league depending on what your interests are, Anthony Zee has a similar writing style (in print)


I think you might be describing refinement type systems [1]?

[1] https://en.wikipedia.org/wiki/Refinement_type


I've been reading about that, can't seem to wrap my head around the difference between refinement and dependent types.


I think dependent types are stronger than refinement types because they can encode types that don't necessarily have to be decidable. Refinement types are used for creating subsets of a type, while dependent types can be used for creating arbitrary types based on values. As a result, dependent types are more powerful, but they might be too much if all you need are some constraints on an existing type.


`|` is bitwise-or in many C-style languages.


AFAIK React itself is performant enough, the bottleneck is client code within components. I suppose the change is mostly for full enterprise web apps at the Facebook level that likely have thousands of components running at the same time. Concurrent mode lets React split up work and give some time back to browsers (for e.g. mouse movements or rendering) to prevent large freezes on rerenders, essentially allowing prioritization of certain interactions for latency. On top of that, I think it also helps to remove some of the bottleneck that comes from I/O.


I wrote an article with the same name! [1] The problem solving based approach seems to be the most relatable and easy for people to get an idea of what monads are. Nice work.

[1] https://blog.kabir.sh/inventing-monads


Great work Kabir! : )


They pioneered the "battle pass" model for cosmetic items, and their constant release of skins, dances, etc can create a lot of recurring revenue I imagine.


Weird, on the one hand I'm thankful to hear it's not "pay to win" stuff, but on the other, wow that's a lot of skins and dances!

Edit: Insert joke about Quantitative Easing here


Yup, I think they definitely found the sweet spot for making revenue from games — offering players sweet returns if they keep playing (you buy the battle pass for 1000 vbucks, but you can earn back 1500 just from playing casually) while still keeping it relatively competitive and not "pay to win". They did host a $30 million world cup event after all :)


Hey, I wrote a post with the same title [1]! I feel like everyone writes a blog post once monads finally click, and this one is great — I like how you introduced them through solving real-world problems. :)

[1] https://blog.kabir.sh/inventing-monads


How does it become any more object-oriented?

I saw it more as "purely functional", where components are the functions and styling outside of the component is a side-effect. Avoiding side-effects like margin or align-self makes components more like pure functions than objects IMO.


Margins and align-self aren’t side-effects. Their semantics are consistent and reproducible and depend only on the context a component is placed in, just as a property like width does.

They’re awkward because they have bigger knock-on effects on the overall layout of components within the parent.

In an OO mindset, I see this as being all about what interfaces your object exposes. For something like a button or slider control, it obviously exposes an action callback of some kind. But it also exposes a generic “child component” interface, which is the only thing a generic parent component cares about.

What are useful things and what are obnoxious things for a child component to do? Flexibly adapting itself to a range of sizes is useful. Demanding that it should be centered is obnoxious. In fact it would be hard to express at all in most OO widget toolkits. So I think this problem is very closely related to good OO design.


Their semantics might be consistent, but they sort of leak into the parent. Height and width are local to what the component renders, and they don't depend on the parent (as long as they aren't relative sizes). But things like margin and align-self change how the parent renders all of its children.

In other words, the view rendered by the component is a function of properties like height and width. But when you throw in margin or align-self, the component now depends on its siblings.

The OO perspective is interesting — I agree that it effectively makes these things hard to express. I now think that it's more of a problem relating to encapsulation and isolation. OO would isolate it through a child component interface while FP would have a function that can't affect siblings.


Yeah, I think we’re mostly agreeing! Encapsulation and isolation are the key bits, and both FP and OO suggest ways to achieve them.


Looking at their GitHub [1], it's been actively developed and open source since 2013.

[1] https://github.com/unisonweb/unison


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: