Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Apple M2 Die Shot and Architecture Analysis – Big Cost Increase and A15 Based IP (semianalysis.substack.com)
237 points by yurisagalov on June 11, 2022 | hide | past | favorite | 199 comments


Thanks to being based on their phone chips Apple came out of the gate with the M1 and cleaned everyone’s clock on performance-per-watt while putting in good to great numbers in general (as a CPU).

But their rate of improvement on the A series has been slowing on general tasks. They’re on the same process node, and only increased frequency a bit.

Is it really that surprising that performance didn’t take a massive jump? You can’t keep up a 20% increase in normal stuff every release for long.

You can use accelerators like they do for video and ML to help some tasks. You can improve your GPU some and make it a little bigger.

It seems like in some places people are trying to push a “the M2 is a failure because it’s not a huge leap above the M1“ narrative. But no one exits that from Intel or AMD every year anymore. Or Apple’s A-series.

So why here?


> They’re on the same process node, and only increased frequency a bit.

It's going from N5 to N5P, chosen by Apple over N4.

> But no one [expects] that from Intel or AMD every year anymore.

That's not accurate, a minor performance upgrade after almost 2 years is the exact thing Intel has gotten a lot of flack for in recent years. The fact that people are willing to defend it is really exclusive to Apple and their unbeatable marketing.

Zen 4 on an almost identical timeline will be ~ 30-40% performance, and people were widely disappointed by the announcement of ">15%" S/T - very close to Apple's +18% M/T. Intel will have gone from Rocket Lake to (almost) Raptor Lake, doubling performance.


Intel got criticized for minor improvements year after year. It's important to look at these on 5+ year timescales, since the improvements aren't evenly spread out. Especially since gains often depend on manufacturing process improvements.


2015 Sky Lake i5-6600K single core 1092 [1]

2017 Kaby Lake i5-7600k single core 1157 [2]

2017 Coffee Lake i5-8600k single core 1206 [3]

2018 Coffee Lake i5-9600k single core 1233 [4]

2020 Comet Lake i5-10600k single core 1307 [5]

16% improvement over 5 years, average 4.6% improvement per release, range from 2.2% to 6% per step. didn't realize that they released 2 Coffee Lakes.

[1] https://browser.geekbench.com/processors/intel-core-i5-6600k

[2] https://browser.geekbench.com/processors/intel-core-i5-7600k

[3] https://browser.geekbench.com/processors/intel-core-i5-8600k

[4] https://browser.geekbench.com/processors/intel-core-i5-9600k

[5] https://browser.geekbench.com/processors/intel-core-i5-10600...


Those are all Skylake CPUs with very minor tweaks. The biggest change being clock frequencies steadily increasing due to 14 nm evolving into the ultra-mature 14+++ nm process.


Actually the reason why Intel got criticized is not that their improvements were minor because Intel could not do better than that.

When studying the evolution of Intel CPUs over many years, it is obvious that most of the time they could have done greater improvements, but as long as their competition was weak they delayed the improvements that they could have done in a single year over 2 or 3 yearly CPU generations, in order to minimize their manufacturing costs, therefore maximizing their profits.

Only during the many years that have passed between Skylake and Alder Lake, Intel was no longer able to implement all the improvements that they would have wanted, due to the failures in the development of the new CMOS processes, so they were forced to make random minor improvements because greater improvements were impossible and they did not have a good Plan B as an alternative to the erroneous Plan A, which was every year that the next year will be the year when the Intel "10 nm" CMOS process will become competitive.


Yeah, I'm not personally surprised by there not a massive jump in an 18 month span.

It looks as though Apple are gearing up for armv9 and smaller process node for the next round of chips which would be more of the "large jump" people are expecting. I think as long as Apple alternate the big jumps with the small jumps then they're not doing anything different from anyone else.

They needed to deliver M2 to show they're not resting on their laurels. If M3 is a similar kind of improvement then that's when to be worried.


I would agree. Third times the charm an all that. Looking at AMD, with Zen 1 that was a huge leap but their second generation Zen+ was quite small in comparison. Zen 2 showed the path forward and Zen 3 showed they could continue to deliver performance with their methods. I would hold Apple Silicon to the same test (as well as Intel’s dedicated GPUs), M3 or whatever the third iteration is will be the true test of Apple’s vision.


Intel didn’t get flack because of minor improvements, they got flack because they couldn’t release 10nm or any other major improvements for the better part of a decade.


>It's going from N5 to N5P, chosen by Apple over N4.

Any info as to why?


N4 is still (despite the name) part of TSMC’s 5nm process family and offers little to no performance/efficiency improvement over N5P.

N4 increases the number of EUV layers so the main improvements should be in cost and yield which would have been interesting to Apple, but N5P hit volume manufacturing earlier allowing Apple to ship the M2 earlier and with more capacity.

Waiting for N3 would have offered a considerable performance and efficiency boost but that’d realistically have delayed M2 to the first half of 2023.


Tech journalism needs a dramatic story. So every product is either world changing or a complete failure.


Someone will probably mention that "it's not tech journalism, but all journalism", and I would probably agree. However please keep in mind that there is a significant selection bias here itself -- non-dramatic stories will be less frequently featured here on HN, and even if they are, attract less comments.


SemiAnalysis may be burying the lede and underhyping the story of the decade here: head of apple arch leaves and takes 100 process engineers to start stealth risc startup and its creating a delay effect in consumer tech innovation, bombshell!


To their credit, with many billions on the line, products do have a bit of a tendency to be a failure if all they can do is tread water.


Does anyone seriously think the performance of the M2 processor will have any meaningful impact on Apple's success?

They made a big splash with the M1 macbook air, which was at the time an incredible value, and the clear best laptop on the market in terms of price/performance hands down. Apple was able to get splashy headlines, and assert their silicon was not just competitive with, but better than Intel and AMD. That's the critical goal they had to reach to validate Apple Silicon as a valid contender in the market.

This year, they're iterating on the design, and getting the market to accept a 20% price increase on the macbook air, which is their mass-market product.

Does anything they do from here on out actually depend on them continuing to win in the semiconductor space? It's not as if these chips are competing for server slots, where winning comes down to raw numbers in terms of performance/Watt.

These macbooks are going to be absolutely fine for the foreseeable future for everything anyone needs a mac to do: video editing, coding, content consumption etc. run absolutely great on these devices which have excellent battery life and great user experience.


Also Apple has lots of knobs to turn due to the high degree of vertical integration. There is a lot of slack they can pick up in OS performance (e.g. process scheduler, memory management). So their overall benchmarks can continue to trend up even if the locus of improvement varies.


Not if the rest of the market is the same. Most products get yearly updates that are completely unremarkable. No one gets hyped up for the next year edition of a car.


M1 to M1+ would be like the release of the next year edition of a car. M1 to M2 would be like the release of the next generation of a car. A lot of people get hyped up for the release of the next generation of a car.


Major car models are introduced every ~6 years with refreshes at 3 years.


Different types of products have different product cycle times, but that doesn't make the comparison between stages of the product refresh inaccurate.


The MacBook Pro M1+ Pro is just too much.


Get that performance from where though?

The big M1 numbers came from getting ahead of the rest of the market on 5nm TSMC, and critically from packing everything into the SOC so physical distances were reduced by multiple orders of magnitude (which has already been the case for the A series). That’s been done now, so the low hanging fruit is gone there.

Performance gains from here should be expected to be identical to AMD as they’ll be moving on TSMC’s cadence (it’s AMD who might actually see similar jumps on the low end if they go the Apple route and move everything to the package).

I wouldn’t be surprised if Apple has already started looking to stand up it’s own fab. They have large and very predictable needs now, and could likely get ahead of ASML’s queue by throwing money and scale at the problem - not least because it would help them muddy the waters as to what Apple Silicon actually is more, which fits the marketing better.


> and critically from packing everything into the SOC so physical distances were reduced by multiple orders of magnitude

I don't think that reduces chip power so much as it reduces latency. Apple's "power" here comes entirely from using the 5nm node and refining a stupid-high IPC.

> it’s AMD who might actually see similar jumps on the low end if they go the Apple route and move everything to the package

No? Again, making everything an SOC has advantages/disadvantages, but your raw performance metrics are almost never significantly influenced by distance of the components (unless the distance is significant enough). AMD's real advantage will be jumping ahead 1.5 generations at TSMC, and then later it will be an architectural change (eg. big.LITTLE). I think Apple is the only one interested in shipping computers with SOCs.


You can’t keep up a 20% increase in normal stuff every release for long.

But this chip is literally 18% faster “in normal stuff”


When M1 came out, I remember a lot of excitement around the past progress of performance in the A* core line; people extrapolated future performance and thought they were going to get +20% (ST) every 18 months on top of M1.

I think this expectation of sustained performance gains was a part of some of the more glowing reviews, rather than a narrow evaluation of M1 itself.

Anyway, it wasn’t a reasonable expectation. But I think people expected it anyway.


And yet, the M2 has 18% performance improvement in about 18 months. Presumably also single-threaded, since the CPU core configuration is the same as the M1. 18% is close enough to 20%.


>So why here?

99.999999% of internet comments on M2, or even anything hardware related are pretty much junk. Anandtech used to do some explanation into these sort of things, but as it turns out people aren't interested in in-depth analysis, they just want benchmarks. You end up having them drifting towards mass / mainstream media or LinusTechTip type of content. RealWorldTech doesn't do any these anymore, partly because there is very little money to be made on the consumer side of things. You have other site which talks about Semiconductor Engineering and Business type of content. Unfortunately every time they were posted on HN, no one has shown any interest or complaining about how the content looks too "Enteprisy or Cooperate" because they were intended for B2B settings.

It also isn't just consumers or enthusiast. When viewed from the outside most people would have expected programmers, or what they now called Software Engineers to have some sort of High Level understanding on hardware. But most developers, especially Web Developers, are so abstracted from hardware they dont know or do not care about it.

Broadly speaking this isn't just with hardware, but also every other subject.


> But no one exits that from Intel or AMD every year anymore.

I've been hearing that since Skylake, at every single processor generation that the gains are too modest. That's more than a decade at this point.


Skylake (6th) came out in 2015. It's just that due to the 10nm delay we then had Kaby Lake, Coffee Lake and Comet Lake all being refreshes of Skylake on the same process until we got to Rocket Lake (11th) - a backport of a 10nm uarch to 14nm with the somewhat predictable power issues. Only with Alder Lake (12th) we actually got the first real new uarch on the process it was designed for since Skylake.


ya but you have to look at it in terms of the consumer. not everyone is buying a new iphone every year. most wait two years and two years of 18%+ improvements leads to a phone that feels like a substantial upgrade especially for mobile games, 4k video and computational photography


AMD, releases CPUs every 2 years, so it's still impressive that in just 1 year Apple can have these gains.


Apple announced M1 in November 2020 and the new MacBook Air M2 isn’t currently available to order in June 2022. It’s not accurate to say M-series iterations are annual.

https://en.wikipedia.org/wiki/Apple_silicon


From Anandtech's deep dive into the performance and efficiency cores in the A15, which are reused here in the M2.

Performance Cores:

>Apple A15 performance cores are extremely impressive here – usually increases in performance always come with some sort of deficit in efficiency, or at least flat efficiency. Apple here instead has managed to reduce power whilst increasing performance, meaning energy efficiency is improved by 17% on the peak performance states versus the A14. If we had been able to measure both SoCs at the same performance level, this efficiency advantage of the A15 would grow even larger. In our initial coverage of Apple’s announcement, we theorised that the company might possibly invested into energy efficiency rather than performance increases this year, and I’m glad to see that seemingly this is exactly what has happened, explaining some of the more conservative (at least for Apple) performance improvements.

Efficiency Cores:

>The A15’s efficiency cores are also massively impressive – at peak performance, efficiency is flat, but they’re also +28% faster.

The comparison against the little Cortex-A55 cores is more absurd though, as the A15’s E-core is 3.5x faster on average, yet only consuming 32% more power, so energy efficiency is 60% better.

Conclusions:

>In our extensive testing, we’re elated to see that it was actually mostly an efficiency focus this year, with the new performance cores showcasing adequate performance improvements, while at the same time reducing power consumption, as well as significantly improving energy efficiency.

The efficiency cores of the A15 have also seen massive gains, this time around with Apple mostly investing them back into performance, with the new cores showcasing +23-28% absolute performance improvements, something that isn’t easily identified by popular benchmarking. This large performance increase further helps the SoC improve energy efficiency, and our initial battery life figures of the new 13 series showcase that the chip has a very large part into the vastly longer longevity of the new devices.

https://www.anandtech.com/show/16983/the-apple-a15-soc-perfo...


This is a really great article from Anandtech with benchmarks and analysis, it explained a lot of things.

> The overall performance gains are quite disappointing when you factor in the raw cost increase that comes with this new M2 and the fact that it has been nearly 2 years since the M1’s introduction.

Also the logic of article in the title is little weird to me. M1 was introduced in the same year as A14, they use the same core; while M2 uses the same core as A15, which introduced 1 year after M1. So technically M2 increased the performance by 18% in one year, not two years.

Though I'm curious why Apple didn't use A16's core in M2.


> Though I'm curious why Apple didn't use A16's core in M2.

Probably the smaller process node. There's low capacity and low yields for the first year or two of the smaller node. It might not be an issue for the base-level M2s, but they'll be expected to update the Pro/Max/Ultra line up as well in the next 8 months which have much larger die sizes and they'd end up throwing away most of the wafer.


Phones are the flagship product. They will always get the latest and greatest first, including cores, die shrinks, etc etc.


Which is weird because most iPhone users use all the power for stuff like WhatsApp. iPhones are plenty fast already (honestly probably mostly due to a well cared-for UI).


I assume Apple wants the phones to have the most efficient CPUs, since battery life is much more critical than laptops & desktops.


It’s my opinion that Apple sees putting the best processors in iPhones as being able to extend the life of the product. With that available compute overhead today it will feel “faster” longer and can take advantage of software features they develop 1-3 years down the line.


Power efficiency is more important in phones.


It is looking like A16 will be on a smaller node. Likely manufacturing it on the current node would be too expensive due to increased transistor count.

Available volume on the new node will be much smaller, so they had to prioritize. This is likely why only the iPhone pro will get the A16.


>Though I'm curious why Apple didn't use A16's core in M2.

We will know soon enough. My guess is that A16 is designed with TSMC 3nm in mind, that is why ( rumour ) only the new iPhone Pro will get A16, and iPhone 14 will stick to A15.


Given that M2 is available in a week or two, and the A16 isn't, I would say it's a scheduling thing. In order to have the M1 to M2 cadence not blow out, they have to make decisions about what can go into the product.


M1 2020, M2 2022?


There's a lot of claims of poached talent in the article, basically claiming [paraphrasing] "Apple, maintaining their stressful work env and not paying to shore that up lost some rockstars"

How true is this? If they're on the money it's an excellent example of a talent retention miss leading to a demonstrable mediocrity in delivery.


Seems silly to me. CPU designs are important, but these companies have more than enough engineers to make competent designs even with some people leaving. There's another factor that completely dominates. It's all about the fabs. Intel lost the performance lead, was it because of their designs? No, it's because they lost the lead in fabs. AMD passed Intel, was it because of their designs? No, it's because they use TSMC's fabs and TSMC passed Intel. Apple blew everyone away with M1, was it because of their designs? No, it's because they paid TSMC boatloads of money for exclusivity on their latest fabs. Apple M2 disappoints on CPU performance, is it because of their designs? No, it's because TSMC's next fab isn't ready yet so they're still using the same fabs as M1.

These days I care more about which TSMC process node my chips came from than which company designed them. I need a new computer but I'm waiting until next year because there will be a wave of new CPUs and GPUs coming out with much better performance. Better designs? Maybe a little, but it's really because they're all moving to TSMC N4.

I really hope Pat Gelsinger can save Intel's fab business because we really need another company that can compete in fabs and Samsung isn't doing too hot either.


> No, it's because they lost the lead in fabs. AMD passed Intel, was it because of their designs? No, it's because they use TSMC's fabs and TSMC passed Intel. Apple blew everyone away with M1, was it because of their designs? No, it's because they paid TSMC boatloads of money for exclusivity on their latest fabs.

The fixation on the fab process is bewildering. Yes, it does help, but it is also an optimisation step that is decoupled from and that bears no relevance on the chip design. Yes, the smaller node size also brings the increased density along and an increased number of things that can be whacked into the same sized piece of silicon, but it will not magically improve the overall system performance or result in the linear architecture scalability.

The article is specifically calling out a potentially decreased ROB size in M2 cores, and ARMv9 also potentially not arriving until M3 which are crucial to the speed or software performance. There is absolutely nothing the fab process can do to make SVE2 and matrix instructions automagically appear in lithographic chip designs – those are the «silicon» design time decisions. As we have recently been seeing more and more practical, mainstream use cases of the advanced use of the SIMD instructions at the C/C++/Rust runtime level that bring an order of magnitude level performance gains, having the SVE2 implementation at the ISA level is becoming somewhat critical.


Stuff like adding SVE2 can be great for specific applications but it's really marginal when looking at whole system performance. What's not marginal are the improvements in power efficiency and room for more cache that come with new process nodes. These chips are power constrained in almost everything they do, because of heat dissipation or battery life or both. Less power and more cache benefits everything automatically, not just the very few things that actually start using new SIMD instructions or other new hardware blocks each year.


> Stuff like SVE2 is really marginal when looking at whole system performance.

It is not. A recent paper (https://arxiv.org/pdf/2205.05982.pdf) from Google engineering has compared performance of a vectorised (SIMD) vs non-vectorised implementation of the quick sort in the Highway library as well as the performance difference of the AVX-512 vs NEON/SVE1 implementations. By switching to the SIMD processing alone, the 9-19x speedup has been reported, depending on the SIMD unit size (32/64/128-bit numbers have been sampled and measured up). Even the smallest of the two, the 9x perfomance gain factor, is far from being marginal.

On the SIMD unit size of things, the performance difference between AVX-512 (the average of 1120 Mb/sec has been measured) and NEON implementation (the 478 Mb/sec throughput on average) is 2.4x smaller for NEON/SVE1 largely due to the smaller width of the units of processing. Again, the 2.4x factor is not in the marginal territory.

> What's not marginal is the improvements in power efficiency that come with new process nodes.

And that is an optimisation step, albeit a very important one. However, it will not make a quick sort implementation run 2.4x faster alone.


You completely ignored the "whole system performance" part of my statement. What percentage of your CPU time is spent running SIMD-optimized implementations of Quicksort? Now apply Amdahl's law.


«Whole system performance» is a meaningless term as it is a function of many, usually poorly controlled, input variables, and your whole system is different from my whole system. If my VPN tunnel allows me to have faster transfer speeds simply by virtue of having ISA assisted optimisations in the cryptographic library it uses, the net result will be very noticeable to me but perhaps not for you unless you also have to use the same VPN client.

Even the web browser you are using right now to comment on HN likely makes use of the very same Highway library (Chrome and Firefox certainly do, unsure about Safari) the speedup gains have been reported for. The «overall» browser performance will also improve as the result due to it receiving gains transparently, by simply dropping an optimised implementation into the browser build.


The idea that my overall browser performance might improve in a noticeable way because it is switching from NEON optimized Quicksort to SVE2 optimized Quicksort is simply laughable. On the other hand, switching to a processor fabbed on a better process node could easily have a noticeable impact on overall browser performance, or battery life while browsing, or both.


It is certainly not laughable to me. You have singled out Quicksort as the sole example of performance gains whereas I have used it as a single isolated example of very large performance gains that can be had. SIMD instructions have seen a lot of other mainstream use cases recently which also includes the memory copying or memcpy(3) optimisations amongst others. Your browser has a Javascript engine, and since it a Javascript engine, it has a garbage collector. Garbage collectors move memory blocks around all the time, and a SIMD optimised memcpy will yield substantial performance gains. Or faster JSON processing. Therefore, SIMD + an improved fab process will result in much larger overall performance gains for you and me as browser users as opposed to an improved fab process alone. It is also a realistic example of the «whole system performance» improvements if the browser is treated as a «whole system».

And an optimised QuickSort can also come in handy if one pokes around a large browser history or uses it as a knowledge base, which I do and use it on a regular basis. My browser keeps a uninterrupted record of all visited websites over the last 15+ years and being able to zoom in on a particular time span to find something within that temporal range quickly is important to me. I am almost certain that a sorting of sorts is involved somewhere behind the scenes.


Forget Quicksort. What percentage of your CPU time is spent running SIMD-optimized implementations of anything? And what percent of those are upgraded to new SIMD instructions each year? And what real world percentage gains are they getting considering other constraints like memory bandwidth, power, etc? The answers to these questions, multiplied together, put a very small upper bound on the overall benefit of new instructions per year.


If you're watching videos, quite a lot. Also, of the five image codecs mentioned here to succeed PNG/JPG, all have SIMD-enabled implementations: https://cloudinary.com/blog/time_for_next_gen_codecs_to_deth...

I share your concern about new SIMD instructions not being used. It seems to me we're at an inflection point, though. ISAs such as RISC-V and SVE will enable (properly written) software to benefit from future wider vectors without even recompiling. github.com/google/highway (disclosure: I am the main author) lets you write your code only once, and target newer instructions whenever they are available, with transparent fallback to other codepaths for CPUs.

Given the various physical realities including power efficiency, I believe there will be considerably more SIMD usage within the next few years.


Soo... basically a 2x speedup in going from 4x128b to 2x512b ALUs, after discounting the frequency difference. But realistically, Intel's client configurations are 3x256b, which is only 25-40% faster in that paper.

(I suspect any application doing enough quicksort that the 2x speedup is significant, would be even happier going slightly off-core to a coprocessor more specialized in vector processing, like Hwacha. There's plenty of space between "tightly-coupled CPU SIMD" and "GPU" that I think makes more sense than needing to implement 512-bit registers in little cores.)


> Soo... basically a 2x speedup in going from 4x128b to 2x512b ALUs, after discounting the frequency difference. But realistically, Intel's client configurations are 3x256b, which is only 25-40% faster in that paper.

2.4x difference was, in fact, reported, however I still find it somewhat difficult to interpret the reported results. The processing unit size difference alone and the number of LU's can't account for such a big difference in transfer speeds as the M1 Max that was used in the assessment has a very wide memory bus (256 bit wide for a performance core cluster or 512 bit wide for the entire SoC) as well as unusually large L1-D cache and a large L2 cache, with both caches having deep TLB's. The test set they used could also fully fit into the L2 cache. I have asked the Google engineer a question in a separate thread about what else could influence the observed performance difference but have not received a satisfactory explanation.


I am that engineer; happy to clarify/go into more detail. First, just to ensure we're on the same page: the speeds we quote are sort goodput, i.e. amount of sorted data produced per time. Memory bus is only incidental to keeping the vector units fed.

The key bottleneck is partitioning. AVX-512 does really well there because it has dedicated compressstore instructions, and it's actually even faster to partition a vector via vperm* (because we only need to do that once, whereas two compressstore are required to partition). So AVX-512 reaches >25 GB/s partition throughput per core; it's instead limited by the memory bandwidth each core can access (around 11 GB/s if a single core is active, less when all are competing for the total "128 GB/s").

By contrast, NEON for example in the M1 has 128-bit vectors. Its "4 vector units" (even if they can actually execute all instructions concurrently, which is not clear to me and unlikely - Intel can also only execute some instructions on certain ports) are definitely not as good as actual 512 bit vectors, because partitioning only has a left and right side, and we don't have enough ILP for each of those to keep 2 vector units busy. Hence NEON reaches 11 GB/s partition throughput. It would seem like this matches Skylake, but no: once a subarray fits into cache, Skylake is freed from the memory bottleneck and is at least twice as fast there (which is a sizable fraction of the total sort time).

Does this help explain the results?

> The test set they used could also fully fit into the L2 cache.

This seems unlikely because we're sorting 8 MB and my understanding is that cores (unless L2==LLC) generally have private, partitioned L2 caches, so 3 MB in the case of M1. Is that incorrect?


> (even if they can actually execute all instructions concurrently, which is not clear to me and unlikely - Intel can also only execute some instructions on certain ports)

It's pretty symmetrical, moreso than Cortex-X2's 4 pipelines; there's analysis that on M1 only some floating point and crypto instructions can't execute on all 4 pipelines. [1] (TP in that table is inverse throughput)

Which means that, for example, byte permutes from tables 256b or less can actually achieve the same throughput on M1 as with Intel's AVX-512, since M1 can sustain 4x 2-register TBL per cycle. And doing the exact equivalent of a 512b vpermb (3 cycle latency, 1/cycle throughput) can be done with 5 cycles latency and 0.33/cycle throughput on M1, via 4x 4-register TBL.

Well, a vpermd in NEON would need an extra MLA to convert indexes, and vpermi2* equivalents fall off a cliff. And Intel still has p01 free, and COMPACT is SVE. But in general, a lot of the parallelism that enables AVX-512 implementations will convert directly into ILP across 128b vectors.

> This seems unlikely because we're sorting 8 MB and my understanding is that cores (unless L2==LLC) generally have private, partitioned L2 caches, so 3 MB in the case of M1. Is that incorrect?

Anandtech [2] measured the same L2 latency up to about 8MB single-core, so regardless of the details, 8MB is a pretty significant cliff on M1. Regardless, RAM bandwidth is ~60GB/s, and unlike Intel, can be just about saturated by a single core.

[1] https://dougallj.github.io/applecpu/firestorm-simd.html

[2] https://www.anandtech.com/show/16252/mac-mini-apple-m1-teste...


Thanks for the instruction table, hadn't seen that yet. That is indeed remarkably symmetrical! It does reinforce the "half of AVX-512" result - we sustain ipc=2 on Skylake (with 512-bit) and it looks like M1 would sustain 4 (x128 bit).

huh, that's surprising, that plot indeed looks like a core might be grabbing more than 'its share' of L2, though not all. The 'full random' curve starts creeping up after ~3MB as expected, so the situation seems to be even more complex than "use up to 8MB".

For completeness I'll also measure for 100M elements single core, though on M1 that wouldn't make a difference because as you say, a single core can drive a lot of memory bandwidth, enough that NEON becomes the bottleneck.


If something is bound by memory/cache bandwidth, then increasing ALU width wouldn't help in the first place.


This is an interesting question. Has there been much uptake of such accelerators/coprocessors? One concern is that by the time the HW is ready, SW wants to do something different, perhaps fusing some other step with the sort. Another is deployability: everyone has SIMD/vectors on board, but even GPUs aren't quite everywhere nor so easy to scale out.

Also, there are now several RISC-V CPUs with 512-bit vectors, and it seems fair to call them little cores especially compared to x86 and M1/M2. Perhaps 512-bit is more feasible (and sensible) than is widely believed?


> adding SVE2 can be great for specific applications but it's really marginal when looking at whole system performance. What's not marginal are the improvements in power efficiency

Depends on the applications, I suppose. But did you know that (at least on OoO x86), the energy cost of scheduling an instruction dwarfs that of the actual computation? That is why SIMD, including SVE2, can be so important - it amortizes that cost over several elements. Let's spend (more of) our energy budget on actual work.

Is it really just "very few things that actually start using new SIMD"? I'm not a huge fan of autovectorization, but even that is able to vectorize some fraction of STL algorithms. And there are several widely used libraries, including image/video codecs and encryption, that use SIMD and wouldn't be feasible otherwise.


Android flagships are shipping with SVE2 as of this year, which I actually didn't realize until like two weeks ago because there's been nearly zero buzz about it. What's SVE2 being used for over NEON as of now?


Low level runtime optimisation that yields substantial performance gains in the user facing or system level software, ranging from cryptography through to data processing algorithms and very high throughput JSON parsing.

Take OpenSSL as an isolated example. By simply fiddling with the C compiler flags to allow it to use NEON on M1, the sha256 calculation speed-up is 4x for 128 and 256 block sizes, with performance gains quickly tapering off for larger block sizes and resutling in a modest 10% increase only. And that performance increase happens without the involvement of hash functions having been manually optimised for NEON/SVE1.

SVE2 with its variable vector size support could improve performance for larger unit sizes. Perhaps it is the time to spin up a Graviton3 instance and poke around with clang/gcc to see how actually good or faster the SVE2 is.


Yeah that's NEON. And there's instructions that literally calculate SHA256 so generalizing that is moot. My point was first, what real benchmarks are there of SVE2's benefits over NEON with mainstream CPUs that M2 would compete against? Unlike AVX-512, NEON was already pretty rich, so the new instructions have rather specialized usefulness.

Because outside of servers where little cores don't exist, 256b ALUs in big cores mean 256b registers in little cores, and Cortex-A510 is way smaller than Gracemont. And then you're giving Samsung another opportunity to screw up big.LITTLE...

And even the server CPUs with SVE are 2x256b except A64FX which is HPC exclusive, so no better than 4x128b.


SVE2 does not increase the maximum speed. That depends only on the width and number of the ALUs, on the number of cores and on the clock frequency.

The purpose of SVE2 is to simplify the writing of the software that exploits the data parallelism, both when that is done manually and when that is done automatically by an autovectorizing compiler.

With SVE2 it should become much easier to deal with data structures where the sizes and the alignments are not multiples of the ALU width and it will also no longer be necessary to write many alternative code paths, to take advantage of any future better CPUs, like when optimizing for Intel SSE/AVX/AVX2/AVX-512.

There are still a majority of programs that do not utilize as frequently as possible the existing SIMD units. With SVE2, their number should diminish.


Fab process is very important. The fast design is nothing if you can't build it.


Key talent is incredibly important. If you lose enough senior engineers, it doesn't matter how talented the rest are. You've lost so much institutional knowledge that is either extremely difficult or impossible to regain. And Apple is notoriously under-staffed for a lot of their projects. With the staffing losses to Nuvia, I wouldn't be surprised if they lost enough key talent that it's going to take them a long time to recover and be able to deliver significant performance improvements again. That's what happens when you treat software developers/engineers like commodities.


I thought AMD succeeded because they managed to get Jim Keller and other great engineers? Unsure why you're placing your hope on a CEO.


No one person deserves the credit for a processor and if Intel had delivered on their roadmap (Ice Lake in 2017 and Alder Lake in 2019) AMD would be dead today.


AMD survived bulldozer. They survived not because of the quality of their chips, but because there exists a sizable x86 market that is literally anything except Intel.


Why does that market exist?


I would expect a large part of it to come from personal dislike in dealing with Intel by people in the position to pick parts and make the sourcing decisions.


Because, before the mobile explosion, we had the PC industry explosion which created a huge demand for X86 PCs and Macs in every home.

And even after the mobile revolution shrank the demand for X86 PCs, the cloud revolution further entrched X86 in the cloud.


The author has been pushing this conjecture for a year over the past year or so, and has been repeatedly called out on the Hardware reddit.

I would recommend not taking their business conjecture without a giant pinch of salt. Just today they were claiming Apple has lost hundreds of engineers in the chip division. The idea that a single division somehow lost hundreds without the industry noticing is ridiculous.


I wondered about this too and I like your advice about salt but apparently Apple is suing Rivos about this very thing: https://www.reuters.com/legal/litigation/apple-lawsuit-says-...


The article says 40 employees not hundreds, but I imagine that 40 of Apple's top chip talent is going to hurt, that is a lot of brain power to lose!

Seems some employees took more than themselves to Rivos. "at least two former Apple engineers took gigabytes of confidential information with them to Rivos."


Remember Apple cancelling their contract with Imagination (GPU) and hiring their employees to work on Apple's GPU?


Hurts donut.


I wonder if tha’ts subject to criminal prosecution or merely civil remedies.


I never said or implied that they didn't take many, just not hundreds. In the tens? I believe that. Up to a hundred? That's a tall order. Multiple hundreds? That's catastrophic to any org, including one as large as Apple


I agree, reading this it just seems like this author has found a market for people who want read news about how AAPL is going to drop tomorrow


I remember reading ESR's blog a decade (or more) ago where every single technical advancement was going to lead to Apple's doom. Every new competitor that popped up was going to lead to Apple's doom. Every legislative initiative was going to lead to Apple's doom. After awhile, I stopped reading his blog because despite a lot of good insight in some areas, his cheerleading for Apple's Doom had clearly created too much bias in his judgement for me to take anything he said seriously.

Apple will eventually be overtaken by another company at some point, but there's a world of journalists and pundits who continue to cry wolf every day.


Why do you think the industry hasn't noticed? If it's not hundreds, how many Apple employees have moved to Nuvia and Rivos?


It has noticed. Look at Apple architects, validation, layout, etc engineers moving to Nuvia + Rivos + Google + Amazon + Microsoft + Meta + Intel + Nvidia + AMD + Apple + Qualcomm.

It's there.


In multiples of tens, I would believe it. Upto a hundred over a couple years? That's a stretch but possible if you count a very wide range of roles. Multiple hundreds as they imply on Reddit? That would be catastrophic to any company, even one as large as Apple. You would certainly see it reflected in their job postings after even a few, let alone hundred+


Looking at another article from the same author[0], we'll have a pretty solid answer to the impact in September with the release of the A16. Apparently the A15 had very minimal CPU gains clock-for-clock over the A14.

To quote from that article:

"SemiAnalysis believes that the next generation core was delayed out of 2021 into 2022 due to CPU engineer resource problems. In 2019, Nuvia was founded and later acquired by Qualcomm for $1.4B. Apple’s Chief CPU Architect, Gerard Williams, as well as over a 100 other Apple engineers left to join this firm. More recently, SemiAnalysis broke the news about Rivos Inc, a new high performance RISC V startup which includes many senior Apple engineers. The brain drain continues and impacts will be more apparent as time moves on. As Apple once drained resources out of Intel and others through the industry, the reverse seems to be happening now."

I was very optimistic on Apple on the CPU front until I read this today. Now I'm waiting to see how the A16 pans out for them to see if it's a two generation loss of progress, or just a single generation stumble.

0: https://semianalysis.substack.com/p/apple-cpu-gains-grind-to...


I don’t know Apple’s turnaround, but processors are released in products only long after their design is completed. Think at least several months, even likely a year+ between design and release.

Nuvia started early enough to be a factor here. But Rivos wasn’t even founded until June 2021. To release now, M2 would already have been at finished with design by then.


Chip manufacturing is difficult, this reminds me of the Japanese entry into the semiconductor market.

There is an excellent video on this for anyone interested in Japanese culture and the war against USA via semiconductors:

https://youtu.be/bwhU9goCiaI


Reads like FUD to me


Likely its a bit of hyperbole to get views.

I think there's always a desire to work at a startup in SV and in a low/zero interest rate environment - VCs could probably fund something in the chip design space.

But now that interest rates are going up, I think that will be a lot tougher and Apple will be a better position due to their direct access to free cashflow - to either compete or acquire them at a later date.

Its also an observation that w.r.t. chip design and consumer electronics, the pay is general lower than say Google, Facebook, Salesforce, Web 2.0 based startups (i.e. AirBnb, Uber, DoorDash), etc.

My presumption is that this is because as a chip designer or embedded software/hardware engineer, the capital costs to do anything interesting on your own as a startup (i.e. tape out a chip, mass production in Asia, etc.) are very very high and very fixed and very up-front. Even fabless semiconductors and factory-less product design companies that outsource manufacturing to Asia would need to go find outside capital for IC masks or HW prototypes. You also need a cadre of supply chain, biz dev, marketing, ad spend, channel sales distribution.

Compare that to AirBnb, Dropbox where you need a good idea, a handful of 10x SW engineers and an AWS account that can scale as you grown and a free tier for onboarding customers. Therefore, Google/FB etc. need to pay more to prevent these folks from going off and starting their disruptor (i.e. Insta, WhatsApp, SalesForce).


++, totally. I think a lot of people are linking CPU engineering with SW engineering because they both work with the same product at the end of the day, but the industries are radically different both from a business and culture standpoint. The "go fast and break things" mentality that pervades the SV software startup scene is, in my experience, no where to be found in hardware because it's both incredibly costly to make any mistakes and because most CPU divisions are lead by people with decades of experience (rather than the mishmash that is startups).

The author's argument here about talent leaving after having "gotten Apple off x64" is such an odd take. It's not as if Apple started designing these chips after the M1 launched—the pipeline for even small SoCs is often five or more years. The bit about Rivos is especially bizarre because that company was founded in 2021, well after this chip must have been taped out.


Yep - also was going to add but it was getting kinda long...

With respect to Rivos, reading the about page - it seems an interesting take on RISC-V.

My take is that this will be rolled back into either Apple or Google at a later date - mostly as a hedge against someone (like Nvidia) acquiring the ARM IP now that its in play - or to provide some realistic alternative that can be used as a counter bid in licensing discussions with ARM.

Two of the founders of Rivos were involved in PA Semi which was acquired by Apple and Agnilux which was acquired by Google ChromeBook team.


I suspect we're looking at Apple implementing tick/tock, whether because they're forced to or because they want to - they've already been doing something similar on the iPhones, and supply constraint may make them do it on the chips, too.

Few people are going to upgrade from the M1 to the M2 anyway, so it makes sense to keep powder dry for the M3.


Intel tick-tock was alternating microarchitecture change with process optimization every other year.

It looks like M2 is neither of those, and it's already 2 year.


Work culture can mean a couple of things though. Building and delivering the M1 was probably a great experience. Maybe like the hardware equivalent of greenfield development. The M1 is out, and now it's about continual refinements. The people who love going from 0->1 are not always the same people who enjoy going from 1->100.

And while Apple isn't the max payer in SV, I'm sure they pay fine compared to other big tech. The issue is, chips are big right now and no existing big tech can compete compensation wise with shares in a growth chip startup. With VC drying up, I expect this to change back in Apple (and other big techs) favor.


The youngest, strongest RTL engineer I know jumped from Apple to Rivos.



I'm still amazed at how jumping to a rival firm like this is possible in California, those of us in pretty much the rest of the country are locked behind non-competes.


That’s quite an exaggeration - many states disallow or severely limit non-competes, and in many of the states that allow them, they are often unenforced, or easy to get around. So yeah - some people in some parts of the country are locked behind non-competes (if they aren’t willing to move), but it’s hardly everyone.


That is a gross exaggeration of the situation.

California has a total ban on non-competes.

A very small handful of other states put restrictions on non-competes, but even those generally allow non-competition agreements if time limited, and the employee makes over ~$100k.

It’s widely accepted that prohibiting non-competes has been a significant factor in the tech industry success in California.

As one example, it is well known that Amazon aggressively enforces non-competes, even against line engineers.


Non-competes without a monetary attachment are hard to enforce from the employer side. Judges don't look kindly to preventing someone from making a living. Of course companies hope the threat of going to court makes people back down - like Amazon who is known bully in this area.

But yes, I wish all states would just ban them outright. Or at least make them require compensation. If an employee is important enough to require a non-compete, then they are important enough to pay during the non-compete time period.

Does CA also ban them as part of an acquisition? I've seen them as part of the sale so everyone doesn't quit the day after the acquisition and start a competitor.


There are no non-competes of any kind in California, at all, ever.


Horace Greely had the answer 150 years ago.


I am curious. Would it be possible to provide a more direct reference?



Thanks! Apparently, I knew the quote, but not the author of it.


My thoughts are it doesn’t even matter that much. Apple is not selling me a CPU - they’re selling me a laptop. Yeah it’s a critical piece but Apple is all about your overall outcome.

If they were a CPU “arms dealer” like Intel or AMD it’d matter more I think.


A few comments here about how Apple is losing a lot of top talent to rival Rivos, a stealth startup.

What would Rivos business model be? I’m genuinely interested seems interesting to me.

Would they be positioning themselves as the next Qualcomm?

Or perhaps sell a superior chip to Apple at some point?


>What would Rivos business model be? Would they be positioning themselves as the next Qualcomm?

Realistically speaking, most likely to be ARM, IMG or CEVA type of IP companies.

But since it is RISC-V, and we are on HN, I wont be surprised if some people expect them to give out their design for free.


They could open up documentation and invite others to build an ecosystem around a RISC-V socket and variety of motherboards. I don't know what their plans might be, but this is the kind of RISC-V ecosystem I'd like to see, like early-x86.


Why does the biggest and richest company in the world ever have to suffer from talent leaving because they don’t get paid enough? It just doesn’t make any sense.


Also their push to try and make people work from the office again went down like a lead balloon.


Steve was notoriously and sometimes arbitrarily cheap. Apple retains some of that.


Shame that the market is slow to punish and inertia continues to reward.


Yet they’ll buy their own quarry or glass factory…


Because it's cheaper...


Raises are usually cheaper than recruiting, also raises are cheaper than quarries...


Things don't become the "est" anything by being wasteful.


Not even the poorest?


That’s such a contradictory statement given the facts I don’t even know where to begin.


My dad used to tell a story when as a kid he worked at a gas station. Guy shows up with a Ferrari, asks for a full tank, pays with a large note and my dad asks him to keep the change. The guy replied

- boy, I got a lot of money Really, a lot. You know how I got them? I never gave anything away for free. Hand that change over.


That just illustrates that a lot of rich guys are entitled assholes. The correlation between assholery and driving an expensive car in particular has been studied.

https://youtu.be/1EHhFwGeQLc


I worked in the service industry in UK for a few months for the richest and most privileged people out there. I got the chance to work in events where members of the royal family were present or events where the most famous and rich people in the world were the guests.

My observation is, the assholes are everywhere but also the nice and polite people. I can't really generalize it for rich or poor, I did not see that simple pattern.

At that time my hourly wage was about 8 pounds and a lady at an extravagant event gave me 5 pounds and told me to keep extra good care of the table. She somehow expected to have private waiter for the night for 5 pounds sterling but I took extra good care for about 45 minutes and when she asked me why I wasn't working for her specifically any longer, I explained that 5 pounds will do just that much and she agreed.

I recall once a very rich person screaming at the waiter because did not like the foam of the coffee and a few instances of rudeness but overall these were rarities.

If anything, the managers were much much bigger arsholes towards the employees because they could afford it(because the employees were mostly students or immigrants like me who need the money to sustain life until they find a proper job). Employees with higher status were big assholes towards the more junior ones.

Most social interactions with the rich or famous that I had or have seen were very positive and polite.

In some instances I was at fault and they were very understanding and tolerant. Once I failed to deliver the coffee of a famous F1 racer at breakfast and he didn't make a big deal of it(If I was him, I would probably be much more rude). Victoria's Secret models were just fine too when received flat champagne.

I'm not convinced that rich people being assholes in social interactions is a real thing. IMHO the pattern is, people who are privileged in their own social group are the assholes.


> I can't really generalize it for rich or poor, I did not see that simple pattern.

My SO works as a consultant in a bank here in Rome, Italy.

She moved from a bank in the periphery to a very central one in the Parioli neighborhood.

There was a night and day difference between her old and new clients in wealth (with the Parioli ones being largely millionaires).

Old clients would treat her with the utmost respect and call her doctor, "dottoressa", and always listen to what she had to say. New ones were on average much more rude, pretending and overall uneducated. She would have to explain them that she couldn't activate them some service because she needed their signatures and they would go all mad and call her director or some friend in the bank.

They are on average much worse people and they're also much more money aware.

Another anecdote she recalled me was how some rich woman wanted to set up a bank account for a no profit to send money to some african country. Not only there was no way to explain her that it was not that easy to do such operations, especially for large sums because this would have to automatically trigger money laundering controls, she would just not listen and blame her, but the client was MAD she had to pay 8 euros commissions on 60k+ euros wire transfer, pretending it to be free because it was a "no profit".

Yes, there's good and bad people in each wealth tier, but rich people on average are much worse assholes. There's no comparison.


When working with money, the rich are also more likely to hit the countless rules and limitations the banking regulations impose on us "for our own good".

Just like as a programmer I am going mental when encountering absurd and ineffective account password rules lets say (one special char, one upper case, one non-letter, etc) while a lay person would just sigh and comply.


exactly, on the other side of this, the rich person should have learned how to get better banking service that doesn't encumber them with these fund movement limitations

most “anti money laundering” or “security” stuff is actually just that one bank’s poor and inaccurate implementation of a law. most of it is just company policy and nothing related to the law.

with electronic funds, the entire banking system relies on assuming that the prior and next bank has already done the checks necessary

because the law only creates a firewall of reporting at the deposit and withdrawal of physical notes (its same across europe, across us, and elsewhere)


this has nothing to do with "better banking systems".

There are laws in italy and there are very specific amounts you can use per month before controls have to be triggered.

60k transactions, abroad are 12 times what you can transfer without declaring exactly what is the money from, where is it from. Especially when sending and receiving money to african countries.

Tax evasion and laundering are high in italy and banks easily deny you their services if they smell something.


some banks waive fees for non profits

your bank did not

one of my biggest pet peeves is how low-level employees cant tell that their organization isn’t doing the normal thing


Wouldn't surprise if the lady did not have a tax id associated with a non profit.


Bingo, she wanted to send 60k to that fresh account and then move it to africa, which raised multiple suspicions.


People in privileged positions often have the support of others like themselves. "Enablement" is probably a more accurate term.

There's an old family in my town that came from the kind of wealth that had each of their children for a few generations married into important or powerful families across the state. Today, the main family has no income other than from what they inherited, but they maintain their position and membership in society through being horrible to deal with. The center of the family is a vile gossip and has nothing but time to hear about everything that happens and think up ways to use it to her advantage.

They're notorious for showing up to functions uninvited, sitting at your table and ordering, and leaving before the bill comes. They hire the best local artisans and builders, complain to everyone about how shoddy the work is until they get extra for free, and then never pay, threatening to sue for imagined problems. When the grand children were in school, the family would try to walk into functions without tickets because "their child was performing", as if no one else's were.

When their daughter married a pro athlete, no one in town would build them a house, so they had to hire from other parts of the state. Their reasoning? No one in town was skilled enough to build them what they wanted.

They wrote a letter of complaint to the White House about a cavalcade driving through town during a family member's wedding reception and were sent an apology and a bottle of champagne by the POTUS. The family apparently sent back a letter letting him know that they didn't vote for them.

No one here even needs TV. Just hold a dinner party at a place they like and they'll show up and entertain for the cost of a few drinks and a meal.


I think we can point out horrible people from all kind of backgrounds. Wealth can definitely amplify their impact on others.


This could be self-selecting: entitled assholes with money buy Ferraris while non-asshole rich people drives normal cars.

How would you know that the person in the normal family car is rich?


anybody would say something snarky if the clerk asks you if they dont have to give you the rest of your money back

summer child labor conscript: your total is $15 and your change is $85, lemme keep that

you: ….. uhhhh you kidding me?

audience: rich people are assholes!


The more we automate the better. Computers don't ask to keep the change.


Instead, computers will keep the change without asking and point you to an unreachable (or unhelpful) customer support number. You’ll either give up on trying to recover your change or you’ll regret not having done it sooner, after spending more of your time and sanity than the money was worth.


I doubt that. I don't tip online purchases. Do you?


Do you buy gas online? The thread is about asking to keep the change at a gas station. Changing it to online purchases is moving the goalposts.


Google doesn't ask whether you agree to account suspension either.


You can get kicked out of a restaurant/hotel/casino as well


When you are kicked out of an establishment, more often than not they will first ask you to leave and give you an explanation. It’s in their best interest to handle the situation as calmly as possible so you don’t cause a scene and make them lose business. You have leeway to contest the decision. Even if you’re banned, it doesn’t affect your life much.

Getting kicked out of your Google account can mean losing access to all your other online accounts. That’s disruptive to your life and you might waste weeks or months dealing with a situation where you have no recourse because you can’t reason with a human.

It’s routine that people get their Google accounts banned without understanding why, and thus can’t fix it. When you’re kicked out of a physical location, you’ll know why.


[flagged]


The assholery part here is not about who keeps the change. It’s about telling a gas station boy “that’s how you got rich”. What makes you rich is first of all being born rich, which most rich people tend to ignore. On the few cases were we are really talking about people with humble beginnings, it’s all about making your time count. If the guy was really rich, the time it would take the boy to get inside, take 2 dollars from the cash register and give them to him was just not worth it. It was all self aggrandising bravado.


I doubt it's even a real story. I've heard variations of it. The basic idea is "just because I'm rich doesn't mean I'm giving stuff away, how do you think I got (and stay) rich?". The ferarri, the $50 note, and other details are just embellishments in this version of the tale.



Think the problem here is that family members of living rich people aren't considered "rich people" themselves, but may still have the ability to spend it.


The majority of wealthy Americans were not born rich. Almost half of Americans own stock and a significant portion of retired Americans are millionaires.


A retired American with a $1M+ in their retirement account but who needs to stretch that over 20+ years isn’t what I’d consider rich.


> I know! People shouldn’t feel entitled to their own change! It’s such a dick move.

Back when I was driving cabs I just assumed whatever they handed me was mine and if they wanted the change they had to ask for it. Mostly worked but plenty of times I gave them their change back and got no tip.

My absolute favorite was this guy taking a girl out to the fancy part of town and making a big deal about giving me a bunch of money then as soon as she was out of earshot asking for the change. Guess you gotta pay for those overpriced drinks somehow.

Some people just don’t ever tip and when the non-tipping profile groups collide in a single person but they do leave a tip it messes with your mind a bunch.


If someone asked to keep my change I would say no too.

Why the fsck? Is it normal to beg during work where you’re from?


> Is it normal to beg during work where you’re from?

Yes - but they call it tipping 'round these parts. They even have prominently displayed tip jars and everything.


Wait staff at restaurant asking "would you like change?" is normal.

Gas station clerk asking "can I keep your change?" is so random and unexpected it makes the story confusing.


Oh I tip plenty; but I’ve never been asked for a tip…


Reminds me of the Simpsons episodes when Bill Gates "buys out" Homer's internet business.

"Well I didn't get rich by writing a lot of cheques!"

https://m.youtube.com/watch?v=H27rfr59RiE


Interesting. Was it common in the past, or in your region, for the service worker to actually ask for the change? I've never heard of that, or experienced it in my own life. Usually the customer plays the only active part in the "keep the change" interaction.


Why would compensation be the decisive factor for top talent in a highly compensated field? They have probably already made enough money that they don't have to work for living, and they are probably genuinely interested in their work because they made it to the top. Apple processors are already in the market, so the most interesting work is done, and it may be time to start looking for new challenges.


The article specifically lists culture and money as the reasons people are leaving. It also adds that there is a need for a challenge amongst a successful and accomplished group.

“ The bleeding hasn’t stopped in recent years as Apple’s work culture simply isn’t the best and other firms, namely the hyperscalers such as Google, Microsoft, Amazon, and Meta, are paying more than Apple was to poach talent.”


If you consider Maslow's hierarchy of needs, you have "esteem" at a high place. Money becomes a proxy for that.

Big management negotiate higher pay all the time not because there is a difference between earning 30 or 40 millions, but because they need to feel their own "value" go up.

This is discussed at length in "thinking fast and slow", iirc.


Spending goes up with compensation in general. Then it becomes the norm and you don't want to lose it. Sure you can retire but probably not at a level where you can fly first class everywhere, stay in five star hotels and stay on private islands.

That aside once people have kids the sky is the limit for giving them the "best life possible." Nanny, private tutors, private schools (and/or a house in Cupertino since it's Apple), college funds, house with good amenities nearby, etc. I probably missed some costs in there.


Private schools aren't the norm in the Bay Area, they just fund their local public schools more and then keep out anyone they don't like via NIMBYism.

They're more normal in SF because of the school lottery system which can assign you to a school whether or not you can actually get there on time every day.


New companies offer an absence of technical and cultural debt.

Nuvia was purchased for $1.4 billion by Qualcomm, a couple of years after being started.


Jony Ive would call it courage.


They lack a culture of advancing technology per se, as opposed to making use of it.


So you claim that 10 well paid engineers are better than 15 average paid engineers?


Apple has enough money that they don’t have to reduce headcount to be able to afford higher salaries.


Well they aren't the richest and certainly aren't the biggest.


Depending on how you define richest Apple could very well be the richest. It has something like 200 billion dollars in cash in hand. I would be surprised if any company could match that.


Often unreported, Apple also has $100B in debt. Google and Microsoft have a better net cash and cash equivalents position than Apple.

https://www.macrotrends.net/stocks/charts/AAPL/apple/long-te...


Getting debt is the only way they could bring their cash home to avoid tax.


Can you explain the process? I can’t make it work to avoid tax. I assume Apple’s debt is credit lines to fund inventory.


Repatriating cash earned outside the U.S. involves a hefty tax (afraid I don't recall the number). Occasionally the government offers a tax holiday to encourage companies to do so.

So, rather than bring the money home at the high rate, Apple has been taking on debt for U.S. operations while waiting for (and perhaps lobbying for) another tax holiday.


And I hope you know that the phrase "repatriating" implies something false. The cash is not stuck in a box overseas. Even though Apple (and google and others) would transfer their profits to a company they set up overseas where there is a lower tax rate, that money ends up in banks and holdings in the US. The hitch, to Apple or Google, is those monies have restrictions on its use, eg, they can't use that pile of cash to build a factory or pay dividends. But what the can do is use that money as collateral for a nearly equivalent amount as a loan which is not so encumbered. It just becomes a game for them to avoid paying most taxes until they are able to get an administration / congress which gives them a tax holiday so they can convert all that deferred money back into normal cash holdings at a much lower tax rate.


Just an example, tax rate 30% and corporate bond rate is 0.7% it's a no brainer choice for them.


There are a downsides to paying a lot:

* you attract/retain more people that are interested in money/status.

* the employees become entitled.

Also, just like Apple's customers are OK with paying a premium price because it's Apple, employees are OK with paying a premium price to be an employee of Apple (by accepting lower salaries).


Economists call it 'psychic income'; it's often cited as a reason why salaries for teachers are not higher.


John Kenneth Galbraith called this phenomenon "convenient social virtue".


> The bleeding hasn’t stopped in recent years as Apple’s work culture simply isn’t the best and other firms, namely the hyperscalers such as Google, Microsoft, Amazon, and Meta, are paying more than Apple was to poach talent.

I really hate that word “poach”. Using “attract” works much better in that sentence.

I find it appalling how entering into a free contract with someone to give them more money for their work is called “poaching”.

Words matter, and how we describe something has an impact in how it is viewed (“piracy” is another example).


Poaching expresses a specific viewpoint that was challenged by the U.S. Department of Justice, https://arstechnica.com/tech-policy/2014/06/should-tech-work...

> A group of big tech companies, including Apple, Google, Adobe, and Intel, recently settled a lawsuit over their "no poach" agreement for $324 million. The CEOs of those companies had agreed not to do "cold call" recruiting of each others' engineers until they were busted by the Department of Justice, which saw the deal as an antitrust violation. The government action was followed up by a class-action lawsuit from the affected workers, who claimed the deal suppressed their wages.


It’s corporate America. “Performance improvement plans” and “at will states” are also part of the same lexicon. (Judge that as you will.)

On poaching though, Apple could choose to respond to the market signal by improving their work culture and policies to retain talent. I do wonder about the supply for these highly skilled hardware engineers though.


So what is it? Work culture or pay?

How many really move from Apple to Amazon for the work culture?


Seems like quibbling over words that everybody understands the meaning and intention of


Words matter. Apple failed to retain these employees with cash, its not some hunting game where you catch some unsuspecting engineer


... who is owned by some other firm.


Yeah, they matter, and everybody* understands what "poaching" means when used this way

* - including you


It’s dated, and the subject has shifted to processor design, but

The Apple Product Cycle (https://misterbg.org/AppleProductCycle.html)

still sums it all up pretty well.


Apple generally pay very poorly for quality of talent.

I would love to live in world where 10x engineers are rewarded 10x. Right now it's 25% better pay than median.


Most 10xers aren't constantly 10x. You might have a 10x year and then a 1.1x year.

With tech it's a bit of a catch 22, most engineers really become effective after 18 to 24 months. At this point you know how to really get things done in your org.

But after 2 years you can job hop and make significantly more, so your interest might not align 100% with the company's


This. I'll make a product for someone in a few weekends, working 18 hours a day for a few weeks straight,

Then I won't have the energy for a few months.

Probably evens out to an employer, but imo you do better when the snowball is rolling longer without breaks.

That's the secret to 10x. ADHD and don't stop. Stopping is the enemy, you can't see the forest.

I hear regulation is nice, but tell my mind that.

Been at my employer for 10 years making cool shit if that matters.


You think it's rough being 10. Trying being 10,000x.


Sleep deprivation is the limit. Whatever multiplier you are, you're capped by that.


If you want 10x pay you need to start your own company. No other way to get a 10x.


Excellent read into the realities of making a SOC.

Every SOC I continue to keep my eyes open for MTE being used in a mainstream ARMv8.5 processor... If we're to believe that M3 is marked to be using ARMv9 as well. Maybe 2024 is the year?


What does the M2 mean for linux support, any porting needed?


If the author had taken the time to define some acronyms, this would have been way more accessible to a layman. Had to give up halfway through.


As a casual reader, I notice the "die shot bleeding" and wonder if that was intended. I hope it wasn't, but I also hope not to have to wonder this about the posts here.


What do you mean? "die shot bleeding"?


Author here. What do you mean die shot bleeding?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: