In my first proper job as a software engineer I wrote a bunch of Forth for "fruit machines". I don't know what the US equivalent would be but they are low stakes gambling machines which are quite common in UK pubs. The core processor was a 6809 and Forth was chosen because the interpreter was super small and easy to implement. I really appreciated the quick interactive way you could update and tweak code as you tested it. I did get slightly weary of having to keep the state of the stack in your head as you DUP and SWAP stuff around but that was probably due to my inexperience and not decomposing things enough.
They continued to use Forth as the basis for their 68000 based video gaming machines although when it came to the hand classifier for video poker we ended up using C - mostly because we wanted to run a lot of simulations on one of these new fangled "Pentium" processors to make sure we got the prize distribution right to meet the target repayment rate of ~98%.
The talks have also now gone up on the YouTube channel: https://www.youtube.com/@kvmforum6546/videos including my keynote based on top QEMU stories in Hacker News for the year ;-)
The upstream now has TCG plugins (https://qemu.readthedocs.io/en/latest/devel/tcg-plugins.html) which allow for a degree of instrumentation. The implementation is architecture agnostic and also tested within the code base. There are still features missing but it does provide a base for dynamic analysis of guest code.
The plugins have access to the instruction stream to make architecture specific decisions. What I meant by architecture independent is that it doesn't involve per-guest annotations in the frontends to handle - any guest using the common translator loop (which is all of them now) can be instrumented by plugins.
However I absolutely agree its not currently as full featured as we would like. The next step when I get time is re-factoring the handling of register values in the core QEMU code so we can expose them to the plugins in a clean API.
Where the category "fish" isn't a clade - it's possible to evolve to no longer be a fish - it's more comparable to a specific generation of ARM chips, like perhaps ARM32, than it is to the ARM line in general. It would be weird to say "64-bit ARMv5" in the same way that it would be weird to say "lactating fish". But it is not weird to say "64-bit ARM" for the same reason it isn't weird to say "lactating euteleostome."
I gather that it's true that ARM hasn't been as good about backwards compatibility as some of its competitors, but was ARMv8 really so much of a jump from ARMv7 that one can't count it as part of the same line of processors anymore?
They weren't horrible either, AArch64 is incompatible with AArch32 but you can still implement both on the same chip with shared internals.
AMD didn't have to extend x86 the way they did, but without buy in from intel there was no way forward unless they went the route they did. Because unless both had agreed to shift to UEFI at the same time and agreed on an ISA it wasn't going to happen. This is why even a modern x86-64 processor has to boot up in real mode... because there was no guarantee that the x64 extensions were going to take off, so AMD had to maintain that strict compatibility to be competitive.
AArch64 had no prohibition, because there is no universal boot protocol for ARM. Insofar as the UEFI or loader sets the CPU in a state the OS can use then it's fine. The fact that there is one IP holder helped as well.
That said could AMD make a x86-64 processor without real mode or compatibility mode support? Yes they can. In fact I would hope that the processors they ship to console manufacturers fit that bill. There is a lot they could strip out if they only intend to support x86-64.
Short answer is yes. Just one significant example all instructions 32 bit long and no Thumb.
If you read Patterson and Hennessy (Arm edition) there is a slightly wistful throwaway comment I think that Aarch64 has more in common with their vision of MIPS than with the original Arm approach.
Elsewhere you've commented that it's more similar to x86 -> x64 than x86 -> Itanium - which may be true but Itanium was a huge change. However, Aarch64 is philosphically different to 32 bit Arm so it's not really like the x86 -> x64 at all which was basically about extending a 32 bit architecture to be 64 bit.
There's a sort of category problem underlying what you're saying though, perhaps fueled by the fact that ARM has more of a mix-and-match thing going on than Intel chips do.
aarch64 isn't really an equivalent category to x64, because it describes only one portion of the whole ARMv8 spec. ARMv8 still includes the 32-bit instructions and the Thumb. I realize you did mention Thumb, but you incorrectly indicated that it doesn't appear at all in ARMv8. As a counterexample, Apple's first 64-bit chip, the A7, supports all three instruction sets. This was how the iPhone 5S, which had an ARMv8 CPU, was able to natively run software that had been compiled for the ARMv7-based iPhone 5.
A better analogue to aarch64 would be just the long mode portion of x64. The tricky thing is that ARM chips are allowed to drop support for the 32-bit portions of ISA, as Apple did a few years later with A11. Like leeter said in the sibling post, though, x64 chip manufacturers don't necessarily have the option to drop support for legacy mode or real mode.
I think that's a fairly important distinction to make for the purposes of this discussion. I wasn't ever really talking about just aarch64; I was talking about all of ARM.
> Not only is it an incumbent switching to another architecture; it's an incumbent switching to another incumbent architecture. ARM is older than PowerPC and almost as old as the Macintosh itself; it came out in 1985.
> I gather that it's true that ARM hasn't been as good about backwards compatibility as some of its competitors, but was ARMv8 really so much of a jump from ARMv7 that one can't count it as part of the same line of processors anymore?
> I wasn't ever really talking about just aarch64; I was talking about all of ARM.
M1 is AArch64 only. You incorrectly brought ARMv8 into the discussion. AArch32 is irrelevant in the context of the M1.
Fair to highlight worse backwards compatibility but then you can't bring back AArch32 which Apple dropped years ago to try to claim that the M1 somehow uses an old architecture.
Is it? It's not like Apple moving MacBooks to M1 happened in a vacuum. M1 is only the latest in a whole series of Apple ARM chips, about half of which were non-aarch64.
That context actually seems extremely relevant to me; it demonstrates that Apple is not just jumping wholesale to a brand new architecture. They migrated the way large companies usually do: slowly, incrementally, testing the waters as they go. And aarch64 was absolutely not involved in the formative stages (which are arguably the most important bits) of that process. It hadn't even come into existence yet when Apple released their first product based on Apple Silicon. Heck, you can make a case that the process's roots go way back before Apple Silicon, all the way back to ~1990, when Apple first shipped the Newton.
Note, too, that the person I was originally replying to didn't say "M1", they said "Apple Silicon." In the interest of leaving the goalpost in one place, I followed that precedent.
AIUI that is mainly achieved by running the native library functions.Given how much time you spend in library functions I'm not surprised it has an edge over the full translation of the application.
We do use the host FPU for a subset of floating point operations now. However it only really works for clean 32/64 IEEE FP so anything that goes through the x87 still needs software emulation.
Since v4.0.0 (see https://git.qemu.org/?p=qemu.git;a=commitdiff;h=a94b783952cc...). We are always up for improving the code generation quality but of course there is a trade off with JITs given we are not compilers. I suspect there are still big wins if we can come up with a reasonable solution for re-generating a hot-path of basic blocks with much better optimisation.
Thank you. A tiered JIT will indeed cause much more optimisations to be made possible.
I made a Qemu backend targeting LLVM before, but that turned out to be way too heavy to be usable, I wonder if it's worth revisiting that idea nowadays...
I should point out that the awesome part of this work is unlike a lot of academic exercises this was done in collaboration with the community. As a result much of the work is merged upstream and QEMU gets to benefit.
They continued to use Forth as the basis for their 68000 based video gaming machines although when it came to the hand classifier for video poker we ended up using C - mostly because we wanted to run a lot of simulations on one of these new fangled "Pentium" processors to make sure we got the prize distribution right to meet the target repayment rate of ~98%.