Actually, in this case it would. Consider the layout (byte-by-byte): BE: t8 t7 t...

weinzierl · on Sept 16, 2024

I think we agree, but it nags me that I still can't follow your line of thought.

Do you want to:

- Compare just the timestamp, so

    1970-01-01 00:00 0x01
    1970-01-01 00:00 0x00
    1970-01-01 00:00 0x01
    1970-01-01 00:01 0x01
    1970-01-01 00:01 0x00
    1970-01-01 00:01 0x01

could be a valid ordering, with the first three and last three in arbitrary ordering, because the checksum doesn't play a role.

- Compare timestamp and checksum, in the sense of ordering all files with the same checksum by timestamp, like this

    1970-01-01 00:00 0x00
    1970-01-01 00:01 0x00
    1970-01-01 00:02 0x00
    1970-01-01 00:00 0x01
    1970-01-01 00:01 0x01
    1970-01-01 00:02 0x01

- Compare timestamp and checksum, in the sense that files with the same timestamp are ordered by checksum, in effect grouping equal checksum files together under their respective date.

    1970-01-01 00:00 0x00
    1970-01-01 00:00 0x01
    1970-01-01 00:00 0x02
    1970-01-01 00:01 0x01
    1970-01-01 00:01 0x02
    1970-01-01 00:01 0x02
    1970-01-01 00:01 0x03

In the first case you could just compare the first 64-bit, so I don't think that's it. The second case would be an advantage for little-endian, so it doesn't support your argument. Third case supports the argument for BE, but is an unusual thing to want.

In other words: Is the checksum crucial for your line of argumentation, or could you make your point with just a timestamp? If not, why not compare just 64-bit. If yes, I don't follow why BE is better in this case.

kstenerud · on Sept 16, 2024

Basically, (and this is getting really esoteric at this point), if you use big endian byte ordering in your data structures when saving to disk, then you can place items in order of descending "sorting order" importance at the beginning of your file. Anyone wishing to sort such files wouldn't need to know anything about the actual structure of the file, or what is stored where. They could simply choose an arbitrary number of bits to read (say, 512 bits), do a big endian sort based on that, and it will always come out right (even though they're technically reading more than they have to).

    struct myfile {
        uint32_t year;
        uint8_t month; // Assuming packed structs here
        uint8_t day;
        uint32_t seconds;
        uint16_t my_custom_ordering;
        uint8_t some_flags;
        uint64_t a_checksum_or_something;
        char name[100];
        ...
    }

Reading the first 64 bytes from this file will give year, then month, then day, then seconds, then my_custom_ordering, then some_flags, then a_checksum_or_something, then the first few bytes of name (assuming we used big endian byte ordering). The extra bytes won't hurt anything because they're lower order when we compare.

To do this with little endian ordered data, you would have to:

1) Reverse the ordering of the "sortable" fields to: my_custom_ordering, seconds, day, month, year

2) Know in advance that you have to read exactly 12 bytes (no more, no less) from any file using this structure. If you read any more, you'll get random ordering based on the reverse of what's in the "name", "a_checksum_or_something", and "some_flags" fields (because they comprise the "higher order" bytes when reading little endian).

3) If you were to add another field "my_extra_custom_ordering", you'd have to adjust the number of bytes you read. With big endian ordering, you can still read 64 bytes and not care. You'd only care once your "sortable fields" exceeds 64 bytes - at which point you'd read, say, 100 bytes to be completely arbitrary... It doesn't matter because with BE everything just sorts itself out.

The comparator function is also much simpler with BE: Just do a byte-by-byte compare until you find a difference. With LE, you have to start at a specific offset (in the above case, 11) and decrement towards 0.

weinzierl · on Sept 16, 2024

That made it click. Thanks a lot for your patience and the detailed explanation.