I'm wondering if anybody has tried adding PNG filters on top of QOI[0]. I have a...

p0nce · on Dec 7, 2022

I'm not sure why your whole scanline would need a predictor that is top pixel, left pixel, or average. Shouldn't this be for 16 or 32 pixel max?

The most popular QOI modification has been to take the average of top and left decoded pixels as predictor.

I've tried to put more bits into selecting 4 possible pixels as predictors (for a single pixel), but it doesn't work that well because the previously encoded colors are already in the table. Predictor+diff encoding weren't as successful as one would expect, probably because there pixels are processed one by one. One design principle of QOI is to read input once to encode, so not sure if people would want to try 4x the scanline.

vanderZwan · on Dec 7, 2022

> Predictor+diff encoding weren't as successful as one would expect.

Interesting, I'd love to see more details about your experiments! Are they online somewhere? And can you link the other QOI modification you mentioned?

> I'm not sure why your whole scanline would need a predictor that is top pixel, left pixel, or average.

Well, it's all just a thought experiment at this stage anyway right? My reasoning was that when filters get more sophisticated it can become a hard problem to figure out when to apply which one. So I figured an approach that lets us brute-force try all of them seemed simplest. It worked well enough for PNG to be an OK-ish lossless compression format after all.

Actually, you just reminded me of another reason I wanted to try the scanline approach: it makes it easier to give each filter its own thread, so we can try out all four options in parallel. In that case the bottleneck is the slowest of the four options, so it would still be slower, but hopefully not that much, and faster than a 4x slowdown.

Another reason for the scan-line approach was that it seemed like it would be a way to add them on top of QOI without messing too much with the latter's specs. As mentioned, the output would be a regular QOI image where each line just happens to optionally have one of three transformations applied to it, followed by an array that tells you which transformation was applied to each line.

If I understand correctly, QOI images have an end marker (seven 0x00 bytes followed by a single 0x01 byte), meaning adding such an extra array of bytes should be ignored by properly implemented QOI decoders.

If so, that would mean that implementing a decoder for this filter-approach could be a minimal program you run after running any other QOI decoder that takes the output and undoes the transformations. That would make it trivial to build a decoder on top of other people's hard work to implement an efficient decoder for "regular" QOI images..

Think of it like how tar and gz work together. I suppose this trick wouldn't be limited to scanlines though, I'm sure others have thought of it too

> One design principle of QOI is to read input once to encode, so not sure if people would want to try 4x the scanline.

True, but my reasoning was that this would be for a particular niche: when the decoding speed matters most (which would be minimally impacted by this) but encoding speed can be a little slower if the compression is significantly better. No idea how much better that would be though, if at all.

p0nce · on Dec 7, 2022

I did not kept track of every experiment, just saying that if you do make a new opcode in QOI that encode a predictor with a few bits + a few bits of diff, then it's hard to make an improvement at all. It's easier if you remove the 8-bit boundaries constraint.

> And can you link the other QOI modification you mentioned?

https://github.com/nigeltao/qoi2-bikeshed/issues/34 then adopted by virtually all QOI-inspired codecs.