More

levbrie · on Aug 31, 2016

This doesn't work.

levbrie · on Aug 31, 2016

There is just so much awesome stuff in this article. Finite State Entropy and Asymmetric Numeral System are completely new concepts to me (I've got 7 open tabs just from references FB supplied in the article), as is repcode modeling. I love that they've already built in granular control over the compression tradeoffs you can make, and I can't wait to look into Huff0. If anyone outside of Facebook has started playing with it or is planning to put it into production right away I'd love to hear about it.

joseraul · on Aug 31, 2016

Indeed ANS is difficult: it is the biggest innovation in compression in the last 20 years. Its author has some nice but dense slides about it.

https://dl.dropboxusercontent.com/u/12405967/ANSsem.pdf

Not sure exactly when repcodes were invented. Igor Pavlov has already used them in 7zip.

DannyBee · on Aug 31, 2016

The best TL;DR of ANS is something like this (without being too wrong). It's still too long:

Huffman requires at least one bit to represent any symbol, because it finds unique prefix codes for every symbol by varying the leading bits.

Arithmetic coding encodes symbols as fractional numbers of bits, by using binary fractions. It divides up the range to make this work. In the end, you get one fraction per "message" that can be decoded back into the message (IE a fraction like 0.53817213781271231 ....)

range coding is similar, it just uses integers instead of floating point. You get one number that can be decoded back into the message (IE a number like 12312381219129123123).

Note that in both of these, you have not changed the number system at all. The number of even numbers and odd numbers still has the same density.

Another way to look at the above is based on what a bit selects.

In huffman, a single bit is generally not enough to select something, the prefixes are long. So you need to walk a bunch of bits to figure out what you've got.

In arithmetic or range coding, a single bit selects a very large range. They change the proportions of those ranges, but at some point, something has to describe that range. This is because it's a range. Knowing you have gotten to the 8 in 0.538 doesn't tell you anything on it's own, you need to know "the subrange for symbol a is 0.530 ... 0.539", so it's a. So you have to transmit that range.

ANS is a different trick. Instead of encoding things using the existing number system, it changes the number system. That is, it redefines the number system so that our even and odd numbers are still uniformly distributed but have different densities. If you do this in the right way, you end up with a number system that lets you only require one number to determine the state, instead of two (like in a range).

vintermann · on Sept 2, 2016

With regular arithmetic coding and Huffman coding you don't need to send over a dictionary. Instead, you can have an adaptive model that learns to compress the data as it goes (e.g. keeps running track of symbol frequencies), and it will still be reversible.

I thought this wasn't possible with ANS. Or has this changed?

eln1 · on Sept 2, 2016

It depends if you have static or adaptive coder.

Static is much cheaper, uses the same probabilities for the entire data block (e.g. 30 kB), probabilities are stored in the header - practically all Huffman and tANS compressors (however, there are considered exceptions: https://en.wikipedia.org/wiki/Adaptive_Huffman_coding ).

Adaptive can start with e.g. uniform probability (no need to store in header) and learns on the way - it is more costly but gives better compression, used with arithmetic coding or rANS. See https://fgiesen.wordpress.com/2015/05/26/models-for-adaptive... https://fgiesen.wordpress.com/2015/12/21/rans-in-practice/

vintermann · on Sept 6, 2016

I know that's what adaptive coding means, I thought that was impossible with rANS. All the descriptions of rANS I could find (back when I looked at it) described it terms of a static model, and I ran into problems trying to generalize it to an adaptive one.

Thanks for the resources.

tmd83 · on Aug 31, 2016

I think you should look at the author's blog [0]. There is lots of in depth explanation of not just what he is doing but how he got there, the failed steps, the intermediate steps.

[0] http://fastcompression.blogspot.in/

levbrie · on Aug 31, 2016

Agreed. How did they not call it "Pied Piper"?

ryeguy_24 · on Aug 31, 2016

I was waiting for that reference.

levbrie · on Aug 31, 2016

Another way to put it might be to say that because neighborhoods near the center are more valuable and attract more people, they also tend to attract other desirables, and that those other desirables tend to further drive up their value. When you rent in New York, you pay for location. A big part of the location equation is your commute. Another huge chunk of it is your neighborhood. And because the two are heavily correlated, commute time by itself tends to track total value of location quite accurately, to the point where on average every minute in reduced commute results in a $56/month increase in rent.

levbrie · on Aug 28, 2016

I think you can reframe this debate pragmatically and widen its applicability significantly: At what point is "bad" code more effective than the alternatives. If you get down into a debate about "best practices" you'll have to concede that anyone writing the code the author is talking about might be using "best practices" in some explicit way, but isn't "following best practices", which are designed to avoid precisely the difficulties he outlines. On the other hand, it's true that most code out there is bad code, and that heavily architecting a system with bad code can be even more of a nightmare than more straightforward bad code. The real question is, when should scientists favor bad code? I'm a huge fan of best practices and of thoughtful and elegant coding, but I could see an argument being made that in most circumstances, scientific code is better off being bad code, as long as you keep it isolated. I'd love to see someone make that argument.

levbrie · on Aug 28, 2016

My vote is for electron, although I'm following how React Native develops really closely. From what I've seen, the best new "Desktop" apps tend to be built with electron. We've been using it in production and, despite having to make pull requests for framework-level bugs fairly regularly, we've haad a pretty good experience overall, and far better than anything else we've tried - although, as I said, we have yet to seriously look into React Native. Also, by cross-platform I'm assuming you mean OS X, Windows, and Linux. If you also want to include mobile, then there are so few options there's no longer any real reason to debate.

levbrie · on Aug 28, 2016

> They may arrived at the guidelines using ML, but it's possible that their guidelines wouldn't be right for the types of emails you are sending out.

This is a great point, and it's something that users ought to consider with nearly every application of machine learning that ends with a definite recommendation to the user. Machine learning can be used to solve many many different types of problems - when it comes to solving problems related to human interaction, the insights that it has will tend to function more like the rules for running an effective business-focused popularity contest than the rules for crafting meaningful emails to every possible audience. That said, if you happen to be sending a business email and want nothing more than to improve the likelihood of response, this seems like a great tool for the job.

baydinalex · on Aug 28, 2016

Sort of true, in the sense that the product isn't separately trained for sales emails vs personal emails vs internal business emails, etc.

But the calculations we chose don't provide a lot of constraints, and the variances were not as high as you'd likely expect. So I'd be comfortable saying that the recommendations generalize well to a vast majority of situations.

levbrie · on Aug 24, 2016

I agree completely but I guess I also don't mind it. But I also wouldn't mind "The Art of Woodworking: Tables You Can Build Yourself" - I suppose it depends on your tolerance for buzz words. I'm bombarded with them all day long so perhaps my tolerance is growing.

iamtherhino · on Aug 24, 2016

To be frank, the way I submit conference talks is normally:

Understanding {buzzword pop culture / news topic} using {conference language} + {pick 3+ of [real-time, cloud, at scale, data science, docker, sentiment analysis, polyglot persistence]}

It works shamefully well.

levbrie · on Aug 24, 2016

Yeah, I'm surprised the Neo4j team hasn't made more of an effort on this. I've run into lots of memory issues with it as well, and although there are reliable, fairly straightforward solutions to most of these problems, the team doesn't seem to be particularly interested in making sure that the defaults are robust enough to handle a reasonable workload. When your database fails on you for making a reasonable query request on a light workload, you can't help but feel troubled. There's a lot to love about Neo4j, but they've got a lot of work to do if they want to win over the developer community as a whole. There may be enterprises that get reassured by a huge price tag and a whole bunch of salespeople at their beck and call, but I don't know any of them. Every engineer I know who is willing to pay for software is either expecting a completely new kind of product or expecting to have an awesome experience with a free version of the tool before being willing to commit even a few bucks a month.

levbrie · on Aug 24, 2016

Kind of shocked nobody mentioned this, even if it is a bit of an aside, but umm, I've been dying for anything at all from Gary Bernhardt - I don't even know what to say except that it makes me hope however unrealistically that will one day get something like the magnum opus that is "Destroy All Software" from him again.

dochtman · on Aug 24, 2016

You know he just started to post new stuff, right?

https://www.destroyallsoftware.com/screencasts/catalog

levbrie · on Aug 24, 2016

I know. Mind blown.

nickpsecurity · on Aug 24, 2016

I havent read/seen it. Im guessing you know more than Id see in an abstract. So, what's the significance of that work?