Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

it's a tough problem.

there are various schemes for estimating mutual information from samples. if you do that and mutual information is very close to zero, then I guess you can claim the two rvs are independent. But these estimators are pretty noisy and also often computationally frustrating (the ones I'm familiar with require doing a bunch of nearest-neighbor search between all the points).

I agree with the OP that it's better to say "non-independence" and avoid confusion, at the same time, I disagree that linear correlation is actually the standard definition. In many fields, especially those where nobody ever expects linear relationships, it is not and everybody uses "correlated" to mean "not independent".



Yeah. It would be simpler to talk about causal graphs if the nodes represented only events instead of arbitrary variables, because independence between events is much simpler to determine: X and Y are independent iff P(X) * P(Y) = P(X and Y). For events there also exists a measure of dependence: The so-called odds ratio. It is not influenced by the marginal probabilities, unlike Pearson correlation (called "phi coefficient" for events) or pointwise mutual information. Of course in practice events are usually not a possible simplification.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: