This isnt a correction to your post, but a clarification for other readers: correlation implies dependence, but dependence does not imply correlation. Conversely, two variables share non-zero mutual information if and only if they are dependent.
Isn't it possible to contrive an example where a test of pairwise dependence causes the statistician to error by excluding relevant variables from tests of more complex relations?
Trying to remember which of these factor both P(A|B) and P(B|A) into the test
I think you're using the word "insignificant" in a possibly misleading or confusing way.
I think in this context, the issue with the spurious correlations from that site is that they're all time series for overlapping periods. Of course, the people who collected these understood that time was an important causal factor in all these phenomena. In the graphical language of this post:
T --> X_i
T --> X_j
Since T is a common cause to both, we should expect to see a mutual information between X_i, X_j. In the paradigm here, we could try to control for T and see if a relationship persists (i.e. perhaps in the same month, collect observations for X_i, X_j in each of a large number of locales), and get a signal on whether some the shared dependence on time is the only link.
If a test of dependence shows no significant results, that's not conclusive because of complex, nonlinear, and quantum 'functions'.
How are effect lag and lead expressed in said notation for expressing causal charts?
Should we always assume that t is a monotonically-increasing series, or is it just how we typically sample observations? Can traditional causal inference describe time crystals?
What is the quantum logical statistical analog of mutual information?
Are there pathological cases where mutual information and quantum information will not discover a relationship?
Does Quantum Mutual Information account for Quantum Discord if it only uses von Neumann definition of entropy?
> A sailor is sailing her boat across the lake on a windy day. As the wind blows, she counters by turning the rudder in such a way so as to exactly offset the force of the wind. Back and forth she moves the rudder, yet the boat follows a straight line across the lake. A kindhearted yet naive person with no knowledge of wind or boats might look at this woman and say, “Someone get this sailor a new rudder! Hers is broken!” He thinks this because he cannot see any relationship between the movement of the rudder and the direction of the boat.
They have clear dependence; if you imagine fixing ("conditioning") x at a particular value and looking at the distribution of y at that value, it's different from the overall distribution of y (and vice versa). But the familiar linear correlation coefficient wouldn't indicate anything about this relationship.
Imagine your data points look like a U. There's no (lineral) correlation between x and y, you are equally likely to have a high value of y when x is high or low. But low values of y are associated with medium values of x, and a high value of y means x will be very high or very low.