Tuesday, 2 July 2013

Us And Them: Anscombe's Quartet

 

In Wikipedia there is an interesting example of multi dimensional data which has peculiar properties. The data is the following:




 

and has the interesting property of having, for each variable the same mean, variance and, for each pair of variables, the same correlation coefficient as well as the same regression model. The quartet is used to demonstrate the inadequacy of basic statistical properties of data.

Our model-free technology, which does not resort to regressions or conventional models, is, however, able to distinguish between the various situations. You may verify this on line by simply copying (Ctrl-C) the data (including variable labels):




x1 10 8 13 9 11 14 6 4 12 7 5
y1 8.04 6.95 7.58 8.81 8.33 9.96 7.24 4.26 10.84 4.82 5.68
x2 10 8 13 9 11 14 6 4 12 7 5
y2 9.14 8.14 8.74 8.77 9.26 8.1 6.13 3.1 9.13 7.26 4.74
x3 10 8 13 9 11 14 6 4 12 7 5
y3 7.46 6.77 12.74 7.11 7.81 8.84 6.08 5.39 8.15 6.42 5.73
x4 8 8 8 8 8 8 8 19 8 8 8
y4 6.58 5.76 7.71 8.84 8.47 7.04 5.25 12.5 5.56 7.91 6.89


and pasting it (Ctrl-V) into the spreadsheet here. Then, simply press "proceed". You may navigate the resulting Complexity map by simply moving the mouse over its nodes and links, observing the generalized correlations between the different pairs of variables.



www.ontonix.com