Correlation is a measure of the linear relationship between two random variables. I find that it is helpful to understand correlation as a sort of angle between two vectors of data.
Notice that the sample covariance is the dot product of the data vectors. The sample standard deviation, or standard error as it is called, is by the same analogy the norm of the sample.
$$\mbox{Let }\{X_i\}, \{Y_i\}\mbox{ be two samples of observations, and let} \vec{X}=\langle X_1, \dots, X_n\rangle, \ \vec{Y}=\langle Y_1, \dots, Y_n\rangle$$
$$Cov(X, Y)=\sum_{i=1}^n (X-\bar{X})(Y-\bar{Y}) = \vec{X}\cdot\vec{Y}$$
$$Var(X, Y)=\sum_{i=1}^n (X-\bar{X})^2 = \|\vec{X}\|^2\mbox{ and }\sigma_X=\|\vec{X}\|$$
$$Corr(X,Y)=\frac{Cov(X,Y)}{\sigma_X\sigma_Y}=\frac{\vec{X}\cdot\vec{Y}}{\|\vec{X}\|\cdot\|\vec{Y}\|}=\frac{\|\vec{X}\|\cdot\|\vec{Y}\|\cdot\cos\theta}{\|\vec{X}\|\cdot\|\vec{Y}\|}=\cos\theta$$
Where θ is the hypothetical angle between the two sets of data.
The demo is divided into three interactive boxes so that it is possible to change the parameters without resampling. These are recognized gray backgrounds, in contrast to white backgrounds that hold the input commands.
First box: The explanatory variable
This is where x, the explanatory variable, is set up. You may choose its distribution. Koop suggests a uniform distribution, but the normal is included here as well. Changing the distribution automatically refreshes the data, but it is possible to refresh is as well by presing the X. Note: Any changes made in boxs one or two will not be reflected until the graph is redrawn. Changing the sample size requires the the noise be refreshed as well.
Second box: The noise, or error term
This button resamples the variable ep (short for epsilon). This variable is standard normal, but is scaled in the third box, below.
Third Box: The plot thickens!
This is where the dynamic experimentation begins. Keeping the sample as it is, you are able to manipulate the slope of the dependence ($\beta$), its intercept ($\alpha$), and the standard deviation of the distribution of the noise. The latter is the reciprocal of what Koop terms "precision".
$$y=\alpha + \beta x +(noise)\epsilon$$
$$\epsilon \sim N(0,1)$$
Some things to notice:
NOTE: The code can be interactively editted beyond the parameters established by the interactive controls. The code may be re-evaluated by shift+enter
Click to the left again to hide and once more to show the dynamic interactive window |
Click to the left again to hide and once more to show the dynamic interactive window |
Click to the left again to hide and once more to show the dynamic interactive window |
|