Simple linear regression
In statistics, linear regression is a method of estimating the conditional expected value of one variable y given the values of some other variable or variables x.
A linear regression model is typically stated in the form
.
Usually, we assume x is determinstic. Conditionally on x,
.
However,
.
This can be obtained using the following formula:
var(y)=var[E(y|x)] + E[var(y|x)].
, thus
.
, thus
.
R square, which represents how much variance in y can be explained by x, is equal to
.
Adjusted R square =
.
R sqaure sometimes is used to judge how well x can predict y. Big R suqare means that x is a good predictor of y. Small R square means we may need the other variables to predict y well.
R square does nothing with the model fit. For the simple regression, the F-test is the same with t-test of
. If this kind of test is significant, there exists linear relationship between y and x. Whether F/t-test is significant or not is not related to the magnitude of R square. However, if R square is very small, it usually means x is not a good predictor of y.
A related discussion of R square can be found at http://www.statisticalexperts.com/jianxu/2006/10/08/r2-confusion/.
Any comments are welcome.