Question
Question: How do you write the equation of the regression line for the following set of data and find the corr...
How do you write the equation of the regression line for the following set of data and find the correlation coefficient?
The table shows the number of turtles hatched at a zoo each year since 2002
Year | 2003 | 2004 | 2005 | 2006 | 2007 |
---|---|---|---|---|---|
Turtles hatched | 21 | 17 | 16 | 16 | 14 |
Solution
Here we will find the equation of the regression line by using regression formula and then substituted the value in the equation of the regression line and also find the correlation coefficient of the given data by using correlation coefficient formula.
Formula used:
For finding Regression line m=n∑x2−(∑x)2n∑xy−(∑x)(∑y) and regression equation y^=mx+b where
b=yˉ−mxˉ , yˉ=n∑y and xˉ=n∑x
For finding correlation coefficient r=n∑x2−(∑x)2n∑y2−(∑y)2n∑xy−(∑x)(∑y)
Complete step by step answer:
x | y | xy | x2 | y2 |
---|---|---|---|---|
2003 | 21 | 42063 | 4012009 | 441 |
2004 | 17 | 34068 | 4016016 | 289 |
2005 | 16 | 32080 | 4020025 | 256 |
2006 | 16 | 32096 | 4024036 | 256 |
2007 | 14 | 28098 | 4028049 | 196 |
∑xi=10025 | ∑yi=84 | ∑xiyi=168405 | ∑xi2=20100135 | ∑yi2=1438 |
For finding Regression line m=n∑x2−(∑x)2n∑xy−(∑x)(∑y)
Now substitute the values in the formula we get,
m=5(20100135)−(10025)25(168405)−(10025)(84)
m=100500675−100500625842025−842100=50−75=−1.5
m=−1.5
The regression line is m=−1.5
Now we are going to find the regression line. Formula for finding regression equation y^=mx+b where
b=yˉ−mxˉ , yˉ=n∑y and xˉ=n∑xand The slope term is m=−1.5.
b=yˉ−mxˉ
⇒=n∑y−mn∑x
⇒=584−(−1.5)510025=3024.3
b=3024.3
Now we have m=−1.5and b=3024.3, then substitute this value into the regression equation we get,
Therefore the regression equation is y^=−1.5x+3024.3
Now we are going to find the correlation coefficient,
For finding correlation coefficient r=n∑x2−(∑x)2n∑y2−(∑y)2n∑xy−(∑x)(∑y)
From the table we have the values of ∑xy,∑x,∑y,
r=5(20100135)−(10025)25(1438)−(84)25(168405)−(10025)(84)
r=50134−75
r=−0.916271
Hence the correlation coefficient is r=−0.916271.
There is a very strong negative (downhill) linear relation between Year(x)and Turtles hatched(y).
Note: The purpose of regression is estimate, explain. Predict and evaluate the relation between variables.
The correlation coefficient is a measure of the strength and the direction of a linear relationship between two variables.
The symbol r represents the sample correlation coefficient.
The range of the correlation coefficient is −1 to1 .
If x and y have a strong positive linear correlation, r is close to 1. If x and y have a strong negative linear correlation, r is close to −1. If there is no correlation or a weak linear correlation, r is close to 0.