7.4 Method of Least Squares

 

Regression Analysis - process of fitting an elementary function to a set of points using the method of least squares

 

Linear Regression

 

Consider set of points:   {(1,20),   (2,14),    (3,11),   (4,3)}

          Being fit to a line  y = ax + b

 

Create and Minimize the function which is the sum of the squares of the residuals

 

 

Residual - the difference between the y value of a point and the y-value predicted by the equation o the line.

  

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Generalized formulae:  page 571

 


#20   for each of five different years, the accompanying table gives the percentage of high school students who had used cocaine at least one in their lives up to that year:

Year

1991

1993

1995

1997

1999

2001

2003

2005

% who had used cocaine at least once

6.0

4.9

7.0

8.2

9.5

9.4

8.7

7.6

a)     Plot these data on a graph, with the number of years after 1991 on the x axis and the percentage of cocaine users on the y axis.

b)     Find the equation of the least-squares line for the data.

c)     Use the least-squares line to predict the percentage of high school students who used cocaine at least once by the year 2009.

 

 

 

 

 

 

 

 

 

 

 

 

 

 


#24  In a study of five industrial areas, a researcher obtained these data relating the average number of units of a certain pollutant in the air and the incidence (per 100,000 people) of a certain disease:

Units of pollutant

3.4

4.6

5.2

8.0

10.7

Incidence of disease

48

52

58

76

96

a)     Plot these data on a graph, using the Units of pollutant as the x-variable.

b)     Find the equation of the least-squares line for the data.

c)  Use the least-squares line to estimate the incidence of the disease in an area with an average pollution level of 7.3 units. 

 

 

 

 

 

 

 

 

 

 


Justification for

         

Then the linear best fit becomes  ln y = m x + ln b

         

 

 


Justification for

         

Then the linear best fit becomes  ln y = m ln x + ln b

         

 

Justification for y = A + Blnx

 

Then the line of best fit is y = m ln x + b

 

Could use a similar process of residuals for quadratic of the form:  or for cubic or quadratic … but the process of partial derivatives and resulting equations becomes very messy.

 

Find the coefficients of the parabola that is the "best" fit for the points (-1,-2), (0,1), (1,2), and (2,0)