Difference between revisions of "Manuals/calci/REGRESSION"

From ZCubes Wiki
Jump to navigation Jump to search
 
(35 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
<div style="font-size:30px">'''REGRESSIONANALYSIS(y,x)'''</div><br/>
 
<div style="font-size:30px">'''REGRESSIONANALYSIS(y,x)'''</div><br/>
 +
 +
Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor). This technique is used for forecasting, time series modelling and finding the causal effect relationship between the variables.
 +
 +
 
*<math>y </math> is the set of dependent variables .
 
*<math>y </math> is the set of dependent variables .
 
*<math>x </math> is the set of independent variables.
 
*<math>x </math> is the set of independent variables.
 
  
 
==Description==
 
==Description==
 
*This function is calculating the Regression analysis of the given data.
 
*This function is calculating the Regression analysis of the given data.
*This analysis is very useful for the analyzation of large amounts of data and making predictions.  
+
*This analysis is very useful for the analyzing the large amounts of data and making predictions.  
 
*This analysis give the result in three table values.
 
*This analysis give the result in three table values.
 
# Regression statistics table.
 
# Regression statistics table.
 
# ANOVA table.  
 
# ANOVA table.  
 
# Residual output.
 
# Residual output.
*1.Regression statistics :   
+
1.'''Regression statistics''' :   
*It contains multiple R, R Square, Adjusted R Square, Standard Error and observations.
+
*It contains multiple R, R Square, Adjusted R Square, Standard Error and observations.
*R square gives the fittness of the data with the regression line.  
+
*R square gives the fitness of the data with the regression line.  
 
*That value is closer to 1 is  the better the regression line  fits the data.  
 
*That value is closer to 1 is  the better the regression line  fits the data.  
 
*Standard Error refers to the estimated standard deviation of the error term. It is called the standard error of the regression.  
 
*Standard Error refers to the estimated standard deviation of the error term. It is called the standard error of the regression.  
*2.ANOVA table:   
+
2.'''ANOVA table''':   
 
*ANOVA is the analysis of variance.  
 
*ANOVA is the analysis of variance.  
 
*This table splits in to two components which is Residual and Regression.   
 
*This table splits in to two components which is Residual and Regression.   
*Total sum of squares= Residual (error) sum of squares+ Regression (explained) sum of squares.  
+
Total sum of squares = Residual (error) sum of squares + Regression (explained) sum of squares.  
 
*Also this table gives the probability, T stat, significance of F and P.  
 
*Also this table gives the probability, T stat, significance of F and P.  
 
*When the significance of F is < 0.05, then the result for the given data is statistically significant.  
 
*When the significance of F is < 0.05, then the result for the given data is statistically significant.  
 
*When  the significance of F is > 0.05, then better to stop using this set of independent variables.  
 
*When  the significance of F is > 0.05, then better to stop using this set of independent variables.  
*Then remove a variable with a high P-value  and returnun the regression until Significance F drops below 0.05.  
+
*Then remove a variable with a high P-value  and return the regression until Significance F drops below 0.05.  
 
*So the Significance of P value should be <0.05.  
 
*So the Significance of P value should be <0.05.  
 
*This table containing the regression coefficient values also.   
 
*This table containing the regression coefficient values also.   
*3.Residual output:  
+
3.'''Residual output''':  
*The residuals show you how far away the actual data points are fom the predicted data points.
+
*The residuals show you how far away the actual data points are from the predicted data points.
 
 
  
 
==Examples==
 
==Examples==
#Temperature -     Drying time(hours)
+
1.
54                               8
+
{| class="wikitable"
63                                6
+
|+Spreadsheet
75                                3
+
|-
82                                1
+
!  !! A !! B
=REGRESSIONANALYSIS(A1:A4,B1:B4)=
+
  |-
Unit sales    Ads      population
+
! 1
4000            12000    300000
+
| '''Temperature''' || '''Drying Time(Hrs)'''
5200            13150    411000
+
|-
6800            14090    500000
+
! 2
8000            11900    650000
+
| 54 || 8  
10000          15000    800000
+
|-
 +
! 3  
 +
| 63  || 6
 +
|-
 +
! 4
 +
| 75 || 3 
 +
|-
 +
! 5
 +
| 82 || 1
 +
|}
  
 +
=REGRESSIONANALYSIS(A2:A5,B2:B5)
 +
'''REGRESSION ANALYSIS OUTPUT'''
  
where  and x
+
{| class="wikitable"
'''REGRESSIONANALYSIS'''(XRange, YRange, ConfidenceLevel, NewTableFlag)
+
|+Summary Output
 +
|-
 +
! Regression Statistics !!
 +
|-
 +
| Multiple R || -0.9989241524588298
 +
|-
 +
| R Square || 0.9978494623655915
 +
|-
 +
| v14193 || 0.9967741935483871
 +
|-
 +
| v15308 || 0.7071067811865362
 +
|}
  
where,
+
{| class="wikitable"
 +
|+ANOVA
 +
|-
 +
! Source of Variation!! Sum Of Squares !! Degree Of Freedom !! Mean Of Squares !! F!! Significance F
 +
|-
 +
! Regression:
 +
| 464 || 1 || 464 ||928 || 0.0010758475411702228
 +
|-
 +
! Residual:
 +
| 1 || 2 || 0.5 ||  || 
 +
|-
 +
! Total:
 +
| 465  || 3  ||    ||  || 
 +
|}
  
'''XRange '''- Input range should be one block.
+
{| class="wikitable"
 +
|+ANOVA
 +
|-
 +
!  !!Coefficients!! Standard Error !! T Statistics !! Probability !! Lower 95%!! Upper 95%
 +
|-
 +
! Intercept:
 +
| 86.5 || 0.6885767430246738 || 125.62143708199632|| 0.00006336233990811291 || 83.53729339698289 || 89.46270660301711
 +
|-
 +
! X Variable
 +
| -4 || 0.13130643285972046 || -30.463092423456118 || 0.0010758475411701829 || -4.564965981777541  || -3.435034018222459 
 +
|}
  
'''YRange '''- Input range should be one block.
+
{| class="wikitable"
 +
|+Residual Output
 +
|-
 +
! Observation !! Predicted Y !! Residuals !! Standard Residuals
 +
|-
 +
| 1 || 54.5 || -0.5 ||-0.8660254037844387
 +
|-
 +
| 2 || 62.5 || 0.5 || 0.8660254037844387 
 +
|-
 +
| 3  || 74.5  || 0.5  || 0.8660254037844387 
 +
|-
 +
| 4  || 82.5  || -0.5  || -0.8660254037844387 
 +
|}
  
'''ConfidenceLevel '''- represents the confidence level for percentage nad value should in between 0 and 100.the default percentage is 95.
+
2.
 +
{| class="wikitable"
 +
|+Spreadsheet
 +
|-
 +
!  !! A !! B !! C
 +
|-
 +
! 1
 +
| '''Unit sales''' || ''' Ads ''' || ''' population'''
 +
|-
 +
! 2
 +
| 4000|| 12000 ||    300000
 +
|-
 +
! 3
 +
| 5200  || 13150 ||    411000
 +
|-
 +
! 4
 +
| 6800  ||  14090 ||    500000 
 +
|-
 +
! 5
 +
| 8000 || 11900 ||    650000
 +
|-
 +
!6
 +
|10000 || 15000 ||    800000
 +
|}
 +
     
 +
#REGRESSIONANALYSIS(A2:A6,B2:C6)
 +
'''REGRESSION ANALYSIS OUTPUT'''
 +
{| class="wikitable"
 +
|+SUMMARY OUTPUT
 +
|-
 +
!  Regression Statistics !!
 +
|-
 +
! Multiple R
 +
| 0.9973790019059987
 +
|-
 +
! R Square
 +
| 0.9947648734430062
 +
|-
 +
! Adjusted R Square
 +
| 0.9895297468860125
 +
|-
 +
! Standard Error
 +
| 240.4075647503864
 +
|-
 +
! Observations
 +
| 5
 +
|}
 +
{| class="wikitable"
 +
|+ANOVA
 +
|-
 +
! !! df !!SS !! MS !! F!! Significance F
 +
|-
 +
! Regression:
 +
| 2 ||  21964408.405621577 || 10982204.202810789 || 190.0173496501376 || 0.00523512655699377
 +
|-
 +
! Residual:
 +
| 2 ||115591.59437842245|| 57795.797189211225||  || 
 +
|-
 +
! Total:
 +
| 4  ||  22080000  ||    ||  || 
 +
|}
  
'''NewTableFlag''' - is the TRUE or FALSE.If set as TRUE,the result in new sheet. If NewTableFlag is omitted, it assumed to be FALSE.
+
{| class="wikitable"
 
+
|-
</div>
+
!  !!Coefficients!! Standard Error !! t Stat !! P-value !! Lower 95%!! Upper 95% ||Lower 95.0% || Upper 95.0%
----
+
|-
<div id="1SpaceContent" class="zcontent" align="left">Regression analysis is a technique used for the modeling and analysis of numerical data consisting of values of a dependent variable(response variable) and of one or more independent variables(explanatory variables).</div>
+
! Intercept:
----
+
| -1096.09242|| 1259.21057|| -0.87046||  0.47583 || -6514.03824|| 4321.85339  || -6514.03824|| 4321.8533
<div id="7SpaceContent" class="zcontent" align="left">
+
|-
 
+
! X Variable1
Lets see an example in (Column3Row1)
+
| 0.14076 ||0.10798|| 1.30359 || 0.32223|| -0.32384  || 0.60538 ||-0.32384 || 0.60538
 
+
|-
UNIQad0c61c6a857529e-nowiki-00000004-QINU
+
!X Variable2
 
+
| 0.01133 || 0.00073|| 15.45951 || 0.00415 || 0.00818 || 0.01449 || 0.00818 || 0.01449
REGRESSIONANALYSIS returns the result in new sheet(5Space).
+
|}
 
+
{| class="wikitable"
UNIQad0c61c6a857529e-nowiki-00000005-QINU
+
|+Residual Output
 
+
|-
RANKANDPERCENTILE returns the #ERROR(ConfidenceLevel=-5).
+
! Observation !! Predicted Y !! Residuals !! Standard Residuals 
 
+
|-
</div>
+
| 1 || 593.1069112686723 || 3406.8930887313277 || 1.5209125615152896
----
+
|-
<div id="12SpaceContent" class="zcontent" align="left"><div class="ZEditBox" align="left">
+
| 2 || 754.9885142857306 || 4445.011485714269 || 1.9843516155712606 
 
+
|-
REGRESSION
+
| 3 || 887.3091289257611  || 5912.690871074239  || 2.6395562126436793 
 
+
|-
</div></div>
+
| 4  || 579.0302501367541  || 7420.969749863246  ||3.312885323147887
----
+
|-
<div id="10SpaceContent" class="zcontent" align="left"><div class="ZEditBox" align="justify">Syntax </div><div class="ZEditBox"><center></center></div></div>
+
| 5 || 1015.4067452262161 || 8984.593254773783 ||4.010921501026477
----
 
<div id="4SpaceContent" class="zcontent" align="left"><div class="ZEditBox" align="justify">Remarks </div></div>
 
----
 
<div id="3SpaceContent" class="zcontent" align="left"><div class="ZEditBox" align="justify">Examples </div></div>
 
----
 
<div id="11SpaceContent" class="zcontent" align="left"><div class="ZEditBox" align="justify">Description </div></div>
 
----
 
<div id="8SpaceContent" class="zcontent" align="left">
 
 
 
If ConfidenceLevel &lt; 0 or ConfidenceLevel &gt;100, REGRESSIONANALYSIS returns the #ERROR.
 
 
 
If Lengthof XRange != Lengthof YRange, it returns the #ERROR.
 
 
 
</div>
 
----
 
<div id="2SpaceContent" class="zcontent" align="left">
 
 
 
{| id="TABLE3" class="SpreadSheet blue"
 
|- class="even"
 
| class=" " |
 
| Column1
 
| Column2
 
| class="  " | Column3
 
| Column4
 
|- class="odd"
 
| class=" " | Row1
 
| class=" " | 1
 
| class=" " | 3
 
| class="sshl_f" | 5Space
 
| class=" " |
 
|- class="even"
 
| class="  " | Row2
 
| class=" " | 7
 
| class=" " | 8
 
| class="    " |
 
| class=" " |
 
|- class="odd"
 
| Row3
 
| class=" " | 12
 
| class=" " | 10
 
| class=" " |
 
| class=" " |
 
|- class="even"
 
| Row4
 
| class=" " | 17
 
| class=" " | 18
 
| class="sshl_f" | #ERROR
 
| class=" " |
 
|- class="odd"
 
| class=" " | Row5
 
| class=" " | 37
 
| class=" " | 36
 
| class=" SelectTD1 ChangeBGColor SelectTD1" |
 
<div id="2Space_Handle" class="zhandles" title="Click and Drag to resize CALCI Column/Row/Cell. It is EZ!"></div><div id="2Space_Copy" class="zhandles" title="Click and Drag over to AutoFill other cells."></div><div id="2Space_Drag" class="zhandles" title="Click and Drag to Move/Copy Area.">[[Image:copy-cube.gif]]  </div>
 
| class=" " |
 
|- class="even"
 
| Row6
 
| class=" " | 6
 
| class=" " | 5
 
| class=" " |
 
| class=" " |
 
 
|}
 
|}
  
<div align="left">[[Image:calci1.gif]]</div></div>
+
==Related Videos==
----
 
<div id="5SpaceContent" class="zcontent" align="left">
 
  
{| class="SpreadSheet blue"
+
{{#ev:youtube|OlxiOJ26r_k|280|center|REGRESSION}}
|+ Regression Analysis<br />SUMMARY OUTPUT
 
|- class="even"
 
! Regression
 
! Statistics
 
|- class="odd"
 
| Multiple R
 
| 0.9933541399262806
 
|- class="even"
 
| R Square
 
| 0.9867524473086807
 
|- class="odd"
 
| AdjustedRSquare
 
| 0.9834405591358509
 
|- class="even"
 
| StandardError
 
| 1.6492920155683643
 
|- class="odd"
 
| Observations
 
| 6
 
|}
 
  
{| class="SpreadSheet blue"
+
==See Also==
|+ <br />ANOVA
+
*[[Manuals/calci/SLOPE| SLOPE]]
|- class="even"
+
*[[Manuals/calci/STEYX| STEYX]]
! Source of Variation
 
! Sum of Squares
 
! Degree of Freedom
 
! Mean of Squares
 
! F
 
! Significance F
 
|- class="odd"
 
| Regression:
 
| 810.452676722863
 
| 1
 
| 810.452676722863
 
| 297.9425620115529
 
| 0.0000661044188125954
 
|- class="even"
 
| Residual:
 
| 10.880656610470271
 
| 4
 
| 2.720164152617568
 
|- class="odd"
 
| Total:
 
| 821.3333333333333
 
| 5
 
|}
 
  
{| class="SpreadSheet blue"
 
|+ <br />
 
|- class="even"
 
!
 
! Coefficients
 
! Standard Error
 
! T Statistics
 
! Probability
 
! Lower 95%
 
! Upper 95%
 
! Lower95%
 
! Upper95%
 
|- class="odd"
 
| Intercept
 
| -0.5146406388642397
 
| 1.0473758171565741
 
| -0.4913619642865071
 
| 0.6489114785470887
 
| -3.4226220244213423
 
| 2.393340746692863
 
| -3.422622024421344
 
| 2.3933407466928646
 
|- class="even"
 
| X Variable
 
| 1.038598047914818
 
| 0.06017016854489271
 
| 17.26101277479262
 
| 0.00006610441881260698
 
| 0.8715388823047108
 
| 1.2056572135249252
 
| 0.8715388823047107
 
| 1.2056572135249252
 
|}
 
 
{| class="SpreadSheet blue"
 
|+ <br />RESIDUAL OUTPUT
 
|- class="even"
 
! Observation
 
! Predicted Y
 
! Residuals
 
! Standard Residuals
 
|- class="odd"
 
| 1
 
| 2.601153504880214
 
| -1.6011535048802141
 
| -1.0854015072917171
 
|- class="even"
 
| 2
 
| 7.7941437444543045
 
| -0.7941437444543045
 
| -0.5383398997096632
 
|- class="odd"
 
| 3
 
| 9.871339840283941
 
| 2.128660159716059
 
| 1.4429915300597537
 
|- class="even"
 
| 4
 
| 18.180124223602483
 
| -1.1801242236024833
 
| -0.7999911358813972
 
|- class="odd"
 
| 5
 
| 36.8748890860692
 
| 0.1251109139307971
 
| 0.08481109034532615
 
|- class="even"
 
| 6
 
| 4.678349600709851
 
| 1.3216503992901493
 
| 0.8959299224777
 
|}
 
  
</div>
+
==References==
----
+
*[http://en.wikipedia.org/wiki/Regression_analysis Regression]

Latest revision as of 07:02, 9 May 2016

REGRESSIONANALYSIS(y,x)


Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor). This technique is used for forecasting, time series modelling and finding the causal effect relationship between the variables.


  • is the set of dependent variables .
  • is the set of independent variables.

Description

  • This function is calculating the Regression analysis of the given data.
  • This analysis is very useful for the analyzing the large amounts of data and making predictions.
  • This analysis give the result in three table values.
  1. Regression statistics table.
  2. ANOVA table.
  3. Residual output.

1.Regression statistics :

  • It contains multiple R, R Square, Adjusted R Square, Standard Error and observations.
  • R square gives the fitness of the data with the regression line.
  • That value is closer to 1 is the better the regression line fits the data.
  • Standard Error refers to the estimated standard deviation of the error term. It is called the standard error of the regression.

2.ANOVA table:

  • ANOVA is the analysis of variance.
  • This table splits in to two components which is Residual and Regression.
Total sum of squares = Residual (error) sum of squares + Regression (explained) sum of squares. 
  • Also this table gives the probability, T stat, significance of F and P.
  • When the significance of F is < 0.05, then the result for the given data is statistically significant.
  • When the significance of F is > 0.05, then better to stop using this set of independent variables.
  • Then remove a variable with a high P-value and return the regression until Significance F drops below 0.05.
  • So the Significance of P value should be <0.05.
  • This table containing the regression coefficient values also.

3.Residual output:

  • The residuals show you how far away the actual data points are from the predicted data points.

Examples

1.

Spreadsheet
A B
1 Temperature Drying Time(Hrs)
2 54 8
3 63 6
4 75 3
5 82 1
=REGRESSIONANALYSIS(A2:A5,B2:B5)

REGRESSION ANALYSIS OUTPUT

Summary Output
Regression Statistics
Multiple R -0.9989241524588298
R Square 0.9978494623655915
v14193 0.9967741935483871
v15308 0.7071067811865362
ANOVA
Source of Variation Sum Of Squares Degree Of Freedom Mean Of Squares F Significance F
Regression: 464 1 464 928 0.0010758475411702228
Residual: 1 2 0.5
Total: 465 3
ANOVA
Coefficients Standard Error T Statistics Probability Lower 95% Upper 95%
Intercept: 86.5 0.6885767430246738 125.62143708199632 0.00006336233990811291 83.53729339698289 89.46270660301711
X Variable -4 0.13130643285972046 -30.463092423456118 0.0010758475411701829 -4.564965981777541 -3.435034018222459
Residual Output
Observation Predicted Y Residuals Standard Residuals
1 54.5 -0.5 -0.8660254037844387
2 62.5 0.5 0.8660254037844387
3 74.5 0.5 0.8660254037844387
4 82.5 -0.5 -0.8660254037844387

2.

Spreadsheet
A B C
1 Unit sales Ads population
2 4000 12000 300000
3 5200 13150 411000
4 6800 14090 500000
5 8000 11900 650000
6 10000 15000 800000
  1. REGRESSIONANALYSIS(A2:A6,B2:C6)

REGRESSION ANALYSIS OUTPUT

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9973790019059987
R Square 0.9947648734430062
Adjusted R Square 0.9895297468860125
Standard Error 240.4075647503864
Observations 5
ANOVA
df SS MS F Significance F
Regression: 2 21964408.405621577 10982204.202810789 190.0173496501376 0.00523512655699377
Residual: 2 115591.59437842245 57795.797189211225
Total: 4 22080000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept: -1096.09242 1259.21057 -0.87046 0.47583 -6514.03824 4321.85339 -6514.03824 4321.8533
X Variable1 0.14076 0.10798 1.30359 0.32223 -0.32384 0.60538 -0.32384 0.60538
X Variable2 0.01133 0.00073 15.45951 0.00415 0.00818 0.01449 0.00818 0.01449
Residual Output
Observation Predicted Y Residuals Standard Residuals
1 593.1069112686723 3406.8930887313277 1.5209125615152896
2 754.9885142857306 4445.011485714269 1.9843516155712606
3 887.3091289257611 5912.690871074239 2.6395562126436793
4 579.0302501367541 7420.969749863246 3.312885323147887
5 1015.4067452262161 8984.593254773783 4.010921501026477

Related Videos

REGRESSION

See Also


References