Python statsmodel输出和Excel / Google表格输出不匹配

那位开发商

我有一个小的数据集,由于某种原因,输出与Excel的不匹配。

这就是我所做的。我必须专栏:

行驶里程 旅行时间
89 7.0
66 5.4
78 6.6
111 7.4
44 4.8
77 6.4
80 7.0
66 5.6
109 7.3
76 6.4

这是我在Google表格上获得的输出:

截距
系数 0.04025678079 3.185560249
标准误差 0.005706415564 0.4669507938
R平方,标准误 0.8615153295 0.3423088398
统计 49.76812677 8
回归SS /残留SS 5.831597265 0.9374027345

此输出也与excel输出匹配。

但是,当我在statsmodel上执行以下操作时:

milesTravelled = [89.0, 66.0, 78.0, 111.0, 44.0, 77.0, 80.0, 66.0, 109.0, 76.0]
travelTime = [7.0, 5.4, 6.6, 7.4, 4.8, 6.4, 7.0, 5.6, 7.3, 6.4]

model = sm.OLS(travelTime, milesTraveled).fit()
print(model.summary())

我得到以下内容:

                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:            Travel Time   R-squared (uncentered):                   0.985
Model:                            OLS   Adj. R-squared (uncentered):              0.983
Method:                 Least Squares   F-statistic:                              575.6
Date:                Mon, 01 Feb 2021   Prob (F-statistic):                    1.82e-09
Time:                        10:18:44   Log-Likelihood:                         -11.951
No. Observations:                  10   AIC:                                      25.90
Df Residuals:                       9   BIC:                                      26.20
Df Model:                           1                                                  
Covariance Type:            nonrobust                                                  
==================================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
Miles Traveled     0.0781      0.003     23.991      0.000       0.071       0.085
==============================================================================
Omnibus:                        2.179   Durbin-Watson:                   2.654
Prob(Omnibus):                  0.336   Jarque-Bera (JB):                1.033
Skew:                          -0.777   Prob(JB):                        0.597
Kurtosis:                       2.741   Cond. No.                         1.00
==============================================================================

如您所见,标准误差,R平方等的值根本与Google Sheet / Excel不匹配。我究竟做错了什么?如何获得确切的结果摘要(例如Google Sheet / Excel)?

沃伦·韦克瑟

默认情况下,OLS该类在线性模型中不包含常数项。您可以用来sm.add_constant为创建适当的exog参数OLS

In [36]: milesTraveled = [89.0, 66.0, 78.0, 111.0, 44.0, 77.0, 80.0, 66.0, 109.0, 76.0]

In [37]: travelTime = [7.0, 5.4, 6.6, 7.4, 4.8, 6.4, 7.0, 5.6, 7.3, 6.4]

In [38]: X = sm.add_constant(milesTraveled)

In [39]: model = sm.OLS(travelTime, X).fit()

In [40]: print(model.summary())
/Users/warren/a2020.11/lib/python3.8/site-packages/scipy/stats/stats.py:1603: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=10
  warnings.warn("kurtosistest only valid for n>=20 ... continuing "
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.862
Model:                            OLS   Adj. R-squared:                  0.844
Method:                 Least Squares   F-statistic:                     49.77
Date:                Mon, 01 Feb 2021   Prob (F-statistic):           0.000107
Time:                        13:04:53   Log-Likelihood:                -2.3532
No. Observations:                  10   AIC:                             8.706
Df Residuals:                       8   BIC:                             9.312
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.1856      0.467      6.822      0.000       2.109       4.262
x1             0.0403      0.006      7.055      0.000       0.027       0.053
==============================================================================
Omnibus:                        0.542   Durbin-Watson:                   2.608
Prob(Omnibus):                  0.763   Jarque-Bera (JB):                0.554
Skew:                           0.370   Prob(JB):                        0.758
Kurtosis:                       2.115   Cond. No.                         353.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章