'Print 'std err' value from statsmodels OLS results
(Sorry to ask but http://statsmodels.sourceforge.net/ is currently down and I can't access the docs)
I'm doing a linear regression using statsmodels
, basically:
import statsmodels.api as sm
model = sm.OLS(y,x)
results = model.fit()
I know that I can print out the full set of results with:
print results.summary()
which outputs something like:
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.952
Model: OLS Adj. R-squared: 0.951
Method: Least Squares F-statistic: 972.9
Date: Mon, 20 Jul 2015 Prob (F-statistic): 5.55e-34
Time: 15:35:22 Log-Likelihood: -78.843
No. Observations: 50 AIC: 159.7
Df Residuals: 49 BIC: 161.6
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1 1.0250 0.033 31.191 0.000 0.959 1.091
==============================================================================
Omnibus: 16.396 Durbin-Watson: 2.166
Prob(Omnibus): 0.000 Jarque-Bera (JB): 3.480
Skew: -0.082 Prob(JB): 0.175
Kurtosis: 1.718 Cond. No. 1.00
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
I need a way to print out only the values of coef
and std err
.
I can access coef
with:
print results.params
but I've found no way to print out std err
.
How can I do this?
Solution 1:[1]
Applying the answer given here I used dir() to print all the attributes of the results
object.
After that I searched for the one that contained the std err
value and it turned out to be:
print results.bse
(Not sure what the b
stands for in bse
, but I guess the se
stands for "standard error")
Solution 2:[2]
results.bse
provides standard errors for the coefficients, identical to those listed in results.summary()
.
The standard error of the regression is obtained using results.scale**.5
.
Also identical to np.sqrt(np.sum(results.resid**2)/results.df_resid)
, where results is your fitted model.
Solution 3:[3]
The following function can be used to get an overview of the regression analysis result. The parameter ols_model
is the regression model generated by statsmodels.formula.api
. The output is a pandas data frame saving the regression coefficient, standard errors, p values, number of observations, AIC, and adjusted rsquared. The standard errors are saved in brackets. ***
, **
, and *
represent 0.001, 0.01, 0.1 significance level:
def output_regres_result(ols_model, variable_list: list):
"""
Create a pandas dataframe saving the regression analysis result
:param ols_model: a linear model containing the regression result.
type: statsmodels.regression.linear_model.RegressionResultsWrapper
:param variable_list: a list of interested variable names
:return: a pandas dataframe saving the regression coefficient, pvalues, standard errors, aic,
number of observations, adjusted r squared
"""
coef_dict = ols_model.params.to_dict() # coefficient dictionary
pval_dict = ols_model.pvalues.to_dict() # pvalues dictionary
std_error_dict = ols_model.bse.to_dict() # standard error dictionary
num_observs = np.int(ols_model.nobs) # number of observations
aic_val = round(ols_model.aic, 2) # aic value
adj_rsqured = round(ols_model.rsquared_adj, 3) # adjusted rsqured
info_index = ['Num', 'AIC', 'Adjusted R2']
index_list = variable_list + info_index
for variable in variable_list:
assert variable in coef_dict, 'Something wrong with variable name!'
coef_vals = []
for variable in variable_list:
std_val = std_error_dict[variable]
coef_val = coef_dict[variable]
p_val = pval_dict[variable]
if p_val <= 0.01:
coef_vals.append('{}***({})'.format(round(coef_val, 4), round(std_val, 3)))
elif 0.01 < p_val <= 0.05:
coef_vals.append('{}**({})'.format(round(coef_val, 4), round(std_val, 3)))
elif 0.05 < p_val <= 0.1:
coef_vals.append('{}*({})'.format(round(coef_val, 4), round(std_val, 3)))
else:
coef_vals.append('{}({})'.format(round(coef_val, 4), round(std_val, 3)))
coef_vals.extend([num_observs, aic_val, adj_rsqured])
result_data = pd.DataFrame()
result_data['coef'] = coef_vals
result_data_reindex = result_data.set_index(pd.Index(index_list))
return result_data_reindex
Solution 4:[4]
statistically standard error of estimate is always equal to square root of mean square error of residual. It can be obtained from results using the formula np.sqrt(results.mse_resid)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Community |
Solution 2 | Topchi |
Solution 3 | Bright Chang |
Solution 4 | ah bon |