'Unbalanced panel error in PMG Analysis in R
I am trying to run a Fama Macbeth analysis in R, where I am using the 'pmg' function with the following code:
Fpmg1 <- pmg(ret ~ HML_OBS + SMB + Mktrf + HML, Analysis4_Weighted, index = c("permno"))
summary(Fpmg1)
I currently have 1,354,623 entries and 11 total columns. I get the below output where the estimates for my coefficients are NA.
Mean Groups model
Call:
pmg(formula = ret ~ HML_OBS + SMB + Mktrf + HML, data = Analysis4_Weighted,
index = c("date", "permno"))
Unbalanced Panel: n = 295, T = 3567-6287, N = 1349058
Residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.065356 -0.077703 -0.008573 0.000000 0.060437 19.741368
Coefficients:
Estimate Std. Error z-value Pr(>|z|)
(Intercept) 0.0110395 0.0034105 3.237 0.001208 **
HML_OBS NA NA NA NA
SMB NA NA NA NA
Mktrf NA NA NA NA
HML NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 50764
Residual Sum of Squares: 45906
Multiple R-squared: 0.0957
I have sorted on the following before running the model:
Analysis4_Weighted <-
Analysis4_Weighted %>%
dplyr::filter(!is.na(HML_OBS))
Analysis4_Weighted <-
Analysis4_Weighted %>%
dplyr::filter(!is.na(ret))
Analysis4_Weighted <-
Analysis4_Weighted %>%
group_by(date) %>%
dplyr::filter(n() > 10)
Do you know why I do not get any estimates?
My data consists of various returns on different stocks in a long time period, and I trying to test the coefficients' ability to predict stock returns over the period across various stocks.
Thank you!
Solution 1:[1]
It may be due to that pmg requires that for the cross-sectional regressions for each permno that you have n+1 times series for n factors for each permno. You may not have n+1 times series for each permno. You would need to either generate data for the missing time series or eliminate permno's that do not have enough time series for estimation.
Solution 2:[2]
From this line in the output
pmg(formula = ret ~ HML_OBS + SMB + Mktrf + HML, data = Analysis4_Weighted,
index = c("date", "permno"))
we can see that you (implictly or explictly) defined a variable called date
as the first index variable. The first index variable is meant to be the unit of observations (often called individual index), while the 2nd index variable is supposed to be the time periods. Very likely your variable date
is the time periods and should go into the 2nd slot and permno
into the first slot of the index
argument.
Try to specify your pdata.frame explicitly beforehand and use it in the estimation with pmg
, i.e., something along these lines:
pdat <- pdata.frame(Analysis4_Weighted, index = c("permno", "date"))
pmg(formula = ret ~ HML_OBS + SMB + Mktrf + HML, data = pdat))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Richard Gregory |
Solution 2 | Helix123 |