'Is there a way of translating the importance of each feature into the observations in PCA?
I am doing PCA with some 140 countries (observations) and 20 features. I have already run the model and it's pointing to keeping the first three components.
I am confused now because I don't know if there's a way to translate those PC values into the observations...? The reason I am asking is because someone who ran this same model on Stata sent me a table with the different observations (not features) and their values for each PC we kept. Is this something that is usually done? If so, is there a way of doing this in Python?
Solution 1:[1]
I went back to the basics and did everything from scratch just using numpy to better understand what .fit(x)
and .fit_transform(x)
exactly do. I ended up getting the values for each of the countries with .fit_transform(x)
.
Here is the chunk of code that did it for me.
# Create new PCA class and fit data
pca = PCA(n_components=3)
principalComponents = pca.fit_transform(x)
# Set df
principalDf = pd.DataFrame(data=principalComponents,
columns=['PC1', 'PC2', 'PC3'])
I then did df.concat()
to add the country names and other info I needed.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Liz |