'PLS-DA Loading Plot in Python
How can I make a Loading plot with Matplotlib of a PLS-DA plot, like the loading plot like that of PCA?
This answer explains how it can be done with PCA: Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)
However there are some significant differences between the two methods which makes the implementation different as well. (Some of the relevant differences are explained here https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/interpreting-pls-scores-and-loadings )
To make the PLS-DA plot I use the following code:
from sklearn.preprocessing import StandardScaler
from sklearn.cross_decomposition import PLSRegression
import numpy as np
import pandas as pd
targets = [0, 1]
x_vals = StandardScaler().fit_transform(df.values)
y = [g == targets[0] for g in sample_description]
y = np.array(y, dtype=int)
plsr = PLSRegression(n_components=2, scale=False)
plsr.fit(x_vals, y)
colormap = {
targets[0]: '#ff0000', # Red
targets[1]: '#0000ff', # Blue
}
colorlist = [colormap[c] for c in sample_description]
scores = pd.DataFrame(plsr.x_scores_)
scores.index = x.index
x_loadings = plsr.x_loadings_
y_loadings = plsr.y_loadings_
fig1, ax = get_default_fig_ax('Scores on LV 1', 'Scores on LV 2', title)
ax = scores.plot(x=0, y=1, kind='scatter', s=50, alpha=0.7,
c=colorlist, ax=ax)
Solution 1:[1]
I took your code and enhanced it. The biplot is obtained via simply overlaying the score and the loading plot. Other, more rigerous plots could be made with truely shared axis according to https://blogs.sas.com/content/iml/2019/11/06/what-are-biplots.html#:~:text=A%20biplot%20is%20an%20overlay,them%20on%20a%20single%20plot.
The code below generates this image for a dataset with ~200 features (therefore there are ~200 red arrows shown):
from sklearn.cross_decomposition import PLSRegression
pls2 = PLSRegression(n_components=2)
pls2.fit(X_train, Y_train)
fig, ax = plt.subplots(constrained_layout=True)
scores = pd.DataFrame(pls2.x_scores_)
scores.plot(x=0, y=1, kind='scatter', s=50, alpha=0.7,
c=Y_train.values[:,0], ax = ax)
newax = fig.add_axes(ax.get_position(), frameon=False)
feature_n=x_loadings.shape[0]
print(x_loadings.shape)
for feature_i in range(feature_n):
comp_1_idx=0
comp_2_idx=1
newax.arrow(0, 0, x_loadings[feature_i,comp_1_idx], x_loadings[feature_i,comp_2_idx],color = 'r',alpha = 0.5)
newax.get_xaxis().set_visible(False)
newax.get_yaxis().set_visible(False)
plt.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |