'Quick way to visualise multiple columns in Altair with regression lines
So the way I have been visualising multiple columns quickly in Altair is to use repeat
. This method is ok until I want to add regression lines using transform_regression
or text using mark_text
because it does not let me add layers when using repeat
.
Pandas makes it very easy to get the correlation for a full df using df.corr
it would be great to have some quick way to visualise all/multiple columns quickly.
Example code:
import altair as alt
from vega_datasets import data
from altair.expr import datum
iris = data.iris()
chart = alt.Chart(iris).mark_circle().encode(
alt.X(alt.repeat("column"), type='quantitative'),
alt.Y(alt.repeat("row"), type='quantitative'),
color='species:N'
).properties(
width=100,
height=100
).repeat(
row=['sepalLength', 'sepalWidth', 'petalLength','petalWidth'],
column=['petalWidth','petalLength', 'sepalWidth', 'sepalLength']
)
chart
Here is the output from the code
So my question: is there a way to quickly add any extras e.g. regression line when using repeat
? If not what is the best route to quickly visualise multiple columns of data, in one go, while adding extras?
Solution 1:[1]
In place of the repeat, you can use two fold transforms and a row/column facet, and then the regression transform can be applied directly. Here is an example:
import altair as alt
import pandas as pd
base = alt.Chart(iris).transform_fold(
['sepalLength', 'sepalWidth', 'petalLength','petalWidth'],
as_=['key_x', 'value_x']
).transform_fold(
['sepalLength', 'sepalWidth', 'petalLength','petalWidth'],
as_=['key_y', 'value_y']
).encode(
x=alt.X('value_x:Q', title=None),
y=alt.Y('value_y:Q', title=None),
).properties(
width=100,
height=100
)
alt.layer(
base.mark_circle().encode(color='species:N'),
base.transform_regression(
'value_x', 'value_y',
groupby=['key_x', 'key_y', 'species']
).mark_line(
color='black'
).encode(
detail='species:N'
)
).facet(
column=alt.Column('key_x:N', title=None),
row=alt.Row('key_y:N', sort='descending')
).resolve_scale(
x='independent',
y='independent'
)
Solution 2:[2]
I copy and pasted the above code but I'm getting one graph with all the points and lines on it, not the multiple graphs shown above.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | jakevdp |
Solution 2 | Sarah Abdelazim |