'Quick way to visualise multiple columns in Altair with regression lines

So the way I have been visualising multiple columns quickly in Altair is to use repeat. This method is ok until I want to add regression lines using transform_regression or text using mark_text because it does not let me add layers when using repeat.

Pandas makes it very easy to get the correlation for a full df using df.corr it would be great to have some quick way to visualise all/multiple columns quickly.

Example code:

import altair as alt
from vega_datasets import data
from altair.expr import datum

iris = data.iris()

chart = alt.Chart(iris).mark_circle().encode(
    alt.X(alt.repeat("column"), type='quantitative'),
    alt.Y(alt.repeat("row"), type='quantitative'),
    color='species:N'
).properties(
    width=100,
    height=100
).repeat(
    row=['sepalLength', 'sepalWidth', 'petalLength','petalWidth'],
    column=['petalWidth','petalLength', 'sepalWidth', 'sepalLength']
)

chart

Here is the output from the codeenter image description here

So my question: is there a way to quickly add any extras e.g. regression line when using repeat? If not what is the best route to quickly visualise multiple columns of data, in one go, while adding extras?



Solution 1:[1]

In place of the repeat, you can use two fold transforms and a row/column facet, and then the regression transform can be applied directly. Here is an example:

import altair as alt
import pandas as pd

base = alt.Chart(iris).transform_fold(
    ['sepalLength', 'sepalWidth', 'petalLength','petalWidth'],
    as_=['key_x', 'value_x']
).transform_fold(
    ['sepalLength', 'sepalWidth', 'petalLength','petalWidth'],
    as_=['key_y', 'value_y']
).encode(
    x=alt.X('value_x:Q', title=None),
    y=alt.Y('value_y:Q', title=None),
).properties(
    width=100,
    height=100
)

alt.layer(
    base.mark_circle().encode(color='species:N'),
    base.transform_regression(
        'value_x', 'value_y',
        groupby=['key_x', 'key_y', 'species']
    ).mark_line(
        color='black'
    ).encode(
        detail='species:N'
    )
).facet(
    column=alt.Column('key_x:N', title=None),
    row=alt.Row('key_y:N', sort='descending')
).resolve_scale(
    x='independent',
    y='independent'
)

enter image description here

Solution 2:[2]

I copy and pasted the above code but I'm getting one graph with all the points and lines on it, not the multiple graphs shown above.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jakevdp
Solution 2 Sarah Abdelazim