'Change order of categorical bars in Plotly parallel categories

I am trying to visualize changes in gene expression as categorical variables (up, down, no change) over various timepoints.

I have a dataframe describing differential expression data that looks like this:

data = {'gene':['Svm3G0018840','Svm5G0011050','Svm9G0059770'],
        '01h': ['nc','up','down'], '04h': ['up', 'down', 'nc'],'08h':['nc','down','up']}
df=pd.DataFrame.from_dict(data)
df=df.set_index('gene')

enter image description here

I can use this df to create the parallel plot using the following code:

fig = px.parallel_categories(herbdf, dimensions=['01h', '04h', '08h','24h','48h'],
                labels={'01h':'', '04h':'', '08h':'','24h':'','48h':''})

fig.show()

However, the categories (up, down, nc) are not always in the same order for every time point which makes the figure very difficult to read. I can change this in the interactive figure in a notebook, but I only have the option to output the corrected figure as a low quality png. I need the image in an svg format, which means I need to use the line:

fig.write_image("/figs/herb_de_pp.svg")

But when I add this line in the code block to save the figure I have no control of the order the categorical boxes end up in:

enter image description here

I have tried to add fig.update_ lines to solve this problem, such as:

fig.update_layout(xaxis={'categoryorder':'total descending'})

but this doesn't seem to change the output at all.

I could be missing something simple- any help would be much appreciated!



Solution 1:[1]

not great answer here, but something that I think will work in a pinch...

It looks like the order of the categories of each figure/column come from the order that they are in the original dataset. That is, in your first column, nc is the first unique item, then down is the second unique item, up is third.

So, if you can rearrange/sort your data so that the data shows up in the order you want it displayed, that should work.

Have your first row be nc | nc | nc | nc | nc, second row down | down | down | down | down, and third row up | up | up | up | up (assuming you actually have records like that). That should do it, but isn't very elegant...

Solution 2:[2]

Given the above solution, this is the line needed to sort the dataframe and produce the figure with ordered categories:

sorteddf = df.sort_values(by=['01h','04h','08h'], axis=0, ascending=False)

Solution 3:[3]

Parallel coordinates diagrams don't have xaxis/yaxis properties, you need to update traces in order to change the dimensions order:

dimensions = ['01h', '04h', '08h','24h','48h']
...
fig.update_traces(dimensions=[{"categoryorder": "category descending"} for _ in dimensions])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 scotscotmcc
Solution 2 Daniel Al Mouiee
Solution 3 Miguel Vieira