'How to plot sequential data, changing the color according to cluster

I have a dataframe with information concerning the date and the cluster that it belongs (it was done before based on collected temperatures for each day). I want to plot this data in sequence, like a stacked bar chart, changing the color of each element according to the assigned cluster. Here it is my table (the info goes up to 100 days):

Date order ClusterNo2 constant
2020-08-07 1 3.0 1
2020-08-08 2 0.0 1
2020-08-09 3 1.0 1
2020-08-10 4 3.0 1
2020-08-11 5 1.0 1
2020-08-12 6 1.0 1
2020-08-13 7 3.0 1
2020-08-14 8 2.0 1
2020-08-15 9 2.0 1
2020-08-16 10 2.0 1
2020-08-17 11 2.0 1
2020-08-18 12 1.0 1
2020-08-19 13 1.0 1
2020-08-20 14 0.0 1
2020-08-21 15 0.0 1
2020-08-22 16 1.0 1

Obs: I can't simply group the data by cluster because the plot should be sequential. I thought writing a code to identify the number of elements of each cluster sequentially, but then I will face the same problem for plotting. Someone know how to solve this?

The expected result should be something like this (the numbers inside the bar representing the cluster, the x-axis the time in days and the bar width the number of observed days with the same cluster in order : example_graph



Solution 1:[1]

You could just plot a normal bar graph, with 1 bar corresponding to 1 day. If you make the width also 1, it will look as if the patches are contiguous.

enter image description here

import numpy as np
import matplotlib.pyplot as plt

from matplotlib.colors import BoundaryNorm

# simulate data
total_datapoints = 16
total_clusters = 4
order = np.arange(total_datapoints)
clusters = np.random.randint(0, total_clusters, size=total_datapoints)

# map clusters to colors
cmap = plt.cm.tab10
bounds = np.arange(total_clusters + 1)
norm = BoundaryNorm(bounds, cmap.N)
colors = [cmap(norm(cluster)) for cluster in clusters]

# plot
fig, ax = plt.subplots()
ax.bar(order, np.ones_like(order), width=1, color=colors, align='edge')

# xticks
change_points = np.where(np.diff(clusters) != 0)[0] + 1
change_points = np.unique([0] + change_points.tolist() + [total_datapoints])
ax.set_xticks(change_points)

# annotate clusters
for ii, dx in enumerate(np.diff(change_points)):
    xx = change_points[ii] + dx/2
    ax.text(xx, 0.5, str(clusters[int(xx)]), ha='center', va='center')

ax.set_xlabel('Time (days)')
plt.show()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1