I have a dataframe generated from Python’s Pandas package. How can I generate heatmap using DataFrame from pandas package.
import numpy as np
from pandas import *
Index= ['aaa','bbb','ccc','ddd','eee']
Cols = ['A', 'B', 'C','D']
df = DataFrame(abs(np.random.randn(5, 4)), index= Index, columns=Cols)
>>> df
A B C D
aaa 2.431645 1.248688 0.267648 0.613826
bbb 0.809296 1.671020 1.564420 0.347662
ccc 1.501939 1.126518 0.702019 1.596048
ddd 0.137160 0.147368 1.504663 0.202822
eee 0.134540 3.708104 0.309097 1.641090
>>>
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
For people looking at this today, I would recommend the Seaborn heatmap() as documented here.
The example above would be done as follows:
import numpy as np from pandas import DataFrame import seaborn as sns %matplotlib inline Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee'] Cols = ['A', 'B', 'C', 'D'] df = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols) sns.heatmap(df, annot=True)

Where %matplotlib is an IPython magic function for those unfamiliar.
Method 2
If you don’t need a plot per say, and you’re simply interested in adding color to represent the values in a table format, you can use the style.background_gradient() method of the pandas data frame. This method colorizes the HTML table that is displayed when viewing pandas data frames in e.g. the JupyterLab Notebook and the result is similar to using “conditional formatting” in spreadsheet software:
import numpy as np import pandas as pd index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee'] cols = ['A', 'B', 'C', 'D'] df = pd.DataFrame(abs(np.random.randn(5, 4)), index=index, columns=cols) df.style.background_gradient(cmap='Blues')
For detailed usage, please see the more elaborate answer I provided on the same topic previously and the styling section of the pandas documentation.
Method 3
You want matplotlib.pcolor:
import numpy as np from pandas import DataFrame import matplotlib.pyplot as plt index = ['aaa', 'bbb', 'ccc', 'ddd', 'eee'] columns = ['A', 'B', 'C', 'D'] df = DataFrame(abs(np.random.randn(5, 4)), index=index, columns=columns) plt.pcolor(df) plt.yticks(np.arange(0.5, len(df.index), 1), df.index) plt.xticks(np.arange(0.5, len(df.columns), 1), df.columns) plt.show()
This gives:
Method 4
Useful sns.heatmap api is here. Check out the parameters, there are a good number of them. Example:
import seaborn as sns
%matplotlib inline
idx= ['aaa','bbb','ccc','ddd','eee']
cols = list('ABCD')
df = DataFrame(abs(np.random.randn(5,4)), index=idx, columns=cols)
# _r reverses the normal order of the color map 'RdYlGn'
sns.heatmap(df, cmap='RdYlGn_r', linewidths=0.5, annot=True)
Method 5
If you want an interactive heatmap from a Pandas DataFrame and you are running a Jupyter notebook, you can try the interactive Widget Clustergrammer-Widget, see interactive notebook on NBViewer here, documentation here
And for larger datasets you can try the in-development Clustergrammer2 WebGL widget (example notebook here)
Method 6
Please note that the authors of seaborn only want seaborn.heatmap to work with categorical dataframes. It’s not general.
If your index and columns are numeric and/or datetime values, this code will serve you well.
Matplotlib heat-mapping function pcolormesh requires bins instead of indices, so there is some fancy code to build bins from your dataframe indices (even if your index isn’t evenly spaced!).
The rest is simply np.meshgrid and plt.pcolormesh.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
def conv_index_to_bins(index):
"""Calculate bins to contain the index values.
The start and end bin boundaries are linearly extrapolated from
the two first and last values. The middle bin boundaries are
midpoints.
Example 1: [0, 1] -> [-0.5, 0.5, 1.5]
Example 2: [0, 1, 4] -> [-0.5, 0.5, 2.5, 5.5]
Example 3: [4, 1, 0] -> [5.5, 2.5, 0.5, -0.5]"""
assert index.is_monotonic_increasing or index.is_monotonic_decreasing
# the beginning and end values are guessed from first and last two
start = index[0] - (index[1]-index[0])/2
end = index[-1] + (index[-1]-index[-2])/2
# the middle values are the midpoints
middle = pd.DataFrame({'m1': index[:-1], 'p1': index[1:]})
middle = middle['m1'] + (middle['p1']-middle['m1'])/2
if isinstance(index, pd.DatetimeIndex):
idx = pd.DatetimeIndex(middle).union([start,end])
elif isinstance(index, (pd.Float64Index,pd.RangeIndex,pd.Int64Index)):
idx = pd.Float64Index(middle).union([start,end])
else:
print('Warning: guessing what to do with index type %s' %
type(index))
idx = pd.Float64Index(middle).union([start,end])
return idx.sort_values(ascending=index.is_monotonic_increasing)
def calc_df_mesh(df):
"""Calculate the two-dimensional bins to hold the index and
column values."""
return np.meshgrid(conv_index_to_bins(df.index),
conv_index_to_bins(df.columns))
def heatmap(df):
"""Plot a heatmap of the dataframe values using the index and
columns"""
X,Y = calc_df_mesh(df)
c = plt.pcolormesh(X, Y, df.values.T)
plt.colorbar(c)
Call it using heatmap(df), and see it using plt.show().
Method 7
Surprised to see no one mentioned more capable, interactive and easier to use alternatives.
A) You can use plotly:
- Just two lines and you get:
- interactivity,
- smooth scale,
- colors based on whole dataframe instead of individual columns,
- column names & row indices on axes,
- zooming in,
- panning,
- built-in one-click ability to save it as a PNG format,
- auto-scaling,
- comparison on hovering,
-
bubbles showing values so heatmap still looks good and you can see
values wherever you want:
import plotly.express as px fig = px.imshow(df.corr()) fig.show()
B) You can also use Bokeh:
All the same functionality with a tad much hassle. But still worth it if you do not want to opt-in for plotly and still want all these things:
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, LinearColorMapper
from bokeh.transform import transform
output_notebook()
colors = ['#d7191c', '#fdae61', '#ffffbf', '#a6d96a', '#1a9641']
TOOLS = "hover,save,pan,box_zoom,reset,wheel_zoom"
data = df.corr().stack().rename("value").reset_index()
p = figure(x_range=list(df.columns), y_range=list(df.index), tools=TOOLS, toolbar_location='below',
tooltips=[('Row, Column', '@level_0 x @level_1'), ('value', '@value')], height = 500, width = 500)
p.rect(x="level_1", y="level_0", width=1, height=1,
source=data,
fill_color={'field': 'value', 'transform': LinearColorMapper(palette=colors, low=data.value.min(), high=data.value.max())},
line_color=None)
color_bar = ColorBar(color_mapper=LinearColorMapper(palette=colors, low=data.value.min(), high=data.value.max()), major_label_text_font_size="7px",
ticker=BasicTicker(desired_num_ticks=len(colors)),
formatter=PrintfTickFormatter(format="%f"),
label_standoff=6, border_line_color=None, location=(0, 0))
p.add_layout(color_bar, 'right')
show(p)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0






