Pandas has a great object - the Styler
object. It can do wonderful things. Many times, when we research, it's great to visualize the data with with colors. I'm the first one to use Matplotlib when it is needed but sometimes there is just no other way than looking at the data itself. Coloring the data could help a great deal with that. Highlighting null
values, understanding the scale of the data or getting a sense of proportions is made a lot easier with styling. In the old days, I used to export my dataframe into an excel file or a google sheet and deploy the reliable conditional formatting. I still love conditional formatting, but exporting tables is something I hate doing, and also Excel and Google Sheets lack the programmatic abilities Python has.
So with the styler object you research with style!
import pandas as pd
import blog
blog.set_blog_style()
import pandas.util.testing as tm
tm.N, tm.K = 10, 7
st = pd.util.testing.makeTimeDataFrame() * 100
st
And let's insert some null values
stnan = st.copy()
stnan[np.random.rand(*stnan.shape) < 0.05] = np.nan
The Styler Object¶
Pandas uses an accessor to get a Styler object on the dataframe. This object implements a _repr_html_
which is the method that Jupyter Notebooks use to make the dataframes so nice. You can also export the html.
tystnan.style # This looks just like the dataframe.
Basic Built-In Styling¶
The styler object has some nice built in functions. You can highlight nulls, min, max, etc. You can also apply it by axis, same as you would on applying functions.
In the next example we should expect:
nan
values to be red- each row would have one yellow cell
- each column would have one blue cell
(stnan
.style
.highlight_null('red')
.highlight_max(color='steelblue', axis = 0)
.highlight_min(color ='gold', axis = 1)
)
Color Scales¶
If you want to understand the scale of the data, applying a gradient creates sort of a "heat map" on the table. In the next case - lower values are white while higher values are dark blue.
st.style.background_gradient()
Custom¶
For me this is the best part, with a bit of css you can do anything on your dataframe, This is where we really differ from Excel or Sheets, doing all of this programitacally make life so much easier.
def custom_style(val):
if val < -100:
return 'background-color:red' # Low values are red
elif val > 100:
return 'background-color:green' # High values are green
elif abs(val) <5:
return 'background-color:yellow'# Values close to 0 are yellow
else:
return ''
st.style.applymap(custom_style)
Bars¶
Applying bars to your data gives a nice look if you want to understand how your data compares between itself. I know economists have great use for it, but everybody can employ this to get a grasp of the data quickly.
(st.style
.bar(subset=['A','D'],color='steelblue')
.bar(subset=['G'],color=['indianred','limegreen'], align='mid')
)
Comments
comments powered by Disqus