ᚱᛗ
© 2022
Powered by Hugo

Pandas Series Breakdown by Counts and Percentages

Table of Contents

Example Data

name role office age start_date salary currency
Airi Satou Accountant Tokyo 33 2008/11/28 162700 USD
Angelica Ramos CEO London 47 2009/10/09 1200000 GBP
Michael Till CTO London 59 2002/10/09 2200000 GBP
Bruno Nash Software Engineer London 35 2019/10/09 900000 GBP
Yuri Nazdev Software Engineer London 34 2020/12/01 950000 GBP
Carla Santos Marketing Specialist Miami 36 2015/07/14 200700 USD
Edward Bale Marketing Lead Miami 39 2010/07/14 300700 USD
Togashi Yuta Designer Tokyo 55 2005/01/28 23173600 JPY
Arinori Mori Senior Advisor Tokyo 55 2005/01/28 23173600 JPY
Jessica Leed Senior Advisor Austin 49 2008/03/24 350000 USD
Sanjeev Gupta Senior Advisor Austin 52 2012/04/26 320000 USD
Ken Vog Marketing Specialist Austin 22 2021/05/22 99000 USD

Breakdown Function

import pandas as pd  # pandas==1.3.2
def break_down_series(series):
    """Return value counts, proportions and percentages of a pandas series"""
    counts = series.value_counts()
    props = series.value_counts(normalize=True).round(2)
    percentages = props.mul(100).round(2).astype(str) + "%"
    return pd.DataFrame(
        {"count": counts, "proportion": props, "percentage": percentages}
    )

Note: this type of breakdown of a series' values makes more sense for categorical/discrete data. For continuous/numeric data, it’s better to use pandas.Series.describe() which returns descriptive statistics, including 25th, 50th, and 75th percentiles.

Breakdown by Office

break_down_series(df["office"])
office count proportion percentage
London 4 0.33 33.0%
Tokyo 3 0.25 25.0%
Austin 3 0.25 25.0%
Miami 2 0.17 17.0%

Breakdown by Currency

break_down_series(df["currency"])
currency count proportion percentage
USD 6 0.50 50.0%
GBP 4 0.33 33.0%
JPY 2 0.17 17.0%