Using GroupBy and Aggregate on Python Dataframes

Let’s say we have a dataset containing data with movie reviews as shown below. This is stored as a pandas dataframe ‘df’. groupby

To find the critics with the most number of reviews:

#Critics with the most No. of Reviews 
df.groupby(["critic"]).size().reset_index(name='count').sort_values('count',ascending= False).head(10)

Output:
groupby

If you want to apply functions on more than one column, we can use the aggregate method as follows: To get the Movie titles sorted by No. of Reviews and the Mean Freshness (Mean of all the freshness ratings for a movie from all reviews).

result = df.groupby(['title']).agg({'review':'count','fresh':'mean'}).sort_values(['review','fresh'],ascending=False)  
result.columns = ['numReviews','meanFreshness']
result

Output:
groupby