Let’s say we have a dataset containing data with movie reviews as shown below. This is stored as a pandas dataframe ‘df’.
To find the critics with the most number of reviews:
#Critics with the most No. of Reviews
df.groupby(["critic"]).size().reset_index(name='count').sort_values('count',ascending= False).head(10)
Output:
If you want to apply functions on more than one column, we can use the aggregate method as follows: To get the Movie titles sorted by No. of Reviews and the Mean Freshness (Mean of all the freshness ratings for a movie from all reviews).
result = df.groupby(['title']).agg({'review':'count','fresh':'mean'}).sort_values(['review','fresh'],ascending=False)
result.columns = ['numReviews','meanFreshness']
result
Output: