avatar Bite 177. Use Pandas to find most common genres in a movie excel sheet

Another pandas Bite: we took some fake movie data from Mockaroo, an awesome fake data service, and dumped it in an Excel file.

Here is how it looks: sample of the movie data

Complete group_by_genre below doing the following:

  1. Load the data in a pandas DataFrame using read_excel. Use skiprows and usecols to get the table of genres and movies (header did not work here).
  2. Split the genres, which are now separated by pipe (|) into separate rows. This is not trivial so we provided the explode function mentioned in this article.
  3. Next filter out the 5 rows without a genre = (no genres listed). At this point your DataFrame should have a shape of (rows, columns): (2024, 2)
  4. Lastly group the DataFrame by genre, counting the number of movies for each. Sort the resulting DataFrame on this count descending. This is how it should look: you should see Drama at the top and IMAX at the bottom.

Good luck, have fun and remember: keep calm and code in Python and pandas!

Login and get coding
go back Advanced level
Bitecoin 4X

70 out of 71 users completed this Bite.
Will you be Pythonista #71 to crack this Bite?
Resolution time: ~57 min. (avg. submissions of 5-240 min.)
Pythonistas rate this Bite 6.17 on a 1-10 difficulty scale.
» Up for a challenge? 💪

Focus on this Bite hiding sidebars, turn on Focus Mode.

Ask for Help