avatar Bite 177. Use Pandas to find most common genres in a movie excel sheet

Another pandas Bite: we took some fake movie data from Mockaroo, an awesome fake data service, and dumped it in an Excel file.

Here is how it looks: sample of the movie data

Complete group_by_genre below doing the following:

  1. Load the data in a pandas DataFrame using read_excel. Use skiprows and usecols to get the table of genres and movies (header did not work here).
  2. Split the genres, which are now separated by pipe (|) into separate rows. This is not trivial so we provided the explode function mentioned in this article.
  3. Next filter out the 5 rows without a genre = (no genres listed). At this point your DataFrame should have a shape of (rows, columns): (2024, 2)
  4. Lastly group the DataFrame by genre, counting the number of movies for each. Sort the resulting DataFrame on this count descending. This is how it should look: you should see Drama at the top and IMAX at the bottom.

Good luck, have fun and remember: keep calm and code in Python and pandas!

Login and get coding
go back Advanced level
Bitecoin 4X

71 out of 72 users completed this Bite.
Will you be the 72nd person to crack this Bite?
Resolution time: ~61 min. (avg. submissions of 5-240 min.)
Our community rates this Bite 6.17 on a 1-10 difficulty scale.
» Up for a challenge? 💪

Focus on this Bite hiding sidebars, turn on Focus Mode.

Ask for Help