Bite 177. Use Pandas to find most common genres in a movie excel sheet
Login and get codingAnother
pandasBite: we took some fake movie data from Mockaroo, an awesome fake data service, and dumped it in an Excel file.Here is how it looks:
Complete
group_by_genrebelow doing the following:
- Load the data in a
pandas DataFrameusingread_excel. Useskiprowsandusecolsto get the table of genres and movies (headerdid not work here).- Split the genres, which are now separated by pipe (|) into separate rows. This is not trivial so we provided the
explodefunction mentioned in this article.- Next filter out the 5 rows without a genre = (no genres listed). At this point your
DataFrameshould have ashapeof (rows, columns):(2024, 2)- Lastly group the
DataFrameby genre, counting the number of movies for each. Sort the resultingDataFrameon this count descending. This is how it should look: you should see Drama at the top and IMAX at the bottom.Good luck, have fun and remember: keep calm and code in Python and
pandas!
71 out of 72 users completed this Bite.
Will you be the 72nd person to crack this Bite?
Resolution time: ~61 min. (avg. submissions of 5-240 min.)
Our community rates this Bite 6.17 on a 1-10 difficulty scale.
» Up for a challenge? 💪