avatar Bite 30. Movie data analysis

In this Bite we are going to parse a csv movie dataset to identify the directors with the highest rated movies.

  1. Write get_movies_by_director: use csv.DictReader to convert movie_metadata.csv into a (default)dict of lists of Movie namedtuples. Convert/filter the data:
    • Only extract director_name, movie_title, title_year and imdb_score, ignoring movies without all of these fields.
    • Type conversions: title_year -> int / imdb_score -> float
    • Discard any movies older than 1960.

    Here is an extract:

    ....
    { 'Woody Allen': [
        Movie(title='Midnight in Paris', year=2011, score=7.7),
        Movie(title='The Curse of the Jade Scorpion', year=2001, score=6.8),
        Movie(title='To Rome with Love', year=2012, score=6.3),  ....
        ], ...
    }
    
  2. Write the calc_mean_score helper that takes a list of Movie namedtuples and calculates the mean IMDb score, returning the score rounded to 1 decimal place.
  3. Complete get_average_scores which takes the directors data structure returned by get_movies_by_director (see 1.) and returns a list of tuples (director, average_score) ordered by highest score in descending order. Only take directors into account with >= MIN_MOVIES

See the tests for more info. This could be tough one, but we really hope you learn a thing or two. Good luck and keep calm and code in Python!

Login and get coding
go back Intermediate level
Bitecoin 3X

666 out of 708 users completed this Bite.
Will you be Pythonista #667 to crack this Bite?
Resolution time: ~84 min. (avg. submissions of 5-240 min.)
Pythonistas rate this Bite 5.86 on a 1-10 difficulty scale.
» Up for a challenge? 💪

Focus on this Bite hiding sidebars, turn on Focus Mode.

Ask for Help