go back  44 - Marvel Data Analysis (Alicante PyChallengeDay)

This challenge write-up first appeared on PyBites.

It's not that I'm so smart, it's just that I stay with problems longer. - A. Einstein

Hi Pythonistas, this is a very special edition! Today, the 10th of November, we launch our first Live Code Challenge. We partnered up with Python Alicante and we will be hosting this code challenge with them at the University of Alicante. If you don't happen to live in Alicante but do want to code today 10am-13pm CET you are more than welcome to join this Gitter channel.

marvel spiderman coffee mug

The Challenge

We all love Marvel, don't you? So here's the deal: we found a CSV with Marvel data taken from this database.

We are going to have you write some Python to get this data into a usable data structure to answer some questions about the data.


Update 4th of Oct 2018: originally this challenge used a separate repo because it was held at PyDay Alicante. If you take this challenge, just use the regular Git instructions.

Please answer ...

marvel.py has already some stubs, here is what we want you to try:

  1. Parse the marvel-wikia-data.csv CSV file and load it into a data structure. You probably want a list of dicts or namedtuples, one for each row. Store this in data which will be in the module's namespace (already done in the template).

  2. Get the most popular characters based on the number of appearances they made in comics over the years.

  3. Get the year with most and least new Marvel characters introduced respectively, return a (max_year, min_year) tuple. Expect min/max to be pretty far apart.

  4. What percentage of the comics characters is female? Please give us the percentage rounded to 2 digits.

  5. Good vs bad characters: return a dictionary of bad vs good vs neutral characters per sex. The keys are Bad Characters, Good Characters, Neutral Characters, the values are integer percentages. Who plays the villain more often, a man or a woman?

Data Viz Bonus

OK you know Python inside out, and this was pretty easy. Sounds like you? Please surprise us with:

  • Use your favorite Python visualization library and make one or more plots for 2.-5.
  • Or try to answer some question you might have about this data set.
  • Feel free to use nbviewer and just PR the link to your notebook.
  • You could even write a quick Flask app to wrap your graph, like we did here.

Here for example we used Bokeh to plot newly introduced characters per year:

example bokeh plot for bonus

This was our first live challenge held at the University of Alicante, you can read our review here: 5 Things we Learned Co-hosting a Live Code Challenge Workshop

flyer announcement

Pybites Slack

You like these challenges? We have published quite a few and we're not planning to stop anytime soon.

You really like them and plan on PR'ing more in the future? Then consider joining our private Slack channel sending us an email. This way you get the unique opportunity to learn from other passionate Pythonistas and share some of your experience.


Our goal is to learn and teach you Python through practical exercises. Learning a programming language is way more fun as a community!

For any feedback, issues or ideas use GH Issues, tweet us or drop us an email.

Keep Calm and Code in Python!

-- PyBites

We use Python 3.8