After our Code Challenge 62 / Alicante PyDay last week, we thought it would be nice to branch off a Bite exercise using what we learned. So prepare to do some web scraping using
BeautifulSoupand discover a new library called
gender_guesser. We are going to look at the percentage of female speakers at Pycon US 2019.
Here is what you need to do:
get_pycon_speaker_first_namesextracting all names from
PYCON_HTMLwe cached somewhere for you. Note that some entries have multiple names separated by comma (
,) and slash (
/), so you will need to extract those. Return a
listof first names.
Detector()to determine the gender based on the first names passed in. This tool is not perfect: some names won't be found. However we like Pareto's principle so we're happy to get a rough indication. Return the percentage of female speakers rounded to 2 decimal places.
If next year's Pycon site doesn't change much, you now have a re-usable script you can run against Pycon 2020's data ...
Have fun and keep calm and code in Python!
10 out of 10 users completed this Bite.
Will you be Pythonista #11 to crack this Bite?
It takes an average of ~71 minutes to solve this Bite (submissions 5-240 min).
Pythonistas rate this Bite 4.0 on a 1-10 difficulty scale.
» Up for a challenge? 💪