go back  22 - Packt Free Ebook Web Scraper




This challenge write-up first appeared on PyBites.

A smooth sea never made a skilled sailor. - Franklin D. Roosevelt

Hi Pythonistas, a new week, a new 'bite' of Python coding :)

This week we will do some web scraping. As you might know Packt gives away a free ebook every (!) single day. In this challenge you will scrape that page and send out a notification to never miss an interesting title.

Sponsor the Python Community

But it gets better: the guys from Pybonacci (great Spanish Python science blog) partnered up with Packt:

Packt will donate up till 1000 bucks ($ 1 per free ebook download) to a Python related non-profit (more info here, you can vote for the non-profit here).

So taking this challenge you get to promote the awesome Python community, isn't that cool?

The Challenge

The challenge is to make a script that scrapes the free learning link every day for meta data about the book (title, description, cover, promo time left).

Then have the script share this info together with this affiliation link: https://www.packtpub.com/packt/offers/free-learning?utm_source=Pybonacci&utm_medium=referral&utm_campaign=FreeLearning2017CharityReferrals to your favorite channel: email, Twitter, Facebook, reddit, slack, etc.

That's it for the basic requirements. You probably want to put this in OS cron or you can use Dan Bader's schedule package.

For the web scraping you could use Beautiful Soup or Scrapy for example. We did an article on the former and used it in our 100days Challenge.

Bonus

If you really want to challenge yourself, you could have the script login to your Packt account and click the 'Claim Your Free eBook', making it fully automated. It might not be easy because they use a CAPTCHA, but hey we like a good challenge, right? It would definitely be a useful tool and a good skill to add.

Not sure where to start? Check out this repo (Github is your friend!). They used Requests / Session to do this.

You could also look at Selenium (here is some 100days code).


Getting ready

See our INSTALL doc how to fork our challenges repo to get cracking.

This doc also provides you with instructions how you can submit your code to our community branch via a Pull Request (PR). We will feature your PRs in our end-of-the-week challenge review (previous editions).

Feedback

If you have ideas for a future challenge or find any issues, please contact us or open a GH Issue.

Last but not least: there is no best solution, only learning more and better Python. Good luck!


Keep Calm and Code in Python!

-- Bob and Julian