This challenge write-up first appeared on PyBites.
This week, each one of you has a homework assignment ... - Tyler Durden (Fight club)
In this 3 part challenge you will analyze Twitter Data. This week we will automate the retrieval of data. In Part 2 we will task you with finding similar tweeters, and for Part 3 you will do a full sentiment analysis.
Start coding by forking our challenges repo:
$ git clone https://github.com/pybites/challenges
If you already forked it sync it:
# assuming using ssh key
$ git remote add upstream [email protected]:pybites/challenges.git
$ git fetch upstream
# if not on master:
$ git checkout master
$ git merge upstream/master
$ cd 04
$ python3 -m venv venv
# = py3 (might need virtualenv for py2 env)
$ source venv/bin/activate
# install tweepy (and its depencencies)
$ pip install -r requirements.txt
# if you want to use another package like twython, feel free to do so
# get your API keys from Twitter - https://apps.twitter.com
$ cp config-template.py config.py
# paste the keys in config.py
# choose a template
$ cp usertweets-help.py usertweets.py
# or
$ cp usertweets-nohelp.py usertweets.py
# code
Create a tweepy API object using the tokens imported from config.py (again, you can use another package if you prefer).
Create an instance variable to hold the last 100 tweets of the user.
Implement len() and getitem() magic (dunder) methods to make the UserTweets object iterable.
Save the generated data as CSV in the data subdirectory: data/some_handle.csv, columns: id_str,created_at,text
We posted two articles this week you might find useful in this context: oop primer and Python's data model.
If you decide to use Tweepy, you might want to check its API reference.
For developers that like to work towards tests we included test_usertweets.py:
$ python test_usertweets.py
...
----------------------------------------------------------------------
Ran 3 tests in 0.001s
OK
We used a namedtuple here, this is not required. Also note the tweets can differ, yet in the unittests we test a fix set (using the optional max_id parameter in the constructor):
$ python
>>> from usertweets import UserTweets
>>> pybites = UserTweets('pybites')
>>> len(pybites)
100
>>> pybites[0]
Tweet(id_str='825629570992726017', created_at=datetime.datetime(2017, 1, 29, 9, 0, 3), text='Twitter digest 2017 week 04 https://t.co/L3njBuBats #python')
>>> ^D
(venv) [bbelderb@macbook 04 (master)]$ ls -lrth data/
...
-rw-r--r-- 1 bbelderb staff 14K Jan 29 21:49 pybites.csv
(venv) [bbelderb@macbook 04 (master)]$ head -3 data/pybites.csv
id_str,created_at,text
825629570992726017,2017-01-29 09:00:03,Twitter digest 2017 week 04 https://t.co/L3njBuBats #python
825267189162733569,2017-01-28 09:00:05,Code Challenge 03 - PyBites blog tag analysis - Review https://t.co/xvcLQBbvup #python
Remember: there is no best solution, only learning more and better Python.
Enjoy and we're looking forward reviewing on Friday all the cool / creative / Pythonic stuff you come up with.
Have fun!
Again to start coding fork our challenges repo or sync it.
More background in our first challenge article.