Hello everybody, and welcome to Python for Everybody.
We're going to do some code walk-through.
If you want to follow through with the code,
you can download the sample code from Python for Everybody.
So, the code that we're going to play with is
the Twitter spider code that is both talking
to the Twitter API and talking to the database.
So, what we're going to be doing is we are going to run
code that's going to hit the Twitter API much like we did in a previous chapter,
and we're going to retrieve the data,
but we're going to remember the data.
So, we don't have to retrieve it again.
So, we're going to keep track of people's friends.
What we're doing here is illicitly pulling down slowly,
but surely based subject to our rate limit.
We're pulling down who our friends are.
So, let's take a look. We're going to use urllib and urllib.error,
the Twitter_URL, which was code that augments my url to do all the OAuth calculation.
We're going to get json data back.
We're going to make a database and we have to import SQL because of
the way Python doesn't trust any certificates no matter how good they are.
So, this is our URL to talk to the Twitter API.
We're going to make a database.
Again, the way SQLite works is as if this spider.sqlite doesn't exist, it creates it.
We get ourselves a cursor and we're going to do a create table,
this if not exists some SQLs,
but sqlite3 does this.
Create table if it doesn't exist.
We want to start this over and over unlike the tracks example.
I want to start this over and over,
and not lose data.
This is a spidering process and we'll see a lot of these.
We want a restartable process where we use a database.
So, if we're starting with nothing and there's no file,
spider.sqlite, it creates this table.
It's the name of the person whether we retrieved or not
and how many friends this person has that we know of in our database.
Now, this little bit is to deal with the SSL certificate errors.
The certificates are totally fine,
but Python doesn't trust any certificates by default,
which is frustrating, but whatever.
So here, we're going to have a loop.
We're going to ask for Twitter account.
We have to type quit to quit.
If we hit Enter in this case,
we're going to actually read from the database on
unretrieved Twitter person and then grab all that person's friends.
9Okay. So then, if we're going to do a fetchone,
get one, and that's going to get the name of the first person,
the sub zero, if we add more things than name here,
sub zero was the first of those.
Fetchone means get one row from the database and
sub zero means the first column of that first row.
If this fails, then we retrieved all the Twitter accounts.
So, we're going to augment this Twitter_URL using this mix.
You can look at the twurl.py code.
This basically requires the hidden.py file,
which has your keys and secrets in it.
You've got to get hidden.py updated.
I've got it updated, but I'm not going to show you
because it has my keys and secrets in it.
So, we're only going to take the first five,
which means we're not going to find friends of friends of friends.
It's only at most five recent ones.
We could run this with a much higher number,
so we have more than one friend.
We'll show the URL while we retrieve it.
We will do our URL open.
We'll do a read, and then we'll do a decode to make sure that this UTF,
this will give us data in UTF eight,
and then decode will give us data in unicode,
which is what we need inside of Python.
We will ask for the headers from the connection.
We'll say, "Give me the headers.
Give me a dictionary of the headers."
The x-rate-limit-remaining header from the Twitter API tells us
when we're going to be told we can't use
this API anymore because this is one of those things,
and then we're going to parse, and load the data that we got from Twitter,
and get a, I think it's a list.
Yes, it's a list, and then we could dump this if you want in here as you can undo that.
Then, what we're going to do is we just
retrieved this person screen name and their friends.
So, the first thing we want to do is update
the database and change the retrieved from zero to one.
That's because we're going to use this to know about unretrieved.
So, retrieved being one means we've already retrieved it and we did retrieve it.
So, for that account, we've retrieved it.
Then, what we're going to do is we're going to parse that.
So, this is similar to the Twitter code we did previously in the web services chapter.
We're going to go through all the users.
We're going to find their screen name.
We're going to print the screen name out, okay?
Then, what we're going to do is see if, let's see,
so we're going through all the users who are the friends of this person,
and we're going to say, "Okay.
Let's select the friends from Twitter where the name is the friend person."