A Step by Step How To for Extracting Twitter Messages from R

I recently started a small hobby project to analyse accident frequency on Singapore roads. I decided to extract this information from the Singapore Land Transport Authority twitter feed. (although I could have gotten data through the DataMall initiative by the Singapore Government using Python, this would be the subject of another how-to later  )

I thought I would share my experience and steps to do this and hopefully you will find this useful.

So what are we waiting for? Let’s begin!

Step 1: Download the twitteR package

We need to ensure that the latest twitteR package is installed on your R environment. Run the following command in R Studio

install.packages (twitteR)
This will download and install the twitteR and all required packages.

Step 2: Setup a Twitter App

We need to create a Twitter App so that we can access the Twitter platform through this web API. Before you can create a Twitter App, you need to create an account first. You can do so on the Twitter Apps page.

Once you are done, you can start by clicking on the Create New App button.

Twitter Application Management landing page
Twitter Application Management landing page

Proceed to enter the required mandatory fields as shown below.

Enter required mandatory fields
Enter required mandatory fields

The Website address can be a temporary one for now. However, ensure that the Callback URL is left blank for now.

Acknowledge the developer agreement and click on the “Create your Twitter application” button. The following page will appear confirming that you have successfully create the web application.

Successfully created a Twitter web application
Successfully created a Twitter web application

Click on the Keys and Access Token tab to view the Consumer Key and Consumer Secret keys.

View keys and access tokens
View keys and access tokens

At this point, you have not created your Access Token yet. Hence click on “Create my access token” button to do so.

Create access token
Create access token

Your access tokens will be generated and displayed on the refreshed page.

Access token generated
Access token generated

Click on the Application Management icon above and you will see your new application created as shown below.

Twitter application successfully created
Twitter application successfully created

Step 3: Create R code to Access Twitter Feeds

Go back to RStudio and enter the following R code:

#install the necessary packages
library(twitteR)

#necessary file for Windows
#download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem")

#to get your consumerKey and consumerSecret see the twitteR documentation for instructions
consumer_key <- 'your consumer key'
consumer_secret <- 'your consumer secret key'
access_token access_secret <- 'your access secret’

setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)

Note that I have commented out the download.file command since I am running OS X in this example. I have not tested whether adding this download.file(…) code snippet will work.

Once you have entered the above, run the code and you will see the following prompt on the RStudio console

Running R code for twitter integration
Running R code for twitter integration

You can select 1 or 2 depending on your preference. Regardless of the choice, you should see the “>” on the next line on the console indicating that the setup_twitter_oauth command was successfully executed.

Step 4: Extract your Twitter Feed

Once you have completed the above step, enter the following R code.

ltaTwtr <- searchTwitter("LTATrafficNews + Accident", n=500)
length(ltaTwtr)

#make data frame
tmpDf <- do.call("rbind", lapply(ltaTwtr, as.data.frame))

The command searchTwitter will issue a search of Twitter based on a supplied search string – based on your subscribed twitter feeds. Because the return value of searchTwitter is a list, we would need to do.call(“rbind”…) function to convert it into a data frame for subsequent processing.

Data from twitter feed based on search string
Data from twitter feed based on search string

The above table is an example of the twitter messages that match my search criterion.

That’s it!

You can download my sample code on Github for those who want the code directly.

I hope this short how-to has help with your data science tasks! Happy coding!