Solution to my Twitter API - twitterR issues

With lots of help from Bob O’Hara (thank you!), I was able to solve my problems. I am looking at the tweets around #AGU10 but it occurred to me that I wanted to know what other tweets the AGU twitterers were sending while at the meeting because some might not have had the hashtag.

Here goes:

# Get the timeline
person <- userTimeline("person",n=500)

# Check to see how many you got
length(person)

# Check to see if that is far enough back
person[[500]]$getCreated()

# Get the time it was tweeted
Time = sapply(person,function(lst) lst$getCreated() )

# Get screen name
SN = sapply(person,function(lst) lst$getScreenName() )

# Get any reply to screen names
Rep2SN = sapply(person,function(lst) lst$getReplyToSN())

# Get the text
Text = sapply(person,function(lst) lst$getText())

# fix the date from number of seconds to a human readable format
TimeN <- as.POSIXct(Time,origin="1970-01-01", tz="UTC")

# replace the blanks with NA
Rep2SN.na <- sapply(Rep2SN, function(str) ifelse(length(str)==0, NA, str))

# make it into a matrix
Data.person <- data.frame(TimeN=TimeN, SN=SN, Rep2SN.na=Rep2SN.na, Text=Text)

# save it out to csv
write.csv(Data.person, file="person.csv")

 

So I did this by finding and replacing person with the screen name in a text editor and pasting that into the script window in Rcmdr. I found that 500 was rarely enough. Some I had to request up to 3200 tweets, which is the maximum. I had to skip one person because 3200 didn’t get me back to December. It’s also worth noting the length() step. It turns out that when you ask for 500 you sometimes get 550 and sometimes get 450 or anywhere in between and it’s not because there aren’t any more. You may also wonder why I wrote the whole thing out to a csv file. I could have had a step to cut out the more recent and older tweets to have just the set there for more operations within R. I need to actually do qualitative content analysis on the tweets and I plan to do that in NVIVO9.

I didn’t do this for all 860, either. I did it for the 30 or so who tweeted 15 or more times with the hashtag. I might expand that to 10 or more (17 more people). Also, I didn’t keep the organizational accounts (like theAGU).

With that said, it’s very tempting to paste all of these data frames together, remove the text and do the social network analysis using iGraph. Even cooler would be to show an automated display of how the social network changes over time. Are there new connections formed at the meeting (I hope so)? Do the connections formed at the meeting continue afterward? If I succumb to the temptation, I’ll let you know. There’s also the the textmining package and plugin for Rcmdr. This post gives an idea of what can be done with that.

2 responses so far

  • Dave C. says:

    So I have no idea what Twitter API is and I have never sent a tweet, although my publisher keeps recommending that I figure it out. I am impressed, however, that someone who works with R also works with qualitative analysis software like NVIVO. I thought I was one of the few working in R and qualitative analysis. Now I know someone else. Cool.

Leave a Reply