Site icon Accidentally in Code

Exploring a Conference Hashtag

My supervisor had the idea of grabbing a conference dataset by hashtag, specifically the Eclipse Conference 2010 (hashtag #ese) which took place in Ludwigsburg, Germany, November 2nd to November 4th.

You can get an idea of what people were talking about in the wordle, below (applet is here):

Apparently there were a lot of RT’s. We’ll explore that later…

I started off with HTML files that he had grabbed for me, and extracted all the tweet ID’s (regular expressions ftw) and then downloading all the information for each tweet from the API (rate-limiting is the new compiling). Finally I had a spreadsheet with a total of 640 tweets (only one couldn’t be retrieved) from 181 different users.

One user has a total of 26 tweets in the dataset, however the majority just tweeted the hashtag one time. The frequency distribution is shown in the chart, below.

The web and Tweetdeck were by far the most popular clients, as per the chart below. Of course, this can be skewed by users posting more.

To reduce this, I eliminated duplicates of user/source combinations to create the chart below:

TweetDeck now seems slightly less popular! It’s interesting giving the tech-savvy of the users – Eclipse is an IDE, amongst other things, and is also Open Source that the web is so prevalent, and Android less so. Although Twitdroid and Twitter for Android are there they are both dominated by Twitter for iPhone.

Just 38 of the 181 users use multiple clients, although one user uses 5 (!)

Below is a heat map of the locations of the users for the tweets in the dataset. The conference took place in Europe, so many of the participants were from that area but we also see users from North America.

[iframe: src=”http://www.openheatmap.com/embed.html?map=PheromonesMotherboardNightstick” frameborder=”0″ width=”600″ height=”450″ scrolling=”yes”]

Only 8 tweets (out of the 640 tweet dataset), 1.25% had geo-location data, and just 75 or 11.7% were replies. 55 of user accounts (out of 181), or 30.4% are geo enabled.

I filtered the dataset to keep just one tweet per user (the last one they posted with the conference hashtag).

The location heatmap with the reduced dataset:

[iframe: src=”http://www.openheatmap.com/embed.html?map=HypercriticallyThesaurussStruts” frameborder=”0″ width=”600″ height=”450″ scrolling=”yes”]

Despite the worldwide locations, the vast majority of users have their language set to English:

How do people at the Eclipse Conference describe themselves? Wordles have limitations in terms of statistical significance, but I find them useful for picking out specific themes. The wordle for user’s bios is below (applet here), “Eclipse”, “software”, “Java” and “Developer” feature prominently.

The earliest user joined in December 2006, but some joined relatively recently – in the chart below, we see a spike around February/March 2009 (this makes sense, given the astounding growth of Twitter at that time).

Personally, I use my favorites to collect things I mean to read. So I had a look at how these users were favoriting too. Users had between 0 and 2366 favorites. A median of 43.9, median of 3, and mode of 0 suggest that many of these users don’t use favourites at all. Standard Deviation was obviously large – 204.23.

I graphed follower/following with size proportional to number of lists using Many Eyes.

Finally – URLs. I was surprised that 54 (29.8%) of users did not have a URL in their profile. 3, shockingly, have a Facebook URL (one of which does not have the vanity URL). Blogspot (22 users) is more popular than WordPress (5 users).

Next I’ll be looking at temporal rhythms and mapping @ mentions.

Exit mobile version