Categories
Programming Twitter Visualization

Who’s Talking About the Future of Newspapers?

My friend Caitlin is using Twitter to investigate the discourse around the future of newspapers. She has collected a bunch of data in a spreadsheet, and I get to visualize it – yay!

First up, extracting some general stats. I used the Apache POI to get the enormous speadsheet into Java (normally I would use Python for this kind of thing, but because I’ll use Java to visualize later I’m just doing it all in Java). POI made it super easy to do this, literally:

public static List extractTweets(String filename) throws IOException {
	InputStream inp = new FileInputStream(filename);
	HSSFWorkbook wb = new HSSFWorkbook(new POIFSFileSystem(inp));

	List tweets = new LinkedList();
	HSSFSheet sheet = wb.getSheetAt(0);
	for (int i = 0; i <= sheet.getLastRowNum(); i++) {
		HSSFRow row = sheet.getRow(i);
		String name = row.getCell(1).getStringCellValue();
		Date date = row.getCell(2).getDateCellValue();
		String tweet = row.getCell(4).toString();
		Tweet t = new Tweet(name, date, tweet);
		tweets.add(t);
	}

	return tweets;
}

First up, I’ve extracted a couple of overview stats. Specifically: total number of tweets, number of tweets containing @ mentions, number of @ replies, number of distinct users mentioned. You can see what this looks like for the 20 people in the chart, below:

User Stats

More to come!

3 replies on “Who’s Talking About the Future of Newspapers?”

[…] This post was mentioned on Twitter by Caitlin Kealey and Peter Cowan, Peter Cowan. Peter Cowan said: RT @caitlinkealey: Neat. All the data from my thesis is being visualized by @kittenthebad SUPER COOL!!!! http://bit.ly/b5AatT (This is part one) […]

[WORDPRESS HASHCASH] The comment’s server IP (208.74.66.43) doesn’t match the comment’s URL host IP (74.112.128.10) and so is spam.

Comments are closed.