Site icon Accidentally in Code

Who’s Talking About the Future of Newspapers?

My friend Caitlin is using Twitter to investigate the discourse around the future of newspapers. She has collected a bunch of data in a spreadsheet, and I get to visualize it – yay!

First up, extracting some general stats. I used the Apache POI to get the enormous speadsheet into Java (normally I would use Python for this kind of thing, but because I’ll use Java to visualize later I’m just doing it all in Java). POI made it super easy to do this, literally:

public static List extractTweets(String filename) throws IOException {
	InputStream inp = new FileInputStream(filename);
	HSSFWorkbook wb = new HSSFWorkbook(new POIFSFileSystem(inp));

	List tweets = new LinkedList();
	HSSFSheet sheet = wb.getSheetAt(0);
	for (int i = 0; i <= sheet.getLastRowNum(); i++) {
		HSSFRow row = sheet.getRow(i);
		String name = row.getCell(1).getStringCellValue();
		Date date = row.getCell(2).getDateCellValue();
		String tweet = row.getCell(4).toString();
		Tweet t = new Tweet(name, date, tweet);
		tweets.add(t);
	}

	return tweets;
}

First up, I’ve extracted a couple of overview stats. Specifically: total number of tweets, number of tweets containing @ mentions, number of @ replies, number of distinct users mentioned. You can see what this looks like for the 20 people in the chart, below:

More to come!

Exit mobile version