If you’re not a computer scientist, probably you think of cliques like in the movie Mean Girls. A bunch of people who talk the same and look the same and act the same…

To computer scientists (well, not all of them – I’m more inclined to think about Mean Girls to be honest) cliques are a part of graph theory. Essentially, it’s a group of nodes within a graph that are all connected to each other (Wikipedia).

### So what? Big who cares, right? Graph theory is super boring!

I’m not big on the maths side of computer science, been there, got the t-shirt, and now I want to make stuff! But recently I’ve been working on applying graph theory to the **Twitter Conversation Networks** I make in order to produce my graphs. Why? Because this will allow me to pull out the sub-networks that you’re a part of, and the cliques that you’re connected to. Maybe if you know this you’ll find other people you’re potentially interested in following or (even better!) talking to.

*For instance, here are my cliques of size 4 (the maximum size of the cliques found) within my 2 degree network:*

[1,2,10,24] – tgrevatt, kittenthebad (me), childspeter, isfalk

[1,2,20,21] – tgrevatt, kittenthebad (me), mazamengr, krusk

[1,2,21,24] – tgrevatt, kittenthebad (me), krusk, isfalk

[1,21,24,74] – tgrevatt, krusk, isfalk, erinblaskie

[1,21,24,85] – tgrevatt, krusk, isfalk, julien

[1,20,21,74] – tgrevatt, mazamengr, krusk, erinblaskie

[2,21,24,36] – kittenthebad (me), krusk, isfalk, foursquare

*There are more for size 3, but you get the picture.*

This is a step towards making simpler graphs, where I will graph only the cliques in someone’s network – this will make it easier to see your important connections, I think. For now, if you want one, you can have a mapping and the result of clique finding for a minimum size of 3 or more.

### How it’s done

- I use the GraphML generated by my Twitter datamining library, that can also be used to produce visualizations.
- A Python script generates a mapping and a list of connections
- I implemented clique finding using Haskell – this means it runs really quickly and I don’t have to worry about memory because of lazy evaluation.
- Minor optimization in the Haskell code, but significant due to the sparse nature of the graphs: I only try and find connections for nodes that have the minimum number of connections. Should be able to further optimize so that the same cliques aren’t found multiple times.