Tag: python

  • Creating a Randomized Emoji String in Python

    Creating a Randomized Emoji String in Python

    TS 2016 Report.png

    You might have noticed this part of the 2016 Emoji Report – part of the graphic features an emoji for each of our subscribers. It’s a randomized ordering of all the different skin tones of the 👩‍💻 and 👨‍💻 emoji, repeated [number of subscribers] / 12. Our subscriber count wasn’t exactly divisible by 12, so I deleted a few blond 👨‍💻 because usually they are over-represented.

    A note on inclusion: I nearly used the 🗣 emoji, but I decided that was too anonymous, and anonymous defaults are usually male. We think that at least half our subscribers are women, so this seemed like a nice way to show that off. I also debated whether to use the default yellow emoji, but opted to in part because of my colleague John’s article Did I Grow Up And Become The Yellow Hand? 

    We needed so many emoji, that I knew I would have to script it. I tend to script things in Python, and figured I could use iPython and that would be cool. In the end I ran into a couple of things and opted to make a file instead. Those things were: emoji support in the terminal was terrible (come on Apple!) and I couldn’t see what I was doing (emoji support in XCode is great), and writing to a file from iPython seemed a bit annoying (of course I go to look that up, and find that it’s actually straightforward – is there a word for the StackOverflow answers you find after you’ve already solved your problem?).

    The only other gotcha was that I needed the elements in a list to shuffle and random them, which makes sense. Here’s the entire four lines of code! Notice that at the top the encoding is specified.

    # -*- coding: UTF-8 -*-
    
    import numpy
    import random
    
    # A list of the emoji we want to randomize.
    emojilist =  '👩‍💻','👩🏻‍💻','👩🏽‍💻','👩🏼‍💻','👩🏾‍💻','👩🏿‍💻','👨‍💻','👨🏻‍💻','👨🏼‍💻','👨🏽‍💻','👨🏾‍💻','👨🏿‍💻'
    
    # Repeat it X number of times (I'm using 10 here as an example).
    repeated = numpy.repeat(emojilist, 10)
    
    # Randomize the list (in place).
    random.shuffle(repeated)
    
    # Join the elements together with nothing in between, and print it out.
    # It's easy to pipe the output to a file using ">".
    print ''.join(repeated)
    
  • Pycon AU: Exploring Science on Twitter with IPython Notebook and Python Pandas

    Pycon AU: Exploring Science on Twitter with IPython Notebook and Python Pandas

    Brenda gave a great talk at Pycon-AU about using IPython and Pandas for her research. Slightly rough notes below.

    She has a dataset of 12 million tweets containing the word “science” – about a years worth of data, after filtering fout non-English tweets and spam.

    Using UTC for fewer timezone problems. Although still some – mostly things expecting the month first cause date-related problems.

    Found more tweets about science mid-week than at weekends – this matches wider patterns of Twitter use in other research.

    IPython features:

    • describe() – summary of the object.
    • groupby() – reorganize your data-structure to group by some attribute.
    • Exports to Latex.

    IP[y] : Notebook

    • Really cool – make notes about what you are doing, interleaved with code.
    • Great for research.
    • ? – inline help.
    • ?? – inline src.
    • %%timeit – times execution, useful for neasuring performance.
    • %pastebin – sends code to pastebin
    • %save – makes a .py file
    • %run – run a script

    Pandas

    • Data structures.
    • Data analysis.
    • Time-based indexing.

    Overall

    I’m pretty fascinated with the results of this research, which we didn’t see much of  as the talk was about the technical setup. I feel like this would have been incredibly handy doing my own research though, and it was good to chat to Brenda at our women’s breakfast and compare notes on other tools like processing, prefuse etc.