Optimist both in belief in technical progress and what we can do with it, underlying hope for humanity.
2015. We really do live in the future. You can go on Kickstarter and support a project to buy an LED disco suit for your dog.
If someone from the 1950’s arrived today what would be the hardest thing to explain?
Answer: device in pocket, cats, arguments.
Heard a theatre prof bemoan that we can still relate to characters written hundreds of years ago. Human nature has remained remarkably consistent.
Recent pictures of Obama – everyone in crowd has a phone. They are documenting their experience through technology, and they are gathering data.
First: technology. Again, we really live in the future. Most excited about “machine intelligence”.
Data products are everywhere. E.g. Foursquare. Built an interactive data product. Started with game, used that data to figure out what the signal strength looks like when in this restaurant vs that one. Learn what you want. Now when walk into a restaurant you like, can give a tip e.g. “you’re gonna love the pancakes here”.
Dark sky. App takes US goverment provided weather data, gets location, and gives you a micro weather report.
Lauren Tabot @ NYC Mayor’s office of analytics. Looked at ambulance response times. Each precinct has so many ambulances. Calculated the optimal place for them to sit, checked but wasn’t the same. Asked why: 24hr bathrooms and coffees. Figured out how to reorder questions to get call handed off to ambulance sooner. Managed to reduce response time by 1m. Simple. Just combining two data sources.
Google maps traffic view – visually so boring. Metaphor that we all know. Using historical models, and real time data, mindblowingly complicated. But makes something so simple we look at it and don’t give it a thought. Enhances our experience but we don’t know anything about how it works.
Why is 2015 a great year to think about data products? One is infrastructure. Computers are cheap, we have the infrastructure to actually process things. Not just cheap, but cheap enough to play with. At Bit.ly, testing a hadoop cluster by checking if people click on cats or dogs more on social media. Realised that we can allocate resources to such a trivial question changes things (answer: dogs). Even a small startup can have resources to analyse it.
Kinda know what to do with it. Companies think data special and unique. Most is from transactions. Consumer behaviour is pretty consistent. This is actually a good thing.
Data science is a job that people can have. Distinct from statistician, and from software engineer. Do three things in one professional role. Math and stats. Code, well enough to get data out of a data store, and to use, sometimes well enough to write production code. Communicate, need to understand what that person needs to understand, go away, do analysis, come back and help them back a better decision – hard part is not the programming, it’s understanding what someone needs to understand, communicating back to them.
See data scientists working in product. Which algorithm is right for the product requires understanding the product, the goals, the user. Communication is key, not just about having a PhD in statistics.
“AI is whatever a computer can’t do today”. Used to be chess. Driving a car. Now it’s being creative.
Creating an organisation is just deploying a different kind of infrastructure.
Organisational structures limit the products that get made. When all organisations are structured the same way, they are all limited in the same ways.
Incentives matter. Build companies, who is in charge of what, those incentives matter, effect what you can build.
Worked with VC, did it to learn how that side functions. Are startups really where the most innovative ideas happen? How do they go from being small to being big?
Academia. Was a professor – incentivised to publish, work limited to what you can publish. Free within that, but a very strong restriction. Even with tenure, you have students, compete for grant money.
Big companies, can take risks? But risk taking is very frowned upon in big companies. People don’t get promoted for taking risks and failing. Data Science about probability, but that means is no definite positive outcome. Create research group, or work with other companies. Can do it themselves, but incentives don’t align.
Startups. But actually hard, taking a technology risk at the same time taking a business risk. Most startups aren’t that innovative. New business model on proven technology. Leads to a homogenisation of the kind of products that get built.
Advice: Think for yourself and love cats.
We think about startups like they exist in a vacuum. Absolutely not the case. Cannot exist without the context around it. A great idea this year might have been a terrible idea two years ago.
Bit.ly – couldn’t start that business today, couldn’t get it to scale.
Any idea, think about the context. Figure out whether the things you need around you are there to make it successful.
Can’t do any of this without people. What it means to be working in technology is changing dramatically today. Idea that if you would have any of the 3 components of Data Science 10 years ago, that would be your whole career. Combine not because we are more remarkable, but because technology has changed and you can master enough of each of these to be useful. Now it’s much cheaper, used to need to hire 3-4 people to do what one person does today.
Think about what the world might look like in 5-10 years. Find the opportunity in that change.
Don’t do it alone. Need a team who complement each other. Have to get other people to agree with you. Stand up and say “I sort of thought this would be interesting”, people will collaborate. Important to think, but also to get allies to actually get there.
People aren’t fungible. People are just not fungible. Even one person with a skill. Can’t just say that one is as good as this one. Each person brings their own perspective.
Try to look for people who have more than one perspective. Really valuable to find people who have a perspective from more than one community. Who have gone through that mental exercise to be fluent in more than one domain, brings a lot of creativity to that work.
As professionals need to adapt, process needs to adapt. Tension between agile and data science. Process is just a programming language for people.
Fast Forward Labs
Attempt to tie these three philosophies together. Technology with a focus on organisational design, great people from interesting backgrounds.
Focus on innovation opportunities through data and algorithms.
Companies have interesting data but don’t know what to do with it. Trying to figure it out without data science on staff. Outsourced R&D lab. Team is mostly reformed academics, connected to academic community, startups, help companies figure out what to do with data.
Have process for figuring out what technologies we think are going to be impactful. Things that are potentially interesting. For each idea, look at factors: theoretical breakthrough that makes it possible to do something that wasn’t possible before. Look for things that are more possible things today than a year ago. Economic constraints. E.g. cost of GPU had gone down. Commoditized. E.g. Hadoop open source project. Was well understood as an approach for a long time. But to use it, needed 20-30 engineers to build and manage a platform. Went down to 2-3. Now, AWS elastic map reduce can be done at home (.2 people). That makes something a technology you can build on top of in a reasonable amount of time.
Sometimes comes from commercial enterprise (e.g. face.com bought by Facebook – became uncommodotized when sold to FB, but then was commodotized again by Lambda Labs).
Last thing is new data. Can’t do anything if the data wasn’t available. Sometimes new data becomes available that can do new stuff with. Anything using language draws from wikipedia. Tries to make new data available by creating it. Data also has to be made useful.
E.g. natural language generation. Build a working reference prototype of everything.
Probabilistic methods for realtime streams.
Deep learning for image analysis. Hot technology at the moment, but not one everyone needs to know about. Re-interpretation of neural networks. First one was built in hardware, not software.
Also consider frivolous to making money. Tries to be in the middle.
One thing can do because in the middle. Always careful to talk about the ethics.
Very exciting time to be in technology. You have the ability to imagine the world you want to live in, see where we are today, and push on that.