The past five years have given me a tremendous opportunity to see firsthand data of over eighty VC-backed tech companies. That is close to 100 teams and 300 individuals. Naturally, I’ve got to see a lot of data - very detailed information on every transaction, activity, click, and interaction. What would be expected of me now is to go about promoting for everyone to collect as much information in order to make data driven decisions. Instead, I think all of us in the data profession should be honest about the pitfalls of always championing data-centric approach. Here is why:
Let’s dive right into it…
If you are non-native English speaker like me, you remember the days when you had to use an online dictionary. You would look up a word, then another — until you translated the whole sentence. This was also the case with translating emails and web pages from other languages into English.
Then, in 2006, Google launched its Translate Service. Suddenly, it was possible to translate whole sentences and even pages. The result was not perfect, but it made online dictionary services obsolete almost overnight. A few dictionary services remained popular for complex languages (Chinese Mandarin, Russian, Hebrew…), and even those are quickly losing relevance.
Twelve years later, there is another revolution brewing in the language space: Interpretation of Emotions. And it is super important! Consider just how interconnected different cultures are today in the digital world. Yes, you can translate email messages word for word, or phrase for phrase, but can you really understand the real opinions and undertones of people behind them? How do you read emotions of someone located far away? How do you tell what someone is feeling if her written or spoken English is not perfect?
When I talk to friends of my father who have read Robert Pirsig’s Zen and the Art of Motorcycle Maintenance, they all say it had a profound effect on them back in the 1970s. Looking around, sometimes it seems the early tech of the 20th century was built primarily by people influenced by Pirsig’s writing. But it has been more than 40 years since the book was published—where are we now?
I recently watched a YouTube video of a technical manager from Criteo, Justin Coffey, demoing the in-house business intelligence tool his team had built. It’s a great presentation. Justin actually lists reasons why his team decided to build its own tool; all are good reasons, except none answer why they would have reinvented something when a better tool already exists. (see footnotes) Since I work across vendors and indirectly benefit from the complexity of new data tools—open source included—I also frequently hear reasonable arguments against buying a new tool.
Recently I left my best job thus far, the job at Looker, to found a consulting company. Naturally there were a lot of factors at play, but Looker’s gravitational pull on me was due to the technology itself. Looker is one of those technologies that everyone wants to build on top of because: