What a meme! “
Data scientist is the sexiest job of the 21st century”, the Harvard Business Review decided in 2012. Since then, the hype about data science, analytics and big data has gone into full swing. I have spoken to a number of these elusive data scientist folks about their daily work. What I hear repeatedly is this: “80% of it is dull data cleaning, organizing and linking.” Not my kind of sexy. But I know another job that is closely related and a bit more colorful: it is the second sexiest job of the 21st century…
pure data science can become a drag
Today, a data scientist is generally asked to be hyper-numeric, a god of statistics, fluent in programming and highly inclined towards mathematical modelling. A bit of machine-learning zest wouldn’t hurt. If you are such a person and you enjoy data mining, cleaning, rearranging, linking and querying, great.
However, a recent
Fortune article argues that the next generation of software (taught by machine learning) will eat your daily grind for breakfast. Any dull, repetitive number-crunching task will be done by computers. They love statistics, can already program and are getting better at mathematical modelling every day. They can even provide conclusions! Artificial intelligence will sift through your data, provide a great analysis and explain the findings. The demand for data scientists might fall rapidly, once technology is in place.
So it might be a good idea to think ahead. Current data science jobs do not (yet) ask for three different skills: consulting, direct model building and simulation skills. Incidentally, they are exactly the areas where machine learning will not be able to replace human work.
Where data science can reveal hidden secrets from your data, there is always the other realm of the not-so-hidden: Go out and talk to the people that are working with the system in question. Also, sit down and try to understand the system from basic analysis. Every consultant will confirm this: talking to the shop-floor technician is worth a million data entries.
It sure is great to build models from your data if you lack the “obvious” knowledge about your system. However, most systems can be understood at least partially by building a simulation model. Even if details are opaque, just the act of building a model reveals many hidden gems, more so if done together with the shop-floor technician.
Building models of reality directly can be very useful
In order to provide good simulation models that complement good consulting, you should have a good understanding of simulation modelling techniques such as agent-based modelling, discrete-event processes or system dynamics. Ideally, you do not even think about techniques anymore but simply model the system in question, blending paradigms seamlessly. You focus on the problem, not the technique.
So you might say “gosh, we should ask even more from future data scientists”? Not really. I am arguing that we might ask for the wrong skills. Skills that will be replaced by computers soon. Things that are hyped as cool but are 80% boring (except, maybe, if you work for Google). We should make data scientists learn skills that are here to stay beyond the machine-learning revolution.
When we ask for the right skills, we end up with a simulation modeller. We try to make sense of data, are hyper-numeric and apply statistics all the time. However, the difference is that we try to understand systems more directly (via consulting with the system and its people directly). Now I realize that this is not possible for all systems. However, we should use data science and simulation modelling in tandem, and avoid the hammer-and-nail problem that is becoming pervasive in data science.
Let me tell you about the second sexiest job of the 21st century: As a simulation modeller, my work is truly colorful. I build virtual worlds that behave like their real-world equivalent: Aircraft move from A to B, legislation impacts water usage or bottles of coke are moving through a factory floor. Our work is a bit like a kid’s model railway: first we build it and then we get to play with it. We see our models grow and become “alive”. And then we use them for analysis, interpretation and real-world action. To me, this sounds very sexy indeed. Only topped, maybe, by being a commercial airline pilot (surely the sexiest job ever…).