FYI: 5 things academics need to know when they become data scientists. | by Dave Dale | Towards Data Science


5 things academics need to know when they become data scientists.

The academia to industry transition can be hard. Managers and ex-academics need to recognise the differences between the two worlds.

There are hard lessons coming if this is your idea of a working environment.

This one is personal. A younger version of me, in one of his first jobs out of academia, got fired. Yes, even as an ex-postdoctoral researcher at Oxford University, and college lecturer in Maths and Statistics, I was sent on my way!

Did they not know who I was?

It turns out they knew exactly who I was. I was a guy who’d been working for a year on a huge Markov Chain Monte Carlo analysis that had yielded precisely no value. And the fact that it was using the latest convergence diagnostics and the most up-to-date methods for dimension reduction meant a sweet fat nothing [1]. Because they also knew who they were — a small start-up company who couldn’t carry unproductive people.

But #iamverysmart? Apparently not.

Of course the blow to my young man’s ego wasn’t pleasant, but, ho hum, it turned out to be a good thing in the end. We learn more from our mistakes than our successes, right? It’s why all experienced data scientists know so much. And now I’m in management roles I keep seeing other people straight out of academia make the same mistakes that I did. So here are 5 things you should know if you’re coming out of academia and entering data science.

  1. No-one (and I mean no-one!) cares about how smart you are.

I get it. To have got on a PhD programme you must have been one of those kids who spent their whole lives excelling academically. You’ll have been in top sets at school, gone to a great university where you’ll have done well. By the time you hit your late 20s, pretty much your whole life will have been a series of people telling you that you’ll do well because you’re smart. So you’ve kept trying to demonstrate that you’re smart, so more people tell you you’re smart and so the cycle has continued.

Until now.

Look, the whole school to academia system is truly messed up, but that’s a post for another day. Here you just need to know that no-one cares how smart you are, they just care how useful you are.

Got that? Useful NOT smart!

If you truly grok that you can probably stop reading right now, because all the other things follow naturally. However if you’re the sort of person that thinks you grok that, then you should probably keep reading. You sound a bit smart to me …

2. Being useful is different to writing a paper.

Now you might say to yourself, "Right, got it, let’s be useful. OK so we use this technique in the company to do that thing, and I read this paper on this slightly different technique. So if I use this different technique on that company thing, then I’m being useful. Because it’s a company thing." And then you go tell your manager that you’re going to do this and they look at you like they want to throttle you. And that makes you feel bad.

This way of behaving seems familiar because it’s what academics are trained to do. Taking a technique from one field and applying it to your own is an easy way of getting a publication. Frankly, in academia, it doesn’t even have to work any better than the existing techniques. It just has to be novel.

Your manager couldn’t care less about novel. You’ve just told them that you’re going to spend an unknown period of time experimenting with an unproven technique that may or may not improve on a solution that is working satisfactorily already. And you’re wondering why they’re not enthusiastic?

There’s a variation on this, whereby the data scientist hunts around the company trying to find a problem that is just about easy enough to solve, but only with a pleasant amount of deep thinking. In some cases the data scientist might end up doing something marginally OK, but rarely will it be that useful. This is because the motivation for doing the thing isn’t really to be useful at all. It’s the deep thinking that’s the true motivator. Again this is the sort of behaviour that gets rewarded in academia. There showing that you’re smart by solving difficult problems is the whole point. Your manager doesn’t care.

3. Working out how to be useful is part of the job.

OK so you’ve broken those academic habits, but you find yourself adrift. If these people don’t want you to behave as you’ve been trained, then what on earth do they want?

Well firstly, relax. Part of your paycheck is for figuring out what’s useful and what isn’t. The difficulty is that no two environments are the same so whatever my advice, you’re going to have to figure some stuff out for yourself. However here are three archetypes for you to think about.

Firstly there’s the chaotic product environment. In this environment you’ll have a lot of product people running around saying "we want to do this thing asap, but we don’t know if it’s possible". Now this might sound like a difficult place to be, but as far as doing something useful goes, it’s actually the easiest! Just find a product person who seems sane, buddy up with them, and work on what they want. They should be experts in their product, have an understanding of what the potential market for it is, and so really know what will be useful and what won’t. In other words you’ve just outsourced the "working out how to be useful" part of your job to someone else. All you have to do now is build that useful thing. And you’re great at that, right?

Secondly there’s the "improve the number" type environment. You often find these in companies with well established data science teams. In this case you or your team will own a service that takes in data, does some Maths and outputs something, often a prediction. There will be a number measuring the quality of that something and you have to improve that number.

Now it sounds like "working out how to be useful" has been done! Improve the number, stupid! Unfortunately, it’s not so simple. You have to figure out the thing that improves the number the most in as short a space of time as possible. The temptation to an ex-academic is to drop back to the old way of behaving and apply some novel technique you read about. Then after months of effort you realise the new method has hardly shifted the number at all. The right thing to do here is to analyse actual failure cases and have a hard and honest think about what will fix the largest number of them. (My prediction: improving the data will by and large trump a more complex method.) The point is, you have to have a real think about what will actually improve the number, and that might involve work that’s stupid and boring. Don’t just reach for a smart new shiny thing and hope for the best.

Finally, there’s the dysfunctional environment. You’ll recognise these because no-one really knows why they are there or what they are supposed to be doing. There are many reasons for this. Perhaps someone at the company knew they needed data science but didn’t have a clue why. Perhaps the person who had a clue left, and there’s now a dangling team with no home. Occasionally it’s a research team kept around long past the time when their research had any relevance to the company mission. If you’re in this situation you’re in trouble. The best you can do to make it work is to try and turn it into the first kind of environment by reaching out to product people outside your immediate team. However in some cases, you might want to dust off that linkedin profile.

4. Lean on other people

You shouldn’t try to be useful alone. In academia, especially as a postdoc, you are a one man band, expected to be mathematician, writer, graphic designer, programmer, public speaker … the list goes on. Additionally you are in fierce competition with the people around you for a very small number of jobs in a very rigid hierarchy that might have made sense to a late 19th century Prussian. This all makes for a strange and isolating working environment, in which people develop overweening egos to survive. Making it worse, is a cultural assumption within academia that smart people stay on to do A-levels at school, the smartest of those go to undergrad, the smartest of those do postgrad, the smartest of those stay on for a postdoc and so on and so on.

You need to drop all that. Firstly if I think back to my time on the undergraduate / postgraduate transition, it definitely isn’t true that the smartest people stayed on to do PhDs. That means that they’re out there in the workplace and don’t need any #iamverysmart nonsense from you. More positively, it also means that they can help you with all that stuff which, if you’re honest, you know you’re not very good at. In fact, one of the great pleasures of working life is being in a cohesive well functioning team where each person takes on the roles that they enjoy.

So don’t be that guy tediously lecturing his colleagues on their jobs. Instead take it as an opportunity to work with, and learn from, other great people who think a little bit differently from you.

5. A small amount of paranoia is healthy in the workplace.

It’s 2020, and though capitalism has built many wonders, it is a ruthless beast. This "being useful" stuff rests on an unemotional calculus of known present costs vs estimated future benefits. And you are a known present cost. As soon as your costs outweigh your estimated future benefits you are in trouble.

So, unfortunately, you need to worry a bit about what expected future benefit the company gets from you. Not so much that you’re sleepless and paralysed by fear, of course. But just enough to keep you focused. You’re not in academia where either people have tenure and rarely get fired, or are on fixed term contracts and never get fired.

Now this is a description of how things are, not a judgement call on how they ought to be. Suffice to say, if you think (or, worse, your manager is telling you) that everything’s fine and everyone can pootle along, clocking off at 5.00pm, then your team is in trouble. The kind of trouble that involves M.B.A.s with great skin, suspiciously neat haircuts, and spreadsheets that lay out the case for major budget cuts.

So that’s it. 5 things I wish I’d known as someone leaving academia and entering the workplace. If you can think of any more, or disagree wildly, please chime in down in the comments!

Footnote

[1] Oh yes, kids of today, it turns out that deep neural nets are not the first technique to come along, generate a lot of excitement by promising to approximate any function in the world, but then prove not to be a universal panacea. Of course I realise I’m giving away my age here.