Category Archives: for novices

Navigating the Machine Learning job market

Over the past couple of months, I have been trying to navigate the machine learning job market. It has been a bewildering, confusing, and yet immensely satisfying and informative time. Talking with friends in similar situations, I find a lot of common threads, and I find surprisingly little clarity online regarding this.

So I’ve just decided to put together the sum total of my experiences. Your mileage may vary. After you’re done being a fresher, your situation and what you’re looking for gets a little more unique, so take whatever I say with a pinch of salt.

I’ve been passionate about machine learning for six years or more now. Though I didn’t realize it at that time, a lot of project choices, career choices and course choices I made were with the thought of ‘does this help me get closer to a research-oriented job that involves text mining in some form?’.  I went to grad school at a university that was very research oriented and worked on a master’s thesis on an NLP problem, as well as a ton of projects in courses. My first job after that involved NLP in the finance industry. My second job also involved text processing. The jobs I got offers from after this period also involve NLP strongly. I’ve literally never worked on anything else. So you can understand where I’m coming from.

So. Machine learning jobs. Where are they, usually?

Literally everywhere, it turns out. Every company seems to have a research division that involves something to do with data, and data mining. The nature of these positions can vary.

There are positions where you need to have some knowledge of machine learning, and it kind of informs your job, which might or might not involve having to use ML-based solutions. Usually these positions are at large companies. As an example, you might be in a team whose output is, say, an email client. There’s some ML used in some features of the product, and it is important for you to be able to grasp and work around those algorithms, or be able to analyze data, but on a day to day basis you’re working on writing code that doesn’t involve any ML.

There are other similar positions where you deal with a higher volume of data, and they have simple solutions to get meaning out of them. Maybe they use Vowpal Wabbit on a Hadoop cluster on occasion. Or Mahout. But they’ve got the ML bit nailed down, and more of the work involves just doing big data kind of work. These positions are more ubiquitous. If you have some ML on your resume, as well as Hadoop or HBase, these doors open up to you. Most of the places that require this kind of a skillset are mid-sized companies kind of out of the startup phase.

Then you have the Data Scientist positions. This phrase is pretty catchall, and you find a wide variety of positions if you look for this title. Often at big firms, it means that you have knowledge of statistics, and can deal with tools like R, Excel, SQL databases, and maybe Python in order to find insights that help with business decisions. The volume of data you deal with isn’t usually large.

At startups though, this title means a lot more. You are usually interviewing to be the go-to person for all the ML needs in the company. The kind of skills interview all the ones I mentioned above, apart from having a thorough knowledge of other things like scikit-learn and Weka, as well as having worked on ML projects. Some big data experience is usually a plus. Often, you’re finding insights in the data and prototyping things that an engineering team will put in production. Or maybe you’re also doing that if ML is not central to the startup’s core business.

Most people are looking for the Research Engineer job. You aren’t usually coming up with new algorithms. But you’re implementing some. On the upper end of the scale, you’re going through research papers and implementing the algorithms in them and making them work. You need a fair idea of putting code into production and deviate from research in adding layers to things to make your system work in a more deterministic, debuggable fashion. An example would be several jobs at LinkedIn where a lot of the features on the site need you to use collaborative filtering or classification. Increasingly, these jobs work on large data, but often that is not the case, and people manage fine using parallel processing instead of graph databases and mapreduce.

In a mature team, this position might not require you to use your ML skills on a day to day basis. In a new team, this position would need you to work on end to end systems that happen to use ML that you will be implementing.

In larger firms, you probably just need to have worked on ML in grad school, and your past jobs. It doesn’t matter the nature of the kind of data you’ve worked on. In startups though, they start looking for more specific skills. Like they’d want someone who’s specifically worked on topic modelling. Or machine translation. The complexity of their system doesn’t usually call for a PhD. They would grab an off the shelf solution if they could. But they would ideally want someone who has an idea of these things own this component and manage it completely, and be able to hit the ground running, which is why they want someone who’s worked on same or similar things previously.

Which brings me to another point. All ML jobs aren’t equally interviewed for.

Several large as well as mid-sized tech firms hire you for the company, not for a specific team or role. Usually, the recruiter finds you based on buzzwords in your resume, and sets up interviews with you. The folks interviewing you probably work in teams that have nothing to do with your skills. It is possible you go through interviews not answering even one ML question. Later when you get hired, they try to match you to a team, and they try to take into account your ML background to place you in a relevant team. If you’re interviewing for a specific kind of job, this makes it harder as you don’t know until you’re done with the whole process about what kind of work you’ll be doing.

Like I said before, at startups probably, you’ll know exactly what kinds of problems you’ll be working on. But more often, you’re hired into a group of sister teams. They all require similar skills. Maybe they work on different components of the same product, all of which use ML in different ways. So you have a fair idea of what you’ll be working on, but not necessarily a clear picture. You might end up working at the heart of the ML algorithm, or maybe you’re preprocessing text. The interviews will go over your ML background and previous projects as well as ML-related problem-solving.

Then there’s the Applied Researcher role. You usually require a demonstrated capability of working on reasonably complex ML problems. You are occasionally putting things in production and need good coding skills. Often, you’re prototyping things after researching different approaches. When you do put things in production, it is usually tools that other teams that use ML in their solutions use. Language is no bar, but usually there’s an agreed-upon suite of tools and languages that the team uses.

The Researcher role usually requires a PhD. Your team is probably the idea factory of the company, or that particular line of business of that company. Intellectual property generation is part of the job. I’m not highly insightful about this line of work, because I haven’t known very many people opting for these positions, and it feels increasingly like PhDs take up the Applied Researcher/Research Engineer role in a team, and do the prototyping and analyses while others help with that as well as put these prototypes into production.

There’s a lot of overlap in all these different types of positions I’ve mentioned, and it isn’t a watertight classification. It’s a rough guide to the different kinds of positions there are.

So where do you find these jobs?

LinkedIn is a great resource. You can use ‘machine learning’, ‘data mining’, ‘image processing’ or ‘data science’ or ‘text mining’ or ‘natural language processing’ as search keywords. I’ve also found Twitter to be a great place to search for jobs using these same keywords.

There are tons of job boards that also enable you to search using these keywords. Apart from them, I find a lot of ML-specific job fora. There’s KDNuggets Jobs, NLPPeople, LinguistList which are browsable job boards. Apart from them, there are also mailing lists like ML-News and SIG-IRList. I’ve also found /r/MachineLearning on Reddit to be a good resource on occasion for jobs.

Now that you’ve found a position and sent them off your resume and they got back to you, what do you expect in the interview? Wait for my next post to find out!

Recommender Systems Wiki

Use and contribute and link to:

Now should have one such for ML methods in NLP and my life will be great

How to read a research paper

If I’m completely in the groove, with a firm topic in mind, I find it relatively easier to read papers. However when I’m attempting to get started on something, or am reading a paper which, say, I have to summarize for a course, I lose my footing. I procrastinate, I become reluctant to start.

I decided I wanted out of this shite, and hence googled for ‘How To Read A Paper’. I found this paper by someone from the University of Waterloo, and I suspect this will help out greatly.

Let me summarize it for you.

Essentially, given a research paper, you go over it in three passes.

First Pass (5-10 minutes):

  • Read the Title, Abstract and Introduction.
  • Read the section/subsection headings and ignore all else
  • Read the conclusions
  • Glance over the references and tick off those you’ve already read.
  • By the end of this pass, you should be able to answer 5 C’s about the paper:
    • Category
    • Context (What papers are related? What bases are used to analyze the problem?)
    • Correctness (Are the assumptions valid?)
    • Contributions of the paper
    • Clarity (Is the paper well-written?)

Second Pass (1 hour):

  • Read the paper more carefully, while ignoring details like proofs
  • Jot down points, make comments in the margins
  • Look carefully at all figures, especially graphs
  • Mark unread references for further reading (for background information).
  • Summarize main themes of the paper to someone else.
  • You mightn’t understand the paper completel. Jot down the points you don’t understand, and why.
  • Now, either
    • Decide not to read the paper
    • Return later to the paper after reading background material
    • Or persevere on to the third pass

Third Pass (4-5 hours):

  • You need to virtually reimplement the paper. Recreate the paper, its reasonings
  • Compare your recreation with the original
  • Think of how you would present the ideas, and compare with how the ideas are presented.
  • Here, you also jot down your ideas for future work
  • Reconstruct the entire structure of the paper from memory.
  • Now you should be able to identify the strong and weak points of the paper,  the implicit assumptions and the issues there might be with experimental or analytical techniques, as well as missing citational information.

That’s all.

Additionally, I think as a form of accountability (which I so need at the moment), I will blog every single paper I read, in accordance with the above structure.

%d bloggers like this: