AI: Career advice
AI: Interview preparation
AI: Career FAQs
Q&A on Quora
The field of AI has changed dramatically over the last decade. Consequently, the role of a data scientist has also transformed and evolved into multiple specialized roles like data engineer, machine learning engineer, research scientist, applied scientist, AI product manager, and so on. I believe that we are still in the early days of AI, and it is as good a time as ever to break into data science.
Data science is also becoming more engineering-focused as companies realize that business value cannot be realized until a robust infrastructure is in place to deploy, monitor, and maintain data science models in production. As a result, data science offers an opportunity for software engineers to transition laterally and work more closely with data and models apart from code.
Additionally, data science has matured as a field with the advent of several tools and products that make the entire data science life cycle more efficient, transparent, and reproducible. The organizational time, effort, and resources needed from conceptualization to production of machine learning models are reducing, enabling data scientists to drive more significant business impact.
Another trend is the focus on deriving business value from massive amounts of unstructured business data like images, text, audio, and video apart from structured, tabular data. For such applications, deep learning models are particularly relevant. We are currently witnessing a tremendous amount of innovation and advances in this area, with groundbreaking models like BERT, GPT-3, DALL-E 2, Imagen, and Whisper, to name a few.
We will see a more significant business impact of innovative AI R&D where startups and large companies leverage these technologies to build new products and services. It is, therefore, even more exciting to be at the forefront of AI innovation and build a long-term career in data science and AI.
If you have a quantitative background in computer science, engineering, physics, finance, and related disciplines, you already have the core technical skill set to transition and excel in data science.
Candidates from a non-technical domain, on the other hand, have the advantage of domain knowledge. Doing well in data science requires a deep understanding of both the data (and the business domain) as well as the scientific aspects of analyzing data. I have seen and coached several candidates from non-traditional backgrounds in transitioning to data science and becoming successful practitioners and experts in the field.
My general advice to candidates interested in data science is to realize that they might already have several skills relevant to the data science industry. You only need to bridge the gap in the skills you lack or are less confident in to crack jobs at top tech companies and startups successfully.
Interviews for data scientist, machine learning engineer, and AI-focused roles comprise several rounds during a typical on-site interview. These interviews assess candidates' prowess in technical (coding, statistics, machine learning, systems design), product (product metrics, product sense, business case), as well as leadership and behavioral skills.
In a typical hour long interview, candidates may get anywhere from 5 to 15 minutes to ask questions to the interviewers. However, most candidates do not prepare or think about questions to ask in advance. This is a big missed opportunity for candidates to learn more about the role, team, org, company, tech stack, culture, leadership values etc. directly from the current employees and future team mates.
In the context of data science, candidates ought to ask pertinent questions that may shed more light on the day-to-day work, projects, teams and the culture in the org. With greater interviewing and real-world data science experience, candidates will be able to better decipher the answers to such questions and read between the lines to make a more informed decision whether to join the company or not.
With everything else being more or less equal amongst the different job offers one may have, the quality of the hiring manager and team, organizational culture, learning and career growth prospects become decisive factors.
Following is a sample list of 20 questions to consider asking the hiring team, in no particular order:
Mathematics is an integral component of Data Science. Building a strong foundation in topics like Probability, Statistics, Linear Algebra, Differential Calculus, Optimisation etc. will hold you in good stead in your data science career.
Having said that, it is important to note that the required level of understanding of the mathematical underpinnings of machine learning methods varies depending on the type of data science role, company, and the business domain.
For example, if you are a product data scientist at big e-commerce company, you may not need to dive deep into the underlying math to excel at your job. On the other hand, if you are a research/applied scientist in an R&D division or a financial trading firm, you do need to have strong fundamentals in mathematics to better understand existing algorithms and develop novel techniques.
Following is a list of recommended resources to get you started in your journey towards learning the mathematics of machine learning:
Machine Learning Engineering is a relatively new role in the data science job family. The ML engineer role is in high demand as organisations have realised that they cannot realise their business goals without first deploying machine learning models to production.
In large organisations with big machine learning teams, Machine learning engineer is a specialised role that is distinct from the role of a data engineer or a data scientist. In a previous article, I have compared the roles of a data scientist vs. machine learning engineer in detail.
As machine learning becomes an increasingly engineering focused discipline, there is a massive requirement for professionals who combine strong software engineering skills with an understanding of the entire data science lifecycle from raw data through to model production, maintenance and monitoring. Machine learning engineers typically focus more on building pipelines and infrastructure to ensure that all the MLOps are running smoothly without any failures.
I have shared a curated list of top 10 resources to help become a Machine learning engineer. Candidates can dive deeper into these courses, books, papers, and blogs to prepare for machine learning engineer job interviews.
Data science is now considered an integral function for modern companies. Data science provides companies a massive competitive edge in terms of:
Data-driven companies are able to achieve their goals faster and realise at least 20% more earnings. Proven statistics like this provide a significant impetus for business to invest in building and hiring data science teams that act as the catalyst for bringing a data culture across the entire organization.
Cracking data science interviews is getting tougher and tougher. Depending on the type of company, be it a startup or a big tech, and the level that you are targeting, you can expect to have 3–6+ interviews in all. The core data science interview rounds focus on statistics, programming (Python), machine learning, product or business sense, behavioral or leadership interviews. Additionally, interviewers in each round round also assess your problem solving, thinking, and, communication skills.
DSA is not as relevant for data science interviews except if you are applying for the role of data engineer or machine learning engineer. These roles involve more software engineering than data science and therefore require strong DSA understanding.
Data science interviews lack structure and vary a lot, and therefore it helps to learn from experienced mentors who have worked at the kind of companies that you are targeting.
Conducting innovative research in AI is not straightforward. Researchers should focus on a problem area that they are deeply passionate about, e.g., NLP, multi-modal AI, computer vision, speech, synthetic data, graph-based models etc. For research, the problem could be in the theoretical realm. However, if the problem area is grounded in the real-world, then practitioners can actually test their algorithms on real-world data and learn from the feedback.
The most important skill for doing novel research is to think deeply about a particular problem, and apply the scientific method systematically. This involves coming up with relevant hypotheses and conducting several experiments using the right datasets, algorithms, models etc. to test the validity of the hypotheses. An empiricial, data-driven strategy coupled with creative ideas usually leads to novel research output.
To come up with innovative ideas, you need to know the existing literature and what ideas have previously worked or not worked for a particular problem. Sometimes, it is sufficient to translate existing ideas for your particular use cases as well. Knowing what ideas can generalize and are practically feasible to solve a business problem is a rare skill that distinguishes the best applied researchers from the rest.
To become an expert in any discipline, it is important to build a solid knowledge base which can take a significant amount of time. If you build this foundation earlier than others, then you can advance on the journey faster and develop better first-principles thinking and intuition for a variety of machine learning problems. This is particularly true in the case of AI, which requires strong fundamentals in diverse topics including statistics, mathematics, programming, data analysis, presentation, and communication skills.
However, regardless of how early or late you start your career in data science, the key is to keep practicing and honing your skills, given that the field of data science is going to continue evolving rapidly. I have worked with both Bachelors’ students as well as senior IT professionals in their 30s and 40s who are equally motivated to launch their careers in data science.
Given the lack of formal degree education in data science, every data scientist I know is self-taught to an extent. With so many open-source resources, courses, code repositories and datasets available online, any ambitious and motivated person can become a good data scientist. However, to truly become a versatile data scientist, one needs to complement their learning with training from experienced industry mentors and develop a deep understanding of business domains like e-commerce, healthcare, fintech and how data science practically works in industry.
Copyright © 2022, Sundeep Teki
All rights reserved. No part of these articles may be reproduced, distributed, or transmitted in any form or by any means, including electronic or mechanical methods, without the prior written permission of the author.
This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated.