“Data is the new oil” – Clive Humby. If you have found your way to this page, I am confident you are not alien to this quote.
Today, even the smaller players in the industry who were once oblivious to data are becoming increasingly reliant on data-based decision-making. Plus, the constant rhetoric put forward by reputed media companies and research houses has left little doubt about today’s being the age of data.
Given that, getting onboard this data wagon as a data scientist or analyst, and contributing to this data transformation is a dream for many. To help you fulfill this dream, I did some pretty heavy research!
I interviewed 100+ data science professionals (data scientists, hiring managers, recruiters – all included) over the last 6 months. Based on the feedback I received from them, I came up with 6 proven steps to becoming a data scientist.
If you can just read through these six steps and follow my recommendations, I am extremely confident that your journey to becoming a data scientist would become much clearer and smoother.
So, without further ado, let’s get started!
6 Proven Steps To Becoming a Data Scientist
Step-1: Understanding the Data Scientist role
If you have been searching the internet to find one consistent answer for ‘What Data Scientists do?’, I am sure you have not had much luck. Data Science is still an evolving career track and could mean different things to different companies (and even to different teams within the same company).
However, irrespective of the company you pick and the industry you choose, there is one definition that successfully captures the essence of the data scientist role. It is as follows:
Data Scientists are problem solvers that help organizations make better decisions with the use of data. They leverage their skills not just to answer management’s questions, but also to help inform the questions that the management should be asking.
In short, data scientists do just a whole lot more than what it may appear on the surface. Accordingly, the level of skills required to be successful in this role can greatly vary.
Key Takeaway: Some data scientists need extremely pronounced technical skills to effectively do their jobs. However, if you choose to, there are plenty of data science positions needing a relatively moderate skillset that you can target as a beginner.
Next, we will look at the key skills that you absolutely need for targeting a data science position.
But, if you are still curious to learn more about the role of data scientists, I recommend reading Data Science For Business by Foster Provost and Tom Fawcett.
In fact, by signing up for Free Audible Premium Plus Membership Trial, you can listen to this amazing book ABSOLUTELY FREE!
Step-2: Learning Core Skills That Data Scientists Need
Now that we have established what a data scientist does on a day-to-day basis, let us take a deeper look at the skills that one absolutely needs to become a Data Scientist.
As mentioned before, this list was created after rigorously absorbing the feedback from 100+ data science professionals at top-tier companies (including Meta, Google, Microsoft, Amazon, Walmart, et cetera).
While there is no limit to how advanced and complicated data science can get, the skills highlighted in this section are enough to get you started as a data scientist.
This may come as a surprise to you, but if data science is your chosen career track, Advanced Excel is the first technical skill that you should aim to master.
Most aspiring data scientists just can’t accept Excel as an important skill for their chosen career. Intuitively, I can see why one would think that way. After all, as a data scientist, you are expected to do more complex number-crunching than what Microsoft Excel can support.
So, what makes Excel/Spreadsheet an important skill for data scientists?
There are three reasons for which a data scientist must have extremely good Excel skills:
- Analysis Review with Stakeholders: As a data scientist, you often present the findings of your analysis to business stakeholders. At most companies (especially the big tech ones), these stakeholders are very data-savvy and would often review your analysis in great detail. For such reviews, Microsoft Excel is a much better choice than using PPTs and dashboards.
- Sharing Analysis with Stakeholders: As a data scientist, you are often required to share your findings with different stakeholders over email or the cloud. Therefore, when it comes to sharing analysis and findings that includes data, there is no better tool in existence than Excel/Spreadsheet.
- Importing Data From Multiple Data Sources: As a data scientist, you would often work with multiple data sources. Some of your data may be sitting on a Teradata server, some in Hive, and the rest in MySQL. In situations such as these, by making import connections, you can easily get all the data that you need in one place for analysis.
On top of all that, there are companies (for example – Walmart) that, as part of their data science hiring process, specifically test you on your Excel skills.
Therefore, start your preparation for becoming a data scientist by solidifying your skills in Microsoft Excel.
Remember: we are talking about Advanced Excel skills here and not just a general level of comfort with the tool. So, you may need to put in some days’ worth of effort here.
To provide you with the best recommendation, I reviewed over 10 Excel courses online, and here is the only one that made the cut.
Excel Skills for Data Analytics and Visualization Specialization offered by Macquarie University on Coursera
Of the 10 Excel courses I personally took and reviewed, this is the only course that covers everything you would need as a data scientist.
To me this Coursera course offered by Macquarie University stood out on all fronts:
- The Topic Coverage: It covers every advanced Excel topic that a data science professional could possibly need. And, I got feedback on what is needed for data science positions at top companies by interacting with over 100 professionals. 🙂
- The Instruction Quality: Instructors have done a great job to make this course engaging. I promise, if you do decide to take this course, you would find all its lectures very interesting.
- Practice Problem: This is an area that many other advanced excel courses struggled with. But, the practice problems included in this course are just Amazing. Hence, just for its practice problems – this course is totally worth it.
- Schedule Flexibility: You can work through this course on your own schedule. Also, if you can finish this course within 7 Days – it is absolutely free! But, even beyond that – for the cost of a family dinner, you get really high value from this course.
With that, I would reiterate that you must not underestimate how critical Advanced Excel is. Make sure to give Advanced Excel the attention it deserves when preparing to be a data scientist.
It’s FREE if you can manage to complete this course within 7 days.
Plus, you can cancel your subscription anytime within the first 7 days.
To most aspiring data scientists, there is little doubt about SQL being a critical skill. However, many underestimate just how much one relies on Advanced SQL after getting into this profession.
You definitely need SQL to clean, structure, and query data. Plus, running relatively low complexity analysis in SQL is usually a breeze.
But, you may be surprised to learn how much SQL code gets inserted within a Python code that is written to analyze a complex data problem.
HINT: A LOT!
In my interaction with over 100 data professionals, I reference a study conducted by Forbes suggesting that data scientists spend nearly 60% of their work time organizing and cleaning data so that I can get their perspective on the same. To no one’s surprise, they all concurred with the findings of this study.
Plus, the 100+ data science professionals I interviewed added that besides running lower complexity analysis (again, relatively speaking), SQL is their ‘tool of choice’ and their ‘tool of need’ when it comes to cleaning data or preparing data for any of the more complex analysis.
Hence, as a data scientist, you can expect to spend at least ~60% of your time writing, optimizing, validating, and debugging SQL queries.
On top of that, there are two other reasons why you would want to pay special attention to learning SQL:
- SQL Focused Job Opportunities: At top companies (including – Meta, Google, Walmart, etc.), there are plenty of data science and advanced analytics positions at top companies that almost fully rely on SQL and Advanced Excel.
- SQL Tests in Data Science Hiring Process: Most companies will test your knowledge of SQL as part of their data science hiring process. Without proper SQL knowledge, navigating through trick interview questions may prove challenging for you.
Now that we have established SQL as a critical skill to be a data scientist, let’s talk about some good resources for learning it.
I reviewed over 20 highly recommended SQL courses and evaluated these courses on two criteria. First, how well do these courses cover and instruct fundamental SQL concepts? Second, how well do these courses prepare you for technical SQL interviews?
Based on that criteria, below are my top course recommendations.
Top Recommendation: Databases and SQL for Data Science with Python offered by IBM on Coursera
First thing first. The title of this course implies that there is some Python required for completing it. However, you will not even need to say the word ‘Python’ until its very last chapter.
This course covers SQL concepts that are needed to crack most data science interviews BETTER than any other SQL course that I came across and reviewed!
To top that all up –> You can get started with this course absolutely FREE! (It offers 7 Days Free Trial)
Now, that alone is a good enough reason to sign up for this course. But, there is one other reason that made this course stand out for me among the 20+ SQL courses that I reviewed:
- Pathway to IBM Data Science Professional Certificate: This is one of the 10 courses that you need to compete to earn the IBM Data Science Professional Certificate. Further in this article, I will explain why you may want to pursue this certification. For now, let me just conclude by saying – It is a big deal!
In short, this is simply the best SQL course for data science aspirants out there. Irrespective of where your current SQL skill level falls today, this course has something for you. You just cannot go wrong with this course!
Runner-up Recommendation: DataCamp
If you have not heard of DataCamp before, here is how I would introduce it to you:
DataCamp is the best all-rounder platform that offers the entire spectrum of data science courses (SQL, Python, R – you name it) under one membership umbrella.
Therefore, if you want every possible course that you could possibly need for building the right technical skills for data science on one single platform – DataCamp is that single platform for you.
Here are a few things that made DataCamp stand out among the 20+ SQL courses I personally reviewed:
- Course Content and Instruction Quality: DataCamp is the only 100% data science focused platform that I came across. With such great focus, comes really great quality – and, I really mean it. The course content and the instruction quality at DataCamp are just amazing. You just cannot go wrong Signing Up For DataCamp!
- Practice Problems: DataCamp offers more practice problems than any other platform that I have ever come across. Just this problem bank in itself makes DataCamp signup totally worth it.
- One Membership For All Data Science Courses: This, to me, is perhaps the most appealing feature of DataCamp. You sign up for DataCamp, and you are set for most data science courses that you could possibly need in your journey to becoming a data scientist.
Finally, DataCamp offers you the first chapter of every single course ABSOLUTELY FREE!
So, you don’t have to spend a penny if you don’t like what you find post signing up for DataCamp. Hence, there is a lot your you to gain and almost nothing for you to lose by signing up on this platform.
Data Visualization is another simple but important skill for succeeding as a data scientist.
From a technical know-how perspective, it is not a very complex skill to learn. With the advent of data visualization software (such as Tableau, Power BI, and Looker), once you have built the right data pipelines, creating different data visuals is not that complex. But, don’t let that fool you!
Knowing how to create visuals, and knowing what visuals to create for narrating your stories are two very different sets of skills. To succeed as a data scientist, you need both.
In any data science job, you will be tested on both these skills every day. Additionally, don’t be surprised if in your data science interview you are tested on things such as determining KPIs to track and dashboards to build. Such questions are very common in data science interviews at Tesla.
Therefore, practice and put dedicated efforts towards building these skills rigorously!
Every data science professional that I interviewed emphasized the importance of storytelling and choosing the right visuals when it comes to data visualization.
However, of the 10+ data visualization courses that I reviewed for you, only 1 covered data visualization from the storytelling lens. This is in addition to the technical aspects of data visualization, of course.
For that reason, there is just 1 data visualization course that I can truly recommend to you. And, it is the Data Visualization with Tableau Specialization offered by UC Davis on Coursera.
As mentioned above, Data Visualization with Tableau Specialization on Coursera was the only course that checked all the boxes for me.
Plus, the overall structure of this course is just amazing. In my opinion, this is perhaps the only course that truly prepares you to handle data visualization problems from every single angle. Be it storytelling or the technical aspects of building a dashboard. Definitely, the best one that at least I came across in my extremely thorough research.
The best part: Like most Coursera courses, you can Try This Course ABSOLUTELY FREE!
So, without any further ado, Click Here To Sign Up Today!
(You can always cancel within 7 days if you don’t find it worth your time and effort)
Python/R For Data Science
Now, let’s talk about Python and/or R for data science!
It for sure is the skill that you need for the fun and exciting stuff in data science – for example, Machine Learning and Deep Learning.
But, many aspiring data scientists make the mistake of assuming that it is just that fancy stuff that one needs Python and R for. When in fact, you will need to learn either Python or R for a much broader scope.
Interviewing 100+ data science professionals revealed that you need Python and/or R to perform three functions as a data scientist:
- Analyzing Data Spread Across Multiple Sources: At large companies, it is not uncommon for the data that one needs for analysis to be spread across multiple data sources. In such situations, by using Python/R (and plugging SQL queries within the Python/R code), you can establish data connections across multiple sources.
- Running More Complex Data Analysis: When analyzing millions of rows spread across multiple tables, SQL queries can take extremely long to compute even calculations. In situations such as these, you would need to rely on Python/R for data analysis.
- Machine Learning (and other advanced analytics): This is a no-brainer, of course. You will need either Python or to run advanced forms of assessments that require Machine Learning, Deep Learning, or the likes of these.
If you are confused between the two, learning Python is typically easier than R for beginners, and Python is a more popular language in the workforce (as per many different studies).
Therefore, I would recommend that you get started with just learning Python. Once you have a solid grip on Python, it would be a lot easier for you to learn R too.
I reviewed 25+ Python courses and evaluated them based on feedback and perspective from 100+ data science professionals, and below are my top 2 recommendations for you.
Pick any between these two, you would not need to look elsewhere for any Python concept for data science!
Top Recommendation: Python for Data Science, AI & Development offered by IBM on Coursera
Oh, boy! Where do I start? First, with IBM being the course creator, you can be confident that this course is not just academic, and it in fact prepares you for the real data science world.
Plus, you can Sign Up For This Course Absolutely Free, and won’t have to pay a dime if you can finish it in 7 days!
But, to give you full visibility, below are two reasons that made this course stand out to me:
- Coursework Quality: Both the curriculum and the instruction quality for this course are outstanding. Plus, this course is very practical in terms of preparing you for the data science jobs at top companies.
- Pathway to IBM Data Science Professional Certificate: Similar to the Databases and SQL for Data Science with Python course mentioned above, this is one of the 10 courses required to compete to earn the IBM Data Science Professional Certificate. And, trust me – having that certificate put you at a great advantage vs. other candidates applying to a similar data science role.
So, without much delay, Sign Up For This Course Today!
(it’s completely free if completed within 7 days)
Runner-up Recommendation: DataCamp
As also mentioned above: DataCamp is the best all-rounder platform that offers the entire spectrum of data science courses (SQL, Python, R – you name it) under one membership umbrella.
Be it the course content, the instruction quality, or the practice problem set –> DataCamp Has It All!
You just can’t go wrong with DataCamp!
You can even learn R on it!
Sidenote: If learning R is on your to-do list, DataCamp is my top recommendation for it!
Did you really think you could get away without statistics? Unfortunately (or, fortunately for some) statistics is an important area for data scientists. For a ton of work that you would do in this profession, conceptual clarity in statistics is a must.
However, depending on your interests and target positions, the depth of your knowledge in statistics can vary.
That being said, irrespective of your chosen track within data science, having a working knowledge of Statistics is an absolute must for any data scientist!
In my opinion (which was informed by interviewing 100+ data professionals :)), Naked Statistics by Charles Wheelan is the best resource for learning the statistics that you would need as a data scientist.
In this book, the Wheelan covers statistics in plain English, making this a great audiobook as well. Trust me – learning basic statistics cannot get easier than that!
Plus, you can listen to this audiobook absolutely free by Signing Up For a Free Audible Premium Plus Trial!
Step-4: Strengthen Your Profile For Data Science Roles
Now that we have covered the core skills required for getting your dream job, it’s time to work towards building a profile that is appealing to recruiters and hiring managers.
Let’s be honest, no hiring manager or recruiter can fully evaluate a candidate purely based on the interview. All hiring managers look for certain validations, which may be in form of your past experiences or educational background.
But, does this imply that without having a graduate degree in data science, your chances of becoming a data scientist are slim?
By interviewing 100+ data professionals, I identified a few certifications and steps that you can take to maximize your chances of getting your dream data science job!
So, without further ado, let me tell you what data science hiring managers and recruiters look for, and things that you can do to stand yourself out of the crowd.
Add Data Science Certificates and Credentials To Your Profile
It may sound like a cliche but as per ~100 data science professionals that I interviewed, having data science specific certificates and credentials on your resume helps in two ways:
- It gives recruiters and hiring managers the confidence that you have the core skills necessary to be successful in the role.
- It proves your genuine interest in the domain and helps differentiates you from many other candidates applying for similar roles.
That being said, not all data science certificates and credentials are made the same, and for that reason, they are treated differently.
To ensure you get the best value for your time and energy, I gathered some direct feedback and perspective from industry leaders in data science and narrowed down my recommendation to just three certificates/credentials.
You can’t go wrong picking any among these. But if you are serious about getting into data science, I would strongly recommend that you get at least one.
It will really help you land your dream data science job!
Top Recommendation: IBM Data Science Professional Certificate
IBM Data Science Professional Certificate has a lot going for it. This certificate has long been my all-time favorite for what it offers. Plus, the feedback from 100+ data science professionals made giving this recommendation even easier for me!
Below is a quick summary of what made this certificate my top recommendation:
- Industry Recognition: 70% of ~100+ data science professionals that I interviewed confirmed IBM Data Science Professional Certificate to be highly recognized at their company. They, in fact, added that seeing this certificate on a resume can at times help sway the hiring decisions.
- Course Content and Instruction Quality: This certificate teaches everything you need to be successful as a data scientist. It covers best practices in the industry, and just the course content covered in this certificate program would be enough for most aspiring data scientists.
- Certificate Cost: The fee structure of this certificate program also makes a big difference. In essence, you pay for each month of enrollment vs. paying a lump sum here. With this, you can literally complete this entire certificate for just under 100 dollars. (YES! It is possible. I have personally done it.)
- Program Flexibility: You can pursue this certificate course entirely based on your schedule. It has no hard start and end dates that you need to plan your schedule around. If you like (and I would highly recommend it), you can enroll in this program today!
In short, this certificate program has a lot going for it, and you just can’t go wrong by enrolling in it.
Based on 20+ Non-degree data science programs that I personally reviewed and gathered feedback from 100+ data professionals on, from an overall value standpoint, there is nothing better out there!
Runner-up Recommendation: MIT MicroMaters Program in Statistics and Data Science
MIT Micromasters Program in Statistics and Data Science stands second on my data science credential recommendation list for you. This is an incredible program, and it was only its higher cost and lower flexibility (when compared to the IBM Data Science Professional Certificate) that pushed it to the second spot.
Below are the things that make this course stand out:
- Superior Instruction and Course Content: Well, I believe there are a few things that MIT can never get wrong. This is a wonderfully designed course and the quality of instructions in it is truly amazing.
- Pathway to Graduate/Ph.D. Programs: MIT MicroMasters in Statistics and Data Science is truly a MicroMasters, and not just a certificate program. It is more rigorous than any other program on my recommendation list, and consequently many universities (including MIT itself) have favorable grad school admission criteria for applicants with this credential.
- MIT Brand: It goes without saying, the MIT brand does carry some weight in the industry. Getting this credential is definitely more difficult than other certificate options. But, this extra effort is totally worth it. Nearly all (yes – all 100+ of them!) data professionals that I spoke to attested that this credential will help one stand for almost every data science position.
In summary, MIT Micromasters in Statistics and Data Science has a ton of great things going for it as well. It is more rigorous and detailed than the IBM Data Science Professional Certificate and also carries a slightly higher weight in the industry.
So, if your focus is to get the most comprehensive education in data science: MIT Micromasters in Statistics and Data Science is simply the best out there!
Alternate Recommendation: Harvard Data Science Professional Certificate
Harvard Data Science Professional Certificate is another great option that you may want to consider. It has a lot going for it, and below are the things that make this credential stand out:
- Course Content and Instruction Quality: Well, you can expect nothing less from Harvard! It is an amazingly well-designed course with superior instruction quality.
- The Harvard Brand: It goes without saying, the Harvard brand does have a certain appeal to it. Having the Harvard brand on your resume can go a long way.
So, if this certificate program has so much going for it, why make it an alternate recommendation (and not a primary one)?
- This certificate is based on R, and not Python. If you are just getting started on your journey to becoming a data scientist, I strongly recommend that you start with a Python-based course (vs. an R-based course).
But, if for any reason, R is your language of choice Harvard Data Science Professional Certificate is an amazing course to consider!
Add a Data Science Project Portfolio To Your Profile
Which hiring manager would not want to hire the candidate whose quality of work they have assessed firsthand? Years ago, giving hiring managers visibility into your quality of work would have been a difficult proposition. However, today it’s not that far-fetched of a proposition at all (at least, for domains such as data science).
There are a ton of websites and online forums (for example – GitHub) where you can build a portfolio of data science projects. A link to this portfolio can then be added to your resume, giving hiring managers and recruiters direct access to your work.
Feedback from 100+ data science professionals revealed that besides the right credentials on a resume, it is the project portfolio that makes a candidate stand out to them.
Now, the important question. Where does one get interesting projects to build a data science project portfolio?
When it’s about data science projects, nothing comes close to what DataCamp has to offer here.
Plus, it is not just the projects that DataCamp offers. You get a ton of data science courses, in addition to these projects with their low-cost membership plan.
While their premium membership gets you access to a ton of additional projects and resources, even their basic plan would suffice the requirements of most people.
Step-5: Building the Perfect Data Science Resume
Now that you have done the hard work, it’s time to showcase your skills to the hiring managers and recruiters with a perfectly crafted resume.
Recruiters often overlook many great candidates because of minor things they could have been presented better on their resumes.
But, to be fair, your resume is the only thing recruiters have to compare you with 100 other applicants.
Therefore, you must at all costs ensure that the resume you submit with your job applications is well crafted and targeted!
When I asked data science recruiters for resume writing guidelines they would recommend, many of them recommended reading How to Write the Perfect Resume: Stand Out, Land Interviews, and Get the Job You Want By Dan Clay.
It is available on Amazon, and in the words of two data science recruiters at Google, “This is perhaps one of the best books you can find on resume writing and is totally worth a read”.
Step-6: Targeted Networking and Job Application
Now that you have done all the prep necessary for getting your dream job, and that you have a well-crafted resume. What Next?
It’s time to reap the benefits from all that hard work, and begin applying to your dream jobs!
But, before you do that: One final tip from my interactions with 100+ data science professionals.
Instead of bulk submitting your resume on different recruitment portals and company websites, targeted applications almost always yield better results.
In my conversations, both data science hiring managers and recruiters emphasized that they are more likely to move forward with an applicant who has connected with them on LinkedIn with a personalized message in past.
If you think more tips such as these would be helpful in your job hunt, sign up for the Free 30-day Audible Premium Plus Trial Today, and listen to Knock ’em Dead: The Ultimate Job Search Guide by Martin Yate.
It was a highly recommended resource from quite a few recruiters!
You can even buy a copy of this book on Amazon!
In conclusion, I will reiterate that data science is an incredible domain and things in this space are barely getting started.
If you are even remotely interested in data science, take the first step toward learning more about it. Getting Started With DataCamp is extremely easy and very soft on the pocket.
If nothing more, Signing Up with DataCamp will allow you to get a feel for the data science career track. Hence, there is a lot to gain and almost nothing to lose in taking this first step.
In parallel, if data science is already the domain where your passion lies, you must add some data science credentials to your profile.
IBM Data Science Professional Certificate is the one that I would recommend to most people.
Therefore, if you are committed to being a data scientist and can spend energy and time on just one thing: Sign Up For IBM Data Science Professional Certificate on Coursera.
It has the right mix of everything and costs almost nothing. Plus, it is 100% Free to Try and Get Started! (in the form of a 7-Day free trial)
Among the other solid data science credentials you can pick up to strengthen your profile are the MIT MicroMaters Program in Statistics and Data Science, and the Harvard Data Science Professional Certificate.
But, irrespective of what option you choose, the important thing is to get started.
With that, I am rooting for you and I look forward to seeing you on the other side! Happy Learning!
Affiliate Disclosure: We participate in several affiliate programs and may be compensated if you make a purchase using our referral link, at no additional cost to you. You can, however, trust the integrity of our recommendation. Affiliate programs exist even for products that we are not recommending. We only choose to recommend you the products that we actually believe in.