How I Became a Data Scientist in 6 Months
Related Tags：1. install SDKs for Python
2. Application Configuration Management
Abstract: My name is Kate and I just came out of 8 years of study and hard work without warning. You might be wondering, why would anyone do this? I have to say it was my boss who broke me and he knew I was going to make a change.
When my boyfriend suggested me to be a data scientist, I thought he was crazy because I didn't know anything about programming and he overestimated my abilities.
Turns out my friend Ana suggested the same thing about two weeks later. This made me think about the possibility again, why not try it? With that in mind, I decided to start over and reinvent myself as a data scientist.
Learning from 0, my goal is to become a data scientist
I wanted to study at my own pace, so I decided to take an online course. I think, with a PhD in neuroscience, I probably have enough formal training to do a job in data science. I just need some practical skills.
This story will describe four different courses I took and how they led me to work in data science at a healthcare startup in Silicon Valley.
At the time, most online courses I came across were free. So I challenged myself to get the skills I needed without spending money - what can I say, I'm stingy.
- "I used to think that correlation meant causation. Until I went to a statistics class, I don't think so anymore."
- "Sounds like this class was helpful?"
- "Maybe so."
When I graduated as a postdoc at UCSF, I had no programming experience. I use statistics in all my studies, but on a smaller scale. All datasets I have analyzed before were generated by myself in my lab. Therefore, the number of observations is very small. I need to learn to program and analyze data on a larger scale.
When I decided to become a data scientist, I wanted to learn how to write computer code. Since I've never written code, it's a completely unknown thing. It should be said: if I really hate writing code, then data science is not for me, that is, if you are not interested in this thing at all, then there is no need to learn and it is not suitable to learn from 0. So, it's a good start for me.
I'm lucky because my partner Ben has worked in a lot of tech and was able to point me in the right direction. He suggested that Python might be the best fit for me. Python is an excellent tool for data analysis, very versatile, and works well with large datasets, so I decided this was where I started.
learn to program
Course 1. Codecademy
To start learning programming, I used Codecademy. Started with Getting Started with Python, but I'm not sure if the course I finished still exists as it does in 2014. If I were to start learning Python using Codecademy now, I would probably choose the "Analyzing Data with Python" course.
I've found Codecademy to be a good starting point, the main advantage is being able to write code directly in the browser, since having a programming environment properly installed on my computer is still my Achilles heel, so I'm happy to avoid it at first it. Thankfully, if my code doesn't work, it's just because of a syntax issue, not because of a wrong environment setup.
I also like to use Codecademy to do several minutes of work at a time. If I have some free time, I log in and do some questions, because they are all there for me. This sporadic progress meant I wasn't afraid to get caught up in it.
At the time I finished the course, only a few courses on Codecademy were free. I am amazed by the quality of free courses online.
After learning the basics of Python, I needed to start improving my statistical experience and learn to analyze data on a larger scale.
learn data analysis
Course 2. Coursera Data Science Major at Johns Hopkins
Second, I gained Coursera data science expertise from Johns Hopkins University. At the time, you could make an honor code certificate version for free, and you only had to pay if you needed a certificate.
The certified certificate doesn't seem to matter to me. Instead, I need to be able to demonstrate the skills taught in the course in technical interviews. Therefore, I opted for the free version of the pro version.
One downside for me is that the series is taught in R. R is an excellent programming language for statistical analysis, favored by academia. However, I want to learn Python for data science. I think Python would be more useful in a startup where I want to work.
I've looked into some Python data analysis courses, but they seem to validate that I don't have enough relevant knowledge yet. It looks like most of these courses are geared towards software engineers who want to transition into data science. So they assume you have solid programming skills and already know how to set up a Python environment.
The main thing I like about the Coursera Data Science major is that it has step-by-step instructions from the beginning on how to install R and R studio in the Brst course. This makes it easy to handle subsequent lessons knowing there won't be any technical issues.
Another aspect of the Johns Hopkins data science major that suits me is that it is taught by the Department of Public Health. My expertise in the medical sciences allows me to easily follow the examples they enumerate. They exemplify the impact of air quality on asthma and other datasets relevant to healthcare. Therefore, I can focus on the course content instead of getting lost in the cases provided for data analysis.
This series of courses gave me a basic understanding of the main aspects of working in data science. It covers R programming, basic data cleaning, analysis, regression, and machine learning. I really enjoyed learning to program and how to use code to analyze data, so I was encouraged to keep learning.
Pay attention to recruitment information
During this phase of my training, I started asking friends in my circle if I could introduce me to others transitioning from academia to data science in San Francisco. Several people helped me get in touch, so I set up as many interviews as possible.
A friend introduced me to a data scientist at Modcloth, and she and I went down a similar path. She's a former neuroscientist and I found her advice especially helpful, mainly for learning SQL.
Learn to query databases
Course 3. DB5 SQL Stanford Online
The Coursera Data Science major at Johns Hopkins University does not cover SQL at all. She said most of her day-to-day work consists of querying the database. She has to extract insights for the business development and marketing teams, so only a fraction of her time is devoted to statistical analysis and machine learning.
I followed her advice and started a self-paced SQL course on Stanford Online. Of all the courses I've taken, this is my favorite. I liked it because the teacher was amazing and explained the concepts using simple examples, she also explained each concept in many different ways.
Since then, I have recommended this course to many people because I think a good SQL foundation is essential for any data scientist. The data science courses I've been exposed to don't cover how to get data from a database using SQL. I think this is a huge omission. Most courses have CSV data ready for students to use, but in my experience this is rarely the case for industry data science jobs.
After finishing the Stanford SQL course, I started applying for data science positions. At that point, I went back to Australia and started Skype interviewing startups in the San Francisco Bay Area. While interviewing, I want to continue to develop my skills.
Course 4. edX Data Analysis Fundamentals
Then, I took a basic course in data analysis using edX's R language. It's helpful to revise a lot of the concepts I've already learned in the Coursera course.
I believe learning concepts from different teachers can provide new insights. The second time around, it's easier to understand statistics and machine learning concepts. Through this course, I feel that I have a deeper understanding.
While I was taking this course, I managed to get a job offer at a healthcare startup in San Francisco, I got a work visa, and moved to the US.
get data science jobs
I think I was successful in the final interview because I have good programming skills and a good understanding of statistics, but more importantly I have knowledge in the healthcare field, expertise in experimental design and scientific method.
In my opinion, it's these other aspects that make my app stand out and let this startup give me a chance. My qualifications are very low, and the requirements for job training are higher. I think all the courses I've done are good enough for the recruiting team to consider me, and my experience in the healthcare field sets me apart.
Therefore, if you want to turn your career into data science, I suggest you go to a company where your existing domain knowledge is very valuable.
what i learned
Before I start my new data science job, I want to fill a major gap in my knowledge of using git from the command line. I've never used the terminal or command line before and don't know how to use git to commit code to the company's Github repo.
It took a few engineers a while to get me up to speed. I wish I at least knew how to use it before starting so I don't waste their precious time. My co-workers were great and they didn't seem to mind teaching me, but I did feel a little overwhelmed for the first few days.
I finally caught up and found the "hard command line code learning" very useful.
If you're considering a similar approach to data science, I encourage you to keep going! This is definitely the right choice for me. Of course, different people will learn in different ways, but if you have the discipline to learn and get started, it is certainly feasible to teach yourself data science through an online course.
Knowledge Base Team
Knowledge Base Team
Knowledge Base Team
Knowledge Base Team
Explore More Special Offers
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00