There's No Crying in Data Science

What is it like learning Python if you know R?

It’s like being every single character in this scene all at once, all the time:

Which is to say that the last two months have been some of the most frustrating months of my learning life - and I’m including 3.5 years in an Immunology & Infectious Disease PhD program in that calculation.

Sure I’m a little frustrated with Python, and I’m kind of frustrated with the fields of machine learning (and deep learning and data science and AI), but nothing compares with how frustrated I am with how slowly I’m learning. My learning rate is awful, and every small, incremental, barely noticeable gain in knowledge comes at the cost of 10+ hours of studying. Nothing seems to stick, and there’s been an noticeable uptick in my desire to Ron Swanson my computer:

Is Python hard or are you just bad at it?

Both? Neither? I honestly don’t know anymore.

First of all, installing everything sucks (you’re going to need the command line – no one seems to mention that part), and almost all of the learning resources are made for people who either:

  1. Already know Python
  2. Have never written a line of code

And if you look online it can feel like everyone is talking about how easy Python is, and how elegant and beautiful it is, and it just kind of makes me feel worse for not “getting it” when I so desperately want to “get it” and am actively trying to “get it.”

Why do you need to learn Python if you already know R?

Ah, the million-dollar question! It would be so easy to say “Hey, I know R, why should I bother with Python?” right? Except that I want to take my career in a direction that relies heavily on knowing both Python and R, and because I think that there’s inherent value in knowing both languages.

How do you learn Python effectively if you already know R?

It’s not you, it’s me
As much as I hate to say it, you’re going to have to break up with R for a little bit. It’s not forever, and it’s you - not R - that needs a little time and space. Because you need to let go of the way things are done in R and be open and curious as to how things are done in Python.

Know what you’re trying to learn, and remember why you’re learning
There’s a lot of really cool stuff out there to learn, and in the beginning it feels like you’ll never keep up with the rate of development. And that’s OK. Take a deep breath and think through what you’re trying to learn, and remind yourself why you’re learning.

It can also be helpful to look at what you’re trying to learn and determine if you’re starting with something too far outside of your learning zone.

For example, I really want to get into generative deep learning, but most of the books I’ve found say that I should first be pretty good at Python and Machine Learning before diving into generative deep learning. This doesn’t mean I’ve got to put everything on hold until I’ve taken every fundamentals course out there, but it does mean that I’ll be better served by getting a good enough understanding of Python and Machine Learning before tackling generative deep learning.

Remind yourself (often) that you’re really good at things!
Whenever I get really frustrated with myself for not being great at Python (yet) I take a moment to remember that there are things I’m really good at - like teaching, curriculum design, and public speaking. It’s helpful to remember that you’re an expert in something, and that becoming an expert took a lot of time, learning, and patience.

What do you recommend for learning Python if you know R?

Skip installation and use a cloud-hosted notebook
Seriously. Your goal is learning how to program in Python, not set up and optimize a system for production, which means anything you can do to reduce the amount of time spent between deciding to learn Python and actually programming in Python is worth the effort.

Some options for cloud-hosted notebooks are:

Find your “just right” resources
For me this has involved one paper and two books. Start by reading A Whirlwind Tour of Python. This is a fantastic resource which assumes you can program in another language, and spends its time pointing out some of the landmarks that make Python different and unique.

From there find a book (or class or course or whatever learning method you prefer) and use it to learn. I’m working through Hands on Machine Learning with Scikit-Learn, Keras, and Tensorflow, and my process looks like this:

  • Read a chapter while writing and executing all of the code in a Kaggle notebook.
  • Re-read the chapter (without coding) and annotate the chapter by highlighting vocabulary words, important concepts, and anything unfamiliar.
  • Research what I’ve flagged in the chapter by taking notes and creating illustrations to better understand the concepts.
  • Review the code and start linking what the code is doing to chapter vocabulary and concepts, especially to any illustrations that I’ve made.
  • Dig deeper into the code by looking up anything I have questions about in the Python Data Science Handbook.
  • Use my chapter notes to re-create the analysis using a new dataset.
  • Complete the end of chapter questions.

It takes a long time and it can be exhausting, but this method works really well for me.

Find your learning community
We learn better when we learn together. I haven’t found my Python learning community - yet - but I’m going to keep looking.