Learning Machine Learning: How should I attempt to start?

5 min readAug 11, 2019

I could just list a bunch of resources, but, although there is certainly no shortage of excellent content on the topic, there’s a reason I’m not doing that.

Hear me out. I’ve had this conversation over and over — each time with the same disappointing result:

Person X and I talk about work.
Person X expresses their desire to learn Machine Learning.
I explain what my from-scratch approach would be today: jump straight in by understanding a single form of Machine Learning, preferably the one that has been receiving all the hype: a Neural Network (NN). Then do a practical online course on that.
Later the day, I fulfil my promise of sending through some resources, which typically include a 3Blue1Brown video and a course from Andrew Ng.

And then:

5. We see each other again and I hear that Person X hasn’t worked through any of the resources, but is still super keen to learn Machine Learning.

I’ve found this result confusing for a couple of reasons:

The resources themselves are absurdly good. 3Blue1Brown’s legendary animated NN video gave me a better understanding of the topic than my 6-month college course — in 20 minutes. And Andrew Ng is world-renowned for his ML courses and these are available in multiple programming languages.
Person X was almost always genuinely excited about the topic and seems to realize the potential impact of ML.
Plenty of these people were highly capable, both intellectually and with regards to work-rate.

All this has brought about a bigger question, only 50% related to Machine Learning:

How should one approach learning a completely new field?

With regards to this, I have come to believe in mental tricks. Ideally, we’d all be rational enough to take the steps which lead to our best interest, but it seems that we need to play mind-games on ourselves to realize this outcome.

One of these would be to get some skin in the game. It plays on the natural human tendency to avert loss*. Here are some practical examples:

Pay for the course you are taking. It’s almost unfortunate that the best content is available freely. But making a decisive decision and paying for a certificate can be the answer. This dramatically increased the probability of completing the course because of the “I paid for this; I better get value out of it” argument.
Sign up to do work you don’t yet know how, preferably with a deadline and small compensation. Look for the opportunity to do basic machine learning work in an informal setting. This might just be building a simple regression model for data your friend/relative is working on with a 3-course dinner out as payment.
If you’re close to someone who works in ML, see if you can arrange scheduled meetups where you work through some material. Maybe tackle a model a night. And again, preferably compensate the person with dinner or something alike.

A core problem I’ve noticed preventing Person X from starting with ML is that they don’t actually believe they can understand it. The media only writes about outcomes where ML has been used to reach this almost unthinkable achievement. Amazement is the goal of the article, not understanding. And let’s face it: ML is not the simplest concept around. But neither is economics.

So here I’ve found this trick helpful: portraying something as simple as possible in order to make Person X believe that they can actually understand ML. The idea is that once you’ve tasted the satisfaction of grasping a previously pie-in-the-sky concept, your brain is hungry for more.

Let’s actually end this article by doing that right now.

K-Means Clustering in two minutes

Problem: (taken from work a friend is doing) Sneaky Ltd. wants to categorize customers into groups based on which products they’re currently buying. Afterwards, Sneaky markets the products that the individual customer is not yet buying, but is highly popular with the rest of that cluster.

This is defined as an unsupervised classification algorithm; unsupervised because the data points (customers) aren’t labelled (as group 1, group 2, …) — we don’t know exactly as what we’re classifying these customers.

Let’s confine this problem to just 2 products** with 3 customer categories and see how the algorithm tackles it:

K-Means merely chooses 3 (“k”, equal to the number of output categories) random points (“means” or “centroids”) and repeats these two steps:

Links each data-point to the nearest centroid, forming a new category group.
Calculates the new centroids for these new groups of data points.

Simple enough, right? I wouldn’t claim a Neural Net is comparable in complexity, but I would stand by the notion that many algorithms in ML can be reduced to an intuitive representation.

I sincerely hope this article helps, in whatever way. When I originally pondered over this subject, it led me to the question of whether a quite complicated field such as ML should even be considered by outsiders. Was those abstract college courses in statistics actually fundamental to my application of the field? But then I think about coding: how plenty of professionals from other industries had to learn this new useful paradigm even though a deep mathematical background was “required”. And although that background does help and coders who were taught from the ground up usually made better developers, the usefulness of coding and the domain knowledge of the newcomers resulted in an excellent return on investment. I don’t quite believe that ML will be having an impact on the same scale, but it certainly has a ton of applications outside of where it is used today.

Feedback is highly appreciated.

* Not quite using the principle of loss aversion though, as this refers to preferring the lack of a loss over a similar gain.
** With only 2 products, there is very little use in the outcome. And there’s a couple of other changed you’d want to make in a real-life situation, such as square-root scaling the amount bought per year or even just have it as a boolean value. But we humans have difficulty understanding higher dimensions.

Learning Machine Learning: How should I attempt to start?

K-Means Clustering in two minutes

Written by Chrisjan Wust