Practical Machine Learning

Prologue

For my summer internship, my project involves using machine learning to help small businesses with funding. I learned a lot about machine learning in the process, so I gave a talk about it to some of my co-workers and shared the slides online:

I also shared the code on my Github. The following is an essay version of the talk.

Introduction

Before I say anything, I want to show you a video from the 2017 WWDC Apple conference, WWDC is the annual conference which Apple hosts and is one of the most important events in the tech calendar for showcasing the top technology applications that will be used in the near future.

So the machine learning supercut gives some context to how I think society generally views machine learning. On the one hand, it’s a technology which has a lot of potential and will drastically change aspects of our society. Conversely, because it has so much potential people have a tendency to over promise and over-advertise the things which machine learning is capable of doing and often turn it into a marketing gimmick and annoying buzzword. For a high level, non-technical summary of what machine learning is about and what the future of technology in general, I recommend Homo Sapiens by Yuval Noah Harari.

I take a more middle-ground approach and say that you should judge it on the merits of what you can actually build with machine learning, but first, you have to understand what machine learning is.

(Guage audience level) How many of you: has never coded before… used ML in a small side project … Studied ML at a Master’s or Ph.D. Level, written or helped write a paper about ML etc.)I’ve tried to structure my talk in such a way that non-technical people will find it interesting and the more technical, ML-experienced people may some new, interesting concepts.

The reason my talk is called practical machine learning is because I consider myself a very pragmatic, practical person and whenever I learn something, the first thing I ask myself is how can I apply what I’ve learned and put it into practice. Hopefully, after today’s talk, you will hopefully be able to apply what you’ve learned and build actual ML projects. Alright, so let’s get started

What is Machine Learning?

My talk was greatly inspired by 2 tutorials which I did, the website is awesome and the guy who runs it Harrison Kinley is a very good teacher.

https://pythonprogramming.net/machine-learning-tutorial-python-introduction/

https://pythonprogramming.net/machine-learning-python-sklearn-intro/

  • ML is a software program that can learn from given inputs and give you the desired output, without explicitly teaching it what you want the out output to be.
  • Google gets approximately .00000003% 1)I forget the exact number, but I’m just repeating what Scott Galloway said better each time you use it
  • In the past, people used to say that the fundamental difference between humans and computers was that humans get better at a task with more information/ “experience” which is something computer can’t do. ML changes that by allowing algorithms/ programs to become more accurate with more information.
    • Also, speaks to a pattern of whenever people say “Computers will never be able to do X because they need a certain skill that only humans have”. It’s usually just a matter of time before computers acquire those skills.
  • Broadly speaking there are 2 types of Machine learning Supervised and Unsupervised Learning:
    • Within supervised we have regressions and classifier ML problems.
      • Further still, within classifiers, we can either use k-nearest neighbors or support vector machine algorithms to solve classification problems. (time pending I may just do k-nearest neighbors)
    • While within unsupervised we have clustering and neural networks
      • Within unsupervised we can use k-means or use mean shift
      • While within neural networks you can either have a wide-learning, deep learning or ensemble method (which is just a bit of both

Supervised Learning

  • Supervised learning differs from unsupervised in that you have a dataset and then the supervisor tells the machine what the correct answer is supposed to be (called a label) for each feature-set.
    • E.g. if you had a data set of heights and weight and you told the algorithm the gender of each feature-set
    • While in unsupervised, the algorithm would just get the heights and weights and group them in whichever way it thinks makes the most sense.

Classification

  • Classification is the most intuitive type of ML and is the type most people are familiar with. As I said, there are two types of algorithms to solve classification problems, the k-means, and the support vector machine.
  • Classification is essentially when you give a model a certain number of variables and it classifies the feature-set into 2 categories.
  • So for today’s example, we’ll be using the Wisconsin breast cancer dataset. Essentially, what we’re going to do is take about 10 variables on a breast sample and predict whether or not the cancer is malignant or benign.
  • ML is nice because you can build cool things without knowing anything about the underlying math, but i think it’s good to have some understanding of the theory behind what you build and the math is pretty intuitive.
  • The way k-means works is:
    1. You have a dataset with data points belonging to 2 groups.
    2. You then get a new dataset that you are trying to predict which group it belongs to.
    3. So you find the distance between each data point and the point you are trying to predict, and find the k-nearest data points.
      • You find the distances between the data points using something called the Euclidean distance.
    4. You then find what class the majority of the k-nearest data points belong to and that is the class for your new data point.
  • Walk through the code and live demo

Importance of ML tangent

  • So, in the space of 5 minutes, I just showed you the basics of ML and we built a functioning ML application.
  • One of the key takeaways I wanted to leave you with is that every group in here should try and find a way to apply ML or big data analytics to their project.
  • Whether or not you think it “makes sense” for your project, is almost besides the point.
    • Especially for developers, even for the non-technical people, an understanding of ML and data analytics is a very valuable skill to have in the job market and having hands-on experience with it will be a good way to differentiate yourself.
    • So, invest 1 or 2 days to browsing Kaggle, or UCI etc. data sets for data (I’m always impressed by the things people open source. It’s cliche, but I’m still going to say it because it’s true, the only limit is your imagination). Think of how you can incorporate it into your app as a small feature and build something.

Regression

  • Regression is very similar to classifiers only instead of getting a binary output (1 or 0, true or false etc.) the output can be any real number.
  • This is done using a line of best fit, where you take your data points and try to draw a line which gets as close to as many of the data points as possible.
  • Line of best-fit pseudocode:
  • The example we will be building is a stock-price prediction table. Essentially if we have different financial metrics such as % change, closing price, volume traded We should be able to predict the price for the next 10 days.

Unsupervised Learning

Even within clustering, you have 2 types; you have flat clustering which is where you don’t have to tell the computer what datapoint belongs in each group but you have to tell it the # of groups that you want. While with hierarchical you don’t have to tell it anything. I’ll be covering hierarchical clustering as that is in my opinion, the TRUE unsupervised learning.

Flat Clustering – K-Means

  • K-means

Hierarchical Clustering – Mean Shift

  • This works using centroids:
    1. Each datapoint is a centroid and we do a hill climb whereby we increase the radius of each data point up until a specified bandwidth.
    2. Each data point within the bandwidth will be added to a cluster, take the mean of the new cluster and a new centroid is found.

Neural Network (Deep Learning)

Neural networks are my favorite part of ML to talk about and arguably the one which I actually understand the least (I wonder if there’s a correlation). Its also arguably the most esoteric branch of ML but in my opinion, this is where most of the interesting things are happening.

Something very interesting I learned preparing this talk is that a lot of the underlying theory for neural networks has actually existed for many years, dating as far back as the 1940s.

The fundamental problem was that the hardware didn’t yet exist to make the computations required so it was put on the shelf until the rise of things like GPUs, Apache Hadoop and MapReduce etc. A very important point that makes math and science very interesting and something for us to keep in mind. Many things that are researching or building today that we think are novel ideas, are actually ideas that have existed for a long time, our predecessors just didn’t have the technology to implement their ideas. Another one of the many reasons why learning history is so important.

So neural networks are based on the idea of modeling how the brain works.

Our brain is composed of neurons, neurons are cells composed of dendrites, axons, and terminal axons. Neurons talk to each other to transmit information and process information via their dendrites and terminal axons in a process called synapse.So for example, if you put your hand in hot water your fingertips (input) will send that data to your brain, and certain neurons would be activated to let you know that you are in pain and your arm muscles should pull your hand out (output).

Now how does that relate to computer science? So scientists had this idea that if we could use nodes on a system to communicate which each other, each node can translate its inputs into a binary output using a step function. The reason why everyone is so excited about neural networks is that it is able to derive insights from large data sets ( ideally, you need ½ billion data points to get reliable results) without programmers and scientists to providing many explicit rules.

If you want to learn more about ML, Tensor flow is a very good place to start and they have a good tutorial

Personally, I find it interesting because I love the multi-disciplinary approach to problem-solving. The idea that someone under stood math and how computers worked and that they also understood biology and the anatomy of brains. Finally, they had the presence of mind to think “hmm, I wonder if I could take the logic behind how a brain works and apply it to computer algorithms”. The underlying theory behind what is happening under the hood is way out of the scope of this talk  (maybe I can talk about it later). But nevertheless, it’s good to have a general idea of what neural networks are.

Conclusion

  • The key takeaway for this talk is that ML and big data is eating our world.
  • It seems intimidating at first, but there are many tools out there to learn how it works, much of the hard math is already being handled by very smart people.
  • Instead of joining in the hype and just talking about it, we should try and use these powerful technologies to build cool, useful things.

 

References   [ + ]

1. I forget the exact number, but I’m just repeating what Scott Galloway said