Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Well done!
      You have completed Machine Learning Basics!
      
    
You have completed Machine Learning Basics!
Preview
    
      
  There are many different approaches or models in machine learning, but generally, they can be broken down into two major categories called supervised learning and unsupervised learning.
Vocabulary and Definitions
- Model: An algorithm or an approach to a problem
- Probability: A means of expressing how likely it is that an event will occur, or a way of measuring how close a value might be to the actual correct value
- Supervised learning: A case where a machine intelligence is tasked with predicting a category or a quantity
- Unsupervised learning: A case where a computer analyzes unlabeled data and has no previous examples, and tries to identify patterns in the data
- Classification: A supervised machine learning model that makes a prediction about how a piece of data should be categorized
- Regression: A supervised machine learning model that attempts to predict a quantity or a number
- Clustering: An unsupervised machine learning model that attempts to group similar examples together
Further Reading
- Ethical Design: Treehouse course - Stage 3, in particular, covers machine learning.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
                      There are many different approaches,
or models, in machine learning.
                      0:00
                    
                    
                      But generally,
they can be broken down into two
                      0:04
                    
                    
                      major categories called supervised
learning and unsupervised learning.
                      0:07
                    
                    
                      I'm going to use a lot of new terms
here that might be confusing at first.
                      0:13
                    
                    
                      But don't worry,
                      0:17
                    
                    
                      I'll break each one down a little
further after the initial explanations.
                      0:18
                    
                    
                      You might wanna pause the video
periodically to read the teacher notes as
                      0:23
                    
                    
                      each concept is introduced.
                      0:27
                    
                    
                      You might also need to go back and
rewatch parts of a video to review.
                      0:28
                    
                    
                      Let's start with an important
vocabulary word.
                      0:34
                    
                    
                      When you hear the word model or algorithm
in reference to machine learning,
                      0:38
                    
                    
                      it essentially means
an approach to the problem.
                      0:44
                    
                    
                      A model in machine learning is just like
an architectural model or a doll house.
                      0:47
                    
                    
                      It's a simplification that
attempts to simulate or
                      0:53
                    
                    
                      demonstrate some aspect of the real world.
                      0:57
                    
                    
                      Meteorologists use weather models
to try and forecast the weather.
                      1:01
                    
                    
                      Because even all the computing
power in the world
                      1:06
                    
                    
                      couldn't perfectly simulate
every tiny air particle and
                      1:09
                    
                    
                      temperature and pressure change
that would influence the outcome.
                      1:13
                    
                    
                      So instead, we can use a model
to simplify the problem and
                      1:17
                    
                    
                      get a useful approximation of the result.
                      1:21
                    
                    
                      A model in machine learning
is typically probabilistic.
                      1:25
                    
                    
                      In other words, it usually does
not produce an exact result.
                      1:29
                    
                    
                      Rather, it makes a prediction with
a corresponding percentage of confidence.
                      1:34
                    
                    
                      This isn't something you need to
understand in great detail, but
                      1:39
                    
                    
                      it's good to know the basics.
                      1:43
                    
                    
                      Probability is a means of expressing how
likely it is that an event will occur, or
                      1:46
                    
                    
                      a way of measuring how close a value
might be to the actual correct value.
                      1:51
                    
                    
                      Probability is typically quantified
as a value between 0 and 1,
                      1:58
                    
                    
                      with 0 being a complete guess,
and 1 being complete certainty.
                      2:03
                    
                    
                      For example, if you're rolling a die and
hoping to roll a 2,
                      2:08
                    
                    
                      you have a 1 in 6 chance of rolling a 2,
                      2:13
                    
                    
                      because the die has 6 sides,
and only one of them is a 2.
                      2:17
                    
                    
                      1 divided by 6 is 0.16 repeating,
                      2:22
                    
                    
                      or a 16.6% chance.
                      2:27
                    
                    
                      This is relevant to machine
learning because if a prediction is
                      2:30
                    
                    
                      far outside the norm, the model might
have low confidence in the answer,
                      2:35
                    
                    
                      because it's highly probable
that it's not correct.
                      2:40
                    
                    
                      If the prediction matches
up almost perfectly with
                      2:44
                    
                    
                      an existing example in a data set,
then the confidence will be very high.
                      2:47
                    
                    
                      Now, with the definition
of a model in mind,
                      2:53
                    
                    
                      let's take a look at the two major
categories of machine learning.
                      2:56
                    
                    
                      The first of the two categories for
machine learning approaches or
                      3:01
                    
                    
                      models is called Supervised Learning.
                      3:05
                    
                    
                      Supervised learning is when
a machine intelligence
                      3:09
                    
                    
                      is tasked with predicting a category or
a quantity.
                      3:12
                    
                    
                      Predicting a category or
                      3:17
                    
                    
                      a quantity comprises the two
subcategories of supervised learning,
                      3:19
                    
                    
                      which are called classification and
regression respectively, put another way.
                      3:24
                    
                    
                      A classifier looks at a piece of data and
                      3:30
                    
                    
                      tries to categorize it, or,
in other words, classify it.
                      3:33
                    
                    
                      And a regression tries to
predict a quantity or a number.
                      3:38
                    
                    
                      The second of the two major
categories is Unsupervised Learning.
                      3:43
                    
                    
                      Unsupervised learning is when
a computer analyzes unlabeled data, and
                      3:48
                    
                    
                      has no previous examples, and
tries to identifies patterns in the data.
                      3:53
                    
                    
                      One of the most common subcategories
of unsupervised learning is called
                      3:58
                    
                    
                      clustering, which are models that attempt
to group similar things together.
                      4:03
                    
                    
                      Because learning is unsupervised,
                      4:08
                    
                    
                      the model's definition of similar
might be different than our own.
                      4:11
                    
                    
                      Regressions, classification,
and clustering are not the only
                      4:16
                    
                    
                      approaches in the two categories of
supervised and unsupervised learning.
                      4:20
                    
                    
                      There are many more, and
                      4:25
                    
                    
                      you should check out the notes associated
with this video for more resources.
                      4:26
                    
                    
                      That's a lot of concepts all at once.
                      4:31
                    
                    
                      So now, let's focus and break things
down further by thinking about
                      4:33
                    
                    
                      an example application for one of these
starting with supervised learning, and
                      4:38
                    
                    
                      one of its subcategories, classification.
                      4:43
                    
                    
                      Let's say you want to classify
an email as spam or not spam.
                      4:48
                    
                    
                      You could give a machine intelligence
millions of email messages that
                      4:54
                    
                    
                      are already labeled as not spam, and
millions that are labeled as spam.
                      4:58
                    
                    
                      With each example message, you would
identify features of the data, like
                      5:06
                    
                    
                      the subject line, the sender, the body of
the email, the attachments, and so forth.
                      5:11
                    
                    
                      Then, when a new email comes through,
the machine intelligence can refer
                      5:18
                    
                    
                      to all of the features of the spam,
and not spam messages and
                      5:23
                    
                    
                      decide how closely the new email
matches any patterns in the data.
                      5:28
                    
                    
                      Then, it assigns a category
to the new message
                      5:34
                    
                    
                      with some percentage of confidence.
                      5:37
                    
                    
                      In this course, we're going to create our
own classifier that will attempt to label
                      5:41
                    
                    
                      new entries into a data set
based on the existing data.
                      5:46
                    
                    
                      However, for completion, let's take a
quick look at regressions and clustering.
                      5:50
                    
                    
                      A regression is another type
of supervised learning.
                      5:57
                    
                    
                      Instead of attempting to categorize data,
it tries to predict quantities.
                      6:01
                    
                    
                      For example,
say you're opening a restaurant, and
                      6:06
                    
                    
                      you're trying to decide how
dishes should be priced.
                      6:10
                    
                    
                      A regression algorithm could look at a
data set of other restaurants in the area,
                      6:13
                    
                    
                      and use features like the average price
of a dish, the relative distance to
                      6:19
                    
                    
                      the new restaurant's location, the average
review score from Yelp, and so forth.
                      6:24
                    
                    
                      Based on that information, the regression
could try and predict appropriate prices.
                      6:29
                    
                    
                      The last approach I mentioned
is called clustering,
                      6:36
                    
                    
                      which is one of the biggest categories
of approaches to unsupervised learning.
                      6:40
                    
                    
                      Have you ever been on a social network and
                      6:46
                    
                    
                      been shown suggested friends or
targeted advertisements?
                      6:48
                    
                    
                      Or have you watched something
on a video sharing site
                      6:52
                    
                    
                      that shows you suggested videos?
                      6:56
                    
                    
                      [SOUND] How do these websites
know what to show you?
                      6:58
                    
                    
                      Or what content is similar?
                      7:01
                    
                    
                      A cluster analysis can attempt the group
                      7:04
                    
                    
                      data by similarity without attempting
to apply any type of labels.
                      7:07
                    
                    
                      This cluster analysis might help to
identify hidden patterns in the data,
                      7:12
                    
                    
                      and automatically group things together
based on features of the data.
                      7:18
                    
                    
                      This was a broad overview of the different
approaches to machine learning.
                      7:23
                    
                    
                      And, as you can imagine, it's a deep
topic with lots more to explore.
                      7:28
                    
                    
                      I encourage you to check the notes
associated with this video
                      7:33
                    
                    
                      to help review what we've learned.
                      7:36
                    
                    
                      I also want to mention again,
                      7:39
                    
                    
                      don't worry if you're not
understanding everything right away.
                      7:41
                    
                    
                      In upcoming videos, we'll take a closer
look at some of these concepts.
                      7:44
                    
                    
                      Machine learning is a huge area of study,
and
                      7:49
                    
                    
                      it might take some review for
you to fully absorb everything.
                      7:52
                    
                    
                      Remember, you can always go back and
                      7:57
                    
                    
                      rewatch videos if you
feel like you need to.
                      7:59
                    
              
        You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up