Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Well done!
      You have completed Preparing Data for Analysis!
      
    
You have completed Preparing Data for Analysis!
Preview
    
      
  Determine the context necessary for working with our dataset.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
                      Understanding the context behind your
analysis may affect how you clean
                      0:00
                    
                    
                      the data.
                      0:05
                    
                    
                      Here are a few questions to ask
yourself when reviewing the dataset.
                      0:06
                    
                    
                      [MUSIC]
                      0:10
                    
                    
                      Do you know enough about
the topic of the dataset?
                      0:13
                    
                    
                      Why are you working with this dataset?
                      0:17
                    
                    
                      How do you plan on using the dataset?
                      0:19
                    
                    
                      What questions do you need to answer?
                      0:23
                    
                    
                      Who will be viewing the analysis?
                      0:25
                    
                    
                      For instance, when sharing analysis with
people from the US, you'll probably wanna
                      0:30
                    
                    
                      have numbers in a format they're
familiar with like inches or pounds.
                      0:35
                    
                    
                      If you're presenting the information
to people from really anywhere else in
                      0:39
                    
                    
                      the world, it might make sense to share
the numbers in centimeters or kilograms.
                      0:44
                    
                    
                      Understanding the context also means
knowing enough about the topic to be able
                      0:49
                    
                    
                      to make accurate predictions and
assumptions about the data.
                      0:54
                    
                    
                      Let's use the Pokemon
dataset as an example,
                      0:58
                    
                    
                      you can find a link to the dataset
in the teacher's notes below.
                      1:01
                    
                    
                      If I were going to use this dataset for
analysis but
                      1:05
                    
                    
                      I didn't know anything about Pokemon,
I would start looking for
                      1:08
                    
                    
                      resources to help me
understand what Pokemon are.
                      1:12
                    
                    
                      What does a weakness or type mean and
how do they interact with one another?
                      1:15
                    
                    
                      Understanding the terminology of the
dataset and how different elements relate
                      1:21
                    
                    
                      to each other will also help you
clean the dataset properly and
                      1:26
                    
                    
                      start building out questions for
your analysis.
                      1:29
                    
                    
                      With this dataset for example,
once I understand how weaknesses and
                      1:33
                    
                    
                      types work, I could answer questions like,
                      1:37
                    
                    
                      what type of Pokemon has
the most weaknesses on average?
                      1:40
                    
                    
                      Or which weakness shows up the most?
                      1:44
                    
                    
                      Once you feel you have enough context
to move on you can start looking for
                      1:48
                    
                    
                      what needs to be cleaned.
                      1:52
                    
                    
                      Let's take a look at the different
types of bad data to look out for
                      1:54
                    
                    
                      in the next video.
                      1:57
                    
              
        You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up