Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Video Player
00:00
00:00
00:00
- 2x 2x
- 1.75x 1.75x
- 1.5x 1.5x
- 1.25x 1.25x
- 1.1x 1.1x
- 1x 1x
- 0.75x 0.75x
- 0.5x 0.5x
Determine the context necessary for working with our dataset.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Understanding the context behind your
analysis may affect how you clean
0:00
the data.
0:05
Here are a few questions to ask
yourself when reviewing the dataset.
0:06
[MUSIC]
0:10
Do you know enough about
the topic of the dataset?
0:13
Why are you working with this dataset?
0:17
How do you plan on using the dataset?
0:19
What questions do you need to answer?
0:23
Who will be viewing the analysis?
0:25
For instance, when sharing analysis with
people from the US, you'll probably wanna
0:30
have numbers in a format they're
familiar with like inches or pounds.
0:35
If you're presenting the information
to people from really anywhere else in
0:39
the world, it might make sense to share
the numbers in centimeters or kilograms.
0:44
Understanding the context also means
knowing enough about the topic to be able
0:49
to make accurate predictions and
assumptions about the data.
0:54
Let's use the Pokemon
dataset as an example,
0:58
you can find a link to the dataset
in the teacher's notes below.
1:01
If I were going to use this dataset for
analysis but
1:05
I didn't know anything about Pokemon,
I would start looking for
1:08
resources to help me
understand what Pokemon are.
1:12
What does a weakness or type mean and
how do they interact with one another?
1:15
Understanding the terminology of the
dataset and how different elements relate
1:21
to each other will also help you
clean the dataset properly and
1:26
start building out questions for
your analysis.
1:29
With this dataset for example,
once I understand how weaknesses and
1:33
types work, I could answer questions like,
1:37
what type of Pokemon has
the most weaknesses on average?
1:40
Or which weakness shows up the most?
1:44
Once you feel you have enough context
to move on you can start looking for
1:48
what needs to be cleaned.
1:52
Let's take a look at the different
types of bad data to look out for
1:54
in the next video.
1:57
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up