Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Video Player
00:00
00:00
00:00
- 2x 2x
- 1.75x 1.75x
- 1.5x 1.5x
- 1.25x 1.25x
- 1.1x 1.1x
- 1x 1x
- 0.75x 0.75x
- 0.5x 0.5x
Determine which author in the dataset has written the most pages and more.
This video doesn't have any notes.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Welcome back.
0:00
In this video we're going to tackle
some questions around pages.
0:01
Let's add them now.
0:04
Markdown, and here we go.
0:08
Who wrote the most pages?
0:12
Another markdown,
what's an author's average page count?
0:18
And one more.
0:32
How many books have been written
with less than 200 pages?
0:35
Let's tackle who wrote the most pages.
0:45
First, we need to find all of
the unique authors in the dataset
0:52
since many authors wrote multiple books.
0:57
We're going to run books.
1:00
Where authors.
1:03
Are unique.
1:07
And you can see it gives us an array
of all the different authors.
1:12
Now we need to work out
how to get the sum for
1:17
all of the pages related
to a specific author.
1:20
Let's set this equal to a variable
right now, all_authors.
1:25
And then let's do books.loc,
1:33
where books, authors, and
1:38
let's just do Stephen King.
1:43
Cuz I know he's a relatively
famous author, and
1:48
I know he's written multiple books, so
I feel like he's a good example to use.
1:52
Num_pages.
1:59
Okay, so we can see all of the IDs,
and then all of the page counts, or
2:03
the number of pages for all of the books
that have the author of Stephen King.
2:08
So it sorted the books by all
the books that have authors that
2:13
equals Stephen King.
2:17
And it's only returning
the number of pages column.
2:19
So we can see it's quite a lot.
2:23
Now, a fun thing we can do here at the end
that makes our lives a lot easier.
2:25
We just add .sum, and
it will sum it all up for us.
2:29
So we get a total of 1,800.
2:34
No sorry, we get a total of
18,219 pages for Stephen King.
2:38
Now, let's think this through.
2:46
We know how to get all of our authors, and
2:48
we know how to get a single author's
page count to see who has the most.
2:51
So we're going to need to compare
all of the author's page totals, and
2:57
then see who has the highest value.
3:02
There are a few different
ways to tackle this.
3:05
One way is to create a max variable,
3:07
I'm going to put it up here at the top,
and set it equal to zero.
3:11
And then we can loop through our
authors to calculate their page total.
3:18
Compare it to this max value.
3:23
And if it's larger,
then we can update the value.
3:26
And let's also hold the author's name
as well so we know who we end up with.
3:30
And I'm going to do top_author, and
I'm going to set it equal to None for
3:35
now so
that we can set it as an author's name.
3:40
So let's turn this into a loop.
3:44
So we need to get all of our authors and
now we need to loop through them all.
3:47
So for author in all_authors.
3:52
We're going to do, tab this over, and
3:58
this is going to be our
total_pages equals.
4:02
And instead of Stephen King,
we need to pass in our author so
4:08
that we get to each author
as it loops through.
4:13
Awesome, and
then next we need to check if their
4:22
total_pages is greater than
the current max value.
4:27
If it is then, the max needs to
now be set equal to total pages so
4:35
that they now have the top spot.
4:40
And our top author now is going
to be set equal to that author.
4:45
And then let's print out the max value.
4:53
And let's print out the author or
the top_author.
4:57
Actually, it doesn't matter cuz
they will be the same thing.
5:01
And then at the end, outside of our for
5:05
loop, I'm going to print the max again,
5:10
and print the author.
5:15
And this is just so
we can see as the for loop is running,
5:20
which authors kind of
take over the top spot,
5:24
the leaderboard and then at the end,
who came out on top.
5:28
And I think something, I think this
one I need to do as top_author.
5:34
That was my mistake.
5:40
Let's run it again.
5:42
Okay, so we can see it's running and
we got J.K Rowling, and
5:44
then another form of J.K Rowling cuz
sometimes it's not splitting them up but
5:47
that's okay for
what we're doing right now.
5:52
And then we got J.R Tolkien, and
then we got Stephen King, and
5:55
then Stephen King ended up being
our top author with 18,219 pages.
5:59
Awesome.
6:04
Now our next question,
what's an author's average page count?
6:05
We can use the same count code from above.
6:09
So.
6:12
We got our total pages.
6:15
I'm just going to copy this.
6:17
Row here, and paste it.
6:21
And I'm going to do their pages.
6:25
And then the same thing as before,
6:29
I'm just going to use
Stephen King as our example.
6:31
Just cuz he just won the top
number of pages written.
6:37
And then so we got them their number of
pages now we need to know the number of
6:45
books that they've written.
6:49
So their books, we can do
6:51
books where the authors is
6:56
equal to Stephen King.
7:02
And we can do.
7:07
Okay, so we can wrap this in
a parentheses and then do value_counts.
7:11
And looks like we have some trues and
false for when that is equal.
7:19
And let's return just this first value.
7:25
So we can see that they have 40
bucks where the author ends up being
7:27
Stephen King.
7:32
So this will be their books.
7:35
Now for a bit of math.
7:41
Their average_pages is
7:43
their pages divided by their books.
7:48
And then we can print their average pages,
7:56
and we get about 455 pages per book.
8:02
And then all you have to do if you wanna
see a different author is just switch out
8:10
the author's name to someone else.
8:14
I could do J.R.R Tolkien,
make sure I spell that right.
8:16
I did not, I-E-N.
8:23
And make sure I have the right
number of periods and things, cool.
8:27
Copy it, and paste it,
and run it, 737 pages.
8:33
Wow.
8:40
I can't imagine writing
that many pages for novel.
8:41
Lastly, we have how many books have
been written with less than 200 pages?
8:45
I wanna give you this to
try on your own first.
8:50
Pause me and see what you come up with,
then unpause me and see what I wrote.
8:53
Okay, so we need to filter our books
9:00
where the number of
pages is less than 200.
9:06
Cool.
9:14
And then to figure out how many there are,
9:15
we can actually just use
len to get the length.
9:18
And it looks like there are 2,898 books
with less than 200 pages in our dataset.
9:23
Nice job, Pythonistas.
9:30
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up