Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
Clean the data imported from the CSV file for entry into the database.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Back inside of our add_csv function,
0:00
let's create a variable
called date = clean_date,
0:04
and we're going to need
to pass in our date,
0:10
which if we remember I'll
scroll a little bit 0, 1, 2.
0:14
It will be item number two in the row.
0:20
So row 2, well, it's the third item,
I mean, but index 012.
0:23
Cool, and we know we aren't gonna have
to do anything with title and authors.
0:29
Let's go ahead and do this now.
0:33
title =row[0], and
0:34
author = row[1].
0:38
Okay, sweet the date's all
cleaned up, onto the price.
0:42
Let's create a function just like we
did with clean_date called clean_price.
0:48
Same thing,
it's going to take a price string.
1:00
And inside we need to convert the number
to a float because it has a decimal point.
1:06
So we need to take the number,
1:13
and let's say price_float
1:18
= float(price_str).
1:22
And I'm gonna print(price_float).
1:27
And we'll see it in the console.
1:34
So I'm gonna pop down here,
change clean data clean price and
1:38
then copy a price from console.
1:44
Cool, scroll back to where I was.
1:53
And let's run it so
we see I get a number 28.84.
1:56
So that part is working.
2:00
Let's get back up here.
2:04
But the next thing I'm going to do is I'm
going to turn that number into an integer,
2:06
because we told our column,
the numbers are going to be an integer.
2:11
So we need to convert it.
2:16
And also because working with floats
can lead to some interesting issues
2:17
which come up a lot when dealing
with money and currency conversions.
2:22
I'll link some resources in the teacher's
notes if you wanna dive into this, but
2:28
it is kind of a rabbit
hole just a heads up.
2:32
So let's do return
2:37
int(price_float * 100),
2:41
so this essentially makes
2:48
our price in cents, so
2:53
2884 cents or $28.84.
2:58
Little math fun, and we don't need
this print statement anymore.
3:07
Actually, we can remove this one too,
just so we don't have extra ones.
3:10
Yeah, there we go.
3:15
Cool, so now we have our price and
date cleaned, let's make sure we use it.
3:17
So inside of our add_csv,
3:24
let's do price = clean_price(row3).
3:27
And if you wanna check all scroll
back up 0,1, 2 and 3, perfect.
3:34
Okay, now we can create a new book.
3:42
We have all the information we need.
3:44
I'm actually gonna remove this just
to clear up our console a little bit.
3:47
And we're gonna do new_book
3:52
= Book(title=title,
3:57
author=author, published_date=date,
4:05
price=price) awesome.
4:14
Then we need to add it to session,
session.add(new_book) and
4:21
then outside of our loop
let me make that smaller so
4:27
l can show more of our function.
4:31
There we go.
Outside of our loop we're going to commit.
4:35
So session.commit.
4:37
So, for each row in our database,
it's going to set our title, author, date,
4:42
and price, create a new book and add it to
the session and that will loop back and
4:47
do the next book in our row or
the next row in our database.
4:52
And then create a book at it, and then
after all of those books are added to our
4:56
session, then we're going to finally
commit them to the database.
5:01
Great, let's run it in the console and
5:08
to make sure that we are getting
all of our books, and delete that.
5:11
Uncomment that.
5:19
Let's loop through all of our books
to make sure they've been added
5:21
to the database.
5:24
for book in session.query(Book);
5:25
print(book) okay, cool.
5:34
Now, I'm gonna make this
almost full screen here.
5:41
There we go.
5:47
So we can see the entire console or
as much of the console as we can.
5:48
And let's run the file.
5:52
Awesome, all of our books are here,
one, two, three,
5:54
four, five, six, seven, eight, nine, ten.
5:59
Ten books, amazing.
6:04
One slight problem though,
if I run the file again, now we have,
6:07
More than ten books because
now we have duplicates.
6:17
Back inside of our csv loop,
we can check for the book and
6:23
make sure we only add it to the database
if it doesn't already exist.
6:28
So we're gonna create
6:35
a variable called book_in_db =
6:40
session.query(book).filter(Book.title
6:48
= = row[0]).one_or_none.
6:59
And this is going to either return
the book if there is one or
7:08
return none if there isn't a book.
7:15
Then we can use that in
a conditional statement.
7:19
So if book_in_db == None,
7:24
and we can tab all of that over so
7:28
that it only gets created if it doesn't
7:32
already exist in the database.
7:38
Save and since we already know
the database has duplicates,
7:42
we can actually just delete it.
7:48
And then when we run the file again It
will create the database again for us.
7:51
And we can check.
7:59
Yep, that looks like about ten books.
8:03
And if I run it again,
8:05
I should still have only about ten books.
8:08
Awesome, no more duplicates.
8:14
Great work Pythonistas.
8:17
Add your changes to get git add.
8:19
Commit them, git commit
8:24
-m added csv to db.
8:30
And push them.
8:35
Amazing
8:43
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up