Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
Once again, Python makes it easy to automate tasks. In this video let's take a look at how to search for files and directories.
Python Documentation
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
One of the benefits that directories
give us is a logical grouping of files.
0:00
We can keep all of our cat photos,
0:04
project files or favorite rock
operas together in one place.
0:06
Often, when we're creating software for
working with directories in files,
0:10
we wanna be able to search for
particular files and directories.
0:13
Luckily, Python makes this pretty easy for
us.
0:16
So I've imported os.
0:18
I'm gonna do os.listdir, and
you can see all of the files and
0:20
directories that are currently
in this directory.
0:24
Those dir method gives us back
everything that's in the directory.
0:27
By default,
it uses the current working directory,
0:31
but we can provide it a path
right here if you wanted to.
0:33
And it'll tell you everything
that's in that path.
0:36
Slightly more useful though,
is the scan dir method.
0:40
Now we're gonna pass this one to list
because it gives us an iterable to consume
0:43
and we wouldn't see anything good
if we didn't pass it to list.
0:48
So we can see here all
of these dir entries.
0:51
Each one of these dir entries is an object
that represents an entry in the directory,
0:53
so it's either a file or
it's a directory or whatever.
0:57
What's cool about these is we can use them
to get some basic information about their
1:02
equivalent entry without having
to go back to the file system and
1:05
inspect a particular file.
1:09
For instance, let's look at this one here.
1:11
So, 0123.
1:15
Okay?
1:18
So, I'm going to say
files=list(os.scandir).
1:19
And then I'm gonna say files(3).name.
1:23
And I get bootstrap-3.3.7-dist.zip
which you can get off
1:27
the getbootstrap.com website.
1:31
So let's find out if files[3] is a file.
1:33
And it is, so cool.
1:37
So now I can get some statistics about
the object by using the stat method.
1:39
So I can do files[3] and
then I can say stat.
1:44
And I get these stat results.
1:48
Now the one of these
that is the most useful,
1:50
the most interesting is this
one over here, this ST size.
1:52
This is the size of the file in bytes.
1:55
This is really handy if you wanted to,
for example,
1:58
flag files that are above a certain size.
2:01
Now one more thing to bring up.
2:04
Scan dir gives you a stream like iterator.
2:05
Like when you use the open function.
2:08
So if you're not consuming it right away,
and then ending the block it's in.
2:10
Like a for loop or a function.
2:13
Or you're not using it with
a context manager like With.
2:15
You'll want to call
the close method on it.
2:19
So for instance if I had
scanner = os.scandir() I would
2:21
eventually want to do scanner.close.
2:25
That would close out the scanner and
free up memory.
2:29
Now, let me show you another way to
work your way through directories.
2:32
We can use the os.walk method to
step through all of the files and
2:35
directories in a particular directory.
2:39
This isn't exactly the same
as using scandir, but
2:40
it gives us a handy way
to explore file trees.
2:43
I've already started, but I'm going to
finish a script here called tree.py.
2:46
And I say started, I've created a file.
2:51
Now inside here, in this directory,
I have a bootstrap directory and
2:54
that's full of all the files that
you get when you download bootstrap.
3:00
So, I want to look through that.
3:02
So, I'm gonna import OS, and
3:05
then I'm gonna make a new
function named treewalker.
3:06
And it's going to start at some directory.
3:10
So, my total size is 0.
3:12
And my total number of files is 0.
3:15
So for the root, for the dirs and for
3:18
the files that are in os.walk(start)
whatever the start directory is.
3:22
My subtotal is going to be equal
to the sum of os.path.getsize(),
3:28
which is similar to doing the os.stat and
then pulling out the ST_SIZE.
3:34
But getsize just pulls directly that off.
3:38
And I'm gonna use os.path.join,
and the root, and
3:40
the name, and I'm gonna do that for
every name that's in files.
3:44
And then I'm gonna say that total_size
will have the subtotal added to it.
3:50
And I'm gonna say that file_count
will be equal to the len(files).
3:59
Total number of files over there.
4:07
And then total files is going
to plus equal the file count.
4:09
Now I could of course just
combine those two lines but
4:12
sometimes it's nice to
print the whole thing out.
4:14
So,I'm gonna print root and then consumes.
4:17
And I'm gonna end that with a space, and
4:24
then I'm gonna print the subtotal,
and end that with a space.
4:27
And then I'm gonna print bytes in,
and then the file count,
4:33
and I'm gonna say non-directory files.
4:39
And then out here outside of the for
loop, I'm gonna print start
4:44
contains and then total_files.
4:49
And then files with a combined size of,
4:55
and then total size, and then bytes.
4:59
So a lot of stuff to do there.
5:06
But then I'm gonna call treewalker
down here with Bootstrap.
5:08
There we go, save that,
come back over here, and
5:12
I'm gonna execute tree.py.
5:16
So Bootstrap has zero bytes,
and zero non directory files.
5:20
Bootstrap css consumes 1.3 million
bytes in 8 non-directory files.
5:23
Bootstrap fonts contains 215,000 bytes.
5:30
Bootstrap.js is 117,000.
5:34
Bootstrap in total has 16 files with
a combined size of 1.6 million bytes.
5:37
So that's pretty cool.
5:44
That's a really nice way just to
kind of quickly see what's going on.
5:46
There's a lot more to the OS module, and
5:49
we're going to explore some more
of it in the next two sections.
5:51
Feel free though to explore
it a bit more on your own.
5:55
It's one of the more interesting modules
in Python, especially since dealing with
5:57
differences and operating systems
is often such a thorny area.
6:00
All right, take a little break, have
a snack or stretch, and then come back to
6:02
learn about manipulating files with
Python, instead of just looking at them.
6:06
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up