Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Scraping Data From the Web Introducing Data Scraping More Soup in the Tureen

Phil Spelman
Phil Spelman
6,664 Points

Just wondering if there's a reason to use a for loop here; I thought soup.find(...) would only get the first result

I wanted to know if there's a specific reason for using a for loop in the example: for button in soup.find(attrs={'class': 'button button--primary'}): print(button)

My understanding was that using soup.find() (vs. soup.find_all()) would only return a single result

Gari Merrifield
Gari Merrifield
9,598 Points

It does seem like overkill, but 'for' will work on a single item array just as well as a multiple item array.

It would make it easier if you later were to convert to a '.find_all()', you wouldn't have to rewrite any code, just add the "_all"...

My two cents worth...

2 Answers

Jason Anders
MOD
Jason Anders
Treehouse Moderator 145,860 Points

Hey Phil,

That's a very good question! I don't understand why a loop was used either. Tagging Ken Alger for further clarification.

:) :dizzy:

Alex Koumparos
seal-mask
.a{fill-rule:evenodd;}techdegree
Alex Koumparos
Python Development Techdegree Student 36,887 Points

I think this is a bug in Ken's script, and it's not mere overkill. The behaviour is significantly different. Gari is not quite right when he describes the return value of find. It doesn't return a single item list, it returns a single item:

The only difference is that find_all() returns a list containing the single result, and find() just returns the result.

This means that Ken's for loop is not iterating (once) through a single item list, it is iterating several times through the individual children of that one result (and printing the string representation of the child object). With find_all, it would iterate once through a single item list (printing the string representation of the single element with the class "button button--primary").

Compare these two outputs:

>>> # create a simple demonstration HTML snippet
>>> html = """<div class="my_class">
...  <h1>A heading</h1>
...  <p>A paragraph</p>
...  </div>"""
>>> soup = BeautifulSoup(html, 'html.parser')

>>> for elem in soup.find(class_="my_class"):
...    print("before elem")
...    print(elem)
...    print("after elem")
before elem


after elem
before elem
<h1>A heading</h1>
after elem
before elem


after elem
before elem
<p>A paragraph</p>
after elem
before elem


after elem

>>> for elem in soup.find_all(class_="my_class", limit=1):
...    print("before elem")
...    print(elem)
...    print("after elem")
before elem
<div class="my_class">
<h1>A heading</h1>
<p>A paragraph</p>
</div>
after elem

Hope that is clear.

Cheers,

Alex