Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Regular Expressions in Python Introduction to Regular Expressions Email Groups

i dont get \b and ^$ in regex python

when i click check work, i get this message "Got "@teamtreehouse"."

i have specified my regex to be in between word boundaries (so the search should return something in between two non alphanumeric characters). to the left of '@teamtreehouse', there is a alphanumeric character. so how can '@teamtreehouse' be returned if to the left of the @ sign there is an alphanumeric character. only words with a space or non alphanumeric character to the left and right of the @ sign should be returned.

so looking at the first line of 'string', only '@kennethlove' should be returned because i have set a boundary and on both sides of the word there is whitespace. 'kenneth+challenge*@teamtreehouse.com*' (bolded part) should NOT be returned because i set a boundary to the left of the @ sign yet it does so i know this was alot of text but i dont understand

emails.py
import re

string = '''Love, Kenneth, kenneth+challenge@teamtreehouse.com, 555-555-5555, @kennethlove
Chalkley, Andrew, andrew@teamtreehouse.co.uk, 555-555-5556, @chalkers
McFarland, Dave, dave.mcfarland@teamtreehouse.com, 555-555-5557, @davemcfarland
Kesten, Joy, joy@teamtreehouse.com, 555-555-5558, @joykesten'''

contacts = re.search(r'''
        (?P<email>[-\w\d.+?]+@[-\w\d.]+),\s
        (?P<phone>\d{3}[-\.\s]\d{3}[-\.\s]?\d{4})
''', string, re.M | re.X)

#task 2

twitters = re.search(r'\b@\w+\b', string, re.M)

and also i dont know when to use ^ or $.

im aware that the following works:

twitters = re.search(r'@\w+$', string, re.M)

but why do i need to use $ in order for the code to work?

1 Answer

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 68,457 Points

The '\b' represents a word boundary, that is a non-word character must occur in this position. In the twitter search, the @-sign is a non-word character and is effectively the same as '@\b'. Putting the \b before the @ causes the search to look for a word character preceding the \b (otherwise it wouldn't be word boundary). Modifying your pattern for this gives:

twitters = re.search(r'@\b\w+\b', string, re.M)

This issue now is that while the second \b marks the end of the twitter name, it also match the end of the email domain name since the domain ends in a non-word-char period before the Top-Level domain "com", etc.

Since an re.search() can be match anywhere with in the string, there are two other ways to anchor a pattern: using a ^ and $. These anchor the pattern to the beginning or end of a line. Since the twitter name ends the line, adding a '$' at the end of the pattern says after the set of word characters the End-of-Line must follow.

twitters = re.search(r'@\b\w+\b$', string, re.M)

Now since '@ and '$' are effectively non-word characters, they make the \b usage redundant and the \b can be removed:

twitters = re.search(r'@\w+$', string, re.M)

ohhh i understand it now. thanku