Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialLeo Marco Corpuz
18,975 PointsEmail groups
Checking to see if my code makes sense for the email group. Any feedback? Thanks!
import re
string = '''Love, Kenneth, kenneth+challenge@teamtreehouse.com, 555-555-5555, @kennethlove
Chalkley, Andrew, andrew@teamtreehouse.co.uk, 555-555-5556, @chalkers
McFarland, Dave, dave.mcfarland@teamtreehouse.com, 555-555-5557, @davemcfarland
Kesten, Joy, joy@teamtreehouse.com, 555-555-5558, @joykesten'''
contacts=re.search(r'^(?P<email>[\w+\W?\w?]@[\w+\W\w+\W?\w?])?',string)
2 Answers
Chris Freeman
Treehouse Moderator 68,441 PointsI think you may be confused on defining character sets using square brackets [ ].
Just as \w, \w?, \w*, and \w+ mean a single word character, and optional word character, any number of word characters, and one or more word characters, when using square brackets [ ] to define character set, the modifiers go outside: [ ], [ ]?, [ ]*, [ ]+.
When the a + or ? are used inside a character set it means a literal plus sign or question mark.
Reading you regex says:
^ # Starting anchored at the beginning of the string
(?P<email> # Start a group named email
[ # Start a character set
\w+\W?\w? # Set contains any word character,
# a + sign, a non-word character, a question mark
] # Look for exactly one character matching this set
@ # Set is followed by an @sign
[\w+\W\w+\W?\w?] # Define another character set
# exactly the same as the first one
) # End named group
? # This group may optionally be present
By having a \W non-word character in the set, it could match the space preceding the email address.
The characters listed within a set should be as explicit as necessary to not get false matches. For example [\w+.]+ which says โMatch one or more word characters, plus signs, or periods.
You should remove the leading caret ^ since the email is not at the start of the line. Youโll also need use re.M since there are multiple lines within the string.
Your two character sets are the same for the same reason that [banana] and [nba] are the same: repeated characters do not change the set.
Post back if you need more help. Good luck!!!
Leo Marco Corpuz
18,975 PointsSo here's my revised code:
contacts=re.search(r'(?P<email>[\w]+[\W]?[\w]?@[\w]+[\W][\w]+[\W]?[\w]?)',string)
[\w]+[\W]?[\w]? (example kenneth+challenge, dave.mcfarland)
[\w]+[\W][\w]+[\W]?[\w]? (example teamtreehouse.co.uk)
Chris Freeman
Treehouse Moderator 68,441 PointsGetting closer. I see your intentions. The email group would pass with one change. The character set just before the @
sign should be followed by a * instead of a ?. This allows more than one character to follow the plus or period in the username.:
r'(?P<email>[\w]+[\W]?[\w]*@[\w]+[\W][\w]+[\W]?[\w]?)'
That said, if a character set only has one item in the set, then square brackets arenโt needed:
r'(?P<email>\w+\W?\w*@\w+\W\w+\W?\w?)'
A simplified version would be to put the optional characters inside a set:
r'(?P<email>[+.\w]+@[.\w]+)'
Note the period and plus inside the set are literal and the plus outside the set is โone or moreโ
Post back if you need more help. Good luck!!!
Chris Freeman
Treehouse Moderator 68,441 PointsGetting closer still. Your latest attempt:
contacts=re.search(r'(?P<email>[\w]+[\W]?[\w]*@[\w]+[\W][\w]+[\W]?[\w]?) \t (?P<phone>\d{3}\-\d{3}\-\d{4})' ,string)
This will match if you fix the characters between the groups. You have space Tab space. This is not in the target string. you need comma space. This can be written as ,\s
Once this passes, try to simplify the regex character sets using my comment above.
Leo Marco Corpuz
18,975 PointsThanks for your help!
Leo Marco Corpuz
18,975 PointsSo I made the correction before the @ sign. How's my phone group? It's not included in the string if it's on a new line. Is that where \t comes in?
contacts=re.search(r'(?P<email>[\w]+[\W]?[\w]*@[\w]+[\W][\w]+[\W]?[\w]?) \t (?P<phone>\d{3}\-\d{3}\-\d{4})' ,string)
I placed a '\' before '-' but it's not showing when I pasted this code.
[MOD: added ```python formatting -cf]
frankgenova
Python Web Development Techdegree Student 15,616 Pointsfrankgenova
Python Web Development Techdegree Student 15,616 PointsYou can use the website: https://pythex.org/ to test various emails. Sometimes an email will contain numbers like (lillie1974@aol.com). I do not think your regex will find those kinds of emails.