Regular expressions, or regex, provide a way to search for patterns within strings. In Python, the re
module is used for regex-based pattern matching, enabling advanced string manipulation.
Regular Expressions in Python
1. Importing the `re` Module
The re
module in Python offers functions for working with regular expressions. Import it with:
import re
Module Basics Quiz
Which module is used for regular expressions in Python?
What does the 'r' prefix before a regex pattern mean?
2. Basic Pattern Matching
Use re.search()
to find the first match of a pattern within a string:
import re
pattern = r"hello"
text = "hello world"
match = re.search(pattern, text)
if match:
print("Match found:", match.group())
The r
before the pattern indicates a raw string, which treats backslashes as literal characters.
Pattern Matching Quiz
What does re.search() return if no match is found?
How do you access the matched text from a match object?
3. Using `re.findall()`
re.findall()
returns all matches of a pattern in a list:
text = "cat bat rat mat"
matches = re.findall(r"\b\w+at\b", text)
print(matches) # Output: ['cat', 'bat', 'rat', 'mat']
This example uses the word boundary \b
to match words ending in "at".
findall() Quiz
What does re.findall() return if no matches are found?
What does \b represent in regex?
4. Replacing Text with `re.sub()`
Use re.sub()
to replace matches in a string:
text = "I like cats"
new_text = re.sub(r"cats", "dogs", text)
print(new_text) # Output: I like dogs
sub() Quiz
How many replacements does re.sub() make by default?
How would you limit replacements to just the first match?
5. Pattern Modifiers
Modifiers control the behavior of regex. Commonly used flags include:
re.IGNORECASE
orre.I
: Case-insensitive matching.re.MULTILINE
orre.M
: Multi-line matching for patterns like^
and$
.
text = "Hello world"
match = re.search(r"hello", text, re.IGNORECASE)
if match:
print("Case-insensitive match found!")
Modifiers Quiz
Which flag makes matching case-insensitive?
What does re.MULTILINE affect?
6. Common Regex Patterns
\d
: Matches any digit (0-9).\w
: Matches any alphanumeric character (a-z, A-Z, 0-9, _).\s
: Matches any whitespace character (space, tab, newline).^
: Matches the beginning of a string.$
: Matches the end of a string.
Example:
text = "My phone number is 123-456-7890"
pattern = r"\d{3}-\d{3}-\d{4}"
match = re.search(pattern, text)
if match:
print("Phone number found:", match.group())
Patterns Quiz
What does \d match in regex?
What does {3} in \d{3} mean?