Views

Report Content

×

Trie Data Structure: Prefix Trees & Applications

Master trie operations, pattern matching, and autocomplete implementations. This tutorial covers trie structure, insertion/search algorithms, and real-world applications with coding examples.

Trie Usage in Applications (2023)

Autocomplete (40%)
Spell Checkers (30%)
IP Routing (20%)
Other (10%)

1. Trie Fundamentals

Trie (Prefix Tree)

Definition

Trie (Digital Tree or Prefix Tree) is an ordered tree data structure used to store a dynamic set of strings. Each node represents a character, and paths from root to leaf form words. Common prefixes are shared, making it efficient for prefix-based operations.

How It Works

1
Insert Operation
  • Start from root node
  • For each character, traverse or create child node
  • Mark last node as end of word (isEndOfWord = true)
2
Search Operation
  • Traverse nodes following characters
  • If any character missing → word doesn't exist
  • Check isEndOfWord flag at final node
3
Prefix Search
  • Same traversal as search
  • Don't need isEndOfWord flag
  • Returns true if prefix exists
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end_of_word = False

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word: str) -> None:
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_end_of_word = True
    
    def search(self, word: str) -> bool:
        node = self.root
        for char in word:
            if char not in node.children:
                return False
            node = node.children[char]
        return node.is_end_of_word
    
    def starts_with(self, prefix: str) -> bool:
        node = self.root
        for char in prefix:
            if char not in node.children:
                return False
            node = node.children[char]
        return True

Complexity: Insert O(L), Search O(L), Prefix O(P) where L = word length
Space: O(total characters across all words)
Use When: Prefix matching, autocomplete, dictionary storage

2. Autocomplete & Prefix Search

Autocomplete (Word Suggestions)

Definition

Autocomplete returns all words in the trie that start with a given prefix. It's used in search engines, text editors, and command-line interfaces to suggest completions as the user types.

How It Works

1
Navigate to prefix node
  • Traverse the trie following prefix characters
  • If any character missing, return empty list
2
DFS traversal from prefix node
  • Perform depth-first search from current node
  • Collect all paths where isEndOfWord is true
3
Return suggestions
  • Build full words by tracking path characters
  • Return list of matching words
def get_suggestions(self, prefix):
    node = self.root
    # Navigate to prefix node
    for char in prefix:
        if char not in node.children:
            return []
        node = node.children[char]
    
    # DFS to collect all words
    suggestions = []
    self._dfs(node, prefix, suggestions)
    return suggestions

def _dfs(self, node, current_word, suggestions):
    if node.is_end_of_word:
        suggestions.append(current_word)
    
    for char, child_node in node.children.items():
        self._dfs(child_node, current_word + char, suggestions)

# Usage
trie = Trie()
words = ["apple", "application", "apply", "apt", "cat", "car"]
for word in words:
    trie.insert(word)

suggestions = get_suggestions("ap")  # Returns ["apple", "application", "apply"]

Complexity: O(P + W) where P = prefix length, W = total characters in matching words
Use When: Search bars, typeahead, command suggestions

3. Longest Common Prefix

Longest Common Prefix Using Trie

Definition

Longest Common Prefix (LCP) finds the longest string that is a prefix of all words in a list. Using a trie, we traverse until we reach a node with multiple children or a word ending.

How It Works

1
Insert all words into trie
  • Build trie containing all words
  • Common prefixes will share nodes
2
Traverse from root
  • Start with empty prefix string
  • Continue while node has exactly one child
  • Stop if node is end of a word or has multiple children
3
Build and return LCP
  • Append each character as we traverse
  • Return accumulated prefix
def longest_common_prefix(words):
    if not words:
        return ""
    
    trie = Trie()
    for word in words:
        trie.insert(word)
    
    lcp = []
    node = trie.root
    
    # Traverse while single child and not end of word
    while len(node.children) == 1 and not node.is_end_of_word:
        char = next(iter(node.children))
        lcp.append(char)
        node = node.children[char]
    
    return ''.join(lcp)

# Example
words = ["flower", "flow", "flight"]
result = longest_common_prefix(words)  # Returns "fl"

Complexity: O(N × L) where N = number of words, L = average length
Use When: Finding common prefixes in word lists

4. Delete Operation

Trie Deletion

Definition

Delete Operation removes a word from the trie. It must handle three cases: word is a prefix of another word, word shares prefix with others, or word has unique path. Careful cleanup prevents memory leaks.

How It Works

1
Traverse to word end
  • Check if word exists in trie
  • If not found, return false
2
Unmark end of word
  • Set isEndOfWord = false at final node
  • If node has children, we're done
3
Backtrack and delete
  • Recursively delete nodes with no children
  • Stop when encountering node with multiple children
def delete(self, word):
    return self._delete(self.root, word, 0)

def _delete(self, node, word, index):
    if index == len(word):
        if not node.is_end_of_word:
            return False
        node.is_end_of_word = False
        return len(node.children) == 0
    
    char = word[index]
    if char not in node.children:
        return False
    
    should_delete_child = self._delete(node.children[char], word, index + 1)
    
    if should_delete_child:
        del node.children[char]
        return len(node.children) == 0 and not node.is_end_of_word
    
    return False

Complexity: O(L) where L = word length
Use When: Removing words from dictionary

5. Advanced Trie Variants

Binary Trie

Definition

Binary Trie stores binary representations of numbers, with each node having two children (0 and 1). It's optimal for maximum XOR problems and bit manipulation.

How It Works

1
Insert number bits
  • Traverse bits from most significant to least
  • Create child nodes for 0 and 1 paths
2
Find max XOR
  • For each number, try opposite bits at each level
  • Maximize XOR value
class BinaryTrieNode:
    def __init__(self):
        self.children = [None, None]

class BinaryTrie:
    def __init__(self):
        self.root = BinaryTrieNode()
    
    def insert(self, num):
        node = self.root
        for i in range(31, -1, -1):
            bit = (num >> i) & 1
            if not node.children[bit]:
                node.children[bit] = BinaryTrieNode()
            node = node.children[bit]
    
    def max_xor(self, num):
        node = self.root
        result = 0
        for i in range(31, -1, -1):
            bit = (num >> i) & 1
            # Try opposite bit for max XOR
            desired = 1 - bit
            if node.children[desired]:
                result |= (1 << i)
                node = node.children[desired]
            else:
                node = node.children[bit]
        return result

# Find maximum XOR of any two numbers
def find_max_xor(nums):
    trie = BinaryTrie()
    max_xor = 0
    for num in nums:
        trie.insert(num)
        max_xor = max(max_xor, trie.max_xor(num))
    return max_xor

Use When: Maximum XOR problems, bit manipulation
Complexity: O(32) per operation

6. Real-World Applications

Search Engine Autocomplete

Prefix-based suggestions: Google, Bing typeahead

class SearchEngine:
    def __init__(self):
        self.trie = Trie()
        self.popularity = {}
    
    def index_page(self, keyword, score):
        self.trie.insert(keyword)
        self.popularity[keyword] = score
    
    def autocomplete(self, prefix, limit=10):
        words = self.trie.get_suggestions(prefix)
        return sorted(words, key=lambda x: self.popularity.get(x, 0), 
                     reverse=True)[:limit]

Spell Checker

Dictionary lookup: Check if word exists, suggest corrections

class SpellChecker:
    def __init__(self):
        self.trie = Trie()
    
    def load_dictionary(self, words):
        for word in words:
            self.trie.insert(word.lower())
    
    def check(self, word):
        return self.trie.search(word.lower())
    
    def suggest(self, word):
        # Find words with edit distance 1
        suggestions = []
        word = word.lower()
        letters = 'abcdefghijklmnopqrstuvwxyz'
        
        # Generate possible corrections
        for i in range(len(word) + 1):
            # Insertions
            for c in letters:
                candidate = word[:i] + c + word[i:]
                if self.trie.search(candidate):
                    suggestions.append(candidate)
            
            # Deletions
            if i < len(word):
                candidate = word[:i] + word[i+1:]
                if self.trie.search(candidate):
                    suggestions.append(candidate)
            
            # Replacements
            if i < len(word):
                for c in letters:
                    candidate = word[:i] + c + word[i+1:]
                    if self.trie.search(candidate):
                        suggestions.append(candidate)
        
        return list(set(suggestions))[:10]

IP Routing (Longest Prefix Match)

Network routing tables: Find most specific route

class IPRoutingTable:
    def __init__(self):
        self.trie = {}
    
    def binary_str(self, ip):
        return ''.join(f'{int(octet):08b}' for octet in ip.split('.'))
    
    def add_route(self, network, mask_len, next_hop):
        bits = self.binary_str(network)[:mask_len]
        node = self.trie
        for bit in bits:
            if bit not in node:
                node[bit] = {}
            node = node[bit]
        node['next_hop'] = next_hop
    
    def lookup(self, ip):
        bits = self.binary_str(ip)
        node = self.trie
        last_hop = None
        
        for bit in bits:
            if bit not in node:
                break
            node = node[bit]
            if 'next_hop' in node:
                last_hop = node['next_hop']
        
        return last_hop

Trie Problem-Solving Cheat Sheet

Problem Type Approach Time Complexity
Word Search II (Board) Trie + Backtracking/Traversal) O(M × 4 × 3^(L-1)))
Replace Words (Prefix replacement) Trie prefix matching + substitution) O(N × L) (where N = words, L = length)
Concatenated Words (Multi-word combo) Trie with DFS + memoization) O(N × L²) (where N = words, L = length)
Maximum XOR of Two Numbers) Binary Trie (bit-by-bit traversal) O(N × 32) (optimal for integers)
Longest Common Prefix) Trie with single child traversal) O(N × L) (efficient prefix detection)
Pattern Matching (Wildcard) Trie with DFS + wildcard handling) O(26^K) (where K = wildcard count)

Trie Implementation Checklist

Use array for children when alphabet is fixed (26 letters)
Use hash map for Unicode or large character sets
Track word count/end at each node for deletion
Consider compressed/radix tries for memory optimization
Add frequency counters for autocomplete ranking
Handle case sensitivity based on requirements
Implement iterative DFS instead of recursive for deep tries

DSA Expert Insight: Trie problems appear in 15% of technical interviews, especially in search-related and prefix matching scenarios. Mastering both basic operations and specialized trie variants is crucial for efficient string manipulation problems. For production systems, combine tries with caches (e.g., Redis) for popular prefixes and use Ternary Search Tries for better space efficiency when dealing with large character sets.

Share and Join the Discussion

You need to be logged in to participate in this discussion.

×
×