Report Content
×Graph Data Structures: Algorithms & Applications
Master graph representations, traversal algorithms, and pathfinding techniques. This tutorial covers adjacency matrices/lists, BFS/DFS, shortest path algorithms, and advanced graph theory concepts with implementation examples.
Graph Usage in Systems (2023)
1. Graph Representations
Adjacency Matrix
Definition
Adjacency Matrix represents a graph as a 2D array where matrix[i][j] = weight of edge from vertex i to vertex j (or 1 if unweighted). It's optimal for dense graphs where V² is manageable.
How It Works
Initialize matrix
- Create V×V matrix filled with 0 or ∞
- Set diagonal to 0 for self-loops
Add edges
- For directed: matrix[u][v] = weight
- For undirected: matrix[u][v] = matrix[v][u] = weight
Query edges
- Check matrix[u][v] for direct edge
- O(1) time complexity
class GraphMatrix:
def __init__(self, vertices, directed=False):
self.V = vertices
self.directed = directed
# Initialize matrix with 0 (no edge)
self.matrix = [[0] * vertices for _ in range(vertices)]
def add_edge(self, u, v, weight=1):
self.matrix[u][v] = weight
if not self.directed:
self.matrix[v][u] = weight
def remove_edge(self, u, v):
self.matrix[u][v] = 0
if not self.directed:
self.matrix[v][u] = 0
def has_edge(self, u, v):
return self.matrix[u][v] != 0
def get_neighbors(self, u):
neighbors = []
for v in range(self.V):
if self.matrix[u][v] != 0:
neighbors.append(v)
return neighbors
def display(self):
print("Adjacency Matrix:")
for row in self.matrix:
print(row)
# Example
g = GraphMatrix(4)
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 3)
g.display()
Adjacency List
class GraphList:
def __init__(self, directed=False):
self.adj_list = {}
self.directed = directed
def add_vertex(self, vertex):
if vertex not in self.adj_list:
self.adj_list[vertex] = []
def add_edge(self, u, v, weight=1):
self.add_vertex(u)
self.add_vertex(v)
self.adj_list[u].append((v, weight))
if not self.directed:
self.adj_list[v].append((u, weight))
def remove_edge(self, u, v):
if u in self.adj_list:
self.adj_list[u] = [(node, w) for node, w in self.adj_list[u] if node != v]
if not self.directed and v in self.adj_list:
self.adj_list[v] = [(node, w) for node, w in self.adj_list[v] if node != u]
def get_neighbors(self, u):
return self.adj_list.get(u, [])
def display(self):
print("Adjacency List:")
for vertex, edges in self.adj_list.items():
print(f"{vertex}: {edges}")
# Example
g = GraphList()
g.add_edge('A', 'B')
g.add_edge('A', 'C')
g.add_edge('B', 'D')
g.add_edge('C', 'D')
g.display()
2. Breadth-First Search (BFS)
BFS Traversal
Definition
Breadth-First Search (BFS) traverses graph level by level, exploring all neighbors of a vertex before moving to the next level. It uses a queue and guarantees shortest path in unweighted graphs.
How It Works
Initialize queue
- Push start node to queue
- Mark start node as visited
Process level by level
- Dequeue front node
- Process the node
- Add all unvisited neighbors to queue
Repeat until queue empty
- Continues until all reachable nodes visited
- Finds shortest path in unweighted graphs
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
visited.add(start)
traversal = []
while queue:
node = queue.popleft()
traversal.append(node)
for neighbor in graph.get_neighbors(node):
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return traversal
def shortest_path(graph, start, end):
visited = set()
queue = deque([(start, [start])])
visited.add(start)
while queue:
node, path = queue.popleft()
if node == end:
return path
for neighbor in graph.get_neighbors(node):
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, path + [neighbor]))
return None
# Example
graph = GraphList()
graph.add_edge(0, 1)
graph.add_edge(0, 2)
graph.add_edge(1, 3)
graph.add_edge(2, 3)
print("BFS:", bfs(graph, 0))
print("Shortest path 0->3:", shortest_path(graph, 0, 3))
Complexity: O(V + E) time, O(V) space
Use When: Shortest path (unweighted), level-order traversal
3. Depth-First Search (DFS)
DFS Traversal
Definition
Depth-First Search (DFS) explores as far as possible along each branch before backtracking. It uses a stack (implicit recursion) and is useful for topological sort, cycle detection, and connectivity.
How It Works
Recursive approach
- Mark current node as visited
- Process the node
- Recursively visit each unvisited neighbor
Backtrack when done
- When no more neighbors, return to previous node
- Continue with remaining neighbors
def dfs_recursive(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=' ')
for neighbor in graph.get_neighbors(start):
if neighbor not in visited:
dfs_recursive(graph, neighbor, visited)
return visited
def dfs_iterative(graph, start):
visited = set()
stack = [start]
traversal = []
while stack:
node = stack.pop()
if node not in visited:
visited.add(node)
traversal.append(node)
# Add neighbors in reverse order for original order
neighbors = graph.get_neighbors(node)
for neighbor in reversed(neighbors):
if neighbor not in visited:
stack.append(neighbor)
return traversal
# Connected components using DFS
def connected_components(graph):
visited = set()
components = []
for vertex in graph.get_all_vertices():
if vertex not in visited:
component = []
stack = [vertex]
while stack:
node = stack.pop()
if node not in visited:
visited.add(node)
component.append(node)
for neighbor in graph.get_neighbors(node):
if neighbor not in visited:
stack.append(neighbor)
components.append(component)
return components
4. Dijkstra's Shortest Path
Dijkstra's Algorithm
Definition
Dijkstra's Algorithm finds shortest paths from source to all vertices in weighted graphs with non-negative edges. It uses a priority queue to greedily select the closest unvisited vertex.
How It Works
Initialize distances
- Set distance to source = 0, others = ∞
- Add all vertices to priority queue
Extract minimum
- Get vertex with smallest distance from queue
- If distance is larger than known, skip
Relax edges
- For each neighbor, calculate new distance
- If smaller than known, update and re-add to queue
import heapq
def dijkstra(graph, start):
distances = {v: float('inf') for v in graph.get_all_vertices()}
distances[start] = 0
pq = [(0, start)]
predecessors = {v: None for v in graph.get_all_vertices()}
while pq:
current_dist, u = heapq.heappop(pq)
if current_dist > distances[u]:
continue
for v, weight in graph.get_neighbors(u):
distance = current_dist + weight
if distance < distances[v]:
distances[v] = distance
predecessors[v] = u
heapq.heappush(pq, (distance, v))
return distances, predecessors
def reconstruct_path(predecessors, end):
path = []
current = end
while current is not None:
path.append(current)
current = predecessors[current]
return path[::-1]
# Example usage
g = GraphList()
g.add_edge('A', 'B', 4)
g.add_edge('A', 'C', 2)
g.add_edge('B', 'C', 1)
g.add_edge('B', 'D', 5)
g.add_edge('C', 'D', 8)
distances, pred = dijkstra(g, 'A')
print(f"Distances: {distances}")
print(f"Path to D: {reconstruct_path(pred, 'D')}")
5. Topological Sort
Topological Sort (DFS-based)
Definition
Topological Sort orders vertices in a Directed Acyclic Graph (DAG) such that for every edge u→v, u appears before v. It's used for task scheduling and dependency resolution.
How It Works
DFS on each unvisited node
- Perform DFS traversal
- Add node to stack after processing all neighbors
Post-order processing
- Push to stack when leaving node
- Ensures dependencies come first
def topological_sort(graph):
visited = set()
stack = []
def dfs(node):
visited.add(node)
for neighbor, _ in graph.get_neighbors(node):
if neighbor not in visited:
dfs(neighbor)
stack.append(node) # Post-order
for vertex in graph.get_all_vertices():
if vertex not in visited:
dfs(vertex)
return stack[::-1] # Reverse for topological order
# Kahn's Algorithm (BFS-based topological sort)
from collections import deque
def topological_sort_kahn(graph):
in_degree = {v: 0 for v in graph.get_all_vertices()}
for u in graph.get_all_vertices():
for v, _ in graph.get_neighbors(u):
in_degree[v] += 1
queue = deque([v for v in in_degree if in_degree[v] == 0])
result = []
while queue:
u = queue.popleft()
result.append(u)
for v, _ in graph.get_neighbors(u):
in_degree[v] -= 1
if in_degree[v] == 0:
queue.append(v)
if len(result) != len(in_degree):
raise ValueError("Graph has a cycle")
return result
# Example: Course prerequisites
courses = GraphList(directed=True)
courses.add_edge('C1', 'C2')
courses.add_edge('C1', 'C3')
courses.add_edge('C2', 'C4')
courses.add_edge('C3', 'C4')
print("Topological order:", topological_sort(courses))
6. Cycle Detection
Detect Cycle in Directed Graph
Definition
Cycle Detection identifies cycles in directed graphs using DFS with recursion stack tracking. For undirected graphs, tracking parent nodes suffices.
def has_cycle_directed(graph):
visited = set()
rec_stack = set()
def dfs(node):
if node in rec_stack:
return True
if node in visited:
return False
visited.add(node)
rec_stack.add(node)
for neighbor, _ in graph.get_neighbors(node):
if dfs(neighbor):
return True
rec_stack.remove(node)
return False
for vertex in graph.get_all_vertices():
if dfs(vertex):
return True
return False
def has_cycle_undirected(graph):
visited = set()
def dfs(node, parent):
visited.add(node)
for neighbor, _ in graph.get_neighbors(node):
if neighbor not in visited:
if dfs(neighbor, node):
return True
elif neighbor != parent:
return True
return False
for vertex in graph.get_all_vertices():
if vertex not in visited:
if dfs(vertex, -1):
return True
return False
# Example with cycle
graph_cycle = GraphList(directed=True)
graph_cycle.add_edge(0, 1)
graph_cycle.add_edge(1, 2)
graph_cycle.add_edge(2, 0) # Creates cycle
print("Has cycle:", has_cycle_directed(graph_cycle)) # True
Graph Problem-Solving Cheat Sheet
| Problem Type | Algorithm | Time Complexity |
|---|---|---|
| Shortest Path (Unweighted) | BFS | O(V + E) |
| Shortest Path (Weighted, no negative) | Dijkstra | O(E + V log V) |
| Negative Weights / Cycle Detection | Bellman-Ford | O(VE) |
| All Pairs Shortest Path | Floyd-Warshall | O(V³) |
| Minimum Spanning Tree | Prim / Kruskal | O(E log V) |
| Topological Sort (DAG) | DFS / Kahn's | O(V + E) |
| Connected Components | BFS / DFS | O(V + E) |
| Cycle Detection | DFS with recStack / Union-Find | O(V + E) |
7. Real-World Applications
Social Network Friend Suggestions
BFS on social graph: Find friends of friends
def friend_suggestions(social_graph, user, max_depth=2):
visited = {user}
queue = deque([(user, 0)])
suggestions = {}
while queue:
current_user, depth = queue.popleft()
if depth >= max_depth:
continue
for friend in social_graph.get_neighbors(current_user):
if friend not in visited:
visited.add(friend)
suggestions[friend] = suggestions.get(friend, 0) + 1
queue.append((friend, depth + 1))
return sorted(suggestions.items(), key=lambda x: x[1], reverse=True)
GPS Navigation
Dijkstra/A*: Find shortest route
def find_shortest_route(graph, start, destination):
distances, predecessors = dijkstra(graph, start)
if distances[destination] == float('inf'):
return None
path = []
current = destination
while current is not None:
path.append(current)
current = predecessors[current]
return {
'path': path[::-1],
'distance': distances[destination]
}
Graph Implementation Checklist
DSA Expert Insight: Graph problems appear in 40% of technical interviews, especially in pathfinding and optimization scenarios. Mastering both fundamental traversals (BFS/DFS) and advanced algorithms (Dijkstra, Topological Sort) is crucial for solving complex real-world problems. When choosing between adjacency matrix and list, use matrix for dense graphs (where V² ≈ E) and list for sparse graphs (where V² ≫ E). For pathfinding, BFS for unweighted, Dijkstra for weighted without negatives, and Bellman-Ford for graphs with negative edges.
You need to be logged in to participate in this discussion.