Views

Report Content

×

Graph Data Structures: Algorithms & Applications

Master graph representations, traversal algorithms, and pathfinding techniques. This tutorial covers adjacency matrices/lists, BFS/DFS, shortest path algorithms, and advanced graph theory concepts with implementation examples.

Graph Usage in Systems (2023)

Social Networks (35%)
Routing Systems (25%)
Recommendation Engines (20%)
Dependency Analysis (20%)

1. Graph Representations

Adjacency Matrix

Definition

Adjacency Matrix represents a graph as a 2D array where matrix[i][j] = weight of edge from vertex i to vertex j (or 1 if unweighted). It's optimal for dense graphs where V² is manageable.

How It Works

1
Initialize matrix
  • Create V×V matrix filled with 0 or ∞
  • Set diagonal to 0 for self-loops
2
Add edges
  • For directed: matrix[u][v] = weight
  • For undirected: matrix[u][v] = matrix[v][u] = weight
3
Query edges
  • Check matrix[u][v] for direct edge
  • O(1) time complexity
class GraphMatrix:
    def __init__(self, vertices, directed=False):
        self.V = vertices
        self.directed = directed
        # Initialize matrix with 0 (no edge)
        self.matrix = [[0] * vertices for _ in range(vertices)]
    
    def add_edge(self, u, v, weight=1):
        self.matrix[u][v] = weight
        if not self.directed:
            self.matrix[v][u] = weight
    
    def remove_edge(self, u, v):
        self.matrix[u][v] = 0
        if not self.directed:
            self.matrix[v][u] = 0
    
    def has_edge(self, u, v):
        return self.matrix[u][v] != 0
    
    def get_neighbors(self, u):
        neighbors = []
        for v in range(self.V):
            if self.matrix[u][v] != 0:
                neighbors.append(v)
        return neighbors
    
    def display(self):
        print("Adjacency Matrix:")
        for row in self.matrix:
            print(row)

# Example
g = GraphMatrix(4)
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 3)
g.display()

Adjacency List

class GraphList:
    def __init__(self, directed=False):
        self.adj_list = {}
        self.directed = directed
    
    def add_vertex(self, vertex):
        if vertex not in self.adj_list:
            self.adj_list[vertex] = []
    
    def add_edge(self, u, v, weight=1):
        self.add_vertex(u)
        self.add_vertex(v)
        self.adj_list[u].append((v, weight))
        if not self.directed:
            self.adj_list[v].append((u, weight))
    
    def remove_edge(self, u, v):
        if u in self.adj_list:
            self.adj_list[u] = [(node, w) for node, w in self.adj_list[u] if node != v]
        if not self.directed and v in self.adj_list:
            self.adj_list[v] = [(node, w) for node, w in self.adj_list[v] if node != u]
    
    def get_neighbors(self, u):
        return self.adj_list.get(u, [])
    
    def display(self):
        print("Adjacency List:")
        for vertex, edges in self.adj_list.items():
            print(f"{vertex}: {edges}")

# Example
g = GraphList()
g.add_edge('A', 'B')
g.add_edge('A', 'C')
g.add_edge('B', 'D')
g.add_edge('C', 'D')
g.display()

2. Breadth-First Search (BFS)

BFS Traversal

Definition

Breadth-First Search (BFS) traverses graph level by level, exploring all neighbors of a vertex before moving to the next level. It uses a queue and guarantees shortest path in unweighted graphs.

How It Works

1
Initialize queue
  • Push start node to queue
  • Mark start node as visited
2
Process level by level
  • Dequeue front node
  • Process the node
  • Add all unvisited neighbors to queue
3
Repeat until queue empty
  • Continues until all reachable nodes visited
  • Finds shortest path in unweighted graphs
from collections import deque

def bfs(graph, start):
    visited = set()
    queue = deque([start])
    visited.add(start)
    traversal = []
    
    while queue:
        node = queue.popleft()
        traversal.append(node)
        
        for neighbor in graph.get_neighbors(node):
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)
    
    return traversal

def shortest_path(graph, start, end):
    visited = set()
    queue = deque([(start, [start])])
    visited.add(start)
    
    while queue:
        node, path = queue.popleft()
        
        if node == end:
            return path
        
        for neighbor in graph.get_neighbors(node):
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append((neighbor, path + [neighbor]))
    
    return None

# Example
graph = GraphList()
graph.add_edge(0, 1)
graph.add_edge(0, 2)
graph.add_edge(1, 3)
graph.add_edge(2, 3)

print("BFS:", bfs(graph, 0))
print("Shortest path 0->3:", shortest_path(graph, 0, 3))

Complexity: O(V + E) time, O(V) space
Use When: Shortest path (unweighted), level-order traversal

3. Depth-First Search (DFS)

DFS Traversal

Definition

Depth-First Search (DFS) explores as far as possible along each branch before backtracking. It uses a stack (implicit recursion) and is useful for topological sort, cycle detection, and connectivity.

How It Works

1
Recursive approach
  • Mark current node as visited
  • Process the node
  • Recursively visit each unvisited neighbor
2
Backtrack when done
  • When no more neighbors, return to previous node
  • Continue with remaining neighbors
def dfs_recursive(graph, start, visited=None):
    if visited is None:
        visited = set()
    
    visited.add(start)
    print(start, end=' ')
    
    for neighbor in graph.get_neighbors(start):
        if neighbor not in visited:
            dfs_recursive(graph, neighbor, visited)
    
    return visited

def dfs_iterative(graph, start):
    visited = set()
    stack = [start]
    traversal = []
    
    while stack:
        node = stack.pop()
        if node not in visited:
            visited.add(node)
            traversal.append(node)
            
            # Add neighbors in reverse order for original order
            neighbors = graph.get_neighbors(node)
            for neighbor in reversed(neighbors):
                if neighbor not in visited:
                    stack.append(neighbor)
    
    return traversal

# Connected components using DFS
def connected_components(graph):
    visited = set()
    components = []
    
    for vertex in graph.get_all_vertices():
        if vertex not in visited:
            component = []
            stack = [vertex]
            
            while stack:
                node = stack.pop()
                if node not in visited:
                    visited.add(node)
                    component.append(node)
                    for neighbor in graph.get_neighbors(node):
                        if neighbor not in visited:
                            stack.append(neighbor)
            components.append(component)
    
    return components

4. Dijkstra's Shortest Path

Dijkstra's Algorithm

Definition

Dijkstra's Algorithm finds shortest paths from source to all vertices in weighted graphs with non-negative edges. It uses a priority queue to greedily select the closest unvisited vertex.

How It Works

1
Initialize distances
  • Set distance to source = 0, others = ∞
  • Add all vertices to priority queue
2
Extract minimum
  • Get vertex with smallest distance from queue
  • If distance is larger than known, skip
3
Relax edges
  • For each neighbor, calculate new distance
  • If smaller than known, update and re-add to queue
import heapq

def dijkstra(graph, start):
    distances = {v: float('inf') for v in graph.get_all_vertices()}
    distances[start] = 0
    pq = [(0, start)]
    predecessors = {v: None for v in graph.get_all_vertices()}
    
    while pq:
        current_dist, u = heapq.heappop(pq)
        
        if current_dist > distances[u]:
            continue
        
        for v, weight in graph.get_neighbors(u):
            distance = current_dist + weight
            if distance < distances[v]:
                distances[v] = distance
                predecessors[v] = u
                heapq.heappush(pq, (distance, v))
    
    return distances, predecessors

def reconstruct_path(predecessors, end):
    path = []
    current = end
    while current is not None:
        path.append(current)
        current = predecessors[current]
    return path[::-1]

# Example usage
g = GraphList()
g.add_edge('A', 'B', 4)
g.add_edge('A', 'C', 2)
g.add_edge('B', 'C', 1)
g.add_edge('B', 'D', 5)
g.add_edge('C', 'D', 8)

distances, pred = dijkstra(g, 'A')
print(f"Distances: {distances}")
print(f"Path to D: {reconstruct_path(pred, 'D')}")

5. Topological Sort

Topological Sort (DFS-based)

Definition

Topological Sort orders vertices in a Directed Acyclic Graph (DAG) such that for every edge u→v, u appears before v. It's used for task scheduling and dependency resolution.

How It Works

1
DFS on each unvisited node
  • Perform DFS traversal
  • Add node to stack after processing all neighbors
2
Post-order processing
  • Push to stack when leaving node
  • Ensures dependencies come first
def topological_sort(graph):
    visited = set()
    stack = []
    
    def dfs(node):
        visited.add(node)
        
        for neighbor, _ in graph.get_neighbors(node):
            if neighbor not in visited:
                dfs(neighbor)
        
        stack.append(node)  # Post-order
    
    for vertex in graph.get_all_vertices():
        if vertex not in visited:
            dfs(vertex)
    
    return stack[::-1]  # Reverse for topological order

# Kahn's Algorithm (BFS-based topological sort)
from collections import deque

def topological_sort_kahn(graph):
    in_degree = {v: 0 for v in graph.get_all_vertices()}
    
    for u in graph.get_all_vertices():
        for v, _ in graph.get_neighbors(u):
            in_degree[v] += 1
    
    queue = deque([v for v in in_degree if in_degree[v] == 0])
    result = []
    
    while queue:
        u = queue.popleft()
        result.append(u)
        
        for v, _ in graph.get_neighbors(u):
            in_degree[v] -= 1
            if in_degree[v] == 0:
                queue.append(v)
    
    if len(result) != len(in_degree):
        raise ValueError("Graph has a cycle")
    
    return result

# Example: Course prerequisites
courses = GraphList(directed=True)
courses.add_edge('C1', 'C2')
courses.add_edge('C1', 'C3')
courses.add_edge('C2', 'C4')
courses.add_edge('C3', 'C4')

print("Topological order:", topological_sort(courses))

6. Cycle Detection

Detect Cycle in Directed Graph

Definition

Cycle Detection identifies cycles in directed graphs using DFS with recursion stack tracking. For undirected graphs, tracking parent nodes suffices.

def has_cycle_directed(graph):
    visited = set()
    rec_stack = set()
    
    def dfs(node):
        if node in rec_stack:
            return True
        if node in visited:
            return False
        
        visited.add(node)
        rec_stack.add(node)
        
        for neighbor, _ in graph.get_neighbors(node):
            if dfs(neighbor):
                return True
        
        rec_stack.remove(node)
        return False
    
    for vertex in graph.get_all_vertices():
        if dfs(vertex):
            return True
    return False

def has_cycle_undirected(graph):
    visited = set()
    
    def dfs(node, parent):
        visited.add(node)
        
        for neighbor, _ in graph.get_neighbors(node):
            if neighbor not in visited:
                if dfs(neighbor, node):
                    return True
            elif neighbor != parent:
                return True
        return False
    
    for vertex in graph.get_all_vertices():
        if vertex not in visited:
            if dfs(vertex, -1):
                return True
    return False

# Example with cycle
graph_cycle = GraphList(directed=True)
graph_cycle.add_edge(0, 1)
graph_cycle.add_edge(1, 2)
graph_cycle.add_edge(2, 0)  # Creates cycle

print("Has cycle:", has_cycle_directed(graph_cycle))  # True

Graph Problem-Solving Cheat Sheet

Problem Type Algorithm Time Complexity
Shortest Path (Unweighted) BFS O(V + E)
Shortest Path (Weighted, no negative) Dijkstra O(E + V log V)
Negative Weights / Cycle Detection Bellman-Ford O(VE)
All Pairs Shortest Path Floyd-Warshall O(V³)
Minimum Spanning Tree Prim / Kruskal O(E log V)
Topological Sort (DAG) DFS / Kahn's O(V + E)
Connected Components BFS / DFS O(V + E)
Cycle Detection DFS with recStack / Union-Find O(V + E)

7. Real-World Applications

Social Network Friend Suggestions

BFS on social graph: Find friends of friends

def friend_suggestions(social_graph, user, max_depth=2):
    visited = {user}
    queue = deque([(user, 0)])
    suggestions = {}
    
    while queue:
        current_user, depth = queue.popleft()
        
        if depth >= max_depth:
            continue
        
        for friend in social_graph.get_neighbors(current_user):
            if friend not in visited:
                visited.add(friend)
                suggestions[friend] = suggestions.get(friend, 0) + 1
                queue.append((friend, depth + 1))
    
    return sorted(suggestions.items(), key=lambda x: x[1], reverse=True)

GPS Navigation

Dijkstra/A*: Find shortest route

def find_shortest_route(graph, start, destination):
    distances, predecessors = dijkstra(graph, start)
    
    if distances[destination] == float('inf'):
        return None
    
    path = []
    current = destination
    while current is not None:
        path.append(current)
        current = predecessors[current]
    
    return {
        'path': path[::-1],
        'distance': distances[destination]
    }

Graph Implementation Checklist

Choose adjacency matrix for dense graphs (V² ≤ E)
Choose adjacency list for sparse graphs (E << V²)
Track visited nodes to avoid infinite loops in traversal
For weighted graphs, store weights alongside neighbors
Use priority queue (heap) for Dijkstra's algorithm
For negative weights, use Bellman-Ford instead of Dijkstra
Topological sort only works on DAGs (Directed Acyclic Graphs)
BFS gives shortest path in unweighted graphs

DSA Expert Insight: Graph problems appear in 40% of technical interviews, especially in pathfinding and optimization scenarios. Mastering both fundamental traversals (BFS/DFS) and advanced algorithms (Dijkstra, Topological Sort) is crucial for solving complex real-world problems. When choosing between adjacency matrix and list, use matrix for dense graphs (where V² ≈ E) and list for sparse graphs (where V² ≫ E). For pathfinding, BFS for unweighted, Dijkstra for weighted without negatives, and Bellman-Ford for graphs with negative edges.

Share and Join the Discussion

You need to be logged in to participate in this discussion.

×
×