Dijkstra Algorithm

>Graph>Shortest Path>Dijkstra Algorithm
Occasional
Sometimes appears

Dijkstra's algorithm is a fundamental algorithm used to find the shortest path in a graph where all edge weights are non-negative. Below, you'll find a template for implementing the algorithm. If you're new to Dijkstra's algorithm, I recommend starting with a basic understanding of its concept. Here are two excellent resources to help you get started:

This post focuses on the general idea of Dijkstra's algorithm and provides a template for practice and memorization. If you're unfamiliar with Python syntax, check out this cheatsheet for an overview of essential syntax.

Struggling to memorize?
Rather than memorizing the template line by line, please refer to the How to Memorize section.

Important!

Key Points about Dijkstra's Algorithm and This Template:

  • Works only with non-negative weights: Dijkstra's algorithm is not reliable if the graph contains negative edge weights.
  • Time complexity of this template: O((E + V) * log(E))
    • Curious about why? Consider researching the algorithm's heap-based implementation.
  • Optimal time complexity: When implemented with a Fibonacci heap, Dijkstra achieves O(V * log(V) + E).
    • However, this is rarely practical in interviews due to its complexity.
  • Space complexity of this template: O(E + V)

Dijkstra Template

dijkstra_demo.png
The adjacency list in the template will be based on this figure
from collections import defaultdict
import heapq
 
""" Example adjacency list """
adj_list = {
    0: [(1, 1), (2, 5)],  # e.g., edge to node 2 with weight 5
    1: [(3, 100)],
    2: [(3, 6), (4, 8)],
    3: [(5, 2)],
    4: [(5, 3)]
}
 
 
def dijkstra(adj_list):
    """
    Initialize distances to all nodes as infinite.
    Start node has a distance of 0.
    """
    dist_to_reach_node = defaultdict(lambda: float('inf'))
    start_node = 0
    start_dist = 0
    end_node = 5  # The target node we want to reach
 
    dist_to_reach_node[start_node] = start_dist
    min_heap = [(start_dist, start_node)]
 
    while min_heap:
        """ Get the node with the smallest distance """
        curr_dist, curr_node  = heapq.heappop(min_heap)
 
        """ Ignore outdated distances """
        if curr_dist != dist_to_reach_node[curr_node]:
            continue
 
        if curr_node == end_node:
            return dist_to_reach_node[end_node]
 
        for neighbor, weight in adj_list[curr_node]:
            new_dist = curr_dist + weight
 
            """ If the new distance is shorter than the current stored one,
            update the dictionary and push the new distance onto the heap. """
            if new_dist < dist_to_reach_node[neighbor]:
                dist_to_reach_node[neighbor] = new_dist
                heapq.heappush(min_heap, (new_dist, neighbor))
 
    return -1 # Return -1 if the target node is unreachable
 
print(dijkstra(adj_list))
""" Output (shortest path: 0 -> 2 -> 3 -> 5)
13
"""

How to Memorize

  • Dijkstra's algorithm is a greedy algorithm.
  • We only process a node once we've already computed the shortest distance to reach it.
  • A priority queue (min heap) is used to efficiently retrieve the node with the shortest distance.
  • We ensure that the node and distance popped from the priority queue are not outdated:
    • Why? A node can be pointed to by multiple other nodes. This means there can be multiple entries in the heap referring to the same node, but only one of these entries contains the correct (shortest) distance.
    • This check is done with the line: if curr_dist != dist_to_reach_node[curr_node]: continue

Dijkstra Template (Trace Path)

To return the shortest path itself (not just the minimum cost), additional bookkeeping is required to track the path from the source to the destination. This is done using a prev_node array that stores the predecessor of each node on the shortest path.

from collections import defaultdict, deque
import heapq
 
""" Example adjacency list"""
adj_list = {
    0: [(1, 1), (2, 5)],  # e.g., edge to node 2 with weight 5
    1: [(3, 100)],
    2: [(3, 6), (4, 8)],
    3: [(5, 2)],
    4: [(5, 3)]
}
 
 
def dijkstra(adj_list):
    """
    Initialize distances to all nodes as infinite.
    Start node has a distance of 0.
    """
    dist_to_reach_node = defaultdict(lambda: float('inf'))
    n = 6 # number of nodes in graph
    prev_node = [None] * n # to track the previous node for reconstructing the path
    start_node = 0
    start_dist = 0
    end_node = 5  # The target node we want to reach
 
    dist_to_reach_node[start_node] = start_dist
    min_heap = [(start_dist, start_node)]
 
    while min_heap:
        """ Get the node with the smallest distance """
        curr_dist, curr_node  = heapq.heappop(min_heap)
 
        """ Ignore outdated distances """
        if curr_dist != dist_to_reach_node[curr_node]:
            continue
 
        if curr_node == end_node: # If the end node is reached, terminate early
            break
 
        for neighbor, weight in adj_list[curr_node]:
            new_dist = curr_dist + weight
 
            """ If the new distance is shorter than the current stored one,
            update the dictionary and push the new distance onto the heap. """
            if new_dist < dist_to_reach_node[neighbor]:
                dist_to_reach_node[neighbor] = new_dist
                prev_node[neighbor] = curr_node
                heapq.heappush(min_heap, (new_dist, neighbor))
 
    shortest_path = deque()
    curr_node = end_node
    curr_node = end_node
    while curr_node is not None:
        shortest_path.appendleft(curr_node)
        curr_node = prev_node[curr_node]
 
    if shortest_path[0] != start_node:
        return []  # No valid path found
 
    return list(shortest_path)
 
print(dijkstra(adj_list))
""" Output
[0, 2, 3, 5]
"""

Deep Dive on Dijkstra's Time Complexity

I like to think of Dijkstra's time complexity as consisting of two parts: 1) the number of times we pop from the heap, and 2) the number of times we push to the heap. For popping, we know that we can pop at most V times because we never revisit a node. On the other hand, each time we visit a node, we push at most all of its neighbors to the queue. In other words, we push at most E edges to the heap in total. This implies that the size of the heap is at most E. Combining these observations, we can conclude that pushing to the heap takes E*log(E), and popping takes V * log(E), resulting in a total time complexity of (V + E) * log(E).

Misconception on Time Complexity

You might have heard that Dijkstra's algorithm runs in O((V + E) * log (E)) instead of O((V + E) * log (V)), but they are actually mathematically equivalent. This is because E can at most be V^2, which means that O((V + E) * log (V^2)) = O(2 * (V + E) * log (V)) = O((V + E) * log (V)).

Back to Top