Network Delay Time - Djikstra Algorithm ~ Naveen's Blog

Problem Statement:

You are given a network of n nodes, labeled from 1 to n. You are also given times, a list of travel times as directed edges times[i] = (u_i, v_i, w_i), where u_i is the source node, v_i is the target node, and w_i is the time it takes for a signal to travel from source to target.

We will send a signal from a given node k. Return the minimum time it takes for all the n nodes to receive the signal. If it is impossible for all the n nodes to receive the signal, return -1.

ELI5

Just find the shortest path between the given node n to all the other nodes. When i am writing this post, i have no idea about djisktra, i am going to read about djikstra, develop my own intuition then apply on this problem

Trying to solve..

First i would like to create a graph data structure from the given input, so i just created a dict where each key is a node and values will be the nodes which are connected to the node in k.

        for u, v, time in times:
            graph[u].append([v, time]) 

Now we have a dictionary which kind of feels like a linked list, i.e we just have a node which is connected to other nodes.

How our result look like ?

we just need to return a list of times, so we will initialize that

        times = [float("inf")] * n

why float("inf") ? there is a edge case where reaching all nodes might not be possible, this times will be used to store numbers, initializing it to infinity seems logical because times is in range of 1 to 6000 in this case, this infinity can represent node is not reachable instead of using None, in that case i would need to perform a type check every time when i need to access the value.

Base Case

I am sending the signal from node k, so the time taken to reach the node k from k would always obviously be 0, lets code that up

        times[k-1] = 0

k-1, because arrays start from 0

Brute Force:

Before starting to perform any sort of iterations, i want to keep a list of vertices i visit, because i dont want to end up in a cycle where the code would run forever, what i meant is take this example graph

If we start from A and compute min distance between A and B, then we go to B and find C and from C we will find A again. How do you prevent this from happening? The obvious way to do this is by remembering which nodes you visited.

Thats ok we can keep a list, but we would need O(n) time to query that list. Ok scratch that, we can use a set, it just takes O(1) lookup time.

        visited = set()

Alright, all set to start and fail with our brute force solution.

Steps:

1. We will start with node k, explore the neighbours of k

2. Then we will collect all of them in to a list

3. Then we start exploring elements in list and then add back the newly found nodes in the queue.

4. If the node is already in visited then we dont need to visit it again.

        nodes = graph[k]
        while nodes:
            temp = []
            for i in range(len(nodes)):
                if nodes[i][0] not in visited:
                    destination_node, time_took = nodes[i]
                    temp.extend(graph[destination_node])
                    # we will mark the destination node as visited, since
                    # we collected all the nodes connected to it in the temp
                    # array
                    visited.add(destination_node)
            # once we finish exploring the nodes then we set to explore
            # its descendants
            nodes = temp
                    

This now does the basic breadth first search on the graph. This doesnt calculate the time it took for the signal to travel, lets include that logic in the code

        nodes = graph[k]
        while nodes:
            temp = []
            for i in range(len(nodes)):
                if nodes[i][0] not in visited:
                    destination_node, time_took = nodes[i]
                    child_nodes = graph[destination_node]
                    child_nodes = [[x[0], x[1] + time_took] for x in child_nodes]
                    temp.extend(child_nodes)
                    # we will mark the destination node as visited, since
                    # we collected all the nodes connected to it in the temp
                    # array
                    visited.add(destination_node)
            # once we finish exploring the nodes then we set to explore
            # its descendants
            nodes = temp

                    child_nodes = graph[destination_node]
                    child_nodes = [[x[0], x[1] + time_took] for x in child_nodes]

time took to reach child node = time took to travel from parent to child node + existing time which will use to reach its children

we are updating this since the signal is transmitted from node k. When we are updating time we also need to check if that could be the minimum time to reach the node, lets do that with few lines of code

                    destination_node, time_took = nodes[i]
                    result[destination_node-1] = min(result[destination_node-1], time_took)

alright, i guess we completed the code, lets run it

after submission i got error on this test case

i am quite surprised, i was under the impression that i covered all the edge cases, ok lets visualize

I left this edge unexplored, because according to my logic visited nodes should not be visited again, but i am not sure how would i be able to prevent the cycle if i cant keep track of visited nodes.. i could formulate some thing like this, if i reach the initial node again, then its a cycle since i started from k

That doesnt get me far, this is too big to visualize but i am sure i got hit by this edge case

Any nodes which are not the start pointed at each other, ok its very clear that i need to maintain a visited set, we are back to square one.

What i want to do is to find the minimum edge, may be i can explore this edge if its less than previous distance ?

Lets go back to this failing test case

here i didnt explore edge with weight 2 since i had visited 3 via edge with weight 4. i think i can change this condition

                if nodes[i][0] not in visited: 

                if nodes[i][0] not in visited or result[nodes[i][0] - 1] > nodes[i][1]:  

if you hate this nested array comparison syntax you are not alone, i hate it too.

Usually in production code i would probably make this in to a helper method which will be more readable. But in terms of psuedo code this exactly what i am writing

```

if node is not visited or time took to visit the node is greater than our current path

```

We explore the same node again if that condition matches, lets submit

Boom, it worked :-)

This completes our brute force solution, lets analyze why this very inefficient before moving to djisktra

In the green i marked the distance of nodes when starting from A. The distances are marked before discovering our edge B - C which has distance of 2.Once its discovered now every edge leaving C must be recomputed. The reason is we discovered a shortest path to C. Before discovering that edge the shortest path to D would be 10, after discovering the edge it will just be 4

The brute force solution is inefficient because once we discover a shortest path to a node, then we have to recompute all the edges leaving that node. This is exactly what we will try to optimize.

When i thought about this, i was wondering why we need to visit all the paths from node C if the shortest path to C is discovered, we could simply skip it. But then i realized the distance from A is computed by adding these edge lengths on level by level.

So when you discover a shortest path to a node, then all of its children needs to be revisited.

How to do that ?

I created another test case for intuition

Lets say there are three edges leaving A to C, each has distinct weight 1,2,3

Which edge you would choose so that you dont have recalculate the nodes starting from C? Thats easy we can go with edge 1, because other edges are greater than this edge. So we would never have to recompute the nodes connected from C. Well thats about it, we just discovered djisktra algorithm. We just need to apply this optimization over our bruteforce solution.

The solution stays same, instead of a simple list we are going to use min_heap to store nodes. This will not explore nodes level by level, we are going to put the nodes we discover in to our min heap, we will pick the one which has the minimum value.

class Solution:
    def networkDelayTime(self, times: List[List[int]], n: int, k: int) -> int:
        graph = defaultdict(list)
        for u, v, time_took_to_reach_parent_vertex in times:
            graph[u].append([time_took_to_reach_parent_vertex, v])

        result = [float("+inf")] * n
        result[k - 1] = 0
        visited = set()
        visited.add(k)

        min_heap = [[0,k]]

        while min_heap:
            time_took_to_reach_parent_vertex, parent_vertex = heapq.heappop(min_heap)

            result[parent_vertex - 1] = min(result[parent_vertex - 1], time_took_to_reach_parent_vertex)

            for connected_vertex_time, connected_vertex in graph[parent_vertex]:
                total_time = time_took_to_reach_parent_vertex + connected_vertex_time
                if total_time < result[connected_vertex-1]:
                    result[connected_vertex-1] = total_time
                    heapq.heappush(min_heap, [connected_vertex_time + time_took_to_reach_parent_vertex, connected_vertex])

        minimum_time_taken = max(result)
        return minimum_time_taken if minimum_time_taken != float("inf") else -1

thats about it, we implemented djisktra optimization over our bruteforce solution. The major thing we prevented is number of times the distance is recomputed once a minium length to node is discovered.

Naveen's Blog

Network Delay Time - Djikstra Algorithm

0 comments:

Post a Comment