How can I make a recursive search for longest node more efficient?

AucklandInvader

I'm trying to find the longest path in a Directed Acyclic graph. At the moment, my code seems to be running time complexity of O(n3).

The graph is of input {0: [1,2], 1: [2,3], 3: [4,5] }

#Input: dictionary: graph, int: start, list: path
#Output: List: the longest path in the graph (Recurrance)
# This is a modification of a depth first search
def find_longest_path(graph, start, path=[]):
    path = path + [start]
    paths = path
    for node in graph[start]:
        if node not in path:
            newpaths = find_longest_path(graph, node, path)
            #Only take the new path if its length is greater than the current path
            if(len(newpaths) > len(paths)):
                paths = newpaths

    return paths

It returns a list of nodes in the form e.g. [0,1,3,5]

How can I make this more efficient than O(n3)? Is recursion the right way to solve this or should I be using a different loop?

Parakleta

You can solve this problem in O(n+e) (i.e. linear in the number of nodes + edges).

The idea is that you first create a topological sort (I'm a fan of Tarjan's algorithm) and the set of reverse edges. It always helps if you can decompose your problem to leverage existing solutions.

You then walk the topological sort backwards pushing to each parent node its child's distance + 1 (keeping maximums in case there are multiple paths). Keep track of the node with the largest distance seen so far.

When you have finished annotating all of the nodes with distances you can just start at the node with the largest distance which will be your longest path root, and then walk down your graph choosing the children that are exactly one count less than the current node (since they lie on the critical path).

In general, when trying to find an optimal complexity algorithm don't be afraid to run multiple stages one after the other. Five O(n) algorithms run sequentially is still O(n) and is still better than O(n2) from a complexity perspective (although it may be worse real running time depending on the constant costs/factors and the size of n).

ETA: I just noticed you have a start node. This makes it simply a case of doing a depth first search and keeping the longest solution seen so far which is just O(n+e) anyway. Recursion is fine or you can keep a list/stack of visited nodes (you have to be careful when finding the next child each time you backtrack).

As you backtrack from your depth first search you need to store the longest path from that node to a leaf so that you don't re-process any sub-trees. This will also serve as a visited flag (i.e. in addition to doing the node not in path test also have a node not in subpath_cache test before recursing). Instead of storing the subpath you could store the length and then rebuild the path once you're finished based on sequential values as discussed above (critical path).

ETA2: Here's a solution.

def find_longest_path_rec(graph, parent, cache):
    maxlen = 0
    for node in graph[parent]:
        if node in cache:
            pass
        elif node not in graph:
            cache[node] = 1
        else:
            cache[node] = find_longest_path_rec(graph, node, cache)

        maxlen = max(maxlen, cache[node])

    return maxlen + 1

def find_longest_path(graph, start):
    cache = {}
    maxlen = find_longest_path_rec(graph, start, cache)
    path = [start]
    for i in range(maxlen-1, 0, -1):
        for node in graph[path[-1]]:
            if cache[node] == i:
                path.append(node)
                break
        else:
            assert(0)
    return path

Note that I've removed the node not in path test because I'm assuming that you're actually supplying a DAG as claimed. If you want that check you should really be raising an error rather than ignoring it. Also note that I've added the assertion to the else clause of the for to document that we must always find a valid next (sequential) node in the path.

ETA3: The final for loop is a little confusing. What we're doing is considering that in the critical path all of the node distances must be sequential. Consider node 0 is distance 4, node 1 is distance 3 and node 2 is distance 1. If our path started [0, 2, ...] we have a contradiction because node 0 is not 1 further from a leaf than 2.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How can I make this more efficient in Android?

How can I make this more efficient? (Merging arrays in C)

How can I make my trie more efficient?

How can I make this Python code more efficient

How can I make this PHP function more efficient at scale?

How can i make this modification on Dijkstra Algorithm more efficient?

How can I make a code more efficient and shorter?

How can I make my pandas code more efficient?

How can I make this PyTorch heatmap function faster and more efficient?

How can I make my website more efficient?

How can I make this algorithm more efficient for a puzzle?

Can I make this macro more efficient or faster?

How can I make my IsItAHoliday function more efficient?

How can I make this pl/sql cursor more efficient?

How can I make this search query more efficient?

How can I make large IN clauses more efficient in SQL Server?

How can I make this loop more efficient?

Can I make this SQL query more efficient?

How can I make this more efficient path finding?

How can I use iteration to make this vbnet code more efficient?

How can I concatenate my code properly and make this more efficient?

How can i make this script more efficient?(Python)

How can I make this program more efficient, short, and less complex?

How can I make this R function more efficient?

How can I make my VBA error handling more efficient

How can I make this python code more efficient?

How can I refactor this code snippet to make it more efficient?

How can I make this C# code more efficient?

How can I optimize this VBA code to make it more efficient and faster?