Performance of Recursive Programming

user3396283

The following recursive code will cause a stack overflow if the array list is too large. How can you fix this and still retain the recursive pattern?

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    // process the list item...
    nextListItem();
  }
};
VLAZ

Use the event queue

What you can easily do in this case is offload the recursion onto the event queue.

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // schedule the processing of the next list item
    setTimeout(nextListItem);
  }
};

nextListItem();


function readHugeList() {
  return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; //example
}

This is the simplest you can do - setTimeout will schedule the nextListItem to be called again but first the current stack frame will clear. This way, you never cause a stack overflow and can handle arbitrarily large input.

NOTE: This will not cause a stack overflow through recursion, however, it's going to be slow. The problem is that the minimum timeout length is 4ms, so if each call takes less than 4ms to complete, then the whole operation will take longer. For example, if each operation takes 4ms, then delaying via setTimeout will take 4 times longer. Even if the minimum timeout is removed (for example in Node.js or using setImmediate in Node.js), then it will be faster but still slower than other alternatives - this will still wait for the entire event loop to finish. While faster than a forced minimum 4ms wait, it's still an extra overhead.

However, this method does allow for other things can happen in the meantime, so even if it's slower, it might be a suitable implementation to prevent blocking. One use for this is creating incremental updates. Otherwise it might allow other unrelated tasks to occur, so they are not delayed. So, it's a useful technique but probably not the best - if you have a large enough dataset that you get a stack overflow, then you are definitely going to run into a time issue.

Use a trampoline

An alternative is to employ a trampoline. This is basically a manual implementation of tail call optimisation. Here is a sample implementation:

function trampoline(fn) {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

This is a simple implementation of the basics - it doesn't take arguments into account but better implementations do. There are library implementations of a trampoline, so you don't need to write it yourself.

At any rate, a trampoline also requires a change to the recursive function - instead of returning a result, it has to return a thunk - that needs to be evaluated. So, here is how it can work:

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // a thunk for the next list item
    return () => nextListItem();
  }
};

trampoline(nextListItem);


function trampoline(fn) {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

function readHugeList() {
  return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; //example
}

(Note: return () => nextListItem() can also be done simply as return nextListItem but I wanted to be explicit here)

The trampoline will keep calling the function until it stops returning more thunks to be evaluated. And each thunk itself will contain the recursive call to the function. So, instead of the function directly calling itself, you now call it indirectly when evaluating the thunk via fn(). And since this is done in a loop, it still prevents the stack overflow from having a very deeply nested recursion.

Third option use the event queue and cheat a bit

OK, it's not really cheating but you can give yourself an advantage. As mentioned, delegating the recursion to the event queue will solve the stack overflow problem but introduces a speed problem because it imposes a minimum delay to each of the delayed actions. Just to demonstrate how this will affect the performance, here is an example:

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // schedule the processing of the next list item
    setTimeout(nextListItem);
  } else {
    console.log("operation took (in ms):", performance.now() - startTime)
  }
};

var startTime = performance.now();
nextListItem();

function readHugeList() {
  //generate an array of 100 sequential numbers
  return Array.from({length: 100}, (_, i) => i);
}

Recursively going through a list of one hundred items takes about a second (on my machine, at least). Whatever time it takes, it's too long - a hundred items is not that many and won't even cause a call stack overflow in the first place. I will not try it with a number that might cause an overflow as it will work but it will be dreadfully long to wait for it to finish.

However, you can still leverage the event loop via microtasks. A very quick summary is there is two types of tasks the event loop handles

  • macrotasks (like what setTimeout sets) - they have a minimum delay of 4ms and will also allow for other stuff to happen, like browser animations.
  • microtasks - they are higher priority, don't have any minimum delay and will just execute as soon as possible.

If you offload to the microtask queue instead of the macrotask queue, then you will get much faster processing. To do that, you can use a Promise in the browser - resolved promises will add any further processing needed (callbacks added with .then) to the microtask queue.

Here is how this looks and behaves:

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // schedule the processing of the next list item
    Promise.resolve()//resolve immediately
      .then(nextListItem); //adds a microtask
  } else {
    console.log("operation took (in ms):", performance.now() - startTime)
  }
};

var startTime = performance.now();
nextListItem();

function readHugeList() {
  //generate an array of 10 000 sequential numbers
  return Array.from({length: 10000}, (_, i) => i);
}

This takes about second and a half (on my machine). This is comparable with setTimeout with exception that the microtask version process two orders of magnitude more items. That's 10 000 vs only 100.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related