Finding unique values in an array with possible nested arrays

Question

I'm writing a function that receives an array and returns an object with the count of unique values, pretty much like other questions here or in StackExchange. Except that in my case, this array may contain a nested array, like:

var array = [1, 2, 'a', 'b', [3, 'c', 1], 'd', 2, 'b'];

Here is my code:

function findUniques(array, occurrences) {
    if (!occurrences) {
        occurrences = {};
    }

    //checking if parameter passed is an array
    if (Array.isArray(array)) {
        for (let i = 0; i < array.length; i++) {
            //if item is a nested array, call findUniques again
            if (Array.isArray(array[i])) {
                occurrences = findUniques(array[i], occurrences);
            } else {
                //add to/update count in occurrences object
                occurrences[array[i]] = occurrences[array[i]] + 1 || 1;
            }
        }
    }

    return occurrences;
}

It's returning the following object, as it was supposed to:

{ 
    '1': 2, 
    '2': 2, 
    '3': 1, 
    a: 1, 
    b: 2, 
    c: 1, 
    d: 1 
}

My worry is that I'm using recursion.

Is it possible to optimize this function?

(Bear in mind that I must only use vanilla JavaScript)

Just noting that the input array you have above does not generate the output object you have below. — cbojar, Commented Sep 7, 2015 at 20:54

cbojar · Accepted Answer · 2015-09-07 20:52:53Z

When we use recursion, we are building an algorithm based on an implicit¹ stack, the function call stack. We are pushing and popping elements, essentially every time we recurse down and back up. If we make the stack explicit in our code, we can make the method iterative instead, and skip the recursive calls.

A pseudocode version of the algorithm would look like:

function findOccurences(array):
    ensure array is actually an array

    set occurrences to empty object

    create a stack
    push all elements in array onto the stack
    while the stack is not empty:
        if pop returns an array:
            push all of the elements in the popped array onto the stack
            continue while

        process the popped element

    return occurrences

A nice feature in Javascript arrays is that they have stack semantics², so we can just use a built in array as our stack. We can actually get a little more clever than that, though. Since we are putting every element from one array into our stack, which is also an array, we could just as effectively clone the original array and use that as our stack³. We can then just translate the pseudocode into Javascript.

function findOccurrences(array) {
    if(!Array.isArray(array)) {
        return {};
    }

    var occurrences = {};

    // JS arrays don't have a native clone method so
    // Array#slice with a start point of 0 creates
    // a copy of our array. If handling in the same
    // order is important, tack a .reverse() at the
    // end.
    var stack = array.slice(0);

    while(stack.length !== 0) {
        var nextElement = stack.pop();
        if(Array.isArray(nextElement)) {
            // We use some fanciness here so we don't have
            // to write out an explicit loop
            [].push.apply(stack, nextElement);
            continue;
        }

        occurrences[nextElement] = occurrences[nextElement] + 1 || 1;
    }

    return occurrences;
}

¹ Implicit to our algorithm, explicit to our runtime.

² Array#push, Array#pop

³ We could just use the original array, but since it is passed by reference, we are going to corrupt the array in the process and that could lead to unwanted side effects.

I really like the way you explained it all. But as @ErikR mentioned in the other answer, it will create a new copy of the array. Won't it affect performance on large arrays just like in the other answer? — gfnogueira, Commented Sep 8, 2015 at 15:01
@gfnogueira It does create a new copy of the array, which might affect performance. To know for sure, someone would have to profile the code to compare. If the array copy becomes a bottleneck, this can be adapted. If we don't care about preserving the original array, we can just use it directly as a stack. If we need to preserve it, but don't want to copy it, we can make the code much more complicated by only using the stack on elements that are arrays. That would look ugly and confusing, but would use less memory. — cbojar, Commented Sep 8, 2015 at 15:20

ErikR · Accepted Answer · 2015-09-07 18:54:12Z

This may or may not result in faster code, but you can avoid checking that occurrences is defined as well as avoid returning it by writing the code this way:

function findOccurrences(array) {
    var occurrences = {};
    findOccurrences0(array, occurrences)
    return occurrences
}

function findOccurrences0(array, occurrences) {
    var len = array.length;
    for (var i = 0; i < array.length; i++) {
        if (Array.isArray(array[i])) {
            findOccurences0(array[i], occurrences);
        } else {
          ...
        }
    }
}

Note that findOccurrences0 always assumes that occurrences is defined, and it doesn't have to return it since all invocations modify the same object.

Joseph · Accepted Answer · 2015-09-07 21:03:16Z

1

First of all, I would correct that data if possible. If it's "a list of items", then it should be "a list of items", and not a list of mixed data. Unpredictable data structures is one of the nightmares that causes apps to break. Developers tend to work around them instead of solving the root problem, because writing code that works around the issue is faster than updating the logic the creates it... and deadlines.

Anyways, if you're just counting occurrences, you could just flatten out the array, and start counting. Flattening it out first would mean you have a 1D array to play with, easier to comprehend at first read than having to step through the code.

function findOcurrences(array){
  // flatten out the array
  var flat = array.reduce(function(carry, item){
    // This allows us to append either a value or an array without fuss
    return carry.concat(item);
  }, []);
  
  // count the ocurrences
  return flat.reduce(function(carry, item){
    carry[item] = (carry[item] || 0) + 1;
    return carry;
  }, {});
}

var array = [1, 2, 'a', 'b', [3, 'c', 1], 'd', 2, 'b'];

var occurrences = findOcurrences(array);

// SE should seriously have a console printer instead of an HTML results view
document.write(JSON.stringify(occurrences));

edited Sep 7, 2015 at 21:03

answered Sep 7, 2015 at 20:54

Joseph

25.3k2 gold badges26 silver badges37 bronze badges

2

\$\begingroup\$ Correct me if I'm wrong, but flattening will create a whole new copy of the array(even if the array has no nested sub-arrays), and this will affect performance when used on large arrays. A recursive solution only requires O(d) space where d is the maximum number of nested arrays. \$\endgroup\$
– ErikR
Commented Sep 8, 2015 at 1:03
1

\$\begingroup\$ @ErikR Yup. That's true. But the OP never specifically said optimize for performance, and I'm optimizing for conciseness. Fast code is worthless if you start to use bandaids 2 weeks from now. \$\endgroup\$
– Joseph
Commented Sep 8, 2015 at 2:04
\$\begingroup\$ @JosephtheDreamer my mistake, I should've mentioned it was to optimize for performance. But your idea to flatten the array gave me ideas to refactor other projects that I'm working on. \$\endgroup\$
– gfnogueira
Commented Sep 8, 2015 at 14:56

Add a comment |

Stack Exchange Network

Finding unique values in an array with possible nested arrays

3 Answers 3

Hot Network Questions

Finding unique values in an array with possible nested arrays

3 Answers 3

Related

Hot Network Questions