0

I have a for loop running over a list of objects like:

[{a: 2001, b: "hello"}, {a: 54, b: "hi"}....]

In this loop, I filter out objects based on certain field values (like, b == hello?) and create a new list of filtered objects. Once the for loop is complete, I sort the remaining objects by the value of a. This is an oversimplified example, so presorting by field a isn't possible.

My question is, is it faster to do the sort after the loop is complete vs doing something like a binary search at the end of each iteration and then insert the object then? That would avoid sorting at the end, but I don't really know if that ends up costing more.

1
  • 1
    Have you profiled it?
    – user40980
    Commented Apr 28, 2014 at 1:04

1 Answer 1

3

It is in general faster to build a collection of values first, then sort. If you insert into an ordered container, you'll be paying the price for maintaining an ordered collection (which, depending on the underlying implementation may take ~2 lg n compares, or, as you seem to be implying by saying "binary search," ~lg n compares and then ~n/2 swaps).

If, on the other hand, you simply store your items in an unordered manner and then sort, you allow faster algorithms to sort all your items at the end for ~n lg n compares, which, if we amortize to the n elements you've been inserting, this method would take ~lg n compares per element (and fewer swaps) cost, cheaper than both options for maintaining the ordered collection.

Of course, profiling will always tell you what works best on your machine. However, for sufficiently large data sets it would be best to filter, then sort.

Minor note: Technically, when inserting into an ordered container, the operations won't be dependent on the total number of items n but rather the current number of items k, increasing from 1 to n. However, the average cost for these operations comes out to be ~lg n to within a constant factor - see Stirling's approximation.

2
  • Note that if the OP switched to a sort-optimized collection (rather than an array), insertion of a single element could be done in O(lg n), rather than O(n).
    – Brian
    Commented Apr 28, 2014 at 3:00
  • @Brian I mention that, though briefly. This would still be slower than sorting after because of tree rotation costs or relaxed tree depth requirements.
    – VF1
    Commented Apr 28, 2014 at 7:10

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.