A co worker emailed out an interesting question, how would we optimise the following piece of code –

The code looks fast, light and up to the job right? Sure, it will work fine and it will be reasonably fast, but it can always be faster. The first obvious waste of time is with the sort; sorts are lengthy beasts and as much as possible should be done to minimise their impact. The aim of this function is to get the average of the k smallest values in the array, imagine if the array is ten thousand elements big and you only need the average of the two smallest? Sorting that entire array just to get two values is a waste of time.

So, how do we optimise this? My lecturer at university always used to tell us how good the STL is and that it was nearly always better and faster than anything we can write. The obvious solution is to move away from the primitive programming shown in the above function and see what the STL can do for us; luckily enough there are several great functions that do exactly what we want. The first great time saver is the partial sort which simply sorts the array in such a way that the range we specify contains the sorted elements but anything above that middle point is left unsorted. Simply using this method would save a large amount of time over the original method of doing a full sort.

The only requirement of these improvements is to change the function parameters to accept a vector rather than a float array, luckily there were no rules set in the original question!

In order to show that these changes would be faster I wrote some code to prove it. It runs the old function and the new function ten thousand times with the same data and calculates the time taken based on clock cycles. The quicker the better. Take note that the time also includes creating a copy of the input array. If we just pass in the array it will be sorted in place so in reality we are testing how quick it is on already sorted data.

And the results speak for themselves, across ten thousand iterations of each function the STL function is generally twice as fast.

Original function: (0.0157705) 20.56 sec
STL function: (0.0157705) 7.75 sec