Sorting algorithms were once a basic part of every software developer’s repertoire. Developers were supposed to know how to use them the way a carpenter knows how to use a saw. With so many other demands on their time today, it’s easy to lose sight of this basic skill set. Here’s a quick overview of them, with references for further study.
The basics of sorting
The underlying idea is simple. A set of data has an ordering property, so that for any two items, one is less than the other or the two are equal. It’s a transitive property. If a < b and b < c, a < c. They start in some arbitrary order, and the goal is to put them in ascending or descending order.
Let’s use array notation to make that a little more formal. The data to be sorted are in an array A, of length n. This will be an ascending sort. The algorithm re-orders the array. After sorting, the relationship A[i] <= A[i+1] will be true for all values of i from 0 through n – 2.
Many algorithms are possible, but they aren’t all equally good. One algorithm may be faster, simpler to code, or more economical in its use of memory. Some work better with certain data structures. There’s no one best algorithm for all use cases; you can sort sorting algorithms by quality. This article presents brief discussions of just a few of them, to encourage your appetite. For simplicity, they assume an ascending sort.
One of the simplest methods is the interchange sort. It’s also called the bubble sort because on each pass, the largest value will “bubble up” to the top. It’s easy to implement but competes poorly with other algorithms on efficiency. It just goes through the array and swaps two values if they’re out of sequence. It repeats the process until there’s nothing left to swap. Each pass is guaranteed to move the largest value to the end, so the upper bound for the sort can be decremented by 1 each time.
To see why this is inefficient, suppose that A[n – 1] is the smallest value in the array. It will only move down by 1 position on each pass, so the sort will take n – 1 passes.
The insertion sort is easy to understand. It builds up the sorted part of the array from left to right. On the first pass, it takes A and puts it either before or after A as the sorting order requires. There’s now a sorted subarray of 2 items. The algorithm then takes A and puts it in the right position so A through A are correctly sorted. Continuing this way, it sorts all the items in just one pass. However, each pass requires comparing a value with all the previously sorted ones, as well as a lot of moving around of previously sorted items.
This sort is especially well suited for linked lists, since each item just has to be linked in the right place without moving anything else. There’s a more extended explanation of the insertion sort on Interactive Python.
The selection sort is similar to the insertion sort but simpler and less efficient. Its main advantage is that it doesn’t require much memory beyond the array to be sorted.
This algorithm takes multiple passes over the array, building up a sorted part on the left. Initially the whole array is unsorted. The selection sort finds the smallest value and moves it to the start of the array by swapping with the value already there. The sorted part is now one item longer, and the unsorted part one item shorter. Then the algorithm repeats the operation on the remaining unsorted part.
That is, it starts by looking at items 0 through n – 1 and puts the smallest item at index 0. Then it repeats with 1 through n – 1, 2 through n – 1, and so on.
With a linked list, the process has the same effect but works a little differently. It starts by creating an empty sorted list. On each pass, the smallest item is removed from the unsorted list and appended to the sorted list.
Here’s a YouTube video which explains and demonstrates a selection sort.
This algorithm claims speed in its name. Every pass splits the sort into partitions, each of which is sorted separately. It gets the values into place faster than the simple interchange sort.
It uses two indices, i and j. Initially, i is set to 0, and j is set to n – 1. The algorithm compares A[i] with A[j]. If they’re out of sequence, it swaps A[i] and A[j] and then adds 1 to i. If they’re in the correct sequence, it leaves them alone and subtracts 1 from j. The process continues until i and j meet in the middle. This will produce two partitions such that all the values in the first partition (the one with the lower index values) are less than or equal to all the values in the second partition. Then a Quicksort is performed on each partition separately unless a partition is down to size 1.
TutorialsPoint has a more extended explanation of Quicksort.
The heap sort is the most complicated of the algorithms we’ll look at here. It requires building a binary tree. That’s a tree structure where each node has no more than two children. Every node, including both parent and child nodes, corresponds to one of the values in the array to sort. Initially it reflects the unsorted order of the array.
The next step is to reorganize the tree into what’s called a “heap.” It’s still a binary tree, but every node has the property that a child node always has a value which is less than its parent node’s value. The tree maintains certain properties so that it’s efficient. To the greatest extent possible, all branches go to the same depth, and nodes always have either zero or two children.
Once the heap is built, the algorithm takes the root node, which will always be the largest value, and puts it at the end of the array. It then “heapifies” the tree minus the root node. This will put the next largest value at the root, so it pulls that out and puts it in the next position to the left. The process repeats until the heap is gone and all values are in the array in ascending order.
This discussion omits a lot of steps due to space limitations. A more complete explanation of heap sort can be found on Programiz.
These are a very small sampling of the algorithms available for sorting. For further study, look at the long list of algorithms on GeeksforGeeks, or start exploring at the Wikipedia page on the subject.