:orphan:

Iterative merge sort
---------------------------------------------------


For this problem, we have a method ``merge`` that takes two *sorted* arrays and merges them into an also sorted array. The strategy is to start with the smallest possible arrays as inputs to this method. These are arrays with one element only. Single-element arrays are, by definition, sorted. Merged pairs of single-element arrays are sorted arrays with two elements. Now we can take these 2-element arrays and merge them into 4-element arrays, and so on.

Let's work out this with an example, using an array with 8 elements:

.. code-block:: java

   int[] arr = {44, 11, 34, 21, 13, 12, 10, 38};
   

And for simplicity of notation, let's assume that ``arr[p:q]`` is a subarray of ``arr`` from position ``p`` up to (but not including) position ``q``. For example, ``arr[7:8]`` will be ``[38]``. (This is just a notation; in Java we must use a method to do that, for example ``slice(arr,p,q)`` as we did in the classroom; or use ``Arrays.copyOfRange``.)

Our first task is to partition ``arr`` into the following single element arrays:

.. code-block:: java
   
   arr[0:1], arr[1:2], arr[2:3], arr[3:4], arr[4:5], arr[5:6], arr[6,7], arr[7:8]
   //  [44]      [11]      [34]      [21]      [13]      [12]      [10]      [38]
   
Now we can merge these arrays in pairs:

.. code-block:: java

   merge(arr[0:1], arr[1:2]);  // merged result [11,44]
   merge(arr[2:3], arr[3:4]);  // merged result [21,34]
   merge(arr[4:5], arr[5:6]);  // merged result [12,13]
   merge(arr[6,7], arr[7:8]);  // merged result [10,38]

The next step is to put these results back into array ``arr``, which becomes:

.. code-block:: java

   [11, 44, 21, 34, 12, 13, 10, 38] // arr after putting merged results back in

At this point ``arr`` is far from sorted, but now we can partition it into sorted subarrays of size 2:


.. code-block:: java
   
      arr[0:2],    arr[2:4],    arr[4:6],    arr[6,8]
   //  [11,44]      [21,34]      [12,13]      [10,38]

And we can pass these pairs of 2-element subarrays to our ``merge`` method:

.. code-block:: java

   merge(arr[0:2], arr[2:4]);  // merged result [11,21,34,44]
   merge(arr[4:6], arr[6,8]);  // merged result [10,12,13,38]

These results will go back into ``arr``:

.. code-block:: java

   [11, 21, 34, 44, 10, 12, 13, 38] // arr after putting merged results back in

The array is not quite sorted yet, but we are getting there. Because now it can be partitioned into two subarrays of size 4 that they are already sorted. This pair of 4-element arrays will go into the ``merge`` method:

.. code-block:: java

   merge(arr[0:4], arr[4:8]);  // merged result [10,11,12,13,21,34,38,44]

The last step is to copy the results of the merge back to the array. And now ``arr`` is finally sorted.

To summarize our partition and merge strategy:

.. code-block:: java

   // merge calls for subarray size = 1

   merge(arr[0:1], arr[1:2]);  //
   merge(arr[2:3], arr[3:4]);  // Merging 
   merge(arr[4:5], arr[5:6]);  // 4 pairs
   merge(arr[6,7], arr[7:8]);  // 

   // merge calls for subarray size = 2

   merge(arr[0:2], arr[2:4]);  // Merging 
   merge(arr[4:6], arr[6,8]);  // 2 pairs

   // merge calls for subarray size = 4
   
   merge(arr[0:4], arr[4:8]);  // Merging 1 pair


Is there a way to relate the subarray size to the number of pairs we must merge? Yes; first by recognizing that the largest subarray we can have in a pair is half the size of the array we attempt to sort. If the array has :math:`n` elements -- and we can assume that :math:`n` is a power of 2 -- then the subarray sizes will progress from 1 to :math:`n/2`, doubling at each step, This can be accomplished with a for-loop:

.. code-block:: java
   
   for (int size = 1; size <= arr.length/2; size = size*2) {
   }

The loop above will give the subarray sizes but we also need a way to determine where each subarray of a given size begins and ends. It helps to see how the size of subarrays determines the number of pairs we get, and where the subarrays in these pairs begin and end. 

.. code-block:: text

    size = 1      size = 2      size = 4
    4 pairs       2 pairs       1 pair
   
   begin:end     begin:end     begin:end
   
       0:1           0:2           0:4
       1:2           2:4           4:8
       
       2:3           4:6
       3:4           6:8
       
       4:5
       5:6
       
       6:7
       7:8
       
Looking at the column for ``size=1`` above, we see a pattern for the pairs: the first pair begins at position 0, the second pair begins at position 2, the third at position 4, and the last pair at position 6. This seems like a loop along ``arr`` with step 2. 

Next, look at the column for ``size=2``. The first pair begins at 0 and the second pair at 4. This seems like a loop along ``arr`` with step 4.

And the last column, for ``size=4`` has only one pair that begins at 0. This seems like a loop along ``arr`` with step 8.

In each case above, we need a loop along ``arr``. The step of the loop changes as the size of subarrays changes. Specifically, the loop step is ``2*size``. This implies that the loop along ``arr`` depends on the loop for ``size`` we wrote above, leading to the following nested loop snippet:

.. code-block:: java
   
   for (int size = 1; size <= arr.length/2; size = size*2) {
       for (int i = 0; i < arr.length; i = i+2*size) {
       }
   }

Are we there yet? Yes we are! We have a loop that tells us where each pair of a given size begins. Each part of the pair is an array with ``size`` number of elements. If the first part of the pair begins at position ``i``, it corresponds to the array slice ``arr[i:i+size]``. Its second part will begin at ``i+size`` and because it also have ``size`` elements it will end at ``i+size+size``. The two subarrays in this pair, let's call them ``foo`` and ``bar`` can be added to the code:

.. code-block:: java

   for (int size = 1; size <= arr.length/2; size = size*2) {
       for (int i = 0; i < arr.length; i = i+2*size) {
           int[] foo = arr[i:i+size];
           int[] bar = arr[i+size:i+2*size];
       }
   }

Finally, we need to merge the pair ``foo`` and ``bar`` and place their merged result back to the array, at the positions where that pair came from.

.. code-block:: java

   for (int size = 1; size <= arr.length/2; size = size*2) {
       for (int i = 0; i < arr.length; i = i+2*size) {
           int[] foo = arr[i:i+size];
           int[] bar = arr[i+size:i+2*size];
           int[] temp = merge(foo,bar);  // merge the pair
           for (int j = 0; j < temp.length; j++) {
               arr[i+j] = temp[j];  // copy the result of the merge back to input arr
           }
       }
   }

In the code snippets above we use the notation ``arr[p:q]`` to indicate a subarray of ``arr`` from position ``p`` up to but not including ``q``. This notation does not work in Java, so we'll need to replace it either with a our own method (e.g., ``slice``) or use ``Arrays.copyOfRange``. The final method is shown below.


.. code-block:: java

    static void iterativeMergeSort(int[] arr) {
        for (int size = 1; size <= arr.length/2; size = size*2) {
            for (int i = 0; i < arr.length; i = i+2*size) {
                int[] foo = slice(arr, i, i+size);
                int[] bar = slice(arr, i+size, i+2*size);
                int[] temp = merge(foo,bar);  // merge the pair
                for (int j = 0; j < temp.length; j++) {
                    arr[i+j] = temp[j];  // copy the result of the merge back to input arr
                }
            }
        }
    }

The code above can be shortened a bit by omitting ``foo`` and ``bar`` and writing the slices directly into the merge method:

.. code-block:: java

    static void iterativeMergeSort(int[] arr) {
        for (int size = 1; size <= arr.length/2; size = size*2) {
            for (int i = 0; i < arr.length; i = i+2*size) {
                int[] temp = merge(slice(arr, i, i+size), slice(arr, i+size, i+2*size));  // merge the pair
                for (int j = 0; j < temp.length; j++) {
                    arr[i+j] = temp[j];  // copy the result of the merge back to input arr
                }
            }
        }
    }