Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Computer Science

How does merge sort work?

Read the above ^^

I looked it up on Wikipedia and a couple other pages but it is still very confusing... Help appreciated.

Thanks in advance. Tagging Pasan Premaratne

I just noticed that this is actually the first Computer Science question... ever. :D

5 Answers

Robert Stefanic
Robert Stefanic
35,170 Points

Merge sort works by continuously/recursively dividing your data into smaller sub-sets, sorting those smaller sub-sets (because sorting a small sub-set is easier than sorting a large set), and then merging all of the smaller sets together.

So in steps, merge sort does the following:

  1. Divides all of your data into their single elements
  2. Merge and sort from the smaller lists until you have a single sorted list

So let's walk through an example. Let's say that you have a set of 6 numbers, and you want them in ascending order.

{AllElements} = [6, 5, 4, 3, 2, 1]

Following the first step of a merge sort, we will break these 6 numbers into two sets of 3 numbers (we'll call these sets of 3 set {A} and set {B}).

Dividing {AllElements}:

{A} = [6, 5, 4]
{B} = [3, 2, 1]

Now we divide all the elements again with {A}.

Dividing {A}:

{A1} = [6, 5]
{A2} = [4]

And again now to {A1}.

Dividing {A1}:

{A1-1} = [6]
{A1-2} = [5]

Now we can't divide {A1-1} and {A1-2} any further. They are in their most basic form. So now we can move on to sorting and merging them. While we are merging these sets, we are sorting them as well.

Sort and Merge {A1-1} and {A1-2}:

{A1'} = [5, 6]

Great! {A1'} is now sorted. So now we go back to our subset of {A2}. Can we divide {A2} any further? No, we can't. So again, now we merge sort {A1'} and {A2}.

Sort and Merge {A1'} and {A2}:

{A'} = [4, 5, 6]

{A} has now been sorted into {A'}.

We then repeat the process with set {B}, and we'll end up with {B'}, which is {B}, but sorted (just like {A} was sorted into {A'}). Once we have {A'} and {B'}, we can then merge both of these sets, and we'll end up with a sorted list using merge sort.

Pasan Premaratne
STAFF
Pasan Premaratne
Treehouse Teacher

Hey Alexander Davison,

I agree merge sort is confusing. Good news is we have a course releasing next month that covers it in sufficient detail. In general merge sort can be broken down into three steps: (1) Divide step. Here we keep splitting the array by half until we end up with several arrays that are either empty or contain no elements.

[10, 34, 7, 2, 18, 51]

# After first split we end up with two arrays that look like this. Let's focus only on the left array.
[10, 34, 7]  [2, 18, 51] 

# Here we carry out another split. 
[10] [34, 7]

# We can leave the left array alone now since it only has one element. Let's focus on the right array and split it.
[34] [7]

When we're down to two single element arrays we're going to work backwards. We'll compare the value in each array, and merge them back into one array, sorting them at the same time.

# Let's start where we left off with two single element arrays.
[34] [7]

# When we compare and merge them back, we get a sorted array. Since we're working backwards
# we're back to the point where we have a left and right array again. 

[10] [7, 34]

# Again, we'll compare the first element of each array and add them to a new array. Since 7 is lower than 10. We'll add 7 first and then 10. Since there's no value to compare with 34, we'll just go ahead and add that last. After comparing and merging, we go back one more step. 

# When we do it this way, the entire process looks like this.

     [10, 34, 7, 2, 18, 51]
  [10, 34, 7]      [2, 18, 51]
 [10] [34, 7]     [2] [18, 51]
[10]  [34] [7]   [2]   [18] [51]  # At this point we have a bunch of single element arrays and we can compare/sort/merge.
 [10] [7, 34]       [2]  [18, 51]
  [7, 10, 34]         [2, 18, 51]
      [2, 7, 10, 18, 34, 51] 

Hope this helps. There's a more detailed explanation in the upcoming course

Thanks for everybody's great answers! A follow-up question, though: How does the "merging" part work? I understand how you divide up the list/array, but how to you "merge" the split up lists/arrays?

Pasan Premaratne
Pasan Premaratne
Treehouse Teacher

Generally by creating a new array with the sorted elements and returning it. There is a in-place version of merge sort (using one array for the entire process) but its harder to implement

No, I understand that, my question is how do you actually get from...

[[2, 4], [1, 3]]

...to this:

[1, 2, 3, 4]
Pasan Premaratne
Pasan Premaratne
Treehouse Teacher

First some minor clarification. When arrays are split they are not nested inside an array. Each split basically assigns the elements to new arrays.

[2, 4], [1, 3]

In this example we would have two separate arrays, which are passed to a merge function. This is what the merge step looks like:

def merge(left, right):
   # New empty array that we add sorted elements to
    l = []

   # Variables to keep track of indexes across left and right array
    i=0
    j=0

   # As long as left and right contain elements, compare them
    while i < len(left) and j < len(right):
        # If element in left array is less than corresponding element in right, add element in left array to the empty "merged" array 
        if left[i] < right[j]:
            l.append(left[i])
            i=i+1
        else:
        # Otherwise add element from right array to "merged" array
            l.append(right[j])
            j=j+1

    # If there are elements left over in left array after previous iteration, add them all to l
    while i < len(left):
        l.append(left[i])
        i=i+1

    # If there are elements left over in right array after previous iteration, add them all to l
    while j < len(right):
        l.append(right[j])
        j=j+1

    return l

Hope this clears up any questions. There's no magic going on. Just a manual process of grab one element from left, grab one element from right, append whichever one is less than to the new array. Iterate through all the elements this way and then return the new array.

Thanks! :)

I managed to refactor your merge function, as well:

def merge(left, right):
    list = []
    i = 0
    j = 0
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            list.append(left[i])
            i += 1
        else:
            list.append(right[j])
            j += 1
    list.extend(left[i:])
    list.extend(right[j:])
    return list
Pasan Premaratne
Pasan Premaratne
Treehouse Teacher

Thanks! :) The approach we've taken with these courses is to define the algorithms in a way that can be understood by students with any sort of programming experience and language background, which means we try to avoid (at least in the videos) Python specific or syntax optimized implementations. We've tried to be as explicit as possible with each line of code so that you don't have to work too hard to translate it to your language. We have however included text based steps with more refactored versions like yours in various languages that you can use.

I do like your refactor though ?

Balazs Peak
Balazs Peak
46,160 Points

Pasan Premaratne I created a small research study about the efficiency of these algorithms. The results are not so visual though since the Python collection could ot handle that much number of records that would challenge the binary search algorithm with my computer. I summarized the results, check out the readme here: https://github.com/bradib0y/searchalgorithms

Cool :)

Troy Wilson
seal-mask
.a{fill-rule:evenodd;}techdegree seal-36
Troy Wilson
Front End Web Development Techdegree Graduate 19,559 Points

So from what I understand, the merge sort algorithm is more of a split, then sort, then merge algorithm. So we have a list [12,4,8,23,25,90]. The first step is breaking them down into their own separate arrays down the middle so, [12,4,8] and [23,25,90]. That step is repeated until they're all in single variable arrays. So [12],[4],[8],[23][25],[90]. Then you work back up comparing the values. So 12 and 4 are compared and placed back into an array together [4,12] and then this step repeats back upward. eventually, you'll end up with a sorted array [4,8,12,23,25,90].