Date created: Sunday, November 20, 2016 10:59:51 PM. Last modified: Tuesday, January 16, 2024 3:31:44 PM

## Multi-Process Parallel Sorting Comparison

A comparison of the same sorting algorithms run firsty using a single Python process and then split into multiple threads using the multiprocessing library in Python for parallelisation. The integer argument to each script is the length of a list to generate of random numbers which is used for sorting.

This script (single_process_sorts.py) shows the running time for various sorting algorithms run serially in a single Python thread:

```bensley@ubuntu-laptop:~/Python/threading\$ python single_process_sorts.py 5000
Bubble Sort:
Duration:  1.64284515381Selection Sort:Duration: 0.722052097321

Insertion Sort:
Duration:  1.59017801285
```

This script (multi_process_bubble_sort_worse.py) runs a bubble sorting algorithm over multiple processes however it actually takes longer than the standard single threaded version above:

```bensley@ubuntu-laptop:~/Python/threading\$ python multi_process_bubble_sort.py 5000
Bubble Sort:
Duration:  2.35310292244
```

This script (multi_process_insertion_sort_worse.py) runs an insertion sorting algorithm over multiple processes however it also takes longer that the standard single threaded version:

```bensley@ubuntu-laptop:~/Python/threading\$ python multi_process_insertion_sort.py 5000
Insertion Sort:
Duration:  1.67110109329
```

Taking a closer look at the Bubble sort algorithm, bubble sort has an upper bound performance of O(n^2). When the list is already sorted (the best case scenario) the complexity of bubble sort is only O(n) (at the least the whole input list must be walked to check it is in the required order already). The Bubble sort works by "bubbling" up the largest/smallest number to the top of the list on the first iteration, then the next largest/smallest to the top-1 list index on the second iteration, and so on.

In this first bubble sort multi-processing example the list is split into C chunks where C is the number of CPU cores. On the example machine there are four cores so C == 4.  This means the four quarters of random test data should be sorted in about a quarter of the time the single threaded version sorts the entire list:

```bensley@ubuntu-laptop:~/Python/threading\$ python multi_process_bubble_sort_worse.py 5000
Bubble Sort:
Duration:  0.423403024673
# Break after initial chunk sort ^