Parallelize odd-even sort algorithm in python

I developed the following serial code to perform Odd-even sort, which first sorts The Odd/Even indexes and then odd/even.

import time

def oddevenSort(x):
    sorted = False
    while not sorted:
        sorted = True

        #ordena indices par/impar
        for i in range(0, len(x)-1, 2):
            if x[i] > x[i+1]:
                x[i], x[i+1] = x[i+1], x[i]
                sorted = False

        #ordena indices impar/par
        for i in range(1, len(x)-1, 2):
            if x[i] > x[i+1]:
                x[i], x[i+1] = x[i+1], x[i]
                sorted = False
    print(x)
    return x


tic = time.time()
oddevenSort([4,3,2,1,3,6,9,1,2,3,54,76,98,12,333,331,33,1,7,9])
toc = time.time()

print("tempo gasto para processar = " + str(toc-tic)+"s")

Now I need to create for this same sort algorithm a parallel code in python in order to be able to calculate the speedup (time gain between running serial code VS parallel code).

The first approach I was having was to create a thread for even / odd ordering and another one for odd/even ordering, but I can't figure out what the behavior of this code is going to be like. In my view this algorithm should run the whole Odd / Even part and then in sequence the odd/even part. Thus, I would have to synchronize thread 1 and thread 2, preventing the code from occurring in parallel.

Example thread 1 (Even / Odd):

for i in range(0, len(x)-1, 2):
    if x[i] > x[i+1]:                  
        x[i], x[i+1] = x[i+1], x[i]    
        sorted = False   

Example thread 2 (odd/even):

for i in range(1, len(x)-1, 2):
    if x[i] > x[i+1]:
        x[i], x[i+1] = x[i+1], x[i]
        sorted = False

A second approach would be to parallelize the even/odd sort (type thread 1 takes the first half of the numbers and sorts while thread 2 takes the second half and sorts). Same thing for odd/even part (thread 3 sorts half and thread 4 sorts the other half).

Example: Full vector: (9,8,7,6,5,4,3,2,1,0)
Thread 1 sort: 9,8,7,6,5
Thread 2 sort: 4,3,2,1,0
Expected result thread 1: 5,6,7,8,9
Expected result thread 2: 0,1,2,3,4
Expected result of final sorting: 0,1,2,3,4,5,6,7,8,9

Approach 3? Suggestions? If there is no other more efficient way, I would like help to implement one of the two suggestions.

Author: lucasgvarela, 2017-06-20

1 answers

General idea for sorting

To make the most of parallelism, try to use as little as possible the communication between parallel tasks, so there will be no need for synchronizations during the computations made and leaving specific moments to synchronize. This computing model is called BSP .

So that there is no competition in the process, we can use a scheme similar to that of bitonic merge sort, both processing and preview.

The scheme of the bitonic merge sort is based on sort networks. A sorting network consists of horizontal wires representing the positions in the array, and also of vertical Bridges connecting wires. Each bridge connects only and exclusively two wires; on each bridge, the elements of the two connected wires are compared and, if necessary, exchanged. It also has another interesting property in sorting Networks: time passes from left to right, and one wire can only be on one bridge at a time.

See Here a sorting network:

sorting network for 8 wires

Well, in our case, for even/odd ordering, we have a repeat of the following ordering network:

---+--------------
   |
---+--+-----------
      |
---+--+-----------
   |
---+--+-----------
      |
---+--+-----------
   |
---+--+-----------
      |
---+--+-----------
   |
---+--------------

In this example network, we need 4 parallel processes in the even part and 3 parallel processes in the odd part.

On the part of sorted = False, note that any process you write in this variable it will make written of the same value. So if two simultaneous processes need to write the same value, the end result will be the same, regardless of the running condition. Therefore, it is not necessary in this case to do write synchronization on the variable sorted. The next part that requires the synchronization of the parts in parallel is just to know if it will be necessary to repeat the processing of the ordering network.

In general, the ordering would be more or less like this:

  1. move in parallel the swap function if necessary all index elements pair with their immediate successor
  2. wait for the parallelism of the previous step to end...
  3. move in parallel the swap function if necessary all odd index elements with their immediate successor
  4. wait for the parallelism of the previous step to end...
  5. make sure you need to do the process again

Python implementation

O @Anderson Carlos Woss implemented this solution in Python:

from threading import Thread

c_sorted = False
swaps = 0

# Função que inverte os valores se necessário:
def swap (sequence, i, j):
    global c_sorted, swaps
    if sequence[i] > sequence[j]:
        sequence[i], sequence[j] = sequence[j], sequence[i]
        c_sorted = False
        swaps += 1

# Sequência a ser ordenada:
sequence = [9,8,7,6,5,4,3,2,1,0]

# Condição se a lista está ordenada:
while not c_sorted:
    c_sorted = True
    # Lista de threads para índices pares:
    evens = []

    # Swap entre todos os valores de índice par com seu sucessor imediato:
    for i in range(0, len(sequence), 2):
        t = Thread(target=swap, args=(sequence, i, i+1))
        t.run()
        evens.append(t)

    # Aguarda todas as threads anteriores:
    (t.join() for t in evens)

    # Lista de threads para índices ímpares:
    odds = []

    # Swap entre todos os valores de índice ímpar com seu sucessor imediato:
    for i in range(1, len(sequence)-1, 2):
        t = Thread(target=swap, args=(sequence, i, i+1))
        t.run()
        odds.append(t)

    # Aguarda todas as threads anteriores:
    (t.join() for t in odds)

print(sequence)
print("swaps: %d" % swaps)

See working on repl.it

 8
Author: Jefferson Quesado, 2018-03-05 23:51:16