Elliott C. Back: Internet & Technology

Sorting in Linear time

Posted in Computers & Technology, Scalability by Elliott Back on January 7th, 2006.

I recently followed an algorithms trail from Digg where I found sorting algorithm called Flashsort that claims to sort arbitrary elements in linear time. Since java source code is provided, I’ll take them up on it, and compare java.util.Arrays.sort with flash sort and see what can be learned:

Wow!  Linear!

So, overall, flashsort is linear, but with a higher time constant than Java’s built in sorting algorithms. Strange, but it probably only works on integers, and a description of the algorithm makes it sound like a bucket sort. Code after the cut:

import java.util.Arrays;

public class Flashsort {

    public static void main(String[] args) {
        System.out.println("length\tjava:\tflashsort:");
        
        for(int i = 50; i < Integer.MAX_VALUE – 10; i *= 1.5){
            int n = 50;
            int flashtotal = 0;
            int javatotal = 0;
            
            for(int j = 0; j < n; j++){
                int [] arr1 = generateRandomArray(i);
                int [] arr2 = arr1.clone();
            
                long flashtime = System.currentTimeMillis();
                partialFlashSort(arr1);
                insertionSort(arr1);
                flashtotal += System.currentTimeMillis() – flashtime;
            
                if(!checkArr(arr1)){
                    System.err.println("Array not sorted by Flash Sort!!");
                }
            
                long javatime = System.currentTimeMillis();
                Arrays.sort(arr2);
                javatotal += System.currentTimeMillis() – javatime;
            
                if(!checkArr(arr2)){
                    System.err.println("Array not sorted by Java!!");
                }
            }
            
            javatotal = javatotal / n;
            flashtotal = flashtotal / n;
            
            System.out.println(i + "\t" + javatotal + "\t" + flashtotal);
        }
    }

    /**
     * Generate the random array
     */

    private static int [] generateRandomArray(int n) {
        int [] a = new int[n];
        for (int i = 0; i < n; i++) {
            a[i] = (int) (Math.random() * 10000 * n);
        }
        return a;
    }
    
    /**
     * Check the array for correctness
     */
    private static boolean checkArr(int [] arr){
        for(int i = 0; i < arr.length – 1; i++){
            if(arr[i] > arr[i+1])
                return false;
        }
        
        return true;
    }
    
    private static void insertionSort(int [] a){
int i, j, hold;

for (i=a.length-3; i>=0; i–) {
if (a[i+1] < a[i]) {
hold = a[i];
j=i;

while (a[j+1] < hold) {
a[j] = a[j+1];
j++;
}

a[j] = hold;
}
}
}
    
    public static void printArr(int [] arr){
        for(int i = 0; i < arr.length; i++){
            System.out.print(arr[i] + " ");
        }
        System.out.print("\n");
    }

    /**
     * Partial flash sort
     */

    private static void partialFlashSort(int [] a) {
        int n = a.length;
        int m = n / 20;
        int [] l = new int[m];
        
        int i = 0, j = 0, k = 0;
        int anmin = a[0];
        int nmax = 0;

        for (i = 1; i < n; i++) {
            if (a[i] < anmin)
                anmin = a[i];
            if (a[i] > a[nmax])
                nmax = i;
        }

        if (anmin == a[nmax])
            return;

        double c1 = ((double) m – 1) / (a[nmax] – anmin);

        for (i = 0; i < n; i++) {
            k = (int) (c1 * (a[i] – anmin));
            l[k]++;
        }

        for (k = 1; k < m; k++){
            l[k] += l[k - 1];
        }

        int hold = a[nmax];
        a[nmax] = a[0];
        a[0] = hold;

        int nmove = 0;
        int flash;
        j = 0;
        k = m – 1;

        while (nmove < n – 1) {
            while (j > (l[k] – 1)) {
                j++;
                k = (int) (c1 * (a[j] – anmin));
            }

            flash = a[j];

            while (!(j == l[k])) {
                k = (int) (c1 * (flash – anmin));

                hold = a[l[k] – 1];
                a[l[k] – 1] = flash;
                flash = hold;

                l[k]–;
                nmove++;
            }
        }
    }
}

This entry was posted on Saturday, January 7th, 2006 at 2:34 pm and is tagged with . You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.

4 Responses to “Sorting in Linear time”

  1. e2 says:

    Here is the Java Standard sort(). It does not seem to use any tricks like not bounds checking. It just improves upon quicksort a bit.(copied from java standard source)

    // Insertion sort on smallest arrays
    if (len < 7) {
    for (int i=off; ioff && x[j-1]>x[j]; j–)
    swap(x, j, j-1);
    return;
    }

    // Choose a partition element, v
    int m = off + (len >> 1); // Small arrays, middle element
    if (len > 7) {
    int l = off;
    int n = off + len – 1;
    if (len > 40) { // Big arrays, pseudomedian of 9
    int s = len/8;
    l = med3(x, l, l+s, l+2*s);
    m = med3(x, m-s, m, m+s);
    n = med3(x, n-2*s, n-s, n);
    }
    m = med3(x, l, m, n); // Mid-size, med of 3
    }
    int v = x[m];

    // Establish Invariant: v* (v)* v*
    int a = off, b = a, c = off + len – 1, d = c;
    while(true) {
    while (b <= c && x[b] = b && x[c] >= v) {
    if (x[c] == v)
    swap(x, c, d–);
    c–;
    }
    if (b > c)
    break;
    swap(x, b++, c–);
    }

    // Swap partition elements back to middle
    int s, n = off + len;
    s = Math.min(a-off, b-a ); vecswap(x, off, b-s, s);
    s = Math.min(d-c, n-d-1); vecswap(x, b, n-s, s);

    // Recursively sort non-partition-elements
    if ((s = b-a) > 1)
    sort1(x, off, s);
    if ((s = d-c) > 1)
    sort1(x, n-s, s);
    }

  2. Robert says:

    Here’s a working example where Flashsort prevails.

    Not in Java, but in Adobe Flash, ironically.

    http://blog.inspirit.ru/?p=271

  3. Elliott Back says:

    I totally agree with you!

  4. Andrasek says:

    I feel like I should mention that you’ll never be able to write an algorithm *in Java* that is faster than the in-built .sort() algorithm. The reason for this is due to the overhead from using arrays in Java.

    For example, it’s not possible for an array to have a subscript that is beyond the bounds of the array. For example, you can’t have a 100 element array and then reference cell 105. Java throws an error.

    This simple range checking functionality on every single instance of referencing an array causes a noticeable amount of overhead. Imagine putting an “if” check before every array subscript usage. I count 24 array subscripts in the PartialFlashSort algorithm alone. That’s 24 extra checks your code performs that the in-built .sort() routine can ignore.

    Also, the inbuilt .sort() algorithm uses a “tuned quicksort” which performs significantly better than the standard quicksort on several troublesome data sets. Trying to beat a quicksort is a challenge in itself. If you add to that the additional overhead that the Java language adds, there’s no chance that any Java-based sorting algorithm can compete with the internal Java .sort() routine.

Leave a Reply

Powered by WP Hashcash