Jump to content

Sorting efficiency

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
9 replies to this topic

#1
chili5

chili5

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 7,247 posts
I'm working on this problem that is basically this:

Quote

Given a list of numbers, you are to sort them into descending order.


Input
t – the number of numbers in list, then t lines follow [t <= 10^6].
Each line contains one integer: N [0 <= N <= 10^6]

Output

Output given numbers in non decreasing order.
Example

Input:

5
5
3
6
7
1

Output:

1
3
5
6
7

The major problem I am running into is the built-in functions in Java and C++ are taking way too long. I figure if the built-in functions in the JDK and STL are not able to handle it, then if I write my own sorting algorithm that won't work also. Then again maybe quick sort and merge sort are not the right sorts, and I should be using my own quicksort with insertion sort of something to speed it up?

I'm obviously missing something because these codes are not working fast enough:

class Main {
    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        Scanner sin = new Scanner(System.in);
        ArrayList<Integer> aloNums = new ArrayList<Integer>();

        int nNums = sin.nextInt();

        for (int i=0;i<nNums;i++) {
            aloNums.add(sin.nextInt());
        }

        Collections.sort(aloNums);

        for (int i=0;i<aloNums.size();i++) {
            System.out.println(aloNums.get(i));
        }
    }
}

#include <stdlib.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;

/*
 * 
 */
int main(int argc, char** argv) {
    vector<int> v;
    int nNum;
    int n;

    cin >> nNum;

    for (int i=0;i<nNum;i++) {
        cin>>n;
        v.push_back(n);
    }

    sort(v.begin(),v.end());

    for (int i=0;i<nNum;i++) {
        cout << v[i] << endl;
    }
    

    return 0;
}

It is going to have to use some kind of list, or vector as an array is not big enough, I tried. Thoughts?

#2
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
Balanced Binary tree. Insert the items (non-repeating) into the tree, then traverse to produce the output.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#3
BlaineSch

BlaineSch

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,448 posts
Sorting algorithm - Wikipedia, the free encyclopedia

Thats my recommendation!

Edit: just so you know that has an anchor link lol..

#4
chili5

chili5

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 7,247 posts
I came up with this but it is still taking to long.

import java.util.*;

/**
 *
 * @author James
 */
class Main {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args)  {
        Scanner sin = new Scanner(System.in);

        tree t = new tree();
        Node root;
        int x; // the number of cases

        x = sin.nextInt();

        for (int i=0;i<x;i++) {
            t.insert(sin.nextInt());
        }

        root = t.root;

        t.inorder(root);

        sin.close();
    }

}

class Node {
    int i; // the data at this number
    Node left; // a reference to the left child node.
    Node right; // a reference to the right child node.

    void displayNode() {
        System.out.println(i);
    }
}

class tree {
    /*
     * We do not need fields here for other nodes because they are accesssed
     * by traversing from root.
     */
    Node root; // the root node of the tree

    public void insert(int id) {
        Node newNode = new Node();
        Node current;
        Node parent;

        newNode.i = id;

        if (root == null) {
            root = newNode; // no root node
        } else {
            current = root;

            while (true) {
                parent = current;
                if (id < current.i) {
                    // go left
                    current = current.left;
                    if (current == null) {
                        parent.left = newNode;
                        return;
                    }
                } else {
                    // go right
                    current = current.right;
                    if (current == null) {
                        parent.right = newNode;
                        return;
                    }
                }
            }
        }
    }

    public void inorder(Node start) {
        if (start == null) {
            return;
        }
        inorder(start.left);
        start.displayNode();
        inorder(start.right);
    }    
}

This does work but it is taking to long. I think my problem is not in the displaying but adding the values to the structure. Obviously a vector was the wrong choice because I know it is slow with insertion.

#5
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
Given the amount of data you have, it may be slow, regardless. Perhaps a hash table?
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#6
chili5

chili5

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 7,247 posts
I'm not sure how that would help though.

#7
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
Have you done anything to see exactly where the speed is taking a hit? It could be file access.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#8
PythonPower

PythonPower

    Programming Professional

  • Members
  • PipPipPipPipPip
  • 230 posts

WingedPanther said:

Balanced Binary tree. Insert the items (non-repeating) into the tree, then traverse to produce the output.

That algorithm is O(n log n).

BlaineSch said:

Thats my recommendation!

That algorithm in O(n2).

Why not use a linear sorting algorithm?

Radix sort and counting sort are both extremely efficient for this problem. For example, this is a cheap implementation of counting sort:

package sort;

import java.util.Scanner;

public class Main {

    public static void main(String[] args)
	{
        Scanner in = new Scanner(System.in);

		int n = in.nextInt();
		int[] count = new int[1000001];

		for(int i=0; i<n; i++)
		{
			count[in.nextInt()]++;
		}

		for(int i=0; i<=1000000; i++)
		{
			for(int j=0; j<count[i]; j++)
			{
				System.out.println(i);
			}
		}
    }

}


#9
chili5

chili5

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 7,247 posts
Well Winged was right, the sorting is not the problem, reading the input is the problem. Using the scanner on my computer, reading the input takes 10 seconds.

So I guess there must be a more optimal way of reading the data?

#10
zoranh

zoranh

    Programming Professional

  • Members
  • PipPipPipPipPip
  • 207 posts
According to problem statement, values in the array are limited to 1,000,001 different values. This is quite small number these days, so bucket sort can be applied very easy.

In the most extreme version of bucket sort you don't sort values from the input, but rather count how many times each of the values appears in the input array. Then you just output each of the values as many times as it has appeared on input.

This is the C# implementation which sorts randomly generated million numbers in a blink of an eye. Though needs dozen of seconds to put them in file then.

using System;

namespace BucketSort
{
    class Program
    {
        static void Main(string[] args)
        {

            int maxValue = 1000000;
            int inputValuesCount = 1000000;
            int[] sortedCount = new int[maxValue + 1];
            int[] input = new int[inputValuesCount];

            Random rnd = new Random();
            for (int i = 0; i < inputValuesCount; i++)
                input[i] = rnd.Next(maxValue + 1);

            // Now comes the bucket sort
            for (int i = 0; i < input.Length; i++)
                sortedCount[input[i]]++;

            // Now comes the printing out
            for (int i = 0; i < sortedCount.Length; i++)
                for (int j = 0; j < sortedCount[i]; j++)
                    Console.WriteLine(i);

        }
    }
}
At the end, you should know that any comparison-based sorting algorithm (insert sort, quick sort, etc.) is bound to at least O(N logN) complexity, which is easily proved based on decision trees and Stirling's approximation of logarithm function (you can easily find this proof on Google I suppose). So technically, this conclusion stands firm for balanced trees and any other implementation which is based on comparing values.

So to run under O(N logN) you must do something that is not comparison based. Bucket sort is one of solutions.

And one curiosity for the end. Bucket sort and Radix sort were devised many years before personal computers. These algorithms had to be devised because processing machines were quite rudimentary in their structure and had to perform on extremely efficient algorithms in order to do anything in their domain.