Jump to content

Lowest distance

- - - - -

  • Please log in to reply
18 replies to this topic

#1
boost

boost

    Newbie

  • Members
  • Pip
  • 9 posts
Hi.

I've got the problem to solve:

We've got n numbers. The "distance" between two numbers d(A,B) is the amount of prime numbers (got after factorization) of the number A*B/GCD(A,B)^2. We have to output (for every given numbers) the lowest position of number whose distance between the actual number is lowest. Of course the position of actual number can't be the same as the position of number to output (but the values can).

For example:

We've got six numbers:

1,2,3,4,5,6

And the correct output is: 2,1,1,2,1,2.

I've already written the algorithm working in O(n^2) - which compares every numbers, counts the minimal distance and position and output it. But unfortunately that's too slow. I need someting working in O(nlogn). Could somebody help me with this problem?

Thanks and sorry for my bad English.

#2
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
Can you show us your current logic? There may be a simple fix to the code, or it may require a completely new algorithm.

Also, given the small numbers, why are you worried about the algorithmic complexity?
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#3
boost

boost

    Newbie

  • Members
  • Pip
  • 9 posts
Generally the distance between two numbers d(A,B) is called "the number of operations to get number A from number B and by operation we understand division by some prime number or multiplication by some prime number. After some operations I realised that "d(A,B) is the amount of prime numbers (got after factorization) of the number A*B/GCD(A,B)^2".

My algorithm is:


1. we got int TAB[500001];

2. scanf n

3. for from 0 to n [i]:

      scanf TAB[i]

4. for from 0 to n [i]:

      for from 0 to n [j]:

          check if i!=j, count the d(TAB[i],TAB[j]), check if d(TAB[i],TAB[j]) is lower than prevoius minimum, if so - set: int out=j;

      printf out;


and my function to count the d(A,B) is:


long long int countPrimes(long long int x){

	long long int i, e, how;

	i = 2;

	e = (long long int)sqrt(x);

	while (i <= e) {

		while ((x % i) == 0) {

			x /= i;

			e = (int)sqrt(x);

			how++;

		}

		i++;

	}

	if(x > 1){how++;};

	return how;

}



long long int gcd(long long int a, long long int b){

	while(b){swap(a%=b,b);}

	return a;

}


long long int d(long long int a, long long int b){

	return  countPrimes(a*b/pow(gcd(a,b),2));

}


As you can see, it runs in O(n^2), but n is <=500000, so it takes hours to count for the maxtests. I'm afraid it requires to think about new algorithm.

#4
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
I have a few problems with your code, and your analysis.

1) sqrt(x) is expensive. You call it several times, but it isn't needed if you instead compare i*i to x. You would really need to do some tests to see which is cheaper.
2) Your outer loop runs a maximum of sqrt(x) times. The inner loop runs a maximum of how times. Since how is a maximum of log_2(x), your inner loop is contributing ln(x), not x.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#5
boost

boost

    Newbie

  • Members
  • Pip
  • 9 posts
Oops, my bad. The maxiumum size of n is 100000, so it probably needs an algorithm running in O(nlogn).

Mhm... the loops are:


for(int i=0;i<n;++i){

   for(int j=0;j<n;++j){

      // operations

   }

}


So it runs n*n times. Sure, sqrt(x) is expensive, but if I have an algorithm running in logarithmic time, it wouldn't change the time so much. The maximum time this algorithm can run for maxtests (where n=100000) is about 3-4 seconds. I don't know. Maybe I don't have to count the distance every time... I have no idea how to solve this problem without comparing every elements and counting the output. I think about some hack with GCD function, but it seems to be a bit heavier problem...

#6
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
Your original code has no for loops, just while loops.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#7
boost

boost

    Newbie

  • Members
  • Pip
  • 9 posts
The part of code I've presented counts the "distance" between two numbers A and B. I use it in general part of my solution, which compares all the numbers and counts distance between them using the several functions I've posted before. These runs quite fast, but the main part of the program looks like for(for()), which generates O(n^2) complexity, where to solve this problem correctly, it requires O(nlogn).

#8
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
At a guess, you aren't taking advantage of the symmetry in the problem. For example, if 1's closest neighbor is 2, then 2's is 1.
As you build your list, you can also take advantage of some other properties, for example: when looking at 1, 2, 6, and 27: 1 has distances of 1, 2, and 3 to the others. That immediately tells you the distance between 2 and 6 is AT LEAST (2-1)=1, and that the distance between 6 and 27 is AT LEAST (3-2)=1 (though actually 3). So, if you keep track of the distances you've calculated so far, you can quickly determine certain items are disqualified as potential nearest neighbors.

As a side note, this problem feels really familiar. Generally, these types of problems require you to do one of two things: you either need to make a space/time complexity tradeoff, or you have to use a non-intuitive algorithm as you search. For example, you have a neat algorithm to measure distance, but if you start by building a list of primes that are used by ALL your numbers, then you may want to build UP from your smaller numbers to larger numbers, keeping track of distances.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#9
lethalwire

lethalwire

    while(false){ ... }

  • Members
  • PipPipPipPipPipPipPip
  • 748 posts
  • Programming Language:Java, PHP
  • Learning:Java, PHP

WingedPanther said:

As a side note, this problem feels really familiar. Generally, these types of problems require you to do one of two things: you either need to make a space/time complexity tradeoff, or you have to use a non-intuitive algorithm as you search. For example, you have a neat algorithm to measure distance, but if you start by building a list of primes that are used by ALL your numbers, then you may want to build UP from your smaller numbers to larger numbers, keeping track of distances.

Dynamic Programming?

#10
boost

boost

    Newbie

  • Members
  • Pip
  • 9 posts
Okay, I generally understand your idea, but I'm still wondering how to place it in code. To make my mind clear... could you tell me how does this solution work for this test:

n=10

10

6

9

7

2

1

5

4

3

8

After that I'll try to force it and code.

Thanks so much for all your help ;)

#11
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
10 has the following distances:
6: 2, 9: 4, 7: 3, 2: 1, 1: 2, 5: 1, 4: 2, 3: 3, 8: 3 This immediately gives us boundaries for the remaining pairs of [a-b] and [a+b] where a is distance from 10 of one number, and b is distance from 10 of the other.
   6   9   7   2   1   5   4   3   8
6  - 
9 2-6  - 
7 1-5 1-7  - 
2 1-3 3-5 2-4  - 
1 0-4 2-6 1-5 1-3  - 
5 1-3 3-5 2-4 0-2 1-3  - 
4 0-4 2-6 1-5 1-3 0-4 1-3  - 
3 1-5 1-7 0-6 2-4 1-5 2-4 1-5  - 
8 1-5 1-7 0-6 2-4 1-5 2-4 1-5 0-6  - 
This gives you an order to go through comparisons with 6, and you can use the results to refine your ranges.

I'll say right now, that the big thing this does is cut down on the number of calls to countPrimes, which is an expensive function. I'd estimate that calling countPrimes really makes your program more like O(n^3 ln(n)) If you can cut way down on calls to it, that will save you a lot on processing time.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#12
boost

boost

    Newbie

  • Members
  • Pip
  • 9 posts
Okay, so what will be the output and how to get it? I can see, that you count only the value of minimal distance, but how to count minimal position of the number with minimal distance?




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users