I got a very strange result today. I was looking for fun at C# and I wrote a small simulation with it, a Wright-Fisher simulation to be exact. I took my code from C, made some small modifications to make it work in C#, and I got the following result;
C = With MinGW = ~52 sec.
C# 3.5 SP1 = ~36 sec.
Why is C# faster ?
I use the Mersenne Twister to generate the random number. For the C implementation I use the new Fast Mersenne Twister, which make extensive use of the preprocessor to generate efficient code. With C# I use an implementation written in 2003. So the Mersenne Twister that I use with C# should be slower. In fact the simulation in C# is even faster with their standard RNG, but I use the Mersenne Twister to get a fair comparison.
I'm surprised by this result, I thought C# would be as slow as Java for Monte Carlo simulations, perhaps a little better as the language can be optimized for windows, but I never expected C# to beat C. Of course it's just a single program but still I would really like to understand how a relatively high level language such as C# can be faster than C, which is directly compiled to machine code.
Here's the code for the two simulations;
Code:using System; using CenterSpace.Free; // Mersenne Twister (from c-sharpcorner dot com) namespace WrightFisher { class WF { static void Main(string[] args) { MersenneTwister generator = new MersenneTwister(); int x = 1000; // Number of simulations int t = 1000; // Number of generations/simulations int Ne = 1000; // Effective population size int i0 = 500; // Initial number of allele 'A1' int i; // Used for trials int i1; // Alleles at the end of a generation int[] f = new int[x]; // Vector for the number of allele 'A1' at the end of each simulation // Features of A1; double s = 0.00; // Selection coefficient for 'a' // Store the probabilities; double[] p = new double[Ne + 1]; for (int j = 0; j < Ne + 1; j++) { p[j] = ((1.0 + s) * j) / ((1.0 + s) * j + Ne - j); } // The simulation; for (int sim = 0; sim < x; sim++) { i1 = i0; for (int gen = 1; gen < t + 1; gen++) { i = 0; for (int allele = 0; allele < Ne; allele++) if (generator.NextDouble() < p[i1]) ++i; // NextDouble() uses genrand_real2() i1 = i; } f[sim] = i1; } int ext = 0, fix = 0, pol = 0; for (int j = 0; j < x; j++) { if (f[j] == 0) ++ext; else if (f[j] == Ne) ++fix; else ++pol; } Console.WriteLine("Fixation = " + (double)fix / x); Console.WriteLine("Polymorphism = " + (double)pol / x); Console.WriteLine("Extinction = " + (double)ext / x); } } }Code:#include <stdio.h> #include <stdlib.h> #include <time.h> // For the initialization of the pseudorandom number generator #define MEXP 19937 #include "RNG/SFMT.c" int main(void) { init_gen_rand((unsigned)time(NULL)); // Initialization of the RNG int x = 1000; // Number of simulations int t = 1000; // Number of generations/simulations int Ne = 1000; // Effective population size int i0 = 500; // Initial number of allele 'A1' int i; // Used for trials int i1; // Alleles at the end of a generation int f[x]; // Vector for the number of allele 'A1' at the end of each simulation // Features of A1; double s = 0.0; // Selection coefficient for 'a' double p[Ne+1]; // Store the probability int j; for(j = 0; j < Ne+1; j++) p[j] = ((1.+s)*j)/((1.+s)*j+Ne-j); int sim, gen, allele; for(sim = 0; sim < x; sim++) { i1 = i0; for(gen = 1; gen < t+1; gen++) { i = 0; for(allele = 0; allele < Ne; allele++) { if(genrand_real2() < p[i1]) ++i; // i += (genrand_real2() < p[i1]) // Faster ??? } i1 = i; } f[sim] = i1; } int ext = 0, fix = 0, pol = 0; for(j = 0; j < x; j++) { if(f[j]==0) ++ext; else if(f[j]==Ne) ++fix; else ++pol; } printf("Fixation = %.4f\n", (double)fix/x); printf("Polymorphism = %.4f\n", (double)pol/x); printf("Extinction = %.4f\n", (double)ext/x); return 0; }
Without knowing your compiler options, it's hard to comment. MinGW is, in general, not the best compiler for highly efficient code, but the optimization level can make a huge difference. If you chop out sections of the code and compare the timing it may give you some insight as to where the issue came from.
I made some other tests and C is about as fast as C#... the problem was the Mersenne Twister.
Still I wonder how C# can compete in terms of speed with languages directly compiled to machine code such as C and C++. I saw some tests where C# was noticeably faster than C/C++.
Like all things, it depends on the code and whether it's a generic algorithm or optimized to function well for the language.
in all tests i have done c# is generally the speed of unoptimized c++, unfortuantly. however in most applications this simply doesn't matter.
Posted via CodeCall Mobile
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks