And vs Modulo performance: Difference between revisions

From EggeWiki
m (New page: In school I was taught the modulo operator is slow compared with bitwise operators. This is because often modulo is implemented using divide. See http://en.wikipedia.org/wiki/Modulo_oper...)
 
mNo edit summary
 
Line 1: Line 1:
In school I was taught the modulo operator is slow compared with bitwise operators.  This is because often modulo is implemented using divide.  See http://en.wikipedia.org/wiki/Modulo_operation.  So when I write code which is going to sample a variable, I often use AND instead of modulo.  Well, it turns out, I may have fallen victim to premature optimization - at least in Java anyway.
In school I was taught the modulo operator is slow compared with bitwise operators.  This is because often modulo is implemented using divide.  See http://en.wikipedia.org/wiki/Modulo_operation.  So when I write code which is going to sample a variable, I often use AND instead of modulo.  Well, it turns out, I may have fallen victim to premature optimization - at least in Java anyway.


<geshi lang="java">
<syntaxhighlight lang="java">
public class ModuloSpeed {
public class ModuloSpeed {


Line 45: Line 45:
}
}
</geshi>
</syntaxhighlight>


Running on my desktop (Intel Pentium D CPU 2.8GHz) yielded:
Running on my desktop (Intel Pentium D CPU 2.8GHz) yielded:
<geshi>
<syntaxhighlight>
Warm-up completed (-894978952)
Warm-up completed (-894978952)
Warm-up completed (-868618467)
Warm-up completed (-868618467)
Line 55: Line 55:
1000000000 ands completed in 1484ms (-1349018880)
1000000000 ands completed in 1484ms (-1349018880)
1000000000 modulos completed in 15485ms (715752192)
1000000000 modulos completed in 15485ms (715752192)
</geshi>
</syntaxhighlight>


These tests show AND is about 10 times faster than modulo.  I next decided to run the same test on a server (Dual-Core AMD Opteron Processor 2216).   
These tests show AND is about 10 times faster than modulo.  I next decided to run the same test on a server (Dual-Core AMD Opteron Processor 2216).   


<geshi>
<syntaxhighlight>
$ java ModuloSpeed
$ java ModuloSpeed
Warm-up completed (930431069)
Warm-up completed (930431069)
Line 72: Line 72:
1000000000 ands completed in 877ms (-1349018880)
1000000000 ands completed in 877ms (-1349018880)
1000000000 modulos completed in 3244ms (715752192)
1000000000 modulos completed in 3244ms (715752192)
</geshi>
</syntaxhighlight>


On the AMD hardware, AND is only five times faster, but still fast.  However, as soon as I add another operation, like comparing the system clock, the performance is about 100x worse, and one ends up measuring the performance of the other function.  Very few loops in Java are going to run 1 billion operations per second.  So while the performance of AND is faster, unless I was writing a high performance routine, I wouldn't concern myself with the difference.
On the AMD hardware, AND is only five times faster, but still fast.  However, as soon as I add another operation, like comparing the system clock, the performance is about 100x worse, and one ends up measuring the performance of the other function.  Very few loops in Java are going to run 1 billion operations per second.  So while the performance of AND is faster, unless I was writing a high performance routine, I wouldn't concern myself with the difference.


[[Category:Java]] [[Category:Solaris]]
[[Category:Java]] [[Category:Solaris]]

Latest revision as of 18:47, 10 December 2011

In school I was taught the modulo operator is slow compared with bitwise operators. This is because often modulo is implemented using divide. See http://en.wikipedia.org/wiki/Modulo_operation. So when I write code which is going to sample a variable, I often use AND instead of modulo. Well, it turns out, I may have fallen victim to premature optimization - at least in Java anyway.

public class ModuloSpeed {


	public void testWarmup() {
		long s = System.currentTimeMillis();
		int sum = 0, count = 0;
		while(System.currentTimeMillis() - s < 10000) {
			sum += count++ & 0xff % 200;
		}

		System.out.println("Warm-up completed (" + sum + ")");
	}
	
	public void testAnd() {
		long s = System.currentTimeMillis();
		int sum = 0, count = 0;
		while(count < 1000000000) {
			sum += count++ & 0xff;
		}

		System.out.println(count + " ands completed in " + (System.currentTimeMillis() - s) + "ms (" + sum + ")");
	}
	
	public void testModulo() {
		long s = System.currentTimeMillis();
		int sum = 0, count = 0;
		while(count < 1000000000) {
			sum += count++ % 200;
		}

		System.out.println(count + " modulos completed in " + (System.currentTimeMillis() - s) + "ms (" + sum + ")");
	}
	
	public static void main(String[] args) {
		ModuloSpeed m = new ModuloSpeed();
		m.testWarmup();
		m.testAnd();
		m.testModulo();
		m.testAnd();
		m.testModulo();
	}
	
}

Running on my desktop (Intel Pentium D CPU 2.8GHz) yielded:

Warm-up completed (-894978952)
Warm-up completed (-868618467)
1000000000 ands completed in 1453ms (-1349018880)
1000000000 modulos completed in 15579ms (715752192)
1000000000 ands completed in 1484ms (-1349018880)
1000000000 modulos completed in 15485ms (715752192)

These tests show AND is about 10 times faster than modulo. I next decided to run the same test on a server (Dual-Core AMD Opteron Processor 2216).

$ java ModuloSpeed
Warm-up completed (930431069)
1000000000 ands completed in 781ms (-1349018880)
1000000000 modulos completed in 3249ms (715752192)
1000000000 ands completed in 766ms (-1349018880)
1000000000 modulos completed in 3289ms (715752192)

$ java -d64 ModuloSpeed
1000000000 ands completed in 814ms (-1349018880)
1000000000 modulos completed in 3666ms (715752192)
1000000000 ands completed in 877ms (-1349018880)
1000000000 modulos completed in 3244ms (715752192)

On the AMD hardware, AND is only five times faster, but still fast. However, as soon as I add another operation, like comparing the system clock, the performance is about 100x worse, and one ends up measuring the performance of the other function. Very few loops in Java are going to run 1 billion operations per second. So while the performance of AND is faster, unless I was writing a high performance routine, I wouldn't concern myself with the difference.