Java JMH Benchmark Tutorial
Benchmark (N) Mode Cnt Score Error Units BenchmarkLoop.loopFor 10000000 avgt 10 61.673 ± 1.251 ms/op BenchmarkLoop.loopForEach 10000000 avgt 10 67.582 ± 1.034 ms/op BenchmarkLoop.loopIterator 10000000 avgt 10 66.087 ± 1.534 ms/op BenchmarkLoop.loopWhile 10000000 avgt 10 60.660 ± 0.279 ms/op
In Java, we can use JMH (Java Microbenchmark Harness) framework to measure the performance of a function.
Tested with
- JMH 1.21
- Java 10
- Maven 3.6
- CPU i7-7700
In this tutorial, we will show you how to use JMH to measure the performance of different looping methods – for, while, iterator and foreach.
1. JMH
To use JHM, we need to declare jmh-core and jmh-generator-annprocess (JMH annotations)
<properties> <jmh.version>1.21</jmh.version> </properties> <dependencies> <dependency> <groupId>org.openjdk.jmh</groupId> <artifactId>jmh-core</artifactId> <version>${jmh.version}</version> </dependency> <dependency> <groupId>org.openjdk.jmh</groupId> <artifactId>jmh-generator-annprocess</artifactId> <version>${jmh.version}</version> </dependency> </dependencies>
2. JMH – Mode.AverageTime
2.1 JMH Mode.AverageTime example to measure the performance of different looping methods to loop a List containing 10 millions Strings.
package com.mkyong.benchmark; import org.openjdk.jmh.annotations.*; import org.openjdk.jmh.infra.Blackhole; import org.openjdk.jmh.runner.Runner; import org.openjdk.jmh.runner.RunnerException; import org.openjdk.jmh.runner.options.Options; import org.openjdk.jmh.runner.options.OptionsBuilder; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.concurrent.TimeUnit; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MILLISECONDS) @State(Scope.Benchmark) @Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"}) //@Warmup(iterations = 3) //@Measurement(iterations = 8) public class BenchmarkLoop { @Param({"10000000"}) private int N; private List<String> DATA_FOR_TESTING; public static void main(String[] args) throws RunnerException { Options opt = new OptionsBuilder() .include(BenchmarkLoop.class.getSimpleName()) .forks(1) .build(); new Runner(opt).run(); @Setup public void setup() { DATA_FOR_TESTING = createData(); @Benchmark public void loopFor(Blackhole bh) { for (int i = 0; i < DATA_FOR_TESTING.size(); i++) { String s = DATA_FOR_TESTING.get(i); //take out n consume, fair with foreach bh.consume(s); @Benchmark public void loopWhile(Blackhole bh) { int i = 0; while (i < DATA_FOR_TESTING.size()) { String s = DATA_FOR_TESTING.get(i); bh.consume(s); i++; @Benchmark public void loopForEach(Blackhole bh) { for (String s : DATA_FOR_TESTING) { bh.consume(s); @Benchmark public void loopIterator(Blackhole bh) { Iterator<String> iterator = DATA_FOR_TESTING.iterator(); while (iterator.hasNext()) { String s = iterator.next(); bh.consume(s); private List<String> createData() { List<String> data = new ArrayList<>(); for (int i = 0; i < N; i++) { data.add("Number : " + i); return data;
2.2 In the above code, JMH will create 2 forks, each fork containing 5 warmup iterations (JVM warmup, result is ignored) and 5 measuring iterations (for calculation), for example :
# Run progress: 0.00% complete, ETA 00:13:20 # Fork: 1 of 2 # Warmup Iteration 1: 60.920 ms/op # Warmup Iteration 2: 60.745 ms/op # Warmup Iteration 3: 60.818 ms/op # Warmup Iteration 4: 60.659 ms/op # Warmup Iteration 5: 60.765 ms/op Iteration 1: 63.579 ms/op Iteration 2: 61.622 ms/op Iteration 3: 61.869 ms/op Iteration 4: 61.730 ms/op Iteration 5: 62.207 ms/op # Run progress: 12.50% complete, ETA 00:11:50 # Fork: 2 of 2 # Warmup Iteration 1: 60.915 ms/op # Warmup Iteration 2: 61.527 ms/op # Warmup Iteration 3: 62.329 ms/op # Warmup Iteration 4: 62.729 ms/op # Warmup Iteration 5: 61.693 ms/op Iteration 1: 60.822 ms/op Iteration 2: 61.220 ms/op Iteration 3: 61.216 ms/op Iteration 4: 60.652 ms/op Iteration 5: 61.818 ms/op Result "com.mkyong.benchmark.BenchmarkLoop.loopFor": 61.673 ±(99.9%) 1.251 ms/op [Average] (min, avg, max) = (60.652, 61.673, 63.579), stdev = 0.828 CI (99.9%): [60.422, 62.925] (assumes normal distribution)
2.3 Warmup iteration and measuring iteration are configurable :
@Warmup(iterations = 3) // Warmup Iteration = 3 @Measurement(iterations = 8) // Iteration = 8
2.4 We even can warm up the entire fork, before started the real fork for measuring.
@Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"}, warmups = 2)
3. How to run JMH - #1 Maven
There are two ways to run the JMH benchmark, uses Maven or run it via a JMH Runner class directly.
3.1 Maven, package it as a JAR and run it via org.openjdk.jmh.Main class.
<build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.2.0</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <finalName>benchmarks</finalName> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>org.openjdk.jmh.Main</mainClass> </transformer> </transformers> </configuration> </execution> </executions> </plugin> </plugins> </build>
3.2 mvn package, it will generate a benchmarks.jar, just start the JAR normally.
$ mvn package $ java -jar target\benchmarks.jar BenchmarkLoop
4. How to run JMH - #2 JMH Runner
You can run the benchmark via a JMH Runner class directly.
package com.mkyong.benchmark; import org.openjdk.jmh.annotations.*; import org.openjdk.jmh.runner.Runner; import org.openjdk.jmh.runner.RunnerException; import org.openjdk.jmh.runner.options.Options; import org.openjdk.jmh.runner.options.OptionsBuilder; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.concurrent.TimeUnit; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MILLISECONDS) @Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"}) public class BenchmarkLoop { private static final int N = 10_000_000; private static List<String> DATA_FOR_TESTING = createData(); public static void main(String[] args) throws RunnerException { Options opt = new OptionsBuilder() .include(BenchmarkLoop.class.getSimpleName()) .forks(1) .build(); new Runner(opt).run(); // Benchmark code
5. Result
5.1 Review the result, to loop a List containing 10 million String objects, the classic while loop is the fastest loop. However, the difference isn't that significant.
Benchmark (N) Mode Cnt Score Error Units BenchmarkLoop.loopFor 10000000 avgt 10 61.673 ± 1.251 ms/op BenchmarkLoop.loopForEach 10000000 avgt 10 67.582 ± 1.034 ms/op BenchmarkLoop.loopIterator 10000000 avgt 10 66.087 ± 1.534 ms/op BenchmarkLoop.loopWhile 10000000 avgt 10 60.660 ± 0.279 ms/op
5.2 Full detail, good for reference.
$ java -jar target\benchmarks.jar BenchmarkLoop # JMH version: 1.21 # VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10 # VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe # VM options: -Xms2G -Xmx2G # Warmup: 5 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopFor # Parameters: (N = 10000000) # Run progress: 0.00% complete, ETA 00:13:20 # Fork: 1 of 2 # Warmup Iteration 1: 60.920 ms/op # Warmup Iteration 2: 60.745 ms/op # Warmup Iteration 3: 60.818 ms/op # Warmup Iteration 4: 60.659 ms/op # Warmup Iteration 5: 60.765 ms/op Iteration 1: 63.579 ms/op Iteration 2: 61.622 ms/op Iteration 3: 61.869 ms/op Iteration 4: 61.730 ms/op Iteration 5: 62.207 ms/op # Run progress: 12.50% complete, ETA 00:11:50 # Fork: 2 of 2 # Warmup Iteration 1: 60.915 ms/op # Warmup Iteration 2: 61.527 ms/op # Warmup Iteration 3: 62.329 ms/op # Warmup Iteration 4: 62.729 ms/op # Warmup Iteration 5: 61.693 ms/op Iteration 1: 60.822 ms/op Iteration 2: 61.220 ms/op Iteration 3: 61.216 ms/op Iteration 4: 60.652 ms/op Iteration 5: 61.818 ms/op Result "com.mkyong.benchmark.BenchmarkLoop.loopFor": 61.673 ±(99.9%) 1.251 ms/op [Average] (min, avg, max) = (60.652, 61.673, 63.579), stdev = 0.828 CI (99.9%): [60.422, 62.925] (assumes normal distribution) # JMH version: 1.21 # VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10 # VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe # VM options: -Xms2G -Xmx2G # Warmup: 5 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopForEach # Parameters: (N = 10000000) # Run progress: 25.00% complete, ETA 00:10:08 # Fork: 1 of 2 # Warmup Iteration 1: 67.938 ms/op # Warmup Iteration 2: 67.921 ms/op # Warmup Iteration 3: 68.064 ms/op # Warmup Iteration 4: 68.172 ms/op # Warmup Iteration 5: 68.181 ms/op Iteration 1: 68.378 ms/op Iteration 2: 68.069 ms/op Iteration 3: 68.487 ms/op Iteration 4: 68.300 ms/op Iteration 5: 67.635 ms/op # Run progress: 37.50% complete, ETA 00:08:27 # Fork: 2 of 2 # Warmup Iteration 1: 67.303 ms/op # Warmup Iteration 2: 67.062 ms/op # Warmup Iteration 3: 66.516 ms/op # Warmup Iteration 4: 66.973 ms/op # Warmup Iteration 5: 66.843 ms/op Iteration 1: 67.157 ms/op Iteration 2: 66.763 ms/op Iteration 3: 67.237 ms/op Iteration 4: 67.116 ms/op Iteration 5: 66.679 ms/op Result "com.mkyong.benchmark.BenchmarkLoop.loopForEach": 67.582 ±(99.9%) 1.034 ms/op [Average] (min, avg, max) = (66.679, 67.582, 68.487), stdev = 0.684 CI (99.9%): [66.548, 68.616] (assumes normal distribution) # JMH version: 1.21 # VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10 # VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe # VM options: -Xms2G -Xmx2G # Warmup: 5 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopIterator # Parameters: (N = 10000000) # Run progress: 50.00% complete, ETA 00:06:46 # Fork: 1 of 2 # Warmup Iteration 1: 67.336 ms/op # Warmup Iteration 2: 73.008 ms/op # Warmup Iteration 3: 66.646 ms/op # Warmup Iteration 4: 70.157 ms/op # Warmup Iteration 5: 68.373 ms/op Iteration 1: 66.385 ms/op Iteration 2: 66.309 ms/op Iteration 3: 66.474 ms/op Iteration 4: 68.529 ms/op Iteration 5: 66.447 ms/op # Run progress: 62.50% complete, ETA 00:05:04 # Fork: 2 of 2 # Warmup Iteration 1: 65.499 ms/op # Warmup Iteration 2: 65.540 ms/op # Warmup Iteration 3: 67.328 ms/op # Warmup Iteration 4: 65.926 ms/op # Warmup Iteration 5: 65.790 ms/op Iteration 1: 65.350 ms/op Iteration 2: 65.634 ms/op Iteration 3: 65.353 ms/op Iteration 4: 65.164 ms/op Iteration 5: 65.225 ms/op Result "com.mkyong.benchmark.BenchmarkLoop.loopIterator": 66.087 ±(99.9%) 1.534 ms/op [Average] (min, avg, max) = (65.164, 66.087, 68.529), stdev = 1.015 CI (99.9%): [64.553, 67.621] (assumes normal distribution) # JMH version: 1.21 # VM version: JDK 10.0.1, Java HotSpot(TM) 64-Bit Server VM, 10.0.1+10 # VM invoker: C:\Program Files\Java\jre-10.0.1\bin\java.exe # VM options: -Xms2G -Xmx2G # Warmup: 5 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: com.mkyong.benchmark.BenchmarkLoop.loopWhile # Parameters: (N = 10000000) # Run progress: 75.00% complete, ETA 00:03:22 # Fork: 1 of 2 # Warmup Iteration 1: 60.290 ms/op # Warmup Iteration 2: 60.161 ms/op # Warmup Iteration 3: 60.245 ms/op # Warmup Iteration 4: 60.613 ms/op # Warmup Iteration 5: 60.697 ms/op Iteration 1: 60.842 ms/op Iteration 2: 61.062 ms/op Iteration 3: 60.417 ms/op Iteration 4: 60.650 ms/op Iteration 5: 60.514 ms/op # Run progress: 87.50% complete, ETA 00:01:41 # Fork: 2 of 2 # Warmup Iteration 1: 60.845 ms/op # Warmup Iteration 2: 60.927 ms/op # Warmup Iteration 3: 60.832 ms/op # Warmup Iteration 4: 60.817 ms/op # Warmup Iteration 5: 61.078 ms/op Iteration 1: 60.612 ms/op Iteration 2: 60.516 ms/op Iteration 3: 60.647 ms/op Iteration 4: 60.607 ms/op Iteration 5: 60.733 ms/op Result "com.mkyong.benchmark.BenchmarkLoop.loopWhile": 60.660 ±(99.9%) 0.279 ms/op [Average] (min, avg, max) = (60.417, 60.660, 61.062), stdev = 0.184 CI (99.9%): [60.381, 60.939] (assumes normal distribution) # Run complete. Total time: 00:13:31 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. Benchmark (N) Mode Cnt Score Error Units BenchmarkLoop.loopFor 10000000 avgt 10 61.673 ± 1.251 ms/op BenchmarkLoop.loopForEach 10000000 avgt 10 67.582 ± 1.034 ms/op BenchmarkLoop.loopIterator 10000000 avgt 10 66.087 ± 1.534 ms/op BenchmarkLoop.loopWhile 10000000 avgt 10 60.660 ± 0.279 ms/op
Hope this tutorial give you a quick started guide to use JMH benchmark, for more advance JMH examples, please visit this official JMH sample link
How about Forward loop vs Reverse loop? Which one is faster? Visit this JMH test
References
- OpenJDK: jmh
- JHM examples
- Maven – How to create a Java project
- Java – While vs For vs Iterator Performance Test
From:一号门
COMMENTS