class: center, middle .vcenter[ # Microbenchmarking Mysteries Explored through JMH ] --- .left-column[ ## DIY ] .right-column[ ``` void harness(int iterations) { long start = System.currentTimeMillis(); for (int i = 0; i < iterations; i++) { benchmark(); } long end = System.currentTimeMillis(); System.out.println(end - start); } void benchmark() { * new ArrayList<>(); } ``` ] -- .right-column[.monospace[ .fixed-width-250[iterations=1,000] total time=1ms ]] -- .right-column[.monospace[ .fixed-width-250[iterations=10,000] total time=3ms ]] -- .right-column[.monospace[ .fixed-width-250[iterations=100,000] total time=7ms ]] -- .right-column[.monospace[ .fixed-width-250[iterations=100,000,000] total time= ]] --- .left-column[ ## DIY ] .right-column[ ``` void harness(int iterations) { long start = System.currentTimeMillis(); for (int i = 0; i < iterations; i++) { benchmark(); } long end = System.currentTimeMillis(); System.out.println(end - start); } void benchmark() { * new ArrayList<>(); } ``` ] .right-column[.monospace[ .fixed-width-250[iterations=1,000] total time=1ms ]] .right-column[.monospace[ .fixed-width-250[iterations=10,000] total time=3ms ]] .right-column[.monospace[ .fixed-width-250[iterations=100,000] total time=7ms ]] .right-column[.monospace[ .fixed-width-250[iterations=100,000,000] total time=7ms ]] --- .left-column[ ## Does this change anything? ] .right-column[ ``` void harness(int iterations) { long start = System.currentTimeMillis(); for (int i = 0; i < iterations; i++) { benchmark(); } long end = System.currentTimeMillis(); System.out.println(end - start); } void benchmark() { * new HashMap<>(); } ``` ] -- .right-column[.monospace[ .fixed-width-250[iterations=1,000] total time=1ms ]] -- .right-column[.monospace[ .fixed-width-250[iterations=10,000] total time=9ms ]] -- .right-column[.monospace[ .fixed-width-250[iterations=100,000] total time=17ms ]] -- .right-column[.monospace[ .fixed-width-250[iterations=100,000,000] total time= ]] --- .left-column[ ## Does this change anything? ] .right-column[ ``` void harness(int iterations) { long start = System.currentTimeMillis(); for (int i = 0; i < iterations; i++) { benchmark(); } long end = System.currentTimeMillis(); System.out.println(end - start); } void benchmark() { * new HashMap<>(); } ``` ] .right-column[.monospace[ .fixed-width-250[iterations=1,000] total time=1ms ]] .right-column[.monospace[ .fixed-width-250[iterations=10,000] total time=9ms ]] .right-column[.monospace[ .fixed-width-250[iterations=100,000] total time=17ms ]] .right-column[.monospace[ .fixed-width-250[iterations=100,000,000] total time=4251ms ]] --- .left-column[ ## JMH to the rescue? ] .right-column[ ``` public class Example3A { @Benchmark public void newArrayList() { * new ArrayList<>(); } } ``` ] -- .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3A ]]]] -- .right-column-incremental[ ```bash Benchmark Score Units Example3A.newArrayList 0.242 ns/op ``` ] -- .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3A .inline-highlighted[-prof gc] ]]]] -- .right-column-incremental[ ```bash Benchmark Score Units * Example3A.newArrayList:gc.alloc.rate.norm B/op ``` ] --- .left-column[ ## JMH to the rescue? ] .right-column[ ``` public class Example3A { @Benchmark public void newArrayList() { * new ArrayList<>(); } } ``` ] .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3A ]]]] .right-column-incremental[ ```bash Benchmark Score Units Example3A.newArrayList 0.242 ns/op ``` ] .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3A .inline-highlighted[-prof gc] ]]]] .right-column-incremental[ ```bash Benchmark Score Units * Example3A.newArrayList:gc.alloc.rate.norm 0.000 B/op ``` ] --- .left-column[ ## Rule #1: Prevent dead code elimination ] -- .right-column[ ``` public class Example3B { @Benchmark public Object newArrayList() { * return new ArrayList<>(); } } ``` ] -- .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3B ]]]] -- .right-column-incremental[ ```bash Benchmark Score Units Example3A.newArrayList 0.242 ns/op * Example3B.newArrayList 4.741 ns/op ``` ] -- .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3B -prof gc ]]]] -- .right-column-incremental[ ```bash Benchmark Score Units Example3A.newArrayList:gc.alloc.rate.norm 0.000 B/op * Example3B.newArrayList:gc.alloc.rate.norm B/op ``` ] --- .left-column[ ## Rule #1: Prevent dead code elimination ] .right-column[ ``` public class Example3B { @Benchmark public Object newArrayList() { * return new ArrayList<>(); } } ``` ] .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3B ]]]] .right-column-incremental[ ```bash Benchmark Score Units Example3A.newArrayList 0.242 ns/op * Example3B.newArrayList 4.741 ns/op ``` ] .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example3B -prof gc ]]]] .right-column-incremental[ ```bash Benchmark Score Units Example3A.newArrayList:gc.alloc.rate.norm 0.000 B/op * Example3B.newArrayList:gc.alloc.rate.norm 24.000 B/op ``` ] --- .left-column[ ## What about this? ] .right-column[ ``` public class Example4A { @Benchmark public double logarithm() { * return Math.log(31); } } ``` ] -- .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example4A ]]]] -- .right-column-incremental[ ```bash Benchmark Score Units Example4A.logarithm ns/op ``` ] --- .left-column[ ## What about this? ] .right-column[ ``` public class Example4A { @Benchmark public double logarithm() { * return Math.log(31); } } ``` ] .right-column-incremental[.monospace[.section-top-padding[.smaller[ java -jar target/benchmarks.jar Example4A ]]]] .right-column-incremental[ ```bash Benchmark Score Units Example4A.logarithm 2.300 ns/op ``` ] --- .left-column[ ## Rule #2: Prevent constant folding ] -- .right-column[ ``` * @State(Scope.Thread) public class Example4B { * private int x = 31; @Benchmark public double logarithm() { return Math.log(x); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example4A.logarithm 2.300 ns/op * Example4B.logarithm ns/op ``` ] --- .left-column[ ## Rule #2: Prevent constant folding ] .right-column[ ``` * @State(Scope.Thread) public class Example4B { * private int x = 31; @Benchmark public double logarithm() { return Math.log(x); } } ``` ] .right-column-incremental[ ```bash Benchmark Score Units Example4A.logarithm 2.300 ns/op * Example4B.logarithm 21.695 ns/op ``` ] --- .left-column[ ## Don't listen to your IDE ] .right-column[ ``` @State(Scope.Thread) public class Example4C { * private final int x = 31; @Benchmark public double logarithm() { return Math.log(x); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example4A.logarithm 2.300 ns/op Example4B.logarithm 21.695 ns/op * Example4C.logarithm 2.308 ns/op ``` ] --- .left-column[ ## Or, trick your IDE ] .right-column[ ``` @State(Scope.Thread) public class Example4D { private int x; * @Setup * public void setup() { * x = 31; * } @Benchmark public double logarithm() { return Math.log(x); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example4A.logarithm 2.300 ns/op Example4B.logarithm 21.695 ns/op Example4C.logarithm 2.308 ns/op * Example4D.logarithm 22.654 ns/op ``` ] --- class: center, middle .vcenter[ ## JMH Golden Rules ### #1: Always return output to prevent dead code elimination ### #2: Always read input from state objects to prevent constant folding ] --- class: center, middle .vcenter[ ## Now for fun and profit! ] --- .left-column[ ## What's the overhead of a method call? ] -- .right-column[ ```java @State(Scope.Thread) public class Example5B { private Map
map; @Setup public void setup() { map = new HashMap<>(); } @Benchmark public int mapSize() { * return map.size(); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example5B.mapSize 2.450 ns/op ``` ] --- .left-column[ ## What's the harness overhead? ] .right-column[ ```java @State(Scope.Thread) public class Example5A { private int x; @Setup public void setup() { x = 0; } @Benchmark public int harnessOverhead() { * return x; } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units * Example5A.harnessOverhead 1.931 ns/op Example5B.mapSize 2.450 ns/op ``` ] --- .left-column[ ## Does this change anything? ] .right-column[ ```java @State(Scope.Thread) public class Example5C { private Map
map; @Setup public void setup() { * map = new LinkedHashMap<>(); * for (int i = 0; i < 1000; i++) { * mapSize(); * } map = new HashMap<>(); } @Benchmark public int mapSize() { return map.size(); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example5A.harnessOverhead 1.931 ns/op Example5B.mapSize 2.450 ns/op * Example5C.mapSize 2.439 ns/op ``` ] --- .left-column[ ## What about this? ] .right-column[ ```java @State(Scope.Thread) public class Example5D { private Map
map; @Setup public void setup() { map = new LinkedHashMap<>(); for (int i = 0; i < 1000; i++) { mapSize(); } * map = new TreeMap<>(); * for (int i = 0; i < 1000; i++) { * mapSize(); * } map = new HashMap<>(); } @Benchmark public int mapSize() { return map.size(); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example5A.harnessOverhead 1.931 ns/op Example5B.mapSize 2.450 ns/op Example5C.mapSize 2.439 ns/op * Example5D.mapSize 3.399 ns/op ``` ] --- .left-column[ ## A more realistic example ] .right-column[ ```java @State(Scope.Thread) public class Example5E { private Map
map; @Setup public void setup() { * map = ImmutableMap.of(); for (int i = 0; i < 1000; i++) { mapSize(); } * map = ImmutableMap.of(1, 1); for (int i = 0; i < 1000; i++) { mapSize(); } * map = ImmutableMap.of(1, 1, 2, 2); } @Benchmark public int mapSize() { return map.size(); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example5A.harnessOverhead 1.931 ns/op Example5B.mapSize 2.450 ns/op Example5C.mapSize 2.439 ns/op Example5D.mapSize 3.399 ns/op * Example5D.mapSize 3.383 ns/op ``` ] ??? https://github.com/google/guava/commit/b2f0d0fad93f11bb86a2ce1770bcb43fd014fa06 --- class: center, middle .vcenter[ #Down the rabbit hole Courtesy of -XX:+PrintAssembly and JITWatch ] --- ## Monomorphic call site (HashMap only type seen at call site) ``` public int mapSize() { * return map.size(); } ``` ```x86asm [Verified Entry Point] 0x0000000002b5d1b0: mov %eax,-0x6000(%rsp) 0x0000000002b5d1b7: push %rbp 0x0000000002b5d1b8: sub $0x10,%rsp ;*synchronization entry ; - benchmark.Example5B::mapSize@-1 (line 44) 0x0000000002b5d1bc: mov 0xc(%rdx),%r11d ;*getfield map ; - benchmark.Example5B::mapSize@1 (line 44) 0x0000000002b5d1c0: mov 0x8(%r11),%r10d ; implicit exception: dispatches to 0x0000000002b5d1f1 0x0000000002b5d1c4: cmp $0x16d4c708,%r10d ; {metadata('java/util/HashMap')} *(A) 0x0000000002b5d1cb: jne 0x0000000002b5d1e0 0x0000000002b5d1cd: mov %r11,%r10 ;*invokeinterface size ; - benchmark.Example5B::mapSize@4 (line 44) 0x0000000002b5d1d0: mov 0x14(%r10),%eax ;*getfield size ; - java.util.HashMap::size@1 (line 525) ; - benchmark.Example5B::mapSize@4 (line 44) 0x0000000002b5d1d4: add $0x10,%rsp 0x0000000002b5d1d8: pop %rbp 0x0000000002b5d1d9: test %eax,-0xa3d1df(%rip) # 0x0000000002120000 ; {poll_return} 0x0000000002b5d1df: retq *(B) 0x0000000002b5d1e0: mov $0xffffffde,%edx ... ``` --- ## Bimorphic call site (HashMap and LinkedHashMap) ```x86asm [Verified Entry Point] 0x0000000002cf4d30: mov %eax,-0x6000(%rsp) 0x0000000002cf4d37: push %rbp 0x0000000002cf4d38: sub $0x10,%rsp ;*synchronization entry ; - benchmark.Example5C::mapSize@-1 (line 49) 0x0000000002cf4d3c: mov 0xc(%rdx),%r11d ;*getfield map ; - benchmark.Example5C::mapSize@1 (line 49) 0x0000000002cf4d40: mov 0x8(%r11),%r10d ; implicit exception: dispatches to 0x0000000002cf4d7d 0x0000000002cf4d44: mov %r11,%r8 0x0000000002cf4d47: cmp $0x16e8c708,%r10d ; {metadata('java/util/HashMap')} *(A) 0x0000000002cf4d4e: jne 0x0000000002cf4d60 ;*invokeinterface size ; - benchmark.Example5C::mapSize@4 (line 49) 0x0000000002cf4d50: mov 0x14(%r8),%eax ;*synchronization entry ; - benchmark.Example5C::mapSize@-1 (line 49) 0x0000000002cf4d54: add $0x10,%rsp 0x0000000002cf4d58: pop %rbp 0x0000000002cf4d59: test %eax,-0x2aa4d5f(%rip) # 0x0000000000250000 ; {poll_return} 0x0000000002cf4d5f: retq *(B) 0x0000000002cf4d60: cmp $0x16e97638,%r10d ; {metadata('java/util/LinkedHashMap')} *(C) 0x0000000002cf4d67: jne 0x0000000002cf4d6f ;*invokeinterface size ; - benchmark.Example5C::mapSize@4 (line 49) 0x0000000002cf4d69: mov 0x14(%r8),%eax ;*getfield size ; - java.util.HashMap::size@1 (line 525) ; - benchmark.Example5C::mapSize@4 (line 49) 0x0000000002cf4d6d: jmp 0x0000000002cf4d54 *(D) 0x0000000002cf4d6f: mov $0xffffffc6,%edx ... ``` --- ## Megamorphic call site (HashMap, LinkedHashMap and TreeMap) ```x86asm [Verified Entry Point] 0x0000000002bd4c90: mov %eax,-0x6000(%rsp) 0x0000000002bd4c97: push %rbp 0x0000000002bd4c98: sub $0x10,%rsp ;*synchronization entry ; - benchmark.Example5D::mapSize@-1 (line 54) 0x0000000002bd4c9c: mov 0xc(%rdx),%r10d 0x0000000002bd4ca0: mov %r10,%rdx ;*getfield map ; - benchmark.Example5D::mapSize@1 (line 54) 0x0000000002bd4ca3: xchg %ax,%ax 0x0000000002bd4ca5: movabs $0xffffffffffffffff,%rax * 0x0000000002bd4caf: callq 0x0000000002a86120 ; OopMap{off=52} * ;*invokeinterface size * ; - benchmark.Example5D::mapSize@4 (line 54) * ; {virtual_call} 0x0000000002bd4cb4: add $0x10,%rsp 0x0000000002bd4cb8: pop %rbp 0x0000000002bd4cb9: test %eax,-0x2974cbf(%rip) # 0x0000000000260000 ; {poll_return} 0x0000000002bd4cbf: retq ;*invokeinterface size ; - benchmark.Example5D::mapSize@4 (line 54) ``` --- class: center, middle .vcenter[ #No more rabbit holes! (back to fun and profit) ] --- .left-column[ ## What's the overhead of reflection? ] -- .right-column[ ```java @State(Scope.Thread) public class Example6A { private String str; private File file; @Setup public void setup() throws Exception { str = null; file = new File("."); } @Benchmark public String harnessOverhead() { * return str; } @Benchmark public String normalCall() { * return file.getPath(); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example6A.harnessOverhead 2.145 ns/op Example6A.normalCall 2.523 ns/op ``` ] --- .left-column[ ## What's the overhead of reflection? ] .right-column[ ```java @State(Scope.Thread) public class Example6B { private File file; @Setup public void setup() throws Exception { file = new File("."); } @Benchmark public Object reflection() throws Exception { * Method method = File.class.getMethod("getPath"); * return method.invoke(file); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example6A.harnessOverhead 2.145 ns/op Example6A.normalCall 2.523 ns/op * Example6B.reflection ns/op ``` ] --- .left-column[ ## What's the overhead of reflection? ] .right-column[ ```java @State(Scope.Thread) public class Example6B { private File file; @Setup public void setup() throws Exception { file = new File("."); } @Benchmark public Object reflection() throws Exception { * Method method = File.class.getMethod("getPath"); * return method.invoke(file); } } ``` ] .right-column-incremental[ ```bash Benchmark Score Units Example6A.harnessOverhead 2.145 ns/op Example6A.normalCall 2.523 ns/op * Example6B.reflection 118.563 ns/op ``` ] --- .left-column[ ## That last slide was terribly misleading ] .right-column[ ```java @State(Scope.Thread) public class Example6C { * private Method method; private File file; @Setup public void setup() throws Exception { * method = File.class.getMethod("getPath"); file = new File("."); } @Benchmark public Object reflection() throws Exception { return method.invoke(file); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example6A.harnessOverhead 2.145 ns/op Example6A.normalCall 2.523 ns/op Example6B.reflection 118.563 ns/op * Example6C.reflection ns/op ``` ] --- .left-column[ ## That last slide was terribly misleading ] .right-column[ ```java @State(Scope.Thread) public class Example6C { * private Method method; private File file; @Setup public void setup() throws Exception { * method = File.class.getMethod("getPath"); file = new File("."); } @Benchmark public Object reflection() throws Exception { return method.invoke(file); } } ``` ] .right-column-incremental[ ```bash Benchmark Score Units Example6A.harnessOverhead 2.145 ns/op Example6A.normalCall 2.523 ns/op Example6B.reflection 118.563 ns/op * Example6C.reflection 4.881 ns/op ``` ] --- .left-column[ ## Every last nanosecond counts(?) ] .right-column[ ```java @State(Scope.Thread) public class Example6D { private Method method; private File file; @Setup public void setup() throws Exception { method = File.class.getMethod("getPath"); * method.setAccessible(true); file = new File("."); } @Benchmark public Object reflection() throws Exception { return method.invoke(file); } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example6A.harnessOverhead 2.145 ns/op Example6A.normalCall 2.523 ns/op Example6B.reflection 118.563 ns/op Example6C.reflection 4.881 ns/op * Example6D.reflection 4.318 ns/op ``` ] --- .left-column[ ## JMH:
multi-threaded benchmark support ] -- .right-column[ ``` * @State(Scope.Group) public class Example7A { private volatile long x; private volatile long y; * @Group * @GroupThreads(1) @Benchmark public long incrementX() { return x++; } * @Group * @GroupThreads(2) @Benchmark public long readY() { return y; } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example7A.group:incrementX ns/op Example7A.group:readY ns/op ``` ] --- .left-column[ ## JMH:
multi-threaded benchmark support ] .right-column[ ``` * @State(Scope.Group) public class Example7A { private volatile long x; private volatile long y; * @Group * @GroupThreads(1) @Benchmark public long incrementX() { return x++; } * @Group * @GroupThreads(2) @Benchmark public long readY() { return y; } } ``` ] .right-column-incremental[ ```bash Benchmark Score Units Example7A.group:incrementX 30.000 ns/op Example7A.group:readY 12.710 ns/op ``` ] --- .left-column[ ## What? ] .right-column[ ``` @State(Scope.Group) public class Example7B { private volatile long x; * long p1, p2, p3, p4, p5, p6, p7; private volatile long y; @Group @GroupThreads(1) @Benchmark public int incrementX() { return x++; } @Group @GroupThreads(2) @Benchmark public int readY() { return y; } } ``` ] -- .right-column-incremental[ ```bash Benchmark Score Units Example7A.group:incrementX 30.000 ns/op Example7A.group:readY 12.710 ns/op * Example7B.group:incrementX ns/op * Example7B.group:readY ns/op ``` ] --- .left-column[ ## What? ] .right-column[ ``` @State(Scope.Group) public class Example7B { private volatile long x; * long p1, p2, p3, p4, p5, p6, p7; private volatile long y; @Group @GroupThreads(1) @Benchmark public int incrementX() { return x++; } @Group @GroupThreads(2) @Benchmark public int readY() { return y; } } ``` ] .right-column-incremental[ ```bash Benchmark Score Units Example7A.group:incrementX 30.000 ns/op Example7A.group:readY 12.710 ns/op * Example7B.group:incrementX 6.469 ns/op * Example7B.group:readY 2.006 ns/op ``` ] --- .left-column[ ## False sharing ] .right-column[.cache-line-image[ ![Cache line](cache-line.png) .cache-line-caption[(http://mechanical-sympathy.blogspot.com/2011/07/false-sharing.html)] ]] --- .left-column[ ## False sharing ] .right-column[ ### @sun.misc.Contended Introduced in JDK8 to address false sharing. ] -- .right-column-incremental[ Used in java.util.concurrent * ConcurrentHashMap * Exchanger * ForkJoinPool * Striped64 ] -- .right-column-incremental[ Can only be used outside of the bootstrap class loader with .restrict-contended[ ```bash -XX:-RestrictContended ``` ] ] --- class: center .vcenter[.recap-top-padding[ ## Recap ]] -- .vcenter[ ### JMH golden rule #1 Always return output
in order to prevent dead code elimination ] -- .vcenter[ ### JMH golden rule #2 Always read input from state objects
in order to prevent constant folding ] -- .vcenter[ ### Have fun and profit! ] --- .left-column[ ## Some resources ] .right-column[.references[ https://github.com/openjdk/jmh/tree/master/jmh-samples/src/main/java/org/openjdk/jmh/samples JMH javadocs java -jar target/benchmarks.jar -h https://shipilev.net https://psy-lob-saw.blogspot.com/p/jmh-related-posts.html ]]