'Why false share not work without volatile padding
public class VolatileTest {
private static class T {
public long p1,p2,p3, p4,p5;// if comment this and run again
public long x = 0L;
public long y = 0L;
}
public static T[] arr = new T[2];
static {
arr[0] = new T();
arr[1] = new T();
}
public static void main(String[] args) throws InterruptedException {
Thread t1 = new Thread(()->{
for(long a=0;a<999999999L;a++){
arr[0].x = a;
}
});
Thread t2 = new Thread(()->{
for(long a=0;a<999999999L;a++){
arr[1].y = a;
}
});
final long start = System.nanoTime();
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println((System.nanoTime()-start)/1000000);
}}
the first time I run above code, it cost 519. If i comment line 3 and run again, it cost 510. Why the second code (wihout padding) run faster.thanks
Solution 1:[1]
This benchmark can't be used to make any conclusions because you don't run it long enough. I would suggest converting it to JMH and try again.
Apart from that, the padding is broken:
You typically want to pad on both sides
With the current approach you have no clue if the padding actually happens in front. You typically fix this by padding to a super and subclass. See jctools for examples:
- You don't pad enough. You typically want to pad 128 bytes on both sides due to adject cacheline prefetching. You are 'padding' just 56 bytes.
[edit]
The JIT could optimize the following code:
Thread t1 = new Thread(()->{
for(long a=0;a<999999999L;a++){
arr[0].x = a;
}
});
To
Thread t1 = new Thread(()->{
arr[0].x = 999999999L-1;
});
And since the value isn't read, in theory, it could even optimize it to:
Thread t1 = new Thread(()->{
});
So you need to make sure that the JIT doesn't apply dead code elimination. JMH has facilities for that.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |