Memory bandwith and latency measurements Tier-1#

Memory bandwidth and latencies for main memory, as well as latencies for L2 and LLC cache. Measurements have been performed using Intel’s Memory Latency Checker (mlc and mlc_avx512).

skylake nodes Tier-1#

Intel(R) Memory Latency Checker - v3.5
Measuring idle latencies (in ns)...
                Numa node
Numa node            0       1  
       0          79.8   131.4  
       1         130.6    77.4  

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads        :      221550.1        
3:1 Reads-Writes :      191572.6        
2:1 Reads-Writes :      188084.9        
1:1 Reads-Writes :      181321.7        
Stream-triad like:      168425.4        

Measuring Memory Bandwidths between nodes within system 
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
                Numa node
Numa node            0       1  
       0        111180.9        34369.0 
       1        34380.0 110908.4        

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay   (ns)    MB/sec
==========================
 00000  147.22   221598.3
 00002  150.03   221530.8
 00008  146.98   221641.0
 00015  146.20   221253.5
 00050  134.65   214083.3
 00100  109.19   148517.0
 00200   95.20    95575.8
 00300   91.28    68113.2
 00400   89.16    53086.4
 00500   87.85    42872.3
 00700   85.61    31657.3
 01000   83.95    22734.5
 01300   83.71    17786.4
 01700   83.31    13849.3
 02500   84.12     9699.3
 03500   82.46     7183.2
 05000   82.13     5268.8
 09000   81.33     3273.9
 20000   79.21     1936.2

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency        49.4
Local Socket L2->L2 HITM latency        49.7
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                        Reader Numa Node
Writer Numa Node     0       1  
            0        -   110.8  
            1    111.2       -  
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
                        Reader Numa Node
Writer Numa Node     0       1  
            0        -   177.1  
            1    178.1       -  

broadwell nodes Tier-1#

Intel(R) Memory Latency Checker - v3.1a
Measuring idle latencies (in ns)...
                Numa node
Numa node            0       1       2       3  
       0          73.8   151.1   185.7   196.0  
       1         141.1    76.5   184.6   194.2  
       2         185.3   195.0    73.7   150.9  
       3         184.8   194.6   141.1    76.4  

Measuring Peak Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads        :      144906.6        
3:1 Reads-Writes :      130347.1        
2:1 Reads-Writes :      126626.7        
1:1 Reads-Writes :      114846.3        
Stream-triad like:      125606.5        

Measuring Memory Bandwidths between nodes within system 
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
                Numa node
Numa node            0       1       2       3  
       0        36481.7 18909.5 16039.3 15127.7 
       1        20194.2 36376.3 15812.5 14946.7 
       2        16090.9 15191.0 36507.0 18850.3 
       3        15775.5 14939.6 20151.5 36348.1 

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay   (ns)    MB/sec
==========================
 00000  203.58   145200.4
 00002  202.96   145260.1
 00008  198.28   145314.3
 00015  192.23   145305.6
 00050  164.33   144868.6
 00100  103.73   115424.2
 00200   91.29    70055.4
 00300   87.57    49582.2
 00400   85.15    38364.5
 00500   83.05    30718.6
 00700   80.21    22785.3
 01000   78.70    16523.2
 01300   78.55    13031.8
 01700   77.55    10250.7
 02500   76.68     7307.4
 03500   75.41     5504.1
 05000   74.85     4127.0
 09000   74.42     2685.3
 20000   74.12     1687.0

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency        34.4
Local Socket L2->L2 HITM latency        39.2
Remote Socket LLC->LLC HITM latency (data address homed in writer socket)
                        Reader Numa Node
Writer Numa Node     0       1       2       3  
            0        -    46.9    77.5    80.6  
            1     48.8       -    82.1    85.3  
            2     78.3    80.7       -    46.8  
            3     82.6    85.5    47.6       -  
Remote Socket LLC->LLC HITM latency (data address homed in reader socket)
                        Reader Numa Node
Writer Numa Node     0       1       2       3  
            0        -    98.9   132.0   145.5  
            1     95.5       -   134.4   148.4  
            2    132.2   145.5       -    98.9  
            3    134.6   148.7    94.1       -