Memory bandwidth and latency measurements (Thinking)#
Memory bandwidth and latencies for main memory, as well as latencies for L2
and LLC cache. Measurements have been performed using Intel’s Memory Latency
Checker (mlc and mlc_avx512).
IvyBridge CPUs#
Intel(R) Memory Latency Checker - v3.1a
Measuring idle latencies (in ns)...
Numa node
Numa node 0 1
0 61.7 115.1
1 114.6 61.2
Measuring Peak Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 110395.9
3:1 Reads-Writes : 93339.7
2:1 Reads-Writes : 88859.9
1:1 Reads-Writes : 81146.3
Stream-triad like: 93082.4
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Numa node
Numa node 0 1
0 55263.8 25501.2
1 25279.7 55268.5
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
==========================
00000 189.18 110655.5
00002 188.50 110637.9
00008 184.44 110496.9
00015 180.58 110233.8
00050 127.01 105138.4
00100 99.42 83223.3
00200 85.10 54041.9
00300 79.93 39605.1
00400 76.24 31287.1
00500 74.38 25905.8
00700 70.63 19396.5
01000 70.87 14183.6
01300 69.09 11290.3
01700 67.61 8962.1
02500 66.46 6485.2
03500 65.77 4950.3
05000 65.14 3783.7
09000 64.49 2558.4
20000 64.10 1706.5
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 24.8
Local Socket L2->L2 HITM latency 28.3
Remote Socket LLC->LLC HITM latency (data address homed in writer socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 70.4
1 69.7 -
Remote Socket LLC->LLC HITM latency (data address homed in reader socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 69.6
1 69.1 -
Haswell CPUs#
Intel(R) Memory Latency Checker - v3.1a
Measuring idle latencies (in ns)...
Numa node
Numa node 0 1
0 92.1 127.1
1 129.4 90.0
Measuring Peak Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 122130.1
3:1 Reads-Writes : 112893.9
2:1 Reads-Writes : 110465.2
1:1 Reads-Writes : 99124.2
Stream-triad like: 109274.7
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Numa node
Numa node 0 1
0 60758.6 30587.2
1 30602.6 61562.1
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
==========================
00000 200.00 122365.1
00002 198.94 122264.6
00008 191.95 122679.7
00015 186.45 122510.3
00050 158.72 121452.3
00100 121.01 106422.8
00200 106.05 68622.5
00300 101.73 49855.1
00400 98.84 38790.7
00500 96.82 31029.3
00700 94.95 23026.0
01000 93.60 16702.3
01300 93.42 13138.7
01700 92.76 10329.2
02500 91.57 7323.9
03500 91.29 5477.1
05000 90.65 4065.1
09000 90.06 2586.9
20000 89.46 1562.5
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 33.2
Local Socket L2->L2 HITM latency 36.6
Remote Socket LLC->LLC HITM latency (data address homed in writer socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 75.7
1 75.8 -
Remote Socket LLC->LLC HITM latency (data address homed in reader socket)
Reader Numa Node
Writer Numa Node 0 1
0 - 76.0
1 76.0 -