The AMD EPYC 7742 64-Core Processor has two 256-bit Fused Multiply-Add (FMA) units and can deliver up to 16 double-precision floating point operations (flops) per cycle. Thus, the peak double-precision flops per node is: 2 x 64 cores x 2.25 GHz x 16 = 4.6 TF.

Further details on the configuration of the processors can be retrieved on a terminal using the command lscpu and  papi_avail (the latter after loading the module cray on the system). The output on screen should be the following:

lscpu
$ lscpu

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       43 bits physical, 48 bits virtual
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             2250.000
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4500.15
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-15,128-143
NUMA node1 CPU(s):   16-31,144-159
NUMA node2 CPU(s):   32-47,160-175
NUMA node3 CPU(s):   48-63,176-191
NUMA node4 CPU(s):   64-79,192-207
NUMA node5 CPU(s):   80-95,208-223
NUMA node6 CPU(s):   96-111,224-239
NUMA node7 CPU(s):   112-127,240-255
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd sev ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
papi_avail
$ module load cray
$ papi_avail | head -20

Available PAPI preset and user defined events plus hardware information.
--------------------------------------------------------------------------------
PAPI version             : 6.0.0.6
Operating system         : Linux 4.12.14-197.78_9.1.58-cray_shasta_c
Vendor string and code   : AuthenticAMD (2, 0x2)
Model string and code    : AMD EPYC 7742 64-Core Processor (49, 0x31)
CPU revision             : 0.000000
CPUID                    : Family/Model/Stepping 23/49/0, 0x17/0x31/0x00
CPU Max MHz              : 2250
CPU Min MHz              : 1500
Total cores              : 256
SMT threads per core     : 2
Cores per socket         : 64
Sockets                  : 2
Cores per NUMA region    : 32
NUMA regions             : 8
Running in a VM          : no
Number Hardware Counters : 6
Max Multiplex Counters   : 384
Fast counter read (rdpmc): no

A plot of the processor's topology can be obtained in the Scalable Vector Graphics (SVG) file format using the command hwloc,  as shown in the example below:

Processor's topology
$ module load cray hwloc
$ lstopo --of svg --no-io --no-bridges > topology.svg
  • No labels