OV - Compare Loops

MAQAO

options

Loops

▶simulation.cpp: 221 - 16.27 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 221-221					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 221-221
32	3.12	2.92	5.63	0	12.5	23	2.50	2.22	10.63	0	25

Sum on 1 analyzed binary loop (md-icpx-Ofast - 32)						Sum on 1 analyzed binary loop (md-acfl-Ofast - 23)
Analysis					Count	Analysis					Count
Loop Computation Issues						Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1	Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1

▶simulation.cpp: 229 - 15.86 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 229-229					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 229-229
34	3.48	3.18	6.13	0	12.5	25	2.17	2.03	9.73	0	25

Sum on 1 analyzed binary loop (md-icpx-Ofast - 34)						Sum on 1 analyzed binary loop (md-acfl-Ofast - 25)
Analysis					Count	Analysis					Count
Loop Computation Issues						Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1	Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1

▶simulation.cpp: 225 - 15.62 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 225-225					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 225-225
33	3.58	3.31	6.39	0	12.5	24	2.08	1.92	9.23	0	25

Sum on 1 analyzed binary loop (md-icpx-Ofast - 33)						Sum on 1 analyzed binary loop (md-acfl-Ofast - 24)
Analysis					Count	Analysis					Count
Loop Computation Issues						Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1	Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1

▶simulation.cpp: 208 - 13.66 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 208-208					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 208-208
29	2.95	2.71	5.22	0	12.5	20	1.98	1.76	8.45	0	25

Sum on 1 analyzed binary loop (md-icpx-Ofast - 29)						Sum on 1 analyzed binary loop (md-acfl-Ofast - 20)
Analysis					Count	Analysis					Count
Loop Computation Issues						Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1	Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1

▶simulation.cpp: 212 - 10.37 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 212-212					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 212-212
30	3.05	2.61	5.03	0	12.5	21	1.24	1.11	5.34	0	25

Sum on 1 analyzed binary loop (md-icpx-Ofast - 30)						Sum on 1 analyzed binary loop (md-acfl-Ofast - 21)
Analysis					Count	Analysis					Count
Loop Computation Issues						Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1	Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1

▶simulation.cpp: 216 - 10.36 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 216-216					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 216-216
31	2.80	2.53	4.88	0	12.5	22	1.25	1.14	5.48	0	25

Sum on 1 analyzed binary loop (md-icpx-Ofast - 31)						Sum on 1 analyzed binary loop (md-acfl-Ofast - 22)
Analysis					Count	Analysis					Count
Loop Computation Issues						Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1	Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA					1

▶simulation.cpp: 341 - 0.66 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 341-345					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 342-344
46	0.23	0.17	0.33	100	50	39	0.11	0.07	0.34	100	100

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.						No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis					Count	Analysis					Count

▶simulation.cpp: 312 - 0.53 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 312-326					Loop Source Regions	/home/fmusial/MD_Benchmarks/simulation.cpp: 313-325
43	0.31	0.24	0.47	91.3	46.47	37	0.03	0.01	0.06	100	93.27

Sum on 1 analyzed binary loop (md-icpx-Ofast - 43)						No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis					Count	Analysis					Count
Loop Computation Issues
Presence of a large number of scalar integer instructions					1
Control Flow Issues
Presence of calls					1
Data Access Issues
More than 20% of the loads are accessing the stack					1
Vectorization Roadblocks
Presence of calls					1

▶simulation.cpp: 107 - 0.25 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/bits/stl_vector.h: 1264-1264 /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/bits/stl_algobase.h: 239-239 /home/fmusial/MD_Benchmarks/simulation.cpp: 107-116 /home/fmusial/MD_Benchmarks/simulation.cpp: 118-118 /home/fmusial/MD_Benchmarks/simulation.cpp: 124-126					Loop Source Regions	/opt/arm/gcc-14.2.0_Ubuntu-20.04/lib/gcc/aarch64-linux-gnu/14.2.0/../../../../include/c++/14.2.0/bits/stl_algobase.h: 238-238 /opt/arm/gcc-14.2.0_Ubuntu-20.04/lib/gcc/aarch64-linux-gnu/14.2.0/../../../../include/c++/14.2.0/bits/stl_vector.h: 1131-1131 /home/fmusial/MD_Benchmarks/simulation.cpp: 107-125
21	0.10	0.09	0.17	0	6.25	16	0.03	0.02	0.08	0	18.75
23	0.02	0.00	0.01	100	41.67

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.						No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis					Count	Analysis					Count

▶<unknown>: 0 - 0.13 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions						Loop Source Regions
7	0.01	0.00	0.00	0	0	12	0.03	0.01	0.07	0	0
12	0.01	0.00	0.00	0	0	9	0.00	0.00	0.00	0	0
17	0.05	0.03	0.05	0	0	64	0.00	0.00	0.00	0	0
18	0.01	0.00	0.00	0	0	15	0.00	0.00	0.00	0	0
55	0.00	0.00	0.00	0	0
74	0.01	0.00	0.00	0	0
61	0.00	0.00	0.00	0	0

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.						No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis					Count	Analysis					Count

▶simulation.cpp: 51 - 0.08 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)
Run Skylake ICPX Ofast						Run Neoverse ACFL Ofast
Loop Source Regions	/usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/bits/stl_algobase.h: 239-239 /home/fmusial/MD_Benchmarks/simulation.cpp: 51-53 /home/fmusial/MD_Benchmarks/simulation.cpp: 62-62 /home/fmusial/MD_Benchmarks/simulation.cpp: 68-69					Loop Source Regions	/opt/arm/gcc-14.2.0_Ubuntu-20.04/lib/gcc/aarch64-linux-gnu/14.2.0/../../../../include/c++/14.2.0/bits/stl_algobase.h: 238-238 /home/fmusial/MD_Benchmarks/simulation.cpp: 51-68
11	0.05	0.03	0.05	0	6.25	10	0.02	0.01	0.04	0	19.12

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.						No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis					Count	Analysis					Count

×