Perfect OpenMP/MPI/Pthread/TBB + Perfect Load Distribution
1.21
1.22
1.21
1.20
1.20
1.20
1.20
No Scalar Integer
Potential Speedup
1.43
1.43
1.05
1.05
1.09
1.09
1.09
Nb Loops to get 80%
6
6
1
1
1
1
1
FP Vectorised
Potential Speedup
2.71
2.66
2.41
2.40
2.42
2.43
2.43
Nb Loops to get 80%
7
7
6
6
6
6
6
Fully Vectorised
Potential Speedup
3.21
3.13
3.04
3.03
3.08
3.08
3.08
Nb Loops to get 80%
7
7
7
7
7
7
7
Only FP Arithmetic
Potential Speedup
1.60
1.59
1.25
1.25
1.25
1.25
1.25
Nb Loops to get 80%
6
6
1
1
1
1
1
Cumulated Speedup If No Scalar Integer
Cumulated Speedup If FP Vectorized
Cumulated Speedup If Fully Vectorized
Cumulated Speedup If Only FP Arithmetic
Loop Based Profiles
Innermost / Single Loops
Inbetween Loops
Outermost Loops
Cumulated Coverage With All Loops
Innermost Loop Based Profiles
Coverage
Count
Application Categorization
Time
Coverage
Compilation Options
Source Object
Issue
▼md-gcc-Ofast–
▼simulation.cpp–
○
Source Object
Issue
▼md-gcc-Ofast–
▼simulation.cpp–
○
Source Object
Issue
▼md-clang-O3-ffast-math–
▼simulation.cpp–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate. Try to complement -g with -grecord-gcc-switches or -frecord-command-line.
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
Source Object
Issue
▼md-clang-O3-ffast-math–
▼simulation.cpp–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate. Try to complement -g with -grecord-gcc-switches or -frecord-command-line.
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
Source Object
Issue
▼md-icpx-Ofast–
▼random.h–
○
▼simulation.cpp–
○
Source Object
Issue
▼md-icpx-Ofast–
▼random.h–
○
▼simulation.cpp–
○
Source Object
Issue
▼md-icpx-O3–
▼random.h–
○
▼simulation.cpp–
○
Path Count Profiles
Coverage
Count
Low Iteration Count Profiles
Coverage
Count
Average Number of Active Threads
Run 1 - Skylake GCC Ofast
Run 2 - Skylake GCC Ofast - Code Optimisations version
Run 3 - Skylake Clang O3-ffast-math
Run 4 - Skylake Clang O3-ffast-math - Code Optimisations version
Run 5 - Skylake ICPX Ofast
Run 6 - Skylake ICPX Ofast - Code Optimisations version
Run 7 - Skylake ICPX O3 - Code Optimisations version
Experiment Summaries
r0
r1
r2
r3
r4
r5
r6
Experiment Name
MD scalability Skylake 2-52 threads runs | Version : gcc-Ofast
MD scalability Skylake 2-52 threads runs - Code Optimisations version | Version : gcc-Ofast