* [MAQAO] Info: Detected 2 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 1 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 67.6370
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 3.9708
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4815647661
-- (4) Integer space for factors (estimated) = 63960178
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167568
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 0
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.833D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Total space in MBytes, IC factorization (INFOG(17)): 54260
Total space in MBytes, OOC factorization (INFOG(27)): 6388
Elapsed time in analysis driver= 79.4702
Analysis time by clock_gettime(): 79.470 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 1 and #OMP = 1
Elapsed time in save structure driver= 0.0004
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 1 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 1
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4815647661
INFOG(4) Integer space for factors (estim.)= 63960178
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167568
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 4.7532
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Effective size of S (based on INFO(39))= 6394605306
Redistrib: total data local/sent = 0 0
Elapsed time to reformat/distribute matrix = 4.7626
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 230000
Size of async. emission buffer (bytes).. = 566623
Small emission buffer (bytes) .......... = 20
** Memory allocated, total in Mbytes (INFOG(19)): 54260
** Memory effectively used, total in Mbytes (INFOG(22)): 42563
Elapsed time for factorization = 464.8930
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.091D+09
------ (3) Operations in node elimination = 1.836D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4829216999
INFOG (10) Integer space for factors = 64006664
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36969
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 474.5204
Factorization time by clock_gettime(): 474.5039 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 1 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_0 #
################################################################################################################################################################
* [MAQAO] Info: Detected 2 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 2 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 67.3363
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 3.9260
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4844627931
-- (4) Integer space for factors (estimated) = 63985906
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167568
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 2
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 32074
Total space in MBytes, IC factorization (INFOG(17)): 58146
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 5552
Total space in MBytes, OOC factorization (INFOG(27)): 10871
Elapsed time in analysis driver= 80.0160
Analysis time by clock_gettime(): 80.014 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 2 and #OMP = 1
Elapsed time in save structure driver= 0.0004
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 2 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 2
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4844627931
INFOG(4) Integer space for factors (estim.)= 63985906
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167568
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 4.7992
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 3418775043
Elapsed time to reformat/distribute matrix = 5.7580
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 19754932
Size of async. emission buffer (bytes).. = 79217269
Small emission buffer (bytes) .......... = 248
** Memory allocated, max in Mbytes (INFOG(18)): 32074
** Memory allocated, total in Mbytes (INFOG(19)): 58154
** Memory effectively used, max in Mbytes (INFOG(21)): 25254
** Memory effectively used, total in Mbytes (INFOG(22)): 45256
Elapsed time to process root node = 2.8534
Elapsed time for factorization = 261.7727
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.091D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4858197269
INFOG (10) Integer space for factors = 64032420
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36969
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 272.4179
Factorization time by clock_gettime(): 272.4086 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 2 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_1 #
################################################################################################################################################################
* [MAQAO] Info: Detected 4 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 4 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 67.5444
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 3.9754
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4844627931
-- (4) Integer space for factors (estimated) = 64090936
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167568
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 5
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 17751
Total space in MBytes, IC factorization (INFOG(17)): 60826
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 4510
Total space in MBytes, OOC factorization (INFOG(27)): 14610
Elapsed time in analysis driver= 80.3059
Analysis time by clock_gettime(): 80.303 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 4 and #OMP = 1
Elapsed time in save structure driver= 0.0004
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 4 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 4
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4844627931
INFOG(4) Integer space for factors (estim.)= 64090936
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167568
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
[0m
Statistics on the scaling phase
Elapsed time for scaling = 4.7977
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 1779886916
Elapsed time to reformat/distribute matrix = 4.8547
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 18243492
Size of async. emission buffer (bytes).. = 73156393
Small emission buffer (bytes) .......... = 644
** Memory allocated, max in Mbytes (INFOG(18)): 17788
** Memory allocated, total in Mbytes (INFOG(19)): 60935
** Memory effectively used, max in Mbytes (INFOG(21)): 13832
** Memory effectively used, total in Mbytes (INFOG(22)): 46664
Elapsed time to process root node = 1.7368
Elapsed time for factorization = 155.2789
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.093D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4858197269
INFOG (10) Integer space for factors = 64113599
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36969
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 164.9866
Factorization time by clock_gettime(): 165.0230 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 4 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_2 #
################################################################################################################################################################
* [MAQAO] Info: Detected 8 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 8 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 67.6677
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 3.9994
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4844627931
-- (4) Integer space for factors (estimated) = 64272958
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167568
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 13
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 9839
Total space in MBytes, IC factorization (INFOG(17)): 66177
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 3385
Total space in MBytes, OOC factorization (INFOG(27)): 21479
Elapsed time in analysis driver= 80.4866
Analysis time by clock_gettime(): 80.484 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 8 and #OMP = 1
Elapsed time in save structure driver= 0.0005
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 8 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 8
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4844627931
INFOG(4) Integer space for factors (estim.)= 64272958
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167568
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 4.8205
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 961384347
Elapsed time to reformat/distribute matrix = 4.4300
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 15786264
Size of async. emission buffer (bytes).. = 63302908
Small emission buffer (bytes) .......... = 1956
** Memory allocated, max in Mbytes (INFOG(18)): 9915
** Memory allocated, total in Mbytes (INFOG(19)): 66406
** Memory effectively used, max in Mbytes (INFOG(21)): 7575
** Memory effectively used, total in Mbytes (INFOG(22)): 49260
Elapsed time to process root node = 0.9232
Elapsed time for factorization = 92.4802
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.104D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4858197269
INFOG (10) Integer space for factors = 64260751
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36969
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 101.8244
Factorization time by clock_gettime(): 101.8274 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 8 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_3 #
################################################################################################################################################################
* [MAQAO] Info: Detected 16 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 16 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 67.5787
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 4.0059
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4843250499
-- (4) Integer space for factors (estimated) = 64481250
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167569
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 24
Number of split nodes = 1
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 6826
Total space in MBytes, IC factorization (INFOG(17)): 73423
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 3784
Total space in MBytes, OOC factorization (INFOG(27)): 30814
Elapsed time in analysis driver= 80.3587
Analysis time by clock_gettime(): 80.356 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 16 and #OMP = 1
Elapsed time in save structure driver= 0.0005
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 16 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 16
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4843250499
INFOG(4) Integer space for factors (estim.)= 64481250
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167569
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 4.8266
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 522629196
Elapsed time to reformat/distribute matrix = 4.1893
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 18928808
Size of async. emission buffer (bytes).. = 75904514
Small emission buffer (bytes) .......... = 6400
** Memory allocated, max in Mbytes (INFOG(18)): 6837
** Memory allocated, total in Mbytes (INFOG(19)): 73509
** Memory effectively used, max in Mbytes (INFOG(21)): 5144
** Memory effectively used, total in Mbytes (INFOG(22)): 55144
Elapsed time to process root node = 0.6183
Elapsed time for factorization = 45.2283
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.166D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4856855549
INFOG (10) Integer space for factors = 64441504
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36969
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 2
Elapsed time in factorization driver = 54.3124
Factorization time by clock_gettime(): 54.3267 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 16 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_4 #
################################################################################################################################################################
* [MAQAO] Info: Detected 32 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 32 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 72.7035
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 4.5457
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4843056339
-- (4) Integer space for factors (estimated) = 65395880
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167569
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 48
Number of split nodes = 1
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 3305
Total space in MBytes, IC factorization (INFOG(17)): 77769
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 1666
Total space in MBytes, OOC factorization (INFOG(27)): 39055
Elapsed time in analysis driver= 87.5509
Analysis time by clock_gettime(): 87.548 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 32 and #OMP = 1
Elapsed time in save structure driver= 0.0010
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 32 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 32
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4843056339
INFOG(4) Integer space for factors (estim.)= 65395880
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167569
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 4.7540
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 269228014
Elapsed time to reformat/distribute matrix = 4.6679
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 10992152
Size of async. emission buffer (bytes).. = 44078523
Small emission buffer (bytes) .......... = 23008
** Memory allocated, max in Mbytes (INFOG(18)): 3341
** Memory allocated, total in Mbytes (INFOG(19)): 78126
** Memory effectively used, max in Mbytes (INFOG(21)): 2594
** Memory effectively used, total in Mbytes (INFOG(22)): 57266
Elapsed time to process root node = 0.3769
Elapsed time for factorization = 23.6537
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.162D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4856855549
INFOG (10) Integer space for factors = 65112396
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36570
Number of 2x2 pivots in type 2 nodes = 399
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 14
Elapsed time in factorization driver = 33.1300
Factorization time by clock_gettime(): 33.1924 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 32 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_5 #
################################################################################################################################################################
* [MAQAO] Info: Detected 64 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 64 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 73.6373
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 4.7448
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4840692933
-- (4) Integer space for factors (estimated) = 66894318
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167572
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 99
Number of split nodes = 4
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 1763
Total space in MBytes, IC factorization (INFOG(17)): 85781
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 1048
Total space in MBytes, OOC factorization (INFOG(27)): 48670
Elapsed time in analysis driver= 88.9595
Analysis time by clock_gettime(): 88.957 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 64 and #OMP = 1
Elapsed time in save structure driver= 0.0015
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 64 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 64
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4840692933
INFOG(4) Integer space for factors (estim.)= 66894318
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167572
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 4.8284
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 140250363
Elapsed time to reformat/distribute matrix = 5.0127
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 8604236
Size of async. emission buffer (bytes).. = 34502984
Small emission buffer (bytes) .......... = 87004
** Memory allocated, max in Mbytes (INFOG(18)): 1772
** Memory allocated, total in Mbytes (INFOG(19)): 86248
** Memory effectively used, max in Mbytes (INFOG(21)): 1502
** Memory effectively used, total in Mbytes (INFOG(22)): 62623
Elapsed time to process root node = 0.3105
Elapsed time for factorization = 14.2133
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.300D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4854951552
INFOG (10) Integer space for factors = 66210064
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 36072
Number of 2x2 pivots in type 2 nodes = 897
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 43
Elapsed time in factorization driver = 24.1465
Factorization time by clock_gettime(): 24.2105 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 64 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_6 #
################################################################################################################################################################
* [MAQAO] Info: Detected 128 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 5412840 208814389
executing #MPI = 128 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 5412840
Average density of rows/columns = 75
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 73.0576
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 4.6089
A root of estimated size 8025 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 4678450993
-- (3) Real space for factors (estimated) = 4839049631
-- (4) Integer space for factors (estimated) = 69214990
-- (5) Maximum frontal size (estimated) = 15351
-- (6) Number of nodes in the tree = 167577
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 0
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 200
Number of split nodes = 9
RINFOG(1) Operations during elimination (estim)= 1.850D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 1406
Total space in MBytes, IC factorization (INFOG(17)): 103749
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 1039
Total space in MBytes, OOC factorization (INFOG(27)): 68852
Elapsed time in analysis driver= 88.2552
Analysis time by clock_gettime(): 88.254 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 5412840 208814389
executing #MPI = 128 and #OMP = 1
Elapsed time in save structure driver= 0.0028
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 5412840 208814389
executing #MPI = 128 and #OMP = 1
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 128
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 4839049631
INFOG(4) Integer space for factors (estim.)= 69214990
Maximum frontal size (estimated) = 15351
Number of nodes in the tree = 167577
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 4.8790
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.12D+00
Average Effective size of S (based on INFO(39))= 77713835
Elapsed time to reformat/distribute matrix = 5.5197
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 7472040
Size of async. emission buffer (bytes).. = 29962870
Small emission buffer (bytes) .......... = 337856
** Memory allocated, max in Mbytes (INFOG(18)): 1423
** Memory allocated, total in Mbytes (INFOG(19)): 104134
** Memory effectively used, max in Mbytes (INFOG(21)): 922
** Memory effectively used, total in Mbytes (INFOG(22)): 76004
Elapsed time to process root node = 0.4133
Elapsed time for factorization = 14.1727
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 9.552D+09
------ (3) Operations in node elimination = 1.853D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 4853811711
INFOG (10) Integer space for factors = 67928956
INFOG (11) Maximum front size = 15351
INFOG (29) Number of entries in factors = 4691271697
INFOG (12) Number of negative pivots = 73938
INFOG (13) Number of delayed pivots = 23243
Number of 2x2 pivots in type 1 nodes = 35478
Number of 2x2 pivots in type 2 nodes = 1491
RINFOG(19) Smallest pivot WITH perturbed pivots = 9.314D-07
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 9.314D-07
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 102
Elapsed time in factorization driver = 24.6660
Factorization time by clock_gettime(): 24.8363 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 128 and #OMP = 1
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-128_o1_perf009_allowextra_scala_kptr_probe/tools/lprof_run_7 #
################################################################################################################################################################