Loop id | Source Location | Source Function | Level | Max Thread Time / Walltime gcc_3 (%) | Exclusive Coverage gcc_3 (%) | Inclusive Coverage gcc_3 (%) | Max Exclusive Time Over Threads gcc_3 (s) | Max Inclusive Time Over Threads gcc_3 (s) | Exclusive Time w.r.t. Wall Time gcc_3 (s) | Inclusive Time w.r.t. Wall Time gcc_3 (s) | Nb Threads gcc_3 | GFLOPS gcc_3 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing gcc_3 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | Array Access Efficiency |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1832 | libggml-cpu.so - sgemm.cpp:399-399 [...] | void (anonymous namespace)::tinyBLAS<16, float __vector(16), float __vector(16), unsigned short, unsigned short, float>::gemm_bloc<4, 6>(long, long) | Single | 0.11 | 1.19 | 1.19 | 0.16 | 0.16 | 0.09 | 0.09 | 192 | 2406.29 | 100 | 85.29 | 1 | 1 | 1.29 | 1.78 | 0 | 10 | 0 | 0 | 0 | 100.00 |
1390 | libggml-cpu.so - vec.h:89-89 | ggml_compute_forward_soft_max | Innermost | 0.06 | 0.48 | 0.48 | 0.09 | 0.09 | 0.04 | 0.04 | 192 | 12.03 | 100 | 50 | 1 | 1 | 2 | 2.33 | 0 | 2 | 0 | 0 | 0 | 100.00 |
1370 | libggml-cpu.so - ops.cpp:5552-5563 | ggml_compute_forward_set_rows | Innermost | 0.04 | 0.35 | 0.35 | 0.06 | 0.06 | 0.03 | 0.03 | 192 | 2.02 | 0 | 12.5 | 1 | 1 | 8 | 2.3 | 2 | 0 | 0 | 4 | 2 | 50.00 |
983 | libggml-cpu.so - ops.cpp:6210-6245 [...] | ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool) | Innermost | 0.04 | 0.27 | 0.27 | 0.06 | 0.06 | 0.02 | 0.02 | 192 | 786.09 | 7.55 | 7.84 | 1.17 | 1.56 | 8.17 | 2.93 | NA | NA | NA | NA | NA | 0.00 |
66 | libggml-cpu.so - ggml-cpu.c:1291-1297 | ggml_compute_forward_mul_mat | Innermost | 0.04 | 0.24 | 0.24 | 0.05 | 0.05 | 0.02 | 0.02 | 191 | 20.09 | 0 | 11.51 | 1 | 1 | 2 | 2.99 | 2 | 0 | 0 | 0 | 0 | 100.00 |
1303 | libggml-cpu.so - ops.cpp:4325-4326 | ggml_compute_forward_rms_norm | Innermost | 0.04 | 0.23 | 0.23 | 0.05 | 0.05 | 0.02 | 0.02 | 192 | 48.62 | 100 | 39.29 | 1 | 1 | 2.59 | 3.13 | 0 | 0 | 2 | 0 | 0 | 75.00 |
1389 | libggml-cpu.so - vec.h:677-677 [...] | ggml_compute_forward_soft_max | Innermost | 0.04 | 0.23 | 0.23 | 0.05 | 0.05 | 0.02 | 0.02 | 184 | 36.66 | 100 | 100 | 1 | 1 | 1 | 3.1 | 1 | 1 | 0 | 0 | 0 | 100.00 |
951 | libggml-cpu.so - vec.cpp:372-372 [...] | ggml_vec_swiglu_f32 | Single | 0.03 | 0.18 | 0.18 | 0.05 | 0.05 | 0.01 | 0.01 | 171 | 1406.99 | 100 | 100 | 1.02 | 1 | 1 | 3.33 | 0 | 3 | 0 | 0 | 0 | 100.00 |
703 | libggml-cpu.so - binary-ops.cpp:18-32 [...] | ggml_compute_forward_mul | Innermost | 0.03 | 0.15 | 0.15 | 0.04 | 0.04 | 0.01 | 0.01 | 165 | 41.52 | 100 | 50 | 1 | 1.06 | 2 | 3.03 | 1 | 3 | 0 | 0 | 0 | 100.00 |
619 | libggml-cpu.so - binary-ops.cpp:10-32 [...] | ggml_compute_forward_add_non_quantized | Innermost | 0.03 | 0.15 | 0.15 | 0.04 | 0.04 | 0.01 | 0.01 | 164 | 38.53 | 100 | 50 | 1 | 1.06 | 2 | 3.04 | 1 | 3 | 0 | 0 | 0 | 100.00 |
1831 | libggml-cpu.so - sgemm.cpp:425-427 [...] | void (anonymous namespace)::tinyBLAS<16, float __vector(16), float __vector(16), unsigned short, unsigned short, float>::gemm_bloc<4, 6>(long, long) | Single | 0.02 | 0.15 | 0.15 | 0.04 | 0.04 | 0.01 | 0.01 | 192 | 2769.41 | 90.48 | 34.23 | 1 | 1.24 | 3.33 | 3.18 | 0 | 1 | 0 | 1 | 0 | 75.00 |
952 | libggml-cpu.so - vec.cpp:411-415 [...] | ggml_vec_soft_max_f32 | Single | 0.02 | 0.12 | 0.12 | 0.04 | 0.04 | 0.01 | 0.01 | 192 | 2305.39 | 90.47 | 73.78 | 1.05 | 1 | 1.17 | 3.98 | 0 | 0 | 0 | 2 | 0 | 50.00 |
446 | libggml-cpu.so - mmq.cpp:822-1392 [...] | void parallel_for<(anonymous namespace)::convert_B_packed_format<block_q8_0, 32>(void*, block_q8_0 const*, int, int)::{lambda(int, int)#1}>(int, (anonymous namespace)::convert_B_packed_format<block_q8_0, 32>(void*, block_q8_0 const*, int,... | Innermost | 0.03 | 0.11 | 0.11 | 0.04 | 0.04 | 0.01 | 0.01 | 139 | 0.00 | 52.38 | 47.77 | 1 | 1 | 1.11 | 3.87 | 2 | 0 | 0 | 20 | 0 | 54.55 |
2203 | libggml-cpu.so - quants.c:298-321 [...] | quantize_row_q8_0 | Single | 1.04 | 0.11 | 0.11 | 1.57 | 1.57 | 0.01 | 0.01 | 1 | 643.58 | 59.65 | 29.28 | 1.03 | 1.4 | 2.86 | 1 | NA | NA | NA | NA | NA | 0.00 |
1379 | libggml-cpu.so - vec.h:1444-1445 | ggml_compute_forward_soft_max | Innermost | 0.02 | 0.11 | 0.11 | 0.03 | 0.03 | 0.01 | 0.01 | 192 | 290.58 | 0 | 6.25 | 1 | 1 | 16 | 3.07 | 1 | 0 | 2 | 0 | 0 | 83.33 |