| Function: __svml_idiv4_e9 | Module: exec | Source: :0-0 | Coverage: 0.02% |
|---|
| Function: __svml_idiv4_e9 | Module: exec | Source: :0-0 | Coverage: 0.02% |
|---|
*** This Panel is Intentionally Left Blank. *** It is due to a lack of debug symbols in the given object |
0x478c50 ENDBR64 |
0x478c54 VPXOR %XMM2,%XMM2,%XMM2 |
0x478c58 VPCMPEQD %XMM2,%XMM1,%XMM2 |
0x478c5c VTESTPS %XMM2,%XMM2 |
0x478c61 JE 478c6a |
0x478c63 MOV $0,%EAX |
0x478c68 DIV %AL |
0x478c6a VCVTDQ2PS %XMM1,%XMM2 |
0x478c6e VRCPPS %XMM2,%XMM2 |
0x478c72 VCVTPS2PD %XMM2,%YMM2 |
0x478c76 VCVTDQ2PD %XMM1,%YMM1 |
0x478c7a VMULPD %YMM2,%YMM1,%YMM3 |
0x478c7e VBROADCASTSD 0x12b09(%RIP),%YMM4 |
0x478c87 VSUBPD %YMM3,%YMM4,%YMM3 |
0x478c8b VMULPD %YMM2,%YMM3,%YMM2 |
0x478c8f VCVTDQ2PD %XMM0,%YMM0 |
0x478c93 VMULPD %YMM0,%YMM2,%YMM0 |
0x478c97 VMULPD %YMM1,%YMM2,%YMM1 |
0x478c9b VBROADCASTSD 0x2535c(%RIP),%YMM2 |
0x478ca4 VSUBPD %YMM1,%YMM2,%YMM1 |
0x478ca8 VMULPD %YMM1,%YMM0,%YMM0 |
0x478cac VCVTTPD2DQ %YMM0,%XMM0 |
0x478cb0 VZEROUPPER |
0x478cb3 RET |
0x478cb4 NOPW %CS:(%RAX,%RAX,1) |
0x478cbe XCHG %AX,%AX |
| Path / |
| Source file and lines | |
| Module | exec |
| nb instructions | 23 |
| nb uops | 27.50 |
| loop length | 96.50 |
| used x86 registers | 0.50 |
| used mmx registers | 0 |
| used xmm registers | 3 |
| used ymm registers | 5 |
| used zmm registers | 0 |
| nb stack references | 0 |
| ADD-SUB / MUL ratio | 0.40 |
| micro-operation queue | 4.58 cycles |
| front end | 4.58 cycles |
| ALU0/BRU0 | ALU1 | ALU2 | ALU3 | BRU1 | AGU0 | AGU1 | AGU2 | FP0 | FP1 | FP2 | FP3 | FP4 | FP5 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uops | 1.50 | 0.25 | 0.13 | 0.13 | 1.50 | 0.67 | 0.67 | 0.67 | 4.75 | 4.75 | 4.75 | 4.75 | 0.50 | 0.50 |
| cycles | 1.50 | 0.25 | 0.13 | 0.13 | 1.50 | 0.67 | 0.67 | 0.67 | 4.75 | 4.75 | 4.75 | 4.75 | 0.50 | 0.50 |
| Cycles executing div or sqrt instructions | 2.00 |
| Front-end | 4.58 |
| Dispatch | 4.75 |
| DIV/SQRT | 2.00 |
| Overall L1 | 4.79 |
| all | 92% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 92% |
| all | 84% |
| load | 0% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 100% |
| add-sub | 100% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 100% |
| other | 60% |
| all | 87% |
| load | 0% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 100% |
| add-sub | 100% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 100% |
| other | 78% |
| all | 23% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 23% |
| all | 38% |
| load | 12% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 50% |
| add-sub | 50% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 25% |
| other | 25% |
| all | 33% |
| load | 12% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 50% |
| add-sub | 50% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 25% |
| other | 24% |
| Source file and lines | |
| Module | exec |
| nb instructions | 24 |
| nb uops | 29 |
| loop length | 100 |
| used x86 registers | 1 |
| used mmx registers | 0 |
| used xmm registers | 3 |
| used ymm registers | 5 |
| used zmm registers | 0 |
| nb stack references | 0 |
| ADD-SUB / MUL ratio | 0.40 |
| micro-operation queue | 4.83 cycles |
| front end | 4.83 cycles |
| ALU0/BRU0 | ALU1 | ALU2 | ALU3 | BRU1 | AGU0 | AGU1 | AGU2 | FP0 | FP1 | FP2 | FP3 | FP4 | FP5 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uops | 2.00 | 0.50 | 0.25 | 0.25 | 2.00 | 0.67 | 0.67 | 0.67 | 4.75 | 4.75 | 4.75 | 4.75 | 0.50 | 0.50 |
| cycles | 2.00 | 0.50 | 0.25 | 0.25 | 2.00 | 0.67 | 0.67 | 0.67 | 4.75 | 4.75 | 4.75 | 4.75 | 0.50 | 0.50 |
| Cycles executing div or sqrt instructions | 4.00 |
| Front-end | 4.83 |
| Dispatch | 4.75 |
| DIV/SQRT | 4.00 |
| Overall L1 | 4.83 |
| all | 85% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 85% |
| all | 84% |
| load | 0% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 100% |
| add-sub | 100% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 100% |
| other | 60% |
| all | 85% |
| load | 0% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 100% |
| add-sub | 100% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 100% |
| other | 75% |
| all | 22% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 22% |
| all | 38% |
| load | 12% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 50% |
| add-sub | 50% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 25% |
| other | 25% |
| all | 32% |
| load | 12% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 50% |
| add-sub | 50% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 25% |
| other | 23% |
| Instruction | Nb FU | ALU0/BRU0 | ALU1 | ALU2 | ALU3 | BRU1 | AGU0 | AGU1 | AGU2 | FP0 | FP1 | FP2 | FP3 | FP4 | FP5 | Latency | Recip. throughput |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ENDBR64 | |||||||||||||||||
| VPXOR %XMM2,%XMM2,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 |
| VPCMPEQD %XMM2,%XMM1,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 1 | 0.25 |
| VTESTPS %XMM2,%XMM2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0.50 | 0.50 | 7 | 1 |
| JE 478c6a <__svml_idiv4_e9+0x1a> | 1 | 0.50 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50-1 |
| MOV $0,%EAX | 1 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 |
| DIV %AL | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-14 | 4 |
| VCVTDQ2PS %XMM1,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 3 | 0.50 |
| VRCPPS %XMM2,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 |
| VCVTPS2PD %XMM2,%YMM2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 4 | 0.67 |
| VCVTDQ2PD %XMM1,%YMM1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 4 | 0.67 |
| VMULPD %YMM2,%YMM1,%YMM3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VBROADCASTSD 0x12b09(%RIP),%YMM4 | 1 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
| VSUBPD %YMM3,%YMM4,%YMM3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 3 | 0.50 |
| VMULPD %YMM2,%YMM3,%YMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VCVTDQ2PD %XMM0,%YMM0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 4 | 0.67 |
| VMULPD %YMM0,%YMM2,%YMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VMULPD %YMM1,%YMM2,%YMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VBROADCASTSD 0x2535c(%RIP),%YMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
| VSUBPD %YMM1,%YMM2,%YMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 3 | 0.50 |
| VMULPD %YMM1,%YMM0,%YMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VCVTTPD2DQ %YMM0,%XMM0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 6 | 0.67 |
| VZEROUPPER | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| RET | 1 | 0.50 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
| Source file and lines | |
| Module | exec |
| nb instructions | 22 |
| nb uops | 26 |
| loop length | 93 |
| used x86 registers | 0 |
| used mmx registers | 0 |
| used xmm registers | 3 |
| used ymm registers | 5 |
| used zmm registers | 0 |
| nb stack references | 0 |
| ADD-SUB / MUL ratio | 0.40 |
| micro-operation queue | 4.33 cycles |
| front end | 4.33 cycles |
| ALU0/BRU0 | ALU1 | ALU2 | ALU3 | BRU1 | AGU0 | AGU1 | AGU2 | FP0 | FP1 | FP2 | FP3 | FP4 | FP5 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uops | 1.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.67 | 0.67 | 0.67 | 4.75 | 4.75 | 4.75 | 4.75 | 0.50 | 0.50 |
| cycles | 1.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.67 | 0.67 | 0.67 | 4.75 | 4.75 | 4.75 | 4.75 | 0.50 | 0.50 |
| Cycles executing div or sqrt instructions | NA |
| Front-end | 4.33 |
| Dispatch | 4.75 |
| Overall L1 | 4.75 |
| all | 100% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 100% |
| all | 84% |
| load | 0% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 100% |
| add-sub | 100% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 100% |
| other | 60% |
| all | 89% |
| load | 0% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 100% |
| add-sub | 100% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 100% |
| other | 81% |
| all | 25% |
| load | NA (no load vectorizable/vectorized instructions) |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | NA (no mul vectorizable/vectorized instructions) |
| add-sub | NA (no add-sub vectorizable/vectorized instructions) |
| fma | NA (no fma vectorizable/vectorized instructions) |
| other | 25% |
| all | 38% |
| load | 12% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 50% |
| add-sub | 50% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 25% |
| other | 25% |
| all | 34% |
| load | 12% |
| store | NA (no store vectorizable/vectorized instructions) |
| mul | 50% |
| add-sub | 50% |
| fma | NA (no fma vectorizable/vectorized instructions) |
| div/sqrt | 25% |
| other | 25% |
| Instruction | Nb FU | ALU0/BRU0 | ALU1 | ALU2 | ALU3 | BRU1 | AGU0 | AGU1 | AGU2 | FP0 | FP1 | FP2 | FP3 | FP4 | FP5 | Latency | Recip. throughput |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ENDBR64 | |||||||||||||||||
| VPXOR %XMM2,%XMM2,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 |
| VPCMPEQD %XMM2,%XMM1,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 1 | 0.25 |
| VTESTPS %XMM2,%XMM2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0.50 | 0.50 | 7 | 1 |
| JE 478c6a <__svml_idiv4_e9+0x1a> | 1 | 0.50 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50-1 |
| VCVTDQ2PS %XMM1,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 3 | 0.50 |
| VRCPPS %XMM2,%XMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 4 | 0.50 |
| VCVTPS2PD %XMM2,%YMM2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 4 | 0.67 |
| VCVTDQ2PD %XMM1,%YMM1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 4 | 0.67 |
| VMULPD %YMM2,%YMM1,%YMM3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VBROADCASTSD 0x12b09(%RIP),%YMM4 | 1 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
| VSUBPD %YMM3,%YMM4,%YMM3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 3 | 0.50 |
| VMULPD %YMM2,%YMM3,%YMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VCVTDQ2PD %XMM0,%YMM0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 4 | 0.67 |
| VMULPD %YMM0,%YMM2,%YMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VMULPD %YMM1,%YMM2,%YMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VBROADCASTSD 0x2535c(%RIP),%YMM2 | 1 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0.50 |
| VSUBPD %YMM1,%YMM2,%YMM1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 3 | 0.50 |
| VMULPD %YMM1,%YMM0,%YMM0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 3 | 0.50 |
| VCVTTPD2DQ %YMM0,%XMM0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 | 1 | 0.50 | 0 | 0 | 6 | 0.67 |
| VZEROUPPER | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| RET | 1 | 0.50 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
| Name | Coverage (%) | Time (s) |
|---|---|---|
| ○__svml_idiv4_e9 | 0.02 | 0.01 |
