29.2 Floating point instructions


日期: 2000-04-03 14:00 | 联系我
关注我: Telegram, Twitter

< TD>
29.2 Floating point instructions
InstructionOperandsmicro-opsdelaythroughput
p0p1p01p2p3p4
FLDr1
FLDm32/6411
FLDm8022
FBLDm80382
FST(P)r1
FST(P)m32/m64111
FSTPm80222
FBSTPm8016522
FXCHr03/1 f)
FILDm315
FIST(P)m2115
FLDZ1
FLD1 FLDPI FLDL2E etc.2
FCMOVccr22
FNSTSWAX37
FNSTSWm16111
FLDCWm1611110
FNSTCWm16111
FADD(P) FSUB(R)(P)r131/1
FADD(P) FSUB(R)(P)m113-41/1
FMUL(P)r151/2 g)
FMUL(P)m115-61/2 g)
FDIV(R)(P)r138 h)1/37
FDIV(R)(P)m1138 h)1/37
FABS1
FCHS32
FCOM(P) FUCOMr11
FCOM(P) FUCOMm111
FCOMPP FUCOMPP111
FCOMI(P) FUCOMI(P)r11
FCOMI(P) FUCOMI(P)m111
FIADD FISUB(R)m61
FI MULm61
FIDIV(R)m61
FICOM(P)m61
FTST11
FXAM12
FPREM23
FPREM133
FRNDINT30
FSCALE56
FXTRACT15
FSQRT169e,i)
FSIN FCOS17-9727-103e)
FSINCOS18-11029-130e)
F2XM117-4866e)
FYL2X36-54103e)
FYL2XP131-5398-107e)
FPTAN21-10213-143e)
FPATAN25-8644-143e)
FNOP1
FINCSTP FDECSTP1
FFREEr1
FFREEPr2
FNCLEX3
FNINIT13
FNSAVE141
FRSTOR72
WAIT2
Notes:

e) not pipelined

f) FXCH generates 1 micro-op that is resolved by register renaming without going to any port.

g) FMUL uses the same circuitry as integer multiplication. Therefore, the combined throughput of mixed floating point and integer multiplications is 1 FMUL + 1 IMUL per 3 clock cycles.

h) FDIV delay depends on precision specified in control word: precision 64 bits gives delay 38, precision 53 bits gives delay 32, precision 24 bits gives delay 18. Division by a power of 2 takes 9 clocks. Throughput is 1/(delay-1).

i) faster for lower precision.


 文章评论
目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓

当前页面是本站的 Google AMP 版本。
欲查看完整版本和发表评论请点击:完整版 »

 

程序员小辉 建站于 1997
Copyright © XiaoHui.com; 保留所有权利。