26.10 FSCALE and exponential function (all processors)
FSCALE is slow on all processors. Computing integer powers of 2 can be done much faster by inserting the desired power in the exponent field of the floating point number. To calculate 2N, where N is a signed integer, select from the examples below the one that fits your range of N:
For |N| < 27-1 you can use single precision:
MOV EAX, [N] SHL EAX, 23 ADD EAX, 3F800000H MOV DWORD PTR [TEMP], EAX FLD DWORD PTR [TEMP]
For |N| < 210-1 you can use double precision:
MOV EAX, [N] SHL EAX, 20 ADD EAX, 3FF00000H MOV DWORD PTR [TEMP], 0 MOV DWORD PTR [TEMP+4], EAX FLD QWORD PTR [TEMP]
For |N| < 214-1 use long double precision:
MOV EAX, [N] ADD EAX, 00003FFFH MOV DWORD PTR [TEMP], 0 MOV DWORD PTR [TEMP+4], 80000000H MOV DWORD PTR [TEMP+8], EAX FLD TBYTE PTR [TEMP]
FSCALE is often used in the calculation of exponential functions. The following code shows an exponential function without the slow FRNDINT and FSCALE instructions:
; extern "C" long double _cdecl exp (double x); _exp PROC NEAR PUBLIC _exp FLDL2E FLD QWORD PTR [ESP+4] ; x FMUL ; z = x*log2(e) FIST DWORD PTR [ESP+4] ; round(z) SUB ESP, 12 MOV DWORD PTR [ESP], 0 MOV DWORD PTR [ESP+4], 80000000H FISUB DWORD PTR [ESP+16] ; z - round(z) MOV EAX, [ESP+16] ADD EAX,3FFFH MOV [ESP+8],EAX JLE SHORT UNDERFLOW CMP EAX,8000H JGE SHORT OVERFLOW F2XM1 FLD1 FADD ; 2^(z-round(z)) FLD TBYTE PTR [ESP] ; 2^(round(z)) ADD ESP,12 FMUL ; 2^z = e^x RET UNDERFLOW: FSTP ST FLDZ ; return 0 ADD ESP,12 RET OVERFLOW: PUSH 07F800000H ; +infinity FSTP ST FLD DWORD PTR [ESP] ; return infinity ADD ESP,16 RET _exp ENDP