» 首页 > 程序资料 > MMX 汇编优化 > MMX 优化: How to optimize for the Pentium family of microprocessors

26.16 FLDCW (PPro, PII and PIII)

日期: 2000-04-02 15:00 | 联系我
关注我: Telegram, Twitter

The PPro, PII and PIII have a serious stall after the FLDCW instruction if followed by any floating point instruction which reads the control word (which almost all floating point instructions do).

When C or C++ code is compiled it often generates a lot of FLDCW instructions because conversion of floating point numbers to integers is done with truncation while other floating point instructions use rounding. After translation to assembly, you can improve this code by using rounding instead of truncation where possible, or by moving the FLDCW out of a loop where truncation is needed inside the loop.

See chapter 27.5 on how to convert floating point numbers to integers whitout changing the control word.

前一篇：9. Address generation interlock (PPlain and PMMX)
下一篇：24. Scheduling floating point code (PPlain and PMMX)

标签: MMX 优化