»
首页
>
程序资料
>
MMX 汇编优化
>
MMX 优化: How to optimize for the Pentium family of microprocessors
How to optimize for the Pentium family of microprocessors
日期: 2000-04-01 14:00 |
联系我
关注我:
Telegram
,
Twitter
英文资料:怎样优化奔腾 Pentium CPU 系列处理器代码?
Introduction
Literature
Calling assembly functions from high level language
Debugging and verifying
Memory model
Alignment
Cache
First time versus repeated execution
Address generation interlock (PPlain and PMMX)
Pairing integer instructions (PPlain and PMMX)
Perfect pairing
Imperfect pairing
Splitting complex instructions into simpler ones (PPlain and PMMX)
Prefixes (PPlain and PMMX)
Overview of PPro, PII and PIII pipeline
Instruction decoding (PPro, PII and PIII)
Instruction fetch (PPro, PII and PIII)
Register renaming (PPro, PII and PIII)
Eliminating dependencies
Register read stalls
Out of order execution (PPro, PII and PIII)
Retirement PPro, PII and PIII)
Partial stalls (PPro, PII and PIII)
Partial register stalls
Partial flags stalls
Flags stalls after shifts and rotates
Partial memory stalls
Dependency chains (PPro, PII and PIII)
Searching for bottlenecks (PPro, PII and PIII)
Jumps and branches (all processors)
Branchprediction in PPlain
Branch prediction in PMMX, PPro, PII and PIII
Avoiding jumps (all processors)
Avoiding conditional jumps by using flags (all processors)
Replacing conditional jumps by conditional moves (PPro, PII and PIII)
Reducing code size (all processors)
Scheduling floating point code (PPlain and PMMX)
Loop optimization (all processors)
Loops in PPlain and PMMX
Loops in PPro, PII and PIII
Problematic Instructions
XCHG (all )
Rotates through carry (all processors)
String instructions (all processors)
Bit test (all processors)
Integer multiplication (all processors)
WAIT instruction (all processors)
FCOM + FSTSW AX (all processors)
FPREM (all processors)
FRNDINT (all processors)
FSCALE and exponential function (all processors)
FPTAN (all processors)
FSQRT (PIII)
MOV [MEM], ACCUM (PPlain and PMMX)
TESTinstruction (PPlain and PMMX)
Bit scan (PPlain and PMMX)
FLDCW (PPro, PII and PIII)
Special topics
LEA instruction (all processors)
Division (all processors)
Freeing floating point registers (all processors)
Transitions between floating point and MMX instructions PMMX, PII and PIII)
Converting from floating point to integer (All processors)
Using integer instructions to do floating point operations (All processors)
Using floating point instructions to do integer operations (PPlain and PMMX)
Moving blocks of data (All processors)
Self-modifying code (All processors)
Detecting processor type (All processors)
List of instruction timings for PPlain and PMMX
Integer instructions
Floating point instructions
MMX instructions (PMMX)
List of instruction timings and micro-op breakdown for PPro, PII and PIII
Integer instructions
Floating point instructions
MMX instructions (PII and PIII)
XMM instructions (PIII)
Testing speed
Comparison of the different microprocessors
Copyright © 1996, 2000 by Agner Fog. Last modified 2000-03-31
前一篇:
5. Memory model
下一篇:
22.2 Branch prediction in PMMX, PPro, PII and PIII
标签
:
Pentium
|
CPU
|
MMX 优化
文章评论
发表你的评论
|
评论中心
|
联系我
目前没有任何评论.
↓ 快抢占第1楼,发表你的评论和意见 ↓
当前页面是本站的
百度 MIP
版本。
欲查看
完整版本和发表评论
请点击:
完整版 »
程序员小辉
建站于 1997
Copyright © XiaoHui.com; 保留所有权利。