18. Retirement (PPro, PII and PIII)
Retirement is a process where the temporary registers used by the uops are copied into the permanent registers EAX, EBX, etc. When a uop has been executed it is marked in the ROB as ready to retire.
The retirement station can handle three uops per clock cycle. This may not seem like a problem because the throughput is already limited to 3 uops per clock in the RAT. But retirement may still be a bottleneck for two reasons. Firstly, instructions must retire in order. If a uop is executed out of order then it cannot retire before all preceding uops in the order have retired. And the second limitation is that taken jumps must retire in the first of the three slots in the retirement station. Just like decoder D1 and D2 can be idle if the next instruction only fits into D0, the last two slots in the retirement station can be idle if the next uop to retire is a taken jump. This is significant if you have a small loop where the number of uops in the loop is not divisible by three.
All uops stay in the reorder buffer (ROB) until they retire. The ROB can hold 40 uops. This sets a limit to the number of instructions that can execute during the long delay of a division or other slow operation. Before the division is finished the ROB will be filled up with executed uops waiting to retire. Only when the division is finished and retired can the subsequent uops begin to retire, because retirement takes place in order.
In case of speculative execution of predicted branches (see chapter 22) the speculatively executed uops cannot retire until it is certain that the prediction was correct. If the prediction turns out to be wrong then the speculatively executed uops are discarded without retirement.
The following instructions cannot execute speculatively: memory writes, IN, OUT, and serializing instructions.