Hardware Thread Granularity

1. Coarse-Grained Multithreading

Concept

Thread 一直跑，直到遇到 high latency 事件（像是 memory stall）才 context switch

Thread A: [C][C][C][C][M stall!]
                              ↓ 切換
Thread B:                    [C][C][C][C][M stall!]
                                              ↓ 切換回來
Thread A:                                    [C][C]...

Problem: Context Switch 成本高

CPU 為了效能會預先把接下來的 instruction pipeline（prefetch、decode）
context switch 時必須先把 Thread A 的 instruction 全部清理掉（flush）並且讓 Thread B 的 instruction 重新填滿 pipeline
這段操作的 latency 很高

所以 Coarse-Grained 只在「不 context switch 時更虧」的時候才 context switch，也就是 memory stall latency 遠大於 pipeline flush latency

2. Fine-Grained Multithreading

Concept

每個 Thread 執行完一個 instruction 就直接 context switch，不管有沒有 memory stall

Cycle: 1      2      3      4      5      6
       [A 指令][B 指令][A 指令][B 指令][A 指令][B 指令]

Feature：Context Switch 成本低

由於硬體設計使每個 core 都有多個 hardware thread，而每個 hardware thread 都有自己的 state，因此 context switch 不需要 flush 而只是換一套 state 來用

Chilfox

目錄

D-OS-Ch05ga-Hardware_Thread_Granularity

Hardware Thread Granularity

1. Coarse-Grained Multithreading

Concept

Problem: Context Switch 成本高

2. Fine-Grained Multithreading

Concept

Feature：Context Switch 成本低

關係圖譜

反向連結