Chilfox

❯

❯

❯

D DL4CV Lec21c Q Function

D-DL4CV-Lec21c-Q-Function

Type

digest

Aliases

Q-Function

Source Link

D-DL4CV-Lec21-Reinforcement_Learning

Q-Function

The Q-function at state $s$ doing action $a$ computes the expected reward from following policy $π$

Q^{π} (s, a) = E [t \geq 0 \sum γ^{t} r_{t} ∣ s_{0} = s, a_{0} = a, π]

關係圖譜

反向連結

D-DL4CV-Lec21-Reinforcement_Learning
D-DL4CV-Lec21e-Deep_Q-Learning

Created with Quartz v4.5.1 © 2026