Title | Journal | Journal Categories | Citations | Publication Date |
---|---|---|---|---|
On the theory of policy gradient methods: Optimality, approximation, and distribution shift | 2021 | |||
A comprehensive survey on safe reinforcement learning | 2015 | |||
Sur la détermination des polynômes d’approximation de degré donnée | 1934 | |||
Policy evaluation with temporal differences: A survey and comparison | 2014 | |||
CRPO: A new approach for safe reinforcement learning with convergence guarantee |