Classification of Offline Reinforcement Learning
Offline Reinforcement Learning
Conservative value-based approaches
- [CQL]
- Kumar, A., Zhou, A., Tucker, G., and Levine, S. Conservative Q-learning for offline reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- [COMBO]
- Yu, T., Kumar, A., Rafailov, R., Rajeswaran, A., Levine, S., and Finn, C. COMBO: conservative offline model-based policy optimization. In Advances in Neural Information Processing Systems (NeurIPS), pp. 28954–28967, 2021.
- [SAC-N]
- An, G., Moon, S., Kim, J., and Song, H. O. Uncertainty-based offline reinforcement learning with diversified q-ensemble. In Advances in Neural Information Processing Systems (NeurIPS), pp. 7436–7447, 2021.
- []
- Lyu, J., Ma, X., Li, X., and Lu, Z. Mildly conservative Q-learning for offline reinforcement learning. arxiv Preprint arxiv:2206.04745, 2022.
- [RORL]
- Yang, R., Bai, C., Ma, X., Wang, Z., Zhang, C., and Han, L. RORL: robust offline reinforcement learning via conservative smoothing. arxiv Preprint arxiv:2206.02829, 2022.
Regularized policy-based approaches
- [BCQ]
- Fujimoto, S., Meger, D., and Precup, D. Off-policy deep reinforcement learning without exploration. In International Conference on Machine Learning (ICML), pp. 2052–2062, 2019.
- [AWAC]
- Nair, A., Dalal, M., Gupta, A., and Levine, S. Accelerating online reinforcement learning with offline datasets. arxiv Preprint arxiv:2006.09359, 2020.
- []
- Kostrikov, I., Fergus, R., Tompson, J., and Nachum, O. Offline reinforcement learning with fisher divergence critic regularization. In International Conference on Machine Learning (ICML), pp. 5774–5783, 2021a.
- [TD3+BC]
- Fujimoto, S. and Gu, S. S. A minimalist approach to offline reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS), pp. 20132–20145, 2021.
- [SPOT]
- Wu, J., Wu, H., Qiu, Z., Wang, J., and Long, M. Supported policy optimization for offline reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS), pp. 31278–31291, 2022.
- [PRDC]
- Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu, Policy Regularization with Dataset Constraint for Offline Reinforcement Learning, ICML2023, 2023