Improving Observability of Stochastic Complex Networks under the Supervision of Cognitive Dynamic Systems Paper • 1412.6162 • Published Nov 7, 2014 • 1
Policy Networks with Two-Stage Training for Dialogue Systems Paper • 1606.03152 • Published Jun 10, 2016 • 1
Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning Paper • 1906.00572 • Published Jun 3, 2019 • 1
Hybrid Reward Architecture for Reinforcement Learning Paper • 1706.04208 • Published Jun 13, 2017 • 1
Medical Dead-ends and Learning to Identify High-risk States and Treatments Paper • 2110.04186 • Published Oct 8, 2021 • 1
Orchestrated Value Mapping for Reinforcement Learning Paper • 2203.07171 • Published Mar 14, 2022 • 1
Semi-Markov Offline Reinforcement Learning for Healthcare Paper • 2203.09365 • Published Mar 17, 2022 • 1
Systematic Rectification of Language Models via Dead-end Analysis Paper • 2302.14003 • Published Feb 27, 2023 • 1