Learning algorithms that enable software agents to take actions that maximizes some notion of cumulative reward.
Inspired from behavioural psychology, reinforcement learning agents determine the ideal behaviour in a given context with the help of feedback. Our group focuses on both the theoretical and practical aspects of RL, more recently in the context of deep learning, with ongoing projects ranging from establishing theoretical bounds on augmented multi-armed bandit algorithms to achieving safe and stable learning algorithm for deployment in safety-critical applications like Autonomous Driving.