Discovering Blind Spots in Reinforcement Learning

AI systems are being applied to increasingly complex tasks that involve interaction with humans. However, such systems can be potentially dangerous as they did not learn during training to avoid actions that could cause serious harm. In this week’s episode, Kyle is joined by Ramya Ramakrishnan, a PhD candidate studying computer science and AI at MIT, to discuss the idea of “blind spots” in reinforcement learning. Their discussion focuses on Ramya’s recent paper, “Discovering Blind Spots in Reinforcement Learning.”

A famous example that involves reinforcement learning is AlphaGo, or AlphaZero, which simply used simulations to generate training data for its neural nets to learn from. While the simulator was able to perfectly capture the state of the game Go in the case of AlphaGo, this may not always be the case, there are cases where the where the simulated data doesn’t match the real world application, resulting in “blind spots.”

Ramya’s paper focuses on these blind spots in reinforcement learning and proposes a way to learn how to recognize them. What are “blind spots”? Blind spots are unexpected errors that a machine or agent makes because its training environment was different from the real world, or execution environment. Many AI systems get deployed in the real world, but they’re prone to errors and might make a lot of mistakes that it never saw during training. These errors can be extremely difficult to catch because the agent does not have a priori knowledge to watch out for them. Nonetheless, they’re mistakes that are important to catch.

For example, autonomous driving applications is one important application of reinforcement learning where cars are being deployed into the real world. A simulator for automated driving may include all the basic components to learn how to drive, take turns, slow down, make appropriate stops, etc. However, if the simulator does not include police cars, ambulance cars, or firetrucks, you go out into the real world, the agent may not know how to react appropriately when it encounters an emergency vehicle in the real world. There are blind spots in medical applications as well, when a doctor can’t see the full state of a patient’s health. Hence, the doctor would have to make decisions without knowing what the patient’s health is really like.

Blind spots can occur due to a whole variety of reasons. In their paper, Ramya and her colleagues focus on blind spots that occur due to incomplete state representation. Consider the case again of an autonomous vehicle trained in a simulated environment. When the agent is deployed into the real world, it might see a fire truck and a car as the same thing and won't react differently because it doesn't know that it's supposed to behave differently around emergency vehicles. How can we discover blind spots during training so that the agent makes less errors in the real world? Tune in to learn more.