Exploration

Inspired from developmental psychology and the way animals learn, an autonomous system that aims at life-long learning needs to adapt, decide what to learn and follow a specific developmental trajectory. Specific mechanisms of curiosity, information seeking and exploration need to be developed.

Recently, I have contributed with theoretical analysis of empirical measures of learning progress to generalize well-known methods of exploration in markov decision processes to situations with nonstationary noise. We showed that the use of empirical measures of learning rate reduce to the standard measure of learning rate commonly used in machine learning but are more robust to non-stationarities.

In many cases there are several different exploration methods that can be used, and in many cases theoretical results give similar sample complexity. For any particular case it is not clear which method to use. I suggested an online method that is able to select online the best exploration strategy.

I introduced a generic perspective on life-long learning called the strategic student problem. In many cases there are a large variety of tasks to be learned, specially in life-long learning situations, and it is necessary to decide which task can be learned, and how fast. I showed that the problem can only be solved efficiently in the particular situation of submodular costs. Due to this reason we introduced a bandit type algorithm that is able to address more complex costs but with more loose guarantees.

We started a collaboration with the Columbia University (USA), to better develop the relation between computational methods and results from neuroscience. We are currently developing biological plausible computational models of curiosity and information seeking in animals.

Relevant Publications:

Information-seeking, curiosity, and attention: computational and neural mechanisms, Jacqueline Gottlieb, Pierre-Yves Oudeyer, Manuel Lopes and Adrien Baranes. Trends in Cognitive Sciences , 2013. (pdf)

Learning Exploration Strategies in Model-Based Reinforcement Learning, Todd Hester, Manuel Lopes and Peter Stone. International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Saint Paul, Minnesota, USA, 2013. (pdf)

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress, Manuel Lopes, Tobias Lang, Marc Toussaint and Pierre-Yves Oudeyer. Neural Information Processing Systems (NIPS), Tahoe, USA, 2012. (pdf)

The Strategic Student Approach for Life-Long Exploration and Learning, Manuel Lopes and Pierre-Yves Oudeyer. IEEE - International Conference on Development and Learning (ICDL), 2012. (pdf)

A Developmental Roadmap for Learning by Imitation in Robots, Manuel Lopes and José Santos-Victor. IEEE Transactions in Systems Man and Cybernetic - Part B: Cybernetics, 37(2), 2007. (pdf)

Body Schema Acquisition through Active Learning, Ruben Martinez-Cantin, Manuel Lopes and Luis Montesano.IEEE - International Conference on Robotics and Automation (ICRA), Anchorage, Alaska, USA, 2010. (pdf)

Active Learning for Reward Estimation in Inverse Reinforcement Learning,Manuel Lopes, Francisco Melo and Luis Montesano. European Conference on Machine Learning (ECML/PKDD), Bled, Slovenia, 2009. (pdf)