Motion decomposition in goal space

This work studies how to understand and imitate other’s behaviors. It presents a new approach to inverse reinforcement learning that goes beyond the one task at a time hypothesis. Therefore it considers activities that are induced by composite rewards. With such a target, inverse reinforcement learning becomes about learning invisible features of the world by observing other’s behaviors.

The paper Feature learning for multi-task inverse reinforcement learning introduces an algorithm based on an alternate gradient descent to learn simultaneously a dictionary of primitive reward functions and their combination into an approximation of the rewards underlying observed behaviors. It illustrates how this approach enables to re-use knowledge on new tasks. Namely the learner observes a set of tasks during training and extracts knowledge about the underlying tasks structure. The learner then uses this information to improve the perception and imitation of a new composite behaviors. In particular this process thus achieves transfer of knowledge between tasks.