Conference on Robot Learning (CoRL) 2018
October 29 -31, 2018, Zürich, Switzerland

Navigation

Full program
Monday 29 October
Tuesday  30 October
Wednesday 31 October

Program at a glance


 Sunday, October 28  
 Monday, October 29      

   Tuesday, October 30
   Wednesday, October 31
   08h30 - 09h15  Plenary M1 - Leslie Kaelbling
 08h30 - 09h15  Plenary T1Joelle Pineau  08h30 - 09h15  Plenary W1 - Marc Toussaint
   09h15 - 10h00  Session M1: RL Systems, Tools, and Benchmarks
 09h15 - 10h00  Session T1: Reward and Goal Inference
 09h15 - 10h00  Session W1: Model-Free RL
   10h00 - 10h30
 Coffee Break
 10h00 - 10h30  Coffee Break  10h00 - 10h30  Coffee Break
   10h30 - 11h15
 Session M2: Imitation and Transfer
 10h30 - 11h30  Session T2: Learning from and for Interaction with Humans
 10h30 - 11h15  Session W2: Perception
   11h15 - 11h45   Lunch Bags  
 11h15 - 11h45   Lunch Bags
   11h45 - 13h00
 Tutorials
  - Jean-Jacques Slotine,
HG E 1.2    - Benjamin Recht AudiMax
 11h30 - 13h00  
 
Women in ML/RL Luncheon
  11h45 - 13h00  Tutorials
- Emo Todorov, AudiMax
-
Masashi Sugiyama, HG D 7.2
   13h15 - 14h00
  Plenary M2 - Danica Kragic
 13h15 - 14h00   Plenary T2 - Daisuke Okanohara  13h15 - 14h00   Plenary W2 - Masahiro Fujita
   14h00 - 15h30
 Session M3: Manipulation
 14h00 - 15h30  Session T3: Language, Model-Based Learning
 14h00 - 15h30  Session W3: Perception and Driving Systems
   15h30 - 16h00
 Coffee Break  15h30 - 16h00  Coffee Break  15h30 - 16h00  Coffee Break
   16h00 - 16h30  Session M4: Learning Theory
 16h00 - 16h30  Session T4: Planning
 16h00 - 16h30  Session W4: Driving
           16h30 - 16h45   Award Ceremony

 16h30 - 18h00
 Poster Session M
 Exhibit & ETH-Lab Demos
 16h30 - 18h00  Poster Session  T
 
Exhibit & ETH-Lab Demos
 16h45 - 18h00   Poster Session  W
 
Exhibit & ETH-Lab Demos
  17:00 - 21:00 Welcome:
 Badge & Pizza
  GEP Pavillon
 
19h00 

Boat Trip - Panta-RHEI:

19h00 

Conference Diner GIARDINO VERDE:
 
18h00 

 Farewell Party
  CLA Glass Hall


Keynotes


Keynote 1:  Leslie Pack Kaelbling    [Google Scholar]

Title: Doing for our robots what evolution did for us

Abstract:
Within robotics and AI, we have made great strides in planning, learning, and reasoning.  But these improvements are moving forward largely independently, resulting in competing strategies for solving similar classes of problems.  A better strategy would be to figure out which aspects of our techniques for planning and reasoning can leverage learning to be more effective, robust, and sample-efficient.  I will discuss many different roles that learning can play in an intelligent robotic system and ways in which they can be scaffolded by architectural and algorithmic commitments.


Bio: 
Leslie is a Professor at MIT.  She has an undergraduate degree in Philosophy and a PhD in Computer Science from Stanford, and was previously on the faculty at Brown University.  She was the founding editor-in-chief of the Journal of Machine Learning Research. Her research agenda is to make intelligent robots using methods including estimation, learning, planning, and reasoning.  She is not a robot.







Keynote 2: Danica Kragic    
[Google Scholar]

Title: Robot systems that act, interact and collaborate

Abstract:
The integral ability of any robot is to act in the environment, interact and collaborate with people and other robots. Interaction between two agents builds on the ability to engage in mutual prediction and signaling. Thus, human-robot interaction requires a system that can interpret and make use of human signaling strategies in a social context. In such scenarios, there is a need for an interplay between processes such as attention, segmentation, object detection, recognition and categorization in order to interact with the environment. In addition, the parameterization of these is inevitably guided by the task or the goal a robot is supposed to achieve. In this talk, I will present the current state of the art in the area of robot perception and interaction and discuss open problems in the area. I will also show how visual input can be integrated with proprioception, tactile and force-torque feedback in order to plan, guide and assess robot's action and interaction with the environment. For interaction, we employ deep generative models that makes inferences over future human motion trajectories given the intention of the human and the history as well as the task setting of the interaction. With help predictions drawn from the model, we can determine the most likely future motion trajectory and make inferences over intentions and objects of interest.

Bio:
Danica Kragic is a Professor at the School of Computer Science and Communication at the Royal Institute of Technology, KTH. She received MSc in Mechanical Engineering from the Technical University of Rijeka, Croatia in 1995 and PhD in Computer Science from KTH in 2001. She has been a visiting researcher at Columbia University, Johns Hopkins University and INRIA Rennes. She is the Director of the Centre for Autonomous Systems. Danica received the 2007 IEEE Robotics and Automation Society Early Academic Career Award. She is a member of the Royal Swedish Academy of Sciences, Royal Swedish Academy of Engineering Sciences and Young Academy of Sweden. She holds a Honorary Doctorate from the Lappeenranta University of Technology. She chaired IEEE RAS Technical Committee on Computer and Robot Vision and served as an IEEE RAS AdCom member. Her research is in the area of robotics, computer vision and machine learning. In 2012, she received an ERC Starting Grant. Her research is supported by the EU, Knut and Alice Wallenberg Foundation, Swedish Foundation for Strategic Research and Swedish Research Council. She is an IEEE Fellow.



Keynote 3:
Joelle Pineau
    
[Google Scholar]

Title: Reproducibility, Reusability, and Robustness in Deep Reinforcement Learning

Abstract:
In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement  learning. However reproducing results for state-of-the-art deep RL methods is seldom straightforward. High variance of some methods can make learning particularly difficult when environments or rewards are strongly stochastic. Furthermore, results can be brittle to even minor perturbations in the domain or experimental procedure. In this talk, I will discuss challenges that arise in experimental techniques and reporting procedures in deep RL, and will suggest methods and guidelines to make future results more reproducible, reusable and robust.




Bio:

Joelle Pineau is an Associate Professor and William Dawson Scholar at McGill University where she co-directs the Reasoning and Learning Lab. She also leads the Facebook AI Research lab in Montreal. Dr. Pineau’s research focuses on developing new models and algorithms for planning and learning in complex partially-observable domains. She also works on applying these algorithms to complex problems in robotics, health care, games and conversational agents. She serves on the editorial board of the Journal of Artificial Intelligence Research and the Journal of Machine Learning Research and is currently President of the International Machine Learning Society. She is a Senior Fellow of the Canadian Institute for Advanced Research and in 2016 was named a member of the College of New Scholars, Artists and Scientists by the Royal Society of Canada. 



Keynote 4:  Daisuke Okanohara     [Google Scholar]

Title: Robots for solving real-world tasks


Abstract:
I will introduce two example tasks among our activities to show the possibilities and difficulties in using robots for solving real-world
tasks. The first is a picking task. Recent studies and challenges (e.g. Amazon picking challenges) have shown the remarkable progress on using robots for picking tasks. In particular deep learning-based recognition and planning allow for the construction of robust and
scalable picking systems. To deploy these systems in the industry, robots need to pick over several million distinct items even including
unknown items. To solve the picking in this scale we require robust, generalizable, easy-to-prepare solutions. The second is using unconstrained spoken language for instruction. Using spoken language for interaction has two merits. First, a user is not required to learn a special skill (e.g. programming) to perform a task. Second, spoken language can contain various information such as object description, orientation, space and time. We show that a spoken language interface is easy to use, and can handle many different situations.



Short Bio:
Daisuke Okanohara is a co-founder and an executive vice president of Preferred Networks. He received a Ph.D. in Computer Science from the University of Tokyo in 2010. His interests range from deep learning, reinforcement learning, generative models, natural language processing, to life science. Preferred Networks is collaborating with FANUC since 2015 to develop intelligent machine tools and robotics, and with Toyota since 2014 to develop an autonomous driving system.








Keynote 5:
Marc Toussaint
   
[Google Scholar]

Title: Symbols for Manipulation and Physical Reasoning

Abstract:
I will start with discussing general purpose physical reasoning and manipulation, the idea of differentiable simulations, and limitations of this idea. My focus is then on planning and optimization approaches to physical reasoning and sequential manipulation planning. In both cases, abstractions or symbols are essential to formulate bounds and heuristics, which are at the core of efficient solvers. In my view, this research area sheds new light on the role of abstractions for reasoning, and eventually for learning. While the focus of this talk is on planning and reasoning, I will argue, and give examples, that an understanding of these problems is essential also for (sample-efficient) learning.



Bio:

Marc Toussaint is full professor for Machine Learning and Robotics at the University of Stuttgart since 2012. Before he was assistant professor and leading an Emmy Noether research group at FU & TU Berlin. His research focuses on the combination of decision theory and machine learning, motivated by fundamental research questions in robotics. Specific interests are in combining geometry, logic and probabilities in learning and reasoning, and appropriate representations and priors for real-world learning and reasoning.








Keynote 6:
Masahiro Fujita
     [ResearchGate]

Title: Entertainment Robot Aibo (Tentative)

Abstract:
Sony made the announcement of a new entertainment robot, aibo, in November 11, 2017.
From the original version of AIBO sold in 1999, it is almost 20 years past. I will describe the new aibo comparing with the original aibo. In addition I will introduce Sony's robotics activities until 2018.


Bio:

Masahiro Fujita received a B.A. degree in Electronics and Communications from the Waseda University, Tokyo, in 1981, and an M.S.E.E. degree from University of California, Irvine, in 1989. He joined Sony Corporation in 1981. He started Robot Entertainment project from 1993, and developed entertainment robot AIBO and a small humanoid robot QRIO. He became a director of Sony Intelligence Dynamics Laboratories Inc. established in 2004, where he led a new approach of studies for intelligence, aiming at realizing emergence of intelligence with emphasizing embodiment and dynamics. In 2012 he became a head of S-Project Office at System Research & Development Group (SRDG), R&D PF, where he again started robotics R&D.  In 2016, he became a head of Technology Strategy Department, where he is in charge of strategy planning of SRDG’s technology and incubation development. In addition to R&D position, in 2016 to present he became Chief Technology Engineer, AI & Robotics Business Group, Business Incubation Platform, where new business developments according to Sony’s “AIxRobotics” strategy announced in 2016 has been in execution.








Tutorials

Tutorial 1:
Jean-Jacques Slotine      [Google Scholar]

Title : Combining physics and neural networks for stable adaptive control and system identification

Abstract:
Concurrent learning and control in dynamical systems is the subject of adaptive nonlinear control. We discuss systematic algorithms in this context, both vintage and recent. 

The algorithms, applicable to both adaptive control and system identification, have guaranteed stability and convergence properties based on Lyapunov theory and contraction analysis. They easily combine information from physics (e.g., in robotics) with network-based representations, can accommodate local minima or redundancy, and can be made explicitly robust to unmodeled uncertainties. A key enabling aspect of this performance is that adaptation and learning occur on a "need-to-know" basis, in the sense that the system learns just enough to achieve the desired real-time control or identification task.

The algorithms can be extended to exploit dynamic representations based on multilayer or deep networks, with similar stability and
convergence guarantees. They can also exploit both physics-based and local basis functions for real-time computational efficiency.


Bio:
Jean-Jacques Slotine is Professor of Mechanical Engineering and Information Sciences, Professor of Brain and Cognitive Sciences, and Director of the Nonlinear Systems Laboratory at the Massachusetts Institute of Technology. He received his Ph.D. from MIT in 1983, at age 23. After working at Bell Labs in the computer research department, he joined the MIT faculty in 1984. His  research focuses on developing rigorous but practical tools for nonlinear systems analysis and control. These have included key advances and experimental demonstrations in the contexts of sliding control, adaptive nonlinear control, adaptive robotics, machine learning, and contraction analysis of nonlinear dynamical systems. Professor Slotine is the co-author of two graduate textbooks, “Robot Analysis and Control” (Asada and Slotine, Wiley, 1986), and “Applied Nonlinear Control” (Slotine and Li, Prentice-Hall, 1991) and is one of the most cited researchers in systems science and robotics. He was a member of the French National Science Council from 1997 to 2002 and of Singapore’s A*STAR SigN Advisory Board from 2007 to 2010, and currently is a member of the Scientific Advisory Board of the Italian Institute of Technology.
    


Tutorial 2:
Benjamin Recht      [Google Scholar]

Title : Optimization Perspectives on Learning to Control

Abstract:
Given the dramatic successes in machine learning over the past half decade, there has been a resurgence of interest in applying learning techniques to continuous control problems in robotics, self-driving cars, and unmanned aerial vehicles. Though such applications appear to be straightforward generalizations of reinforcement learning, it remains unclear which machine learning tools are best equipped to handle decision making, planning, and actuation in highly uncertain dynamic environments.

This tutorial will survey the foundations required to build machine learning systems that reliably act upon the physical world. The primary technical focus will be on numerical optimization tools at the interface of statistical learning and dynamical systems.  We will investigate how to learn models of dynamical systems, how to use data to achieve objectives in a timely fashion, how to balance model specification and system controllability, and how to safely acquire new information to improve performance. We will close by listing several exciting open problems that must be solved before we can build robust, reliable learning systems that interact with an uncertain environment.


Bio:
Benjamin Recht is an Associate Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. Ben's research group studies the theory and practice of optimization algorithms with a focus on applications in machine learning, data analysis, and controls. Ben is the recipient of a Presidential Early Career Awards for Scientists and Engineers, an Alfred P. Sloan Research Fellowship, the 2012 SIAM/MOS Lagrange Prize in Continuous Optimization, the 2014 Jamon Prize, the 2015 William O. Baker Award for Initiatives in Research, and the 2017 NIPS Test of Time Award.






Tutorial 3 : Emo Todorov      [Google Scholar]

Title : Sensorimotor intelligence via model-based optimization

Abstract:
Model-free reinforcement learning has produced surprisingly good results for a brute-force method. However it appears to be reaching an asymptote that is not competitive with model-based optimization. Furthermore it is mostly limited to simulation where a model is available by definition. So we might as well take full advantage of that model, and reserve model-free methods for fine-tuning on real data.

In this tutorial I will discuss state-of-the-art methods that become available once we (admit that) have a model. As with any other form of optimization, the single most important ingredient is having access to analytical derivatives. This is standard in supervised learning for example, but general-purpose physics simulators are difficult to differentiate. Nevertheless this is now possible in MuJoCo as well as some more limited simulators, opening up possibilities for much more efficient optimization. Another essential ingredient in the control context is inverse dynamics. This enables trajectory optimization methods where the consistency between states and controls no longer needs to be enforced numerically, and instead one has to enforce under-actuation constraints which are lower-dimensional. Another challenge, specific to problems with contact dynamics, is that contacts result in very complex optimization landscapes that can be difficult to navigate even for a full Newton method. Unlike the situation in neural networks where saddle points appear to be the problem, here the problem is harder: the gradient is large yet it changes rapidly in non-linear ways that are not captured by the Hessian (and we don't have 3rd-order methods). This can be alleviated using continuation methods, where the physics model is smoothed early in optimization and gradually made harder while tracking the solution. The algorithm best suited for this problem class is Gauss-Newton. Putting all the ingredients together, one iteration of trajectory optimization can be performed in a few milliseconds on a single computer. The same machinery can be used to solve state estimation problems and system identification problems, in addition to control problems. This is done by modifying the cost function and keeping everything else the same.

A down-side of this framework is that it involves more mathematics, physics, optimization and software engineering than what the community has gotten used to. A possible solution is to produce software that does it automatically, leaving parameter tuning to the user. We are in the process of developing such software (called Optico) and will show demos at the tutorial.


Bio:

Emo Todorov obtained his PhD from MIT in 1998 in Cognitive Neuroscience, with a focus on biological motor control. He was then a postdoc at the Gatsby Computational Neuroscience Unit at UCL, research scientist in Biomedical Engineering at USC, assistant professor in Cognitive Science at UCSD, and associate professor in Applied Mathematics and Computer Science & Engineering at UW.  Currently he is on leave from academia, to develop the MuJoCo physics simulator as well as model-based optimization software built on top of it.





Tutorial 4 Masashi Sugiyama       [Google Scholar]

Title: Machine learning from weak supervision---Towards accurate classification with low labeling costs

Abstract:
Recent advances in machine learning with big labeled data allow us to achieve human-level performance in various tasks such as speech recognition, image understanding, and natural language translation. On the other hand, there are still many application domains---including robotics---where human labor is involved in the data acquisition process and thus the use of massive labeled data is prohibited.  In this tutorial, I will introduce recent advances in classification techniques from weak supervision, including classification from two sets of unlabeled data, classification from positive and unlabeled data, and a novel approach to semi-supervised classification.


Bio:
Masashi Sugiyama received the PhD degree in Computer Science from Tokyo Institute of Technology, Japan in 2001. He has been Professor at the University of Tokyo since 2014 and concurrently appointed as Director of RIKEN Center for Advanced Intelligence Project in 2016. His research interests include theory, algorithms, and applications of machine learning. He (co)-authored several books such as Density Ratio Estimation in Machine Learning (Cambridge University Press, 2012), Machine Learning in Non-Stationary Environments (MIT Press, 2012), Statistical Reinforcement Learning (Chapman and Hall, 2015), and Introduction to Statistical Machine Learning (Morgan Kaufmann, 2015). He served as a Program co-chair and General co-chair of the Neural Information Processing Systems conference in 2015 and 2016, respectively. Masashi Sugiyama received the Japan Society for the Promotion of Science Award and the Japan Academy Medal in 2017.