The 2nd International Workshop on Dynamic Scene Reconstruction

Reconstruction of general dynamic scenes is motivated by potential applications in film and broadcast production together with the ultimate goal of automatic understanding of real-world scenes from distributed camera networks. With recent advances in hardware and the advent of virtual and augmented reality, dynamic scene reconstruction is being applied to more complex scenes with applications in Entertainment, Games, Film, Creative Industries and AR/VR/MR. We welcome contributions to this workshop in the form of oral presentations and posters. Suggested topics include, but are not limited to:

  • - Dynamic 3D reconstruction from single, stereo or multiple views
  • - Learning-based methods in dynamic scene reconstruction and understanding
  • - Multi-modal dynamic scene modelling (RGBD, LIDAR, 360 video, light fields)
  • - 4D reconstruction and modelling
  • - 3D/4D data acquisition, representation, compression and transmission
  • - Scene analysis and understanding in 2D and 3D
  • - Structure from motion, camera calibration and pose estimation
  • - Digital humans: motion and performance capture, bodies, faces, hands
  • - Geometry processing
  • - Computational photography
  • - Appearance and reflectance modelling
  • - Scene modelling in the wild, moving cameras, handheld cameras
  • - Applications of dynamic scene reconstruction (VR/AR, character animation, free-viewpoint video, relighting, medical imaging, creative content production, animal tracking, HCI, sports)

The objectives for this workshop are to:

  • - Bringing together leading experts in the field of general dynamic scene reconstruction to help propel the field forward.
  • - Create and maintain an online database of datasets and papers
  • - Accelerate research progress in the field of dynamic scene reconstruction to match the requirements of real-world applications by identifying the challenges and ways to address them through a panel discussion between experts, presenters and attendees.

Dynavis Recap


Yaser Sheikh is Director of the Facebook Reality Lab in Pittsburgh, devoted to achieving photorealistic social interactions in augmented reality (AR) and virtual reality (VR), and an Associate Professor at the Robotics Institute, Carnegie Mellon University. His research broadly focuses on machine perception and rendering of social behavior, spanning sub-disciplines in computer vision, computer graphics, and machine learning. With colleagues and students, he has won the Honda Initiation Award (2010), Popular Science’s "Best of What’s New" Award, as well as several conference best paper and demo awards. In 2004, he received the Hillman Fellowship for Excellence in Computer Science Research. Yaser has served as a senior committee member at leading conferences in computer vision, computer graphics, and robotics, and served as an Associate Editor of the Elsevier journal Computer Vision and Image Understanding. His research has been featured by various news and media outlets including The New York Times, BBC, MSNBC, Popular Science, WIRED, The Verge, and New Scientist.

Talk Title: Metric Telepresence
In this talk, I will describe early steps taken at FRL Pittsburgh in achieving photorealistic telepresence: realtime social interactions in AR/VR with avatars that look like you, move like you, and sound like you. Telepresence is, perhaps, the application with the greatest potential to bring billions of people into VR. It is the next step along the evolution from telegraphy to telephony to videoconferencing. Just like telephony and video-conferencing, the key attribute of success will be “authenticity”: users' trust that received signals (e.g., audio for the telephone and video/audio for VC) are truly those transmitted by their friends, colleagues, or family. The challenge arises from this seeming contradiction: how do we enable authentic interactions in artificial environments? Our approach to this problem centers around codec avatars: the use of neural networks to address the computer vision (encoding) and computer graphics (decoding) problems in signal transmission and reception. The creation of codec avatars require capture systems of unprecedented 3D sensing resolution, which I will also describe.

Raquel Urtasun is the Chief Scientist of Uber ATG and the Head of Uber ATG Toronto. She is also an Associate Professor in the Department of Computer Science at the University of Toronto, a Canada Research Chair in Machine Learning and Computer Vision and a co-founder of the Vector Institute for AI. Prior to this, she was an Assistant Professor at the Toyota Technological Institute at Chicago (TTIC), an academic computer science institute affiliated with the University of Chicago. She was also a visiting professor at ETH Zurich during the spring semester of 2010. She received her Ph.D. degree from the Computer Science department at Ecole Polytechnique Federal de Lausanne (EPFL) in 2006 and did her postdoc at MIT and UC Berkeley. She is a world leading expert in AI for self-driving cars. Her research interests include machine learning, computer vision, robotics and remote sensing. Her lab was selected as an NVIDIA NVAIL lab. She is a recipient of an NSERC EWR Steacie Award, an NVIDIA Pioneers of AI Award, a Ministry of Education and Innovation Early Researcher Award, three Google Faculty Research Awards, an Amazon Faculty Research Award, a Connaught New Researcher Award, a Fallona Family Research Award, an UPNA alumni award and two Best Paper Runner up Prize awarded at the Conference on Computer Vision and Pattern Recognition (CVPR) in 2013 and 2017 respectively. She was also named Chatelaine 2018 Woman of the year, and 2018 Toronto's top influencers by Adweek magazine.

Talk Title: Photorealistic Simulation with Geometry-Aware Composition for Self-Driving


Time (PDT)


08:30 - 08:40 Welcome and Introduction
08:40 - 09:25 Keynote 1: Yaser Sheikh
Metric Telepresence
09:25 - 09:35 Break
09:35 - 10:35 Paper Session (15 mins each)

Semi-supervised 3D Face Representation Learning from Unconstrained Photo Collections
Zhongpai Gao, Juyong Zhang, Yudong Guo, Chao Ma, Guangtao Zhai, Xiaokang Yang

Bilinear Parameterization For Differentiable Rank-Regularization
Marcus Valtonen Örnhag, Carl Olsson, Anders Heyden

The “Vertigo Effect” on Your Smartphone: Dolly Zoom via Single Shot View Synthesis
Yangwen Liang, Rohit Ranade, Shuangquan Wang, Dongwoon Bai, Jungwon Lee

RGBD-Dog: Predicting Canine Pose from RGBD Sensors (Invited CVPR poster)
Sinead Kearney, Wenbin Li, Martin Parsons, Kwang In Kim, Darren Cosker

10:35 - 10:50 Break
10:50 - 11:35 Keynote 2: Raquel Urtasun
Photorealistic Simulation with Geometry-Aware Composition for Self-Driving
11:35 - 12.05 Panel Discussion
12:05 - 12:15 Close and Best Paper


We welcome submissions from both industry and academia, including interdisciplinary work and work from those outside of the mainstream computer vision community.


Papers will be limited up to 8 pages according to the CVPR format (main conference authors guidelines). All papers will be reviewed with double blind policy. Papers will be selected based on relevance, significance and novelty of results, technical merit, and clarity of presentation.

Submission website

Important Dates

Action Date
Paper submission deadline March 16, 2020
Notification to authors March 30, 2020
Camera ready deadline April 16, 2020


The best paper will receive a NVIDIA TITAN RTX GPU, courtesy of our workshop sponsor NVIDIA.