Pablo Cesar: The Outlook for Virtual Meetups

When Centrum Wiskunde & Informatica celebrated its 75th anniversary online due to a lockdown earlier this year, two colleagues immersed in a virtual world, did cut a virtual birthday cake together. Fortunately, CWI has an in-house specialist who made this possible using virtual reality (VR) togeher with his team: Pablo Cesar.

Publication date: 25-10-2021

How do you cut a birthday cake with your friends if the coronavirus pandemic does not allow you to get close to each other? That was the challenge that the national research institute for mathematics and computer science in the Netherlands, Centrum Wiskunde & Informatica (CWI), faced with professional cake designer Cake Researcher when CWI celebrated its 75th anniversary earlier this year.

Fortunately, CWI has an in-house specialist who solved that problem using virtual reality (VR): Pablo Cesar, a researcher in human-centered multimedia systems and leader of the Distributed and Interactive Systems group at CWI, who also is a professor and chair of Human-Centered Multimedia Systems at the Netherlands' Delft University of Technology (TU Delft). Cesar, named an ACM Distinguished Member in 2020, investigates how to improve the ways people use interactive systems to communicate with each other.

While we currently use interactive systems to communicate person to person via flat screens, it would be much more convenient for many applications to communicate via three-dimensional (3D) video, also called volumetric video. Ultimately, we might want to transfer high-quality 3D models of people anywhere in the world in real time, something that Microsoft calls holoportation.

Working on the path to holoportation, Cesar develops state-of-the-art technology for capturing and distributing volumetric video. He showed Bennie Mols around in CWI's two VR rooms. Surrounded by Kinect cameras standing on tripods and hanging from the ceiling, Cesar spoke about where the technology stands right now, and what the future holds.

How would you describe the point where we are at present with regard to distributing volumetric video?

We are now with volumetric video where we were with 2D video in the 1990s: we understand that it is possible, and we know more or less the path to explore. There is a lot of potential, but there is still a lot of work to be done.

Can you describe a concrete example from your research?

Cutting the virtual birthday cake earlier this year was state of the art of what is possible with volumetric video. Two people were immersed in a virtual world by wearing VR glasses, and had to cut a virtual birthday cake together. Each person and the birthday cake were recorded in separate rooms using three Kinect cameras each. The volumetric videos were combined in real time so that the two people got the feeling that they were really cutting the cake together. You can see the demonstration online.

How does this compare to what other labs are capable of?

Other labs might have a better video capturing system or a better rendering system; each lab has its own strong points. But in terms of the full pipeline to capture, transmit, and render volumetric video, my team is state of the art. Our goal is to make our technology open source available later this year or next year.

Do you have another example from your work that might have practical applications?

In another recent demonstration, we used a normal mobile phone camera instead of three Kinect cameras. Though not perfect, the mobile phone camera can also capture depth information.

On a nearby parking place, we simulated a cyclist having fallen off his bicycle, seriously injuring his knee. A chance passer-by starts a video conversation with a doctor using the public 5G mobile network from KPN. This was a premier, the first time that the public 5G network was used for volumetric videoconferencing.

We demonstrated that, based on the volumetric video she sees, the doctor can give advice to the passer-by on how to help the injured cyclist. This is a good example of the future of remote consultation, where professionals really need 3D video and data to make decisions, and not just 2D video.

Text: Bennie Mols for ACM

More information