Without massive computing power, the meta universe is a mirage

Introduction: On September 26, 2022, the immersive interview program "Breaking the" Metauniverse "was broadcast simultaneously in Alibaba Cloud developer community, Alibaba Cloud developer video number, Alibaba Cloud Watch and other official channels. He Zhan, NVIDIA's China Omniverse business leader, Lou Yanxin, the founder of Shahe Nuclear Technology, and Zhang Xintao, an expert on Alibaba Cloud's elastic computing products, shared their industry understanding, implementation cases, bottleneck challenges, and so on.

Digital world pioneer × Scientific artist × What opinions can the three masters of cloud computing come up with? Click the video below to watch the program feature.


Video: Breaking the "Metauniverse" | Big Brother Dialogue

The following are the articles of this program for reading:

Q1 What is the meta universe and immersive experience?

He Zhan: How do you understand the meta universe? What is the meta universe in your mind?

Zhang Xintao: We think it is the next generation of Internet. In the future, all our food, clothing, transportation, learning and work will no longer be carried out on mobile phones and PCs, but will have a similar lightweight XR terminal, where all our businesses will be put.

He Zhan: I understand that it is a new evolution of the Internet. I hope to interact through some media, such as XR or even the future brain computer interface.

Lou Yanxin: I think that for the meta universe, we have not yet fully reached the concept of a 100% meta universe. Instead, we are jointly building a new generation of Internet, and we are still contributing to it.

He Zhan: Yes, this ecology needs to be built together.

He Zhan: Immersion is also an interesting word. How do we understand it?

Lou Yanxin: Immersion itself is an adjective, which means a feeling for everyone: users put themselves in a field, and then become a part of the field. AR, VR, etc., especially VR, give you a more direct feeling.

Zhang Xintao: Immersive. In addition to the VR equipment and high-quality content that the General Manager just said, it also needs an interactive process. Interaction actually brings a huge challenge to the cloud computing and chip industry. It requires real-time feedback, that is, real-time calculation. For example, if the experiencer touches a flower in the virtual space, the flower will move, wear the corresponding gloves, and there will be corresponding tactile feedback, which requires real-time calculation.

We need to use enough technical means to "cheat" the human brain, and then let the brain think that I am in the real world, there is no way to distinguish between virtual space and real space. When there is no way for the brain to distinguish, the sense of immersion naturally comes into being.

He Zhan: In fact, I was lucky to have just participated in Alibaba's U Design Week a few weeks ago. Many sessions (sharing or exhibition areas) introduced algorithm optimization and tactility in the field of vision. The technology of virtual reality interaction through wearing VR glasses and gloves, and the simulation of smell, that is, smell, through wearing a collar.

For example, when we see the scene of chocolate cake in the film, the collar will smell of chocolate; A particularly stinky stable appears on the screen, and the stinky smell will be synthesized synchronously, which is real-time. The taste experience is to take a small pad and insert it into the phone, and then simulate the taste experience. But all sensory realization is inseparable from one thing, that is, computing.

Lou Yanxin: The simulation cost is different for different senses. What we are doing now is actually more vision and hearing. Touch is more expensive, and there is no way to reduce it to the extent that everyone can use it. But all the work we do is to restore and simulate the senses.

He Zhan: Yes, you mentioned hearing just now. I experienced hearing in a university a while ago. It is wearing a headset and using a voice channel generated by AI to string a piece of music. It sounds like it is running from the left ear to the right ear. Just now, we discussed immersion, with vision, hearing, taste, smell and touch. Let's think again. If these five senses are to be realized in real time, as Xintao said, it is not just a matter that can be supported by a pair of glasses.

Lou Yanxin: Yes, there is a long way to go.

Q2 Sharing sinking immersion experience practice?

He Zhan: Have Alibaba Cloud done any relevant landing scenarios and applications recently? Let's share them with you.

Zhang Xintao. Actors are not limited by physics. Actors can become larger or smaller. This is also a new art form that cannot be produced on the real stage.

He Zhan: I understand that this is different from an ordinary virtual concert. It supports switching between different scenes, and then allows the audience and actors to interact in real time.

Lou Yanxin: Just now, teacher Xintao said that we had a concert on the universe. It is not just a single person, but a band that drives the characters in the virtual space to perform for everyone in real time by means of motion capture. But what is more special is that this concert can be viewed in both VR and plane mode. The interaction mode of each end is different.

There is a song in VR, which can fly in the whole space. The character is in front of a black hole, and the actor is also in front of the black hole. During the flight, color line effects can also be drawn. It is also part of a stage effect. However, this stage effect is not produced by actors or the stage, but by the effective interaction of the audience.

He Zhan: The plane you just referred to refers to a mobile phone or tablet?

Lou Yanxin: Yes, then the whole performance is rendered in the cloud, and the finished rendering is distributed to the VR head display, tablet and mobile phone. This performance is actually a part of our whole activity. Our newly developed platform, called "Daqian", is a space aggregation platform that can perform various performances and exhibitions in virtual space. We have also developed a completely cloud based version. This is the whole space. No matter users go to exhibitions, performances, etc., a series of activities can be accessed through cloud based cloud rendering.

He Zhan: Is this platform still real-time?

Lou Yanxin: Yes, it is completely real-time. It also uses NVIDIA cards.

He Zhan: How much concurrency can you achieve in projects you have worked on?

Lou Yanxin: If the network is concurrent in the traditional sense, it can reach the level of thousands of people.

He Zhan: I remember that at the end of last year's GTC was our technical conference, Lao Huang (Huang Renxun, founder of NVIDIA) personally demonstrated his dialogue with his virtual human. It should be the first time in my memory that I saw a real person interact with a digital image in real time.

Lou Yanxin: We happen to have a work that has also been shortlisted for the Venice International Film Festival this year. It's just now that Venice is showing. It is a theatrical performance project. The space of performance can be transformed by several different stages. During the whole process, the actors wore moving arrest clothes, and one actor performed for six audiences. But during the whole performance, the actors are in Paris and the audience is in Venice, so in fact, it is presented in a transnational motion capture data transmission scheme, and the whole process is also done by real-time computing.

Q3 Why must cloud computing be used for immersive experience?

He Zhan: I have a question for Xintao, who is a cloud expert. What I want to ask is why "immersive experience" is strongly related to cloud computing? How do you see this? Is the implementation of "immersive experience" really that demanding for computing power?

Zhang Xintao. So far, the language models of some of the world's leading enterprises actually have a lot of problems and need a huge computing cluster to do this. Every word I say to the virtual person and every answer the virtual person gives means that a lot of computing power needs to be mobilized later.

The other is 3D rendering. If you want to achieve remote location, you need to find the corresponding computing node, find the corresponding network transmission, and reduce the delay of this network to a very low level. You can't obviously feel the delay when interacting. There are still many challenges.

He Zhan: Especially for such a major event, it cannot allow problems.

Zhang Xintao: It seems relatively simple to output such computing power stably, but it is actually a very challenging thing. For example, when our own mobile phone crashes and our PC fails, the cloud does not allow this. Like Alipay, maybe the user is in the hospital at the moment to pay. If there is a failure at this time, the problem is very big. The other is the scale. Some concerts may be very hot. You need to think of 20000 and 30000. When the cloud does this, because the cloud has a huge computing resource pool, it can be given immediately.

He Zhan: Yes, the example of Alipay mentioned just now really touched me. It's really a small thing in daily life. Without a stable computing support, it will have an impact and become a big one. I would also like to ask the president why you chose Alibaba Cloud in the process of your business?

Lou Yanxin: In the past, in the process of building the "Daqian" platform, many of our plans were based on the end. We need to consider whether 1080 or (other graphics cards) is the computing standard. To put it bluntly, what is your computer graphics card level to plan this matter.

Later, when we came to the cloud and built the cloud version on the cloud, we found that we were relieved and we didn't have to think about this anymore. The current Daqian platform has both cloud and end, which means that both ends can support access.

We also need to consider transnational, because our actors may be in China, but they will perform for overseas audiences, so we need to consider how to deploy nodes and what institutions can provide us with such capabilities. It seems that only Alibaba Cloud can provide such capabilities, so we chose Alibaba Cloud from the beginning.

He Zhan: To sum up, if there is no cloud, it would be painful to choose some standards as the support of computing power.

Lou Yanxin: Yes, it's really painful.

He Zhan: Just now you said that hundreds of people are concurrent, thousands of people are concurrent, and tens of thousands of people are concurrent. It can really reach tens of thousands of people. Now we can do it, right?

Zhang Xintao: A few years ago, we had a customer make an application. That application basically achieved a leap in cloud computing, that is, more than 13000 GPUs serve one APP at the same time, and tens of millions of people log in and use one APP at the same time.

He Zhan: I think only the Chinese market can have such a large concurrency.

What are the challenges in the Q4 XR field that need to be broken through?

He Zhan: What technologies do you think need to be improved in XR or VR?

Zhang Xintao: In fact, this part of the challenge is still quite big. Our current computing capacity, communication capacity and computing scale are far from the level we just imagined. For example, if we want to be a very high-definition digital person now, we may still not be able to count on NVIDIA's most powerful chip. Then we may want to consider working with the engine company to try to make it parallel.

As for AI, you will find that our current large-scale language models, many AICG or the ability to recognize human microexpression, are still in a weak AI situation. Virtual human has a low IQ, so users must have no sense of immersion, because they subconsciously think that this is a machine, right? But if it has a high IQ, can recognize your expression, and can understand your emotions, then users will think that it is really a virtual person at this time. We believe that there are still some theoretical breakthroughs in computing, including communication and various algorithms.

Lou Yanxin: It is always a difficult thing to obtain computing power stably and cheaply. Because today's VR devices actually contain chips, or all-in-one machines commonly used by people, whether Pico or META, which contain mobile ARM chips, are far from 1080, and may not reach 6600.

Usually, the computing power of the consumer VR devices used by the public is very limited, but what we want to do is far more than its computing power. We want to do very gorgeous and interesting scenes, but we can't provide them to everyone. So the cloud can really help you and give you this ability. But how to let ordinary audience consumers obtain this kind of computing power at a low cost and in a stable way is really an aspect that requires joint efforts.

Another thing I want to talk about is interoperability. In the direction of VR, in fact, many of the things we do are information islands. We are working on the "Daqian" platform, which actually aggregates various virtual spaces created by different people. In this matter alone, we need to consider what the format is and what the interface is.

I think that in the future, at the level of asset format, we may gradually embrace USD (general scenario description). But at the same time, USD is not enough, because in fact, USD is a description of assets, and we also need to have logic. In the engine, there is also how users want to play and interact, which is not standardized in USD. I believe that when we participated in the "Meta Universe Standards Forum", we were all discussing this issue, that is, how we can jointly build an interconnected network architecture of the Meta Universe, where assets and information can flow through each other.

He Zhan: Yes, after hearing about the demand for computing power just now, our CloudXR is also cooperating with Alibaba Cloud. In addition, we are the first group to join the open standard you mentioned. A total of 36 enterprises have been discussing the standard of data format, scene description, material definition, and some standards called digital economic system. They are all involved in customization. It is really difficult to develop such a thing that everyone can communicate.

Lou Yanxin: Yes, I think it's time for us to go back to web1.0 and start building a new network architecture together.

He Zhan: Yes, so we are also describing that USD is the HTML of the next generation Internet or the next generation meta universe.

Lou Yanxin: Yes, interworking format.

He Zhan: Yes, we have been dreaming about this for a long time. We look forward to working with Alibaba Cloud and our users to build a better ecosystem. So that's the end of our interview today. We look forward to more new and landing immersive projects in the meta universe in the future. Thank you.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us