“Increasing demand is driving the digital human boom,” says Shiyan Li, head of Baidu’s robotics and digital human business, who created the digital model actor, Gong. “In China alone, there are more than 400 million fans of ACGN (animation, comics, games and novels) and a business market worth hundreds of billions of dollars focused on digital humans.” And according to a company that tracks business records, Qichacha, in China, there are now more than 280,000 companies engaged in digital people-related activities.
Another type of digital
The debut of Baidu’s digital celebrity may not seem like much at first, as the concept of “virtual idols” has been around for years. For example, American virtual influence Lil Miquela has appeared alongside real human celebrities in online ads and TV commercials since 2016, gaining more than three million followers on Instagram. However, there is something different about the virtual Chinese star: a digital human with the ability to listen, speak, and interact with real humans on a level never seen before. And Gong’s digital homework isn’t just about singing. In the latest update of the Baidu app, China’s leading search and feed app, Gong appears on users ’phones, helping them with searches and queries using the real voice of the model actor. Since this interactive search experience was launched in 2021, the number of voice search queries on the Baidu app has increased by 18.2%.
Baidu AI Cloud began developing a digital employee in 2019 in collaboration with Shanghai Pudong Development (SPD) Bank. Subsequently, they focused their efforts on building a digital financial advisor to provide a service equivalent to that of a human bank representative when real-life employees were unavailable. Today, SPD Bank says more than 460,000 customers rely on digital humans for banking and portfolio management each month. “Access to digital humans outside of normal business hours allows SPD Bank to offer customer service 24 hours a day, 7 days a day, at low cost and high efficiency,” says a bank representative.
More recently, a virtual anchor created by Baidu offered live commentary in sign language at the 2022 Beijing Winter Games for hearing-impaired viewers. In addition to looking like a real person, the avatar was endowed with speech recognition and sign language interpretation skills to ensure fast and highly accurate entry and exit. According to the World Health Organization, with approximately 430 million people worldwide suffering from “disabling” hearing loss, there is great potential for this technology to be used to increase its ability to access a wide range. of contents.
XiLing: a new generation on an AI platform
From entertainment to public services, digital humans will play a more important role in our daily lives. But behind its natural, effortless appearance is a complex network of new and emerging technologies that push the boundaries of AI innovation.
Baidu AI Cloud digital celebrities and virtual sign language anchors were created through XiLing, a new digital platform launched in 2021. At the Baidu World 2022 event held on June 21, the company announced a new capacity in XiLing, which supports the creation of digital content. humans who can host live broadcasts who can sing, dance, and respond to comments in real time, without the need for a single break. XiLing is unique in its ability to support the entire process of creating a digital human, from creating a realistic person to equipping them with conversational skills and content generation. One of its most amazing attributes is speed. The platform can generate a 3D avatar based on a real person in one or two weeks, while a 2D avatar can be done in minutes.
In addition, using XiLing’s intelligent dialogue tools, creators can quickly customize a digital human’s conversational ability, letting them adapt and learn over time. This capability is driven by Baidu’s PLATO, a one-hundred-billion-parameter dialogue model that allows digital humans to engage in open domain conversations, that is, to understand any topic and offer relevant answers. Speech recognition and high-precision lip synchronization with accuracy greater than 98.5% allow digital humans to have more fluid, human-like interactions. “The use of advanced AI technologies will continue to reduce the cost of building digital humans and significantly improve their interactions with real humans,” Li says.
Just as every real human has their own set of skills and talents, so does the new generation of digital humans. This may even include giving digital humans the ability to be creative themselves, thanks to recent advances made by large AI models like Baidu’s ERNIE, which can generate text and create realistic images when asked. Digital humans designed to serve as brand spokespersons, for example, can independently create and post on social media, design posters, and act out videos.