Realistic 3D avatars have become prevalent in video games, virtual reality/augmented reality experiences, and the film industry. The advent of the Metaverse and AI agents has fueled a growing need for customized and expressive character creation, especially in virtual meetings, conversational agents, and intelligent customer service. Despite this demand, crafting a personalized 3D avatar using conventional digital creation tools remains intricate and time-intensive. This poses a challenge for general users to generate detailed facial attributes.
A team of researchers from the Institute for Intelligent Computing and Alibaba Group introduces Make-A-Character (Mach), an inventive system designed to simplify creating 3D digital human models. Leveraging advanced language and vision models, Mach transforms basic text descriptions into detailed and realistic 3D avatars. This streamlined approach allows users to generate personalized avatars that align with their envisioned personas effortlessly. Mach also enables easy integration with existing CG pipelines for dynamic expressiveness.
The researchers put forward a conversion mechanism named Triplane, which improved geometry generation and made it easier to optimize camera parameters and Triplane maps based on dense facial landmarks and a reference image. To collect ground truth data, they captured the faces of 193 individuals under uniform illumination and artificially created textures under varying illumination conditions. To improve data diversity and avoid overfitting, they augmented the skin colors of the ground truth diffuse albedos based on the Individual Typology Angle (ITA). High dynamic range (HDR) lights were generated for each ground truth data to cover a wide range of natural lighting conditions.
A series of 2D face parsing and 3D generation modules were utilized to generate the mesh and textures of the target face, along with additional matched accessories, enabling easy animation of the generated 3D avatar. The process employs differentiable rendering and enhancement methods to extract and perfect the diffuse texture using a reference image. The hair generation module also contributes to overall expressiveness through detailed strand-level synthesis. Accessories like garments, glasses, eyelashes, and irises are sourced from a tagged 3D asset library, and their semantic attributes are extracted, followed by the assembly of these elements to create a comprehensive 3D figure.
The study presents visual results of the generated 3D avatars, showcasing expressive animations achieved through facial rig control. The researchers demonstrate the effectiveness of their approach by developing detailed facial attributes guided by text prompts using Stable Diffusion Model, LLM, and ControlNet. The generated 3D avatars exhibit realistic textures and geometry. The researchers also showcase the strand-based hair generation guided by hairstyle images generated using SD models.
In conclusion, The study proposes a method for generating detailed 3D avatars with realistic textures and geometry, guided by text prompts and dense facial landmarks. The researchers demonstrate the effectiveness of their approach through visual results, showcasing expressive animations achieved through facial rig control. Stable Diffusion Model, LLM, and ControlNet enable the generation of detailed facial attributes. The study highlights the importance of dense facial landmarks in accurately reconstructing the face and head structure. Mach utilizes synthetic images for training data and establishes a multi-view capturing and processing pipeline to produce uniformly topological head scans.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 35k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, LinkedIn Group, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.