This topic introduces how agent animations are implemented in the demo of Real-time Conversational AI.
Overview
In a voice call scenario, you can customize an agent avatar to express different emotions based on emotional tags. Compared to a standard agent, an expressive one interacts more naturally with users. Varied tones, expressions, and movements enhance emotional connection and make interactions more engaging and realistic. This type of agent can also perceive user emotions to provide more personalized services, making it ideal for customer service, education, and entertainment.
Official agent animation
How it works
The official AI Real-Time Interaction demo uses After Effects to design the agent avatar animation, which is then exported as a Lottie resource file. The client integrates a Lottie component library to play and control the animation. For more information about Lottie, see LottieFiles.
In the official demo, animations designed in After Effects are split into modular components: head, hand, eyes, and covering eyes. The animation resources for each module are exported separately. The complete Lottie resource structure is as follows:
├── Avatar // Root directory for avatar animation resources
│ ├── bg.png // Background image
│ ├── Enter // Intro animation (while the call is connecting)
│ ├── Head // Head animation
│ ├── Hand // Hand animation
│ ├── CoveringEyes // Hand animation for covering eyes (when the agent is interrupted)
│ ├── EyeEmotions // Eye animations
│ ├── Interrupting // Eye animation - Interrupted
│ ├── Listening // Eye animation - Listening
│ ├── Thinking // Eye animation - Thinking
│ ├── Speaking // Eye animation - Speaking
│ ├── Happy // Eye animation - Speaking happily
│ ├── Sad // Eye animation - Speaking sadlyReference
The source code for the animations in the official demo can be found here:
Platform | Scenario | File name | Class name |
Android | Intro animation |
| |
In-call animation |
| ||
iOS | Intro animation |
| |
In-call animation |
| ||
Web | Intro animation |
| |
In-call animation |
Implementation details
In the official demo, the agent's head (nodding motion) and hand animations play in a loop. When the agent's state or emotion changes, the corresponding eye animation plays in a loop. When the agent is interrupted, the "covering eyes" hand animation plays once and then returns to the previous state. Additionally, the following logic requires special handling:
As the head animation loops up and down, the eyes follow this movement. You can achieve this by listening to the head animation's playback progress to estimate the Y-axis offset and then updating the eye animation's position in real-time.
// Listen for update events on the head animation. headAnimator.addAnimatorUpdateListener(new ValueAnimator.AnimatorUpdateListener() { @Override public void onAnimationUpdate(ValueAnimator animation) { float animatedFraction = animation.getAnimatedFraction(); float speed = headAnimator.getSpeed(); float progress; if (speed < 0) { progress = 1 - animatedFraction; } else { progress = animatedFraction; } // Estimate the offset based on the playback progress. float y = eyeOffset * (progress - 1.0f); // Update the eye animation's Y translation. eyeAnimatorContainerView.setTranslationY(y); } });Define the eye animation types and handle the transition logic:
The
Errorstate is final and cannot be transitioned from.If the next state is
Interrupting, it cannot be overridden.While in the
Interruptingstate, the animation cannot be switched. The next state is cached and will play after the interruption animation finishes.When the agent is speaking with an emotion (
HappySpeakingorSadSpeaking), it cannot be switched to the neutralSpeakingstate.Transitions to
InterruptingorErrorare queued and execute after the current animation cycle finishes.
public enum EyeAnimator { None, // No animation Start, // Agent starting up Error, // Agent error state Interrupting, // Agent is interrupted Thinking, // Agent is thinking or loading Listening, // Agent is listening Speaking, // Agent is speaking (neutral) HappySpeaking, // Agent is speaking (happy) SadSpeaking // Agent is speaking (sad) } public void setEyeAnimatorType(EyeAnimator type) { Log.d("Animator", "Will Set Animator Type Curr: " + currEyeAnimatorType + " To: " + type); if (type == EyeAnimator.None) { return; } if (currEyeAnimatorType == type) { return; } if (currEyeAnimatorType == EyeAnimator.Error) { // The Error state is final and cannot be transitioned from. return; } if (nextEyeAnimatorType == EyeAnimator.Interrupting || nextEyeAnimatorType == EyeAnimator.Error) { // If the next state is Interrupting, it cannot be overridden. return; } if (currEyeAnimatorType == EyeAnimator.Interrupting) { // While in the Interrupting state, the animation cannot be switched. The next state is cached and will play after the interruption animation finishes. if (type == EyeAnimator.Speaking && nextEyeAnimatorType.ordinal() > type.ordinal()) { // If the next state is an emotional speaking state, it cannot be switched to the neutral Speaking state. return; } nextEyeAnimatorType = type; return; } if (type == EyeAnimator.Speaking && currEyeAnimatorType.ordinal() > type.ordinal()) { // If the agent is speaking with an emotion, it cannot be switched to the neutral Speaking state. return; } if (type == EyeAnimator.Interrupting || type == EyeAnimator.Error) { // If the next state is Interrupting or Error, wait for the current animation to complete. nextEyeAnimatorType = type; return; } Log.d("Animator", "Set Animator Type Next: " + type + " Curr: " + currEyeAnimatorType); startNextEyeAnimator(type); }When an interruption event is triggered, the "covering eyes" animation must be synchronized with the head animation. It must start simultaneously with the first frame (progress=0) of the head animation's cycle. During this time, the eye animation cannot be switched to another state.
// Listen for the end of the head animation. headAnimator.addAnimatorListener(new AnimatorListenerAdapter() { @Override public void onAnimationEnd(Animator animation) { headAnimator.reverseAnimationSpeed(); headAnimator.playAnimation(); // Loop the head animation by playing it again. if (nextEyeAnimatorType != EyeAnimator.None) { // Check for a queued animation type. startNextEyeAnimator(nextEyeAnimatorType); // Synchronously start the next animation (e.g., covering eyes). if (currEyeAnimatorType == EyeAnimator.Interrupting) { nextEyeAnimatorType = EyeAnimator.Listening; } else { nextEyeAnimatorType = EyeAnimator.None; } } } });Avatar animations can be difficult to design purely with vector graphics and may rely on many rasterized assets (bitmaps), which can increase memory consumption. To reduce memory usage, you can export only a one-way animation (e.g., nodding down) from After Effects and use the
AutoReversemode during playback to create a loop.
Implement custom agent animations
Method 1
Follow the official demo's approach by reusing the existing control logic. Design and export your own Lottie assets using After Effects, then replace the official Lottie assets located in the AUIAICall/src/main/assets/Avatar directory. This allows you to reuse the demo's animation playback and control source code.
Method 2
Design your own animations in After Effects and split out the animated parts that change based on state (such as listening, thinking, and happily speaking). Then, implement your own animation control logic by referencing the official demo's implementation. This approach allows you to add your own custom logic and special handling.