1.¬†Works with the user’s live speech. Parsed voice input animates the avatar’s lips to produce real-time lip synchronization.

2. Random eye movement to simulate a typical human gaze.

3. Real-time head tracking and eyebrow motion. The avatar¬†follows the user’s head rotation, and eyebrows change with head posture.

4. Body interaction. The avatar’s body follows the user’s motions.

5. Lighting and background images, and sound reflect the emotional content of the user’s speech (currently based three emotional categories: happiness, sadness, anger).

6. The user can choose among four developmental stages for both male and female genders: childhood, adolescence, adulthood, old age.

7. Live text input for questions or comments (by a caregiver, friend or family member), to mediate the flow of storytelling.