Our goal, in the area of facial animation, is to make a facial model which can be used in speech perception and production research. In driving a realistic facial model, we learn about the neural control of speech production and how neural signals interact with the biomechanical and physiological characteristics of the articulators and the vocal tract. In addition, we make possible the systematic manipulation of physical parameters to study their effect on speech perception.
Our facial model is an extension of previous works on muscle-based models of facial animation (Lee, Terzopoulos, and Waters 1993, 1995; Parke and Waters, 1996; Terzopoulos and Waters, 1993; Waters and Terzopoulos, 1991, 1992). The modeled face consists of a deformable multi-layered mesh, with the following generic geometry: the nodes in the mesh are point masses, and are connected by spring and damping elements (i.e., each segment connecting nodes in the mesh consists of a spring and a damper in a parallel configuration). The nodes are arranged in three layers representing the structure of facial tissues. The top layer represents the epidermis, the middle layer represents the fascia, and the bottom layer represents the skull surface. The elements between the top and middle layers represent the dermal-fatty tissues, and elements between the middle and bottom layer represent the muscle. The skull nodes are fixed in the three-dimensional space. The fascia nodes are connected to the skull layer except in the region around the upper and lower lips and the cheeks The mesh is driven by modeling the activation and motion of several facial muscles in various facial expressions.
The figure (below) shows the full face mesh. In this figure we have individualized the shape of the mesh by adapting it to a subject's morphology using data from a Cyberware scanner. This is a 3-D laser rangefinder which provides a range map that is used to reproduce the subject's morphology and a texture map (shown below) that is used to simulate the subject's skin quality.
The red lines on the face mesh represent the lines of action of the modeled facial muscles. The lines of action, origins, insertions, and physiological cross-sectional areas are based on the anatomy literature and our measures of muscle geometry in cadavers. Our muscle model is a variant of the standard Hill model and includes dependence of force on muscle length and velocity.
At present, we can drive the model in two ways: 1) by simulating the activation of several facial muscles during various facial gestures or 2) by using processed electromyographic (EMG) recordings from a subject's actual facial muscles. In the animation below you can watch the face model when it is driven by EMG recordings from the muscles around the mouth. The speaker is repeating the nonsense utterance /upae/. This animation of the lower face movements was produced using only the EMG recordings and thus several seconds of realistic animation were produced from previously recorded muscle activity.
Please note that the speed at which the movie plays will be determined by the computer on which it is viewed. It does not represent the real-time speed of the animation.