The central component of the program is the face graph. It combines bone poses and morph targets to a single target which is driven by animation curves. These are created by a speech recognition system but can also be manually edited. The facial animation is then outputted to the game engine either as bone transformation or morph target frame by frame.
To create a realistic animation curve, the program recognizes 42 phonemes and tries to take co-articulation rules into account. It also tries to create head movement, blinks, and eyebrow raises from emphasis points in the audio file.