When you’re designing facial capture protocols, there are so many things you need to consider to prevent poor quality data yields and participant fatigue.
Factors like…
- what poses you choose
- how you sequence the poses
- how you explain/show the poses, etc.
…make a huge difference in how your sessions can turn out.
Working with major game and tech companies to refine their facial performance capture pipelines, it’s quite obvious that people are recycling similar, dated protocols. Considering the present purpose of many captures, such old protocols are often clunky with illogical pose combinations, redundant expressions, and inefficient flow.
There is an endless list of dos and don’ts for facial capture design, but here are a few general pointers:
1. Be use case-minded!
- Are you defining machine learning training data for real-time applications? Are you gathering extreme poses for high-intensity fight scenes in film or games? Think about your end goal.
- If your purpose is to capture facial data for an avatar product aimed toward co-working, you’d likely want to prioritize prosocial, collaborative, and natural facial expressions. Stop wasting time, energy, and production budget by bloating your session with every possible unsightly “scream” pose or hyper-compressed “lemon” face. Save those for Diablo and Planet of the Apes.
2. Design for logical flow.
- Group similar expressions together, e.g. brow-based FACS poses (action units like – inner brow raiser, outer brow raiser, brow lowerer) and eye-based FACS poses (action units like – upper lid raiser, lid tightener, cheek raiser, eye closure, blink, wink) together.
- Go from easy to difficult within each section. If you move from easy to difficult across ALL poses, you will end up forcing your user to leap from eyes to mouth to brows to jaw.
- Grouping expressions strategically not only helps with user fatigue and comprehension, but it also opens up opportunity to order your poses in a way that allows you to describe and build off previous ones.
3. Make sure your example imagery and descriptions match the target pose and intended blendshapes.
- Too many times, I see prompts like “raise brows without widening eyes” – yet the actor in the example shot is clearly widening their eyes. A large percentage of users will do as they see, not as they hear or read. So, don’t give conflicting instructions, and make sure you rigorously review the example poses!
📝 One more tip: In the video, notice how I prompt the action known as chin raiser by saying “push up your bottom lip.” It’s easy to get stuck in describing poses based on their formal names, but by using more accessible descriptors you can increase the likelihood of your participant hitting the pose.




1 thought on “Facial Performance: 3 Essential Tips for Data Capture Protocols”
Comments are closed.