Bypassing VR Headset Occlusion in Face Tracking – PREVIEW

VR headset occlusion guide


Tracking What you Cannot See

Throughout my years working in face tracking, I observed a superstition among engineers and researchers that you cannot track facial landmarks (features like the eyebrows, eyes, and mouth) when they fall outside the camera view. This belief is not entirely true. You don’t need to see an eyebrow to know whether it’s raising or furrowing, and you don’t need to see the nose to know when it’s wrinkling.

Our faces bulge, stretch, and wrinkle uniquely with each facial action. I have used these changes to train labelers to recognize and accurately classify discrete expressions from skin movement alone. With well-directed documentation, a comprehensive set of examples, and a high-quality camera, you can extrapolate bounds of information with a limited FOV.

The minimal FOV required for eye-tracking is often enough to track a handful of actions. View A (in the image set below) is most reflective of what a gaze-based tracker view may look like. Though the main goal of eye tracking cameras is to cover just enough of the eye to observe changes in gaze, its potential is much greater. Even with this concentrated view, you can still detect upper lid raiser (AU5), cheek raiser (AU6), and lid tightener (AU7) with a relatively high degree of certainty. These actions are useful for their applications in measuring attention, reactions, and engagement; they are also crucial signals in communication.

Many people get blocked by action unit names like cheek raiser and assume, “We can’t track cheek raiser because our FOV doesn’t cover the cheek area.” But cheek raiser is more than its name reveals; it’s an action caused by the contraction of orbicularis oculi, a muscle surrounding the eye area. While movements of the orbicularis oculi do impact the cheeks, many changes actually take place in the eye socket area. As long as you have a marginal view of the [REMOVED FROM PREVIEW] or a sliver of skin under the [REMOVED FROM PREVIEW], you can determine whether or not cheek raiser is occurring. Similar concepts apply to the other actions I have listed in the images below. 


If you find the content in this post helpful, please read, “Big Tech’s Homogenous Hiring Habits” and educate yourself on the importance of valuing cross-disciplinary knowledge in emerging technology.


Capabilities With Different FOVs

This chart shows which action units (AUs) are possible to detect with various fields of view. Keep in mind this is an abridged breakdown of what may or may not be possible with different FOVs. (If you wish to learn about predictions for lower face and combination shapes, I am available for consultation.) Conditions will change based on additional factors such as:

  • camera angle
  • how the headset rests on the face. (Is the headset heavy? How does its weight and pressure affect various areas of the face?)
  • individual anatomy (I have ideas for how to classify variations of the face by section. Again, available for consultation.)

If you are working on in-headset face-tracking, don’t let assumptions limit your potential. The face is complicated and full of clues. All you need to do is find the right clues, and you can accomplish a lot from a little.


AU1 = inner brow raiser
AU2 = outer brow raiser
AU4 = brow lowerer
AU5 = upper lid raiser
AU6 =  cheek raiser
AU7 = lid tightener
AU9 = nose wrinkler
AU10 = upper lip raiser
AU12 = lip corner puller 

green box = detectable at most levels of intensity, robust to facial structure
yellow box = detectable at moderate to high intensity levels, less robust to facial structure
orange box = contingent on intensity level, fallible to certain facial structures

VR headset occlusion guide with various FOVs

To view full post, see the buttons below.