By learning human motion priors, motion capture can be achieved by 6 inertial measurement units (IMUs) in recent years with the development of deep learning techniques, even though the sensor inputs are sparse and noisy. However, human global motions are still challenging to be reconstructed by IMUs. This paper aims to solve this problem by involving physics. It proposes a physical optimization scheme based on multiple contacts to enable physically plausible translation estimation in the full 3D space where the z-directional motion is usually challenging for previous works. It also considers gravity in local pose estimation which well constrains human global orientations and refines local pose estimation in a joint estimation manner. Experiments demonstrate that our method achieves more accurate motion capture for both local poses and global motions. Furthermore, by deeply integrating physics, we can also estimate 3D contact, contact forces, joint torques, and interacting proxy surfaces.
This work was supported by the National Key R&D Program of China (2023YFC3305600), the Zhejiang Provincial Natural Science Foundation (LDT23F02024F02), and the NSFC (No.61822111, 62021002). This work was also supported by THUIBCS, Tsinghua University, and BLBCI, Beijing Municipal Education Commission. The authors would like to thank Wenbin Lin and Yunzhe Shao for their help on the live demos. Feng Xu is the corresponding author.