← Back to home

Sparse Inertial Motion Capture

IMU-based pose estimation and sensor fusion

Human motion capture traditionally relies on marker-based optical systems (Vicon, OptiTrack) or dense sensor suits, both requiring controlled environments and expensive infrastructure. Sparse inertial motion capture has emerged as a compelling alternative, using only 6 Inertial Measurement Units (IMUs) placed at key body locations to reconstruct full-body pose in unconstrained environments (Yi, Zhou, and Xu 2021). Unlike vision-based methods, IMU-based approaches are immune to occlusion and work in any lighting condition, though they face challenges from sensor drift, noise, and the inherent ambiguity of mapping sparse measurements to full-body pose.

Early learning-based methods applied recurrent neural networks to model temporal dependencies in IMU sequences, but struggled with physical plausibility; producing motions with floating, sliding, or ground penetration artifacts. Physical Inertial Poser (PIP) (Yi et al. 2022) introduced physics-aware optimization to enforce ground contact constraints and biomechanical feasibility, combining neural kinematics estimation with physics-based refinement. Physical Non-inertial Poser (PNP) (Yi, Zhou, and Xu 2024) extended this by modeling fictitious forces in non-inertial reference frames, addressing errors that arise when the pelvis (root) undergoes significant acceleration or rotation.

Calibration presents a fundamental challenge: sensors must be precisely aligned to body segments, and magnetometer drift causes heading errors over time. Traditional calibration requires static poses (T-pose, N-pose), breaking workflow and failing when sensors shift. Transformer IMU Calibrator (TIC) (Zuo et al. 2025) achieves dynamic, implicit calibration by learning to estimate calibration matrices from diverse motion patterns, enabling seamless "put on and use" operation without explicit calibration procedures.

Multi-modal fusion enhances accuracy by combining IMU with complementary sensors. DiffCap (Pan et al. 2025) fuses sparse IMUs with monocular camera using diffusion models, where visual information provides dense constraints when available and IMU ensures robustness during occlusion. BaroPoser (Zhang, Yi, and Xu 2025) incorporates barometric pressure from everyday devices (smartphones, smartwatches) to estimate height changes, enabling motion capture on non-flat terrain. Ultra Inertial Poser (Armani et al. 2024) adds ultra-wideband (UWB) ranging between sensors, providing absolute inter-sensor distances that dramatically reduce drift and jitter.

Recent work addresses practical deployment challenges. Loose Inertial Poser (Zuo et al. 2024) enables motion capture from sensors embedded in loose-fitting clothing by modeling secondary motion effects. MagShield (Shao et al. 2025) detects and corrects magnetic disturbances that corrupt orientation estimates in real-world environments. Group Inertial Poser (Xue et al. 2025) extends to multi-person tracking by leveraging inter-person UWB distances for relative positioning.

The field has converged on: (1) deep neural networks for learning motion priors from large datasets, (2) physics-based refinement for physical plausibility, (3) multi-modal fusion for enhanced accuracy, and (4) dynamic calibration for practical deployment. Global translation estimation; particularly in the vertical direction; remains challenging, with physics-based contact reasoning (Yi, Pan, and Xu 2025) providing the current best solution.

References

Armani, Rayan, Changlin Qian, Jiaxi Jiang, and Christian Holz. 2024. "Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging." arXiv. https://doi.org/10.48550/arXiv.2404.19541.
Pan, Shaohua, Xinyu Yi, Yan Zhou, Weihua Jian, Yuan Zhang, Pengfei Wan, and Feng Xu. 2025. "DiffCap: Diffusion-Based Real-Time Human Motion Capture Using Sparse IMUs and a Monocular Camera." arXiv. https://doi.org/10.48550/arXiv.2508.06139.
Shao, Yunzhe, Xinyu Yi, Lu Yin, Shihui Guo, Junhai Yong, and Feng Xu. 2025. "MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances." arXiv. https://doi.org/10.48550/arXiv.2506.22907.
Xue, Ying, Jiaxi Jiang, Rayan Armani, Dominik Hollidt, Yi-Chi Liao, and Christian Holz. 2025. "Group Inertial Poser: Multi-Person Pose and Global Translation from Sparse Inertial Sensors and Ultra-Wideband Ranging." arXiv. https://doi.org/10.48550/arXiv.2510.21654.
Yi, Xinyu, Shaohua Pan, and Feng Xu. 2025. "Improving Global Motion Estimation in Sparse IMU-Based Motion Capture with Physics." arXiv. https://doi.org/10.48550/arXiv.2505.05010.
Yi, Xinyu, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, and Feng Xu. 2022. "Physical Inertial Poser (PIP): Physics-Aware Real-Time Human Motion Tracking from Sparse Inertial Sensors." arXiv. https://doi.org/10.48550/arXiv.2203.08528.
Yi, Xinyu, Yuxiao Zhou, and Feng Xu. 2021. "TransPose: Real-Time 3D Human Translation and Pose Estimation with Six Inertial Sensors." arXiv. https://doi.org/10.48550/arXiv.2105.04605.
- - - . 2024. "Physical Non-Inertial Poser (PNP): Modeling Non-Inertial Effects in Sparse-Inertial Human Motion Capture." arXiv. https://doi.org/10.48550/arXiv.2404.19619.
Zhang, Libo, Xinyu Yi, and Feng Xu. 2025. "BaroPoser: Real-Time Human Motion Tracking from IMUs and Barometers in Everyday Devices." In Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology, 1-9. https://doi.org/10.1145/3746059.3747731.
Zuo, Chengxu, Jiawei Huang, Xiao Jiang, Yuan Yao, Xiangren Shi, Rui Cao, Xinyu Yi, Feng Xu, Shihui Guo, and Yipeng Qin. 2025. "Transformer IMU Calibrator: Dynamic On-Body IMU Calibration for Inertial Motion Capture." ACM Transactions on Graphics 44 (4): 1-14. https://doi.org/10.1145/3730937.
Zuo, Chengxu, Yiming Wang, Lishuang Zhan, Shihui Guo, Xinyu Yi, Feng Xu, and Yipeng Qin. 2024. "Loose Inertial Poser: Motion Capture with IMU-Attached Loose-Wear Jacket." In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2209-19. Seattle, WA, USA: IEEE. https://doi.org/10.1109/CVPR52733.2024.00215.