UniFacePoint-FM: A Foundation Model for Generalizable 3D Facial Representation Learning and Multi-Attribute Prediction
Li, D.; Fu, C.-H.; Tang, K.
Show abstract
The human face is a rich medium for biometric, behavioral, and clinical information. However, 2D facial images based technologies lack critical geometric details and are susceptible to pose and illumination interference, while 3D facial deep learning frameworks are hindered by complex annotation, preprocessing, and task-specific designs with poor cross-domain generalization. To address these challenges, we propose UniFacePoint-FM, a 3D facial foundation model built on a self-supervised Point-MAE framework, tailored for high-fidelity point cloud representation learning. The model was pretrained on a self-constructed dataset of high-resolution 3D facial scans, followed by supervised fine-tuning and comprehensive evaluation across three independent datasets for diverse downstream tasks. Experimental results demonstrate that UniFacePoint-FM is both pretraining-efficient and highly generalizable: it achieves state-of-the-art performance on gender classification, age regression, and BMI prediction, and matches the accuracy of the ResMLP model (while outperforming other baselines) in facial expression recognition. Notably, by learning high-quality, fine-grained representations directly from raw point clouds, UniFacePoint-FM delivers robust generalization and transferability across tasks, datasets, and even different face scanning platforms. Overall, our work establishes an effective foundation model paradigm for 3D facial analysis, with promising implications for biometric security, health monitoring, and advanced human-computer interaction systems.
Matching journals
The top 7 journals account for 50% of the predicted probability mass.