This work presents TTFDNet, a transformer-based and transfer learning network for end-to-end depth estimation from single-frame fringe patterns in fringe projection profilometry. TTFDNet features a precise contour and coarse depth (PCCD) pre-processor, a global multi-dimensional fusion (GMDF) module and a progressive depth extractor (PDE). It utilizes transfer learning through fringe structure consistency evaluation (FSCE) to leverage the transformer's benefits even on a small dataset. Tested on 208 scenes, the model achieved a mean absolute error (MAE) of 0.00372 mm, outperforming Unet (0.03458 mm) models, PDE (0.01063 mm) and PCTNet (0.00518 mm). It demonstrated precise measurement capabilities with deviations of ~90 μm for a 25.4 mm radius ball and ~6 μm for a 20 mm thick metal part. Additionally, TTFDNet showed excellent generalization and robustness in dynamic reconstruction and varied imaging conditions, making it appropriate for practical applications in manufacturing, automation and computer vision.
Keywords: deep learning; depth estimation; fringe projection profilometry; transfer learning.