Forensic feature extraction of document paper using periodic marks: PCA and t-SNE for manufacturer discrimination and document dating

Forensic Sci Int. 2024 Dec 20:367:112348. doi: 10.1016/j.forsciint.2024.112348. Online ahead of print.

Abstract

Paper differentiation can play a critical role in forensic document examination along with examinations of handwriting identification, impressed writing, and ink and printer toner analyses. If reference database to compare was constructed, paper analyses are also useful in terms of examining when document paper was produced. In this study, two datasets were utilized for principal component analysis (PCA) and t-SNE, and each dataset was constructed for the manufacturer discrimination and document paper dating tasks. A database for the angle and step data of periodic marks at top 10 intensity respectively was established by a two dimensional lab formation sensor. Model performance was evaluated using clustering indexes, i.e., the silhouette index, the normalized mutual information, the Calinski-Harabasz index, and the Davies-Bouldin index. Periodic marks analysis using an unsupervised machine learning model was performed to differentiate the manufacturers and investigate the production date in the case of forming fabric alteration. We found that forensic differentiation of paper is feasible using a combined PCA and t-SNE model on test document data and two datasets because the forming fabric of paper-making machines inevitably leaves periodic marks on the surface of the paper. Our findings demonstrate that these periodic marks can play a key role in forensic feature extraction. As a result, the combined PCA and t-SNE model has demonstrated high performance on the target tasks.

Keywords: Forensic document examination; Forming fabric; questioned document; unsupervised model; weave marks.