In the era of rapid internet expansion and technological progress, discerning real from fake news poses a growing challenge, exposing users to potential misinformation. The existing literature primarily focuses on analyzing individual features in fake news, overlooking multimodal feature fusion recognition. Compared to single-modal approaches, multimodal fusion allows for a more comprehensive and enriched capture of information from different data modalities (such as text and images), thereby improving the performance and effectiveness of the model. This study proposes a model using multimodal fusion to identify fake news, aiming to curb misinformation. The framework integrates textual and visual information, using early fusion, joint fusion and late fusion strategies to combine them. The proposed framework processes textual and visual information through data cleaning and feature extraction before classification. Fake news classification is accomplished through a model, achieving accuracy of 85% and 90% in the Gossipcop and Fakeddit datasets, with F1-scores of 90% and 88%, showcasing its performance. The study presents outcomes across different training periods, demonstrating the effectiveness of multimodal fusion in combining text and image recognition for combating fake news. This research contributes significantly to addressing the critical issue of misinformation, emphasizing a comprehensive approach for detection accuracy enhancement.
Keywords: Fake news detection; deep learning; multimodal fusion; natural language processing; text classification.