Multimodal Analytics in Big Data architectures implies compounded configurations of the data processing tasks. Each modality in data requires specific analytics that triggers specific data processing tasks. Scalability can be reached at the cost of an attentive calibration of the resources shared by the different tasks searching for a trade-off with the multiple requirements they impose. We propose a methodology to address multimodal analytics within the same data processing approach to get a simplified architecture that can fully exploit the potential of the parallel processing of Big Data infrastructures. Multiple data sources are first integrated into a unified knowledge graph (KG). Different modalities of data are addressed by specifying ad hoc views on the KG and producing a rewriting of the graph containing merely the data to be processed. Graph traversal and rule extraction are this way boosted. Using graph embeddings methods, the different ad hoc views can be transformed into low-dimensional representation following the same data format. This way a single machine learning procedure can address the different modalities, simplifying the architecture of our system. The experiments we executed demonstrate that our approach reduces the cost of execution and improves the accuracy of analytics.
Keywords: Big Data; big graph; data fusion; multimodal analysis.