Enhancing Multimodal Knowledge Graph Representation Learning through Triple Contrastive Learning
Enhancing Multimodal Knowledge Graph Representation Learning through Triple Contrastive Learning
Yuxing Lu, Weichen Zhao, Nan Sun, Jinzhuo Wang
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 5963-5971.
https://doi.org/10.24963/ijcai.2024/659
Multimodal knowledge graphs incorporate multimodal information rather than pure symbols, which significantly enhance the representation of knowledge graphs and their capacity to understand the world. Despite these advancements, existing multimodal fusion techniques still face significant challenges in representing modalities and fully integrating the diverse attributes of entities, particularly when dealing with more than one modality. To address this issue, this article proposes a Knowledge Graph Multimodal Representation Learning (KG-MRI) method. This method utilizes foundation models to represent different modalities and incorporates a triple contrastive learning model and a dual-phase training strategy to effectively fuse the different modalities with knowledge graph embeddings. We conducted comprehensive comparisons with several different knowledge graph embedding methods to validate the effectiveness of our KG-MRI model. Furthermore validation on a real-world Non-Alcohol Fatty Liver Disease (NAFLD) cohort demonstrated that the vector representations learned through our methodology possess enhanced representational capabilities, showing promise for broader applications in complex multimodal environments.
Keywords:
Multidisciplinary Topics and Applications: MTA: Bioinformatics
Data Mining: DM: Knowledge graphs and knowledge base completion
Machine Learning: ML: Multi-modal learning
Multidisciplinary Topics and Applications: MTA: Life sciences