default search action
Shentong Mo
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j3]Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang:
Context Autoencoder for Self-supervised Representation Learning. Int. J. Comput. Vis. 132(1): 208-223 (2024) - [j2]Shentong Mo, Miao Xin:
BSTG-Trans: A Bayesian Spatial-Temporal Graph Transformer for Long-Term Pose Forecasting. IEEE Trans. Multim. 26: 673-686 (2024) - [c26]Tanvir Mahmud, Shentong Mo, Yapeng Tian, Diana Marculescu:
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers. CVPR Workshops 2024: 7996-8005 - [c25]Shentong Mo, Pedro Morgado:
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions Through Masked Modeling. CVPR 2024: 27176-27186 - [c24]Shentong Mo, Enze Xie, Yue Wu, Junsong Chen, Matthias Nießner, Zhenguo Li:
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation. ECCV (84) 2024: 354-370 - [c23]Shentong Mo, Pedro Morgado:
Audio-Visual Generalized Zero-Shot Learning the Easy Way. ECCV (71) 2024: 377-395 - [c22]Shentong Mo, Miao Xin:
Tree of Uncertain Thoughts Reasoning for Large Language Models. ICASSP 2024: 12742-12746 - [i48]Miao Xin, Zhongrui You, Zihan Zhang, Taoran Jiang, Tingjia Xu, Haotian Liang, Guojing Ge, Yuchen Ji, Shentong Mo, Jian Cheng:
We Choose to Go to Space: Agent-driven Human and Multi-Robot Collaboration in Microgravity. CoRR abs/2402.14299 (2024) - [i47]Shentong Mo, Yansen Wang, Xufang Luo, Dongsheng Li:
LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning. CoRR abs/2402.17406 (2024) - [i46]Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado:
Audio-Synchronized Visual Animation. CoRR abs/2403.05659 (2024) - [i45]Shentong Mo, Jing Shi, Yapeng Tian:
Text-to-Audio Generation Synchronized with Videos. CoRR abs/2403.07938 (2024) - [i44]Jiantao Wu, Shentong Mo, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais:
DailyMAE: Towards Pretraining Masked Autoencoders in One Day. CoRR abs/2404.00509 (2024) - [i43]Shentong Mo, Xufang Luo, Yansen Wang, Dongsheng Li:
A Large-scale Medical Visual Task Adaptation Benchmark. CoRR abs/2404.12876 (2024) - [i42]Shentong Mo, Haofan Wang, Huaxia Li, Xu Tang:
Unified Video-Language Pre-training with Synchronized Audio. CoRR abs/2405.07202 (2024) - [i41]Shentong Mo, Yapeng Tian:
Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation. CoRR abs/2405.15881 (2024) - [i40]Shentong Mo, Sukmin Yun:
DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture. CoRR abs/2405.17995 (2024) - [i39]Tanvir Mahmud, Shentong Mo, Yapeng Tian, Diana Marculescu:
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers. CoRR abs/2406.04930 (2024) - [i38]Shentong Mo:
Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs. CoRR abs/2406.05038 (2024) - [i37]Shentong Mo, Yapeng Tian:
Semantic Grouping Network for Audio Source Separation. CoRR abs/2407.03736 (2024) - [i36]Shentong Mo, Russ Salakhutdinov, Louis-Philippe Morency, Paul Pu Liang:
IoT-LM: Large Multisensory Language Models for the Internet of Things. CoRR abs/2407.09801 (2024) - [i35]Shentong Mo, Pedro Morgado:
Audio-visual Generalized Zero-shot Learning the Easy Way. CoRR abs/2407.13095 (2024) - [i34]Shentong Mo, Paul Pu Liang:
MultiMed: Massively Multimodal and Multitask Medical Understanding. CoRR abs/2408.12682 (2024) - [i33]Shentong Mo, Haofan Wang:
Multi-scale Multi-instance Visual Sound Localization and Segmentation. CoRR abs/2409.00486 (2024) - 2023
- [j1]Paul Pu Liang, Yiwei Lyu, Xiang Fan, Jeffrey Tsaw, Yudong Liu, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Russ Salakhutdinov:
High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning. Trans. Mach. Learn. Res. 2023 (2023) - [c21]Jiantao Wu, Shentong Mo, Xingshen Zhang, Muhammad Awais, Sara Ahmed, Zhenhua Feng, Lin Wang, Xiang Yang:
Variational Autoencoders with Decremental Information Bottleneck for Disentanglement. BMVC 2023: 433-436 - [c20]Shentong Mo, Yapeng Tian:
Audio-Visual Grouping Network for Sound Localization from Mixtures. CVPR 2023: 10565-10574 - [c19]Shentong Mo, Weiguo Pian, Yapeng Tian:
Class-Incremental Grouping Network for Continual Audio-Visual Learning. ICCV 2023: 7754-7764 - [c18]Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian:
Audio-Visual Class-Incremental Learning. ICCV 2023: 7765-7777 - [c17]Shentong Mo, Pedro Morgado:
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition. ICML 2023: 25006-25017 - [c16]Ruihang Chu, Enze Xie, Shentong Mo, Zhenguo Li, Matthias Nießner, Chi-Wing Fu, Jiaya Jia:
DiffComplete: Diffusion-based Generative 3D Shape Completion. NeurIPS 2023 - [c15]Shentong Mo, Bhiksha Raj:
Weakly-Supervised Audio-Visual Segmentation. NeurIPS 2023 - [c14]Shentong Mo, Enze Xie, Ruihang Chu, Lanqing Hong, Matthias Nießner, Zhenguo Li:
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation. NeurIPS 2023 - [c13]Shentong Mo, Zhun Sun, Chao Li:
Representation Disentanglement in Generative Models with Contrastive Learning. WACV 2023: 1531-1540 - [c12]Shentong Mo, Zhun Sun, Chao Li:
Multi-level Contrastive Learning for Self-Supervised Vision Transformers. WACV 2023: 2777-2786 - [i32]Jiantao Wu, Shentong Mo, Muhammad Awais, Sara Atito, Xingshen Zhang, Lin Wang, Xiang Yang:
Variantional autoencoder with decremental information bottleneck for disentanglement. CoRR abs/2303.12959 (2023) - [i31]Shentong Mo, Yapeng Tian:
Audio-Visual Grouping Network for Sound Localization from Mixtures. CoRR abs/2303.17056 (2023) - [i30]Shentong Mo, Jingfei Xia, Ihor Markevych:
CAVL: Learning Contrastive and Adaptive Representations of Vision and Language. CoRR abs/2304.04399 (2023) - [i29]Shentong Mo, Yapeng Tian:
AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation. CoRR abs/2305.01836 (2023) - [i28]Shentong Mo, Jing Shi, Yapeng Tian:
DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment. CoRR abs/2305.12903 (2023) - [i27]Shentong Mo, Pedro Morgado:
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition. CoRR abs/2305.19458 (2023) - [i26]Ruihang Chu, Enze Xie, Shentong Mo, Zhenguo Li, Matthias Nießner, Chi-Wing Fu, Jiaya Jia:
DiffComplete: Diffusion-based Generative 3D Shape Completion. CoRR abs/2306.16329 (2023) - [i25]Shentong Mo, Enze Xie, Ruihang Chu, Lewei Yao, Lanqing Hong, Matthias Nießner, Zhenguo Li:
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation. CoRR abs/2307.01831 (2023) - [i24]Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian:
Audio-Visual Class-Incremental Learning. CoRR abs/2308.11073 (2023) - [i23]Jiantao Wu, Shentong Mo, Muhammad Awais, Sara Atito, Zhenhua Feng, Josef Kittler:
Masked Momentum Contrastive Learning for Zero-shot Semantic Understanding. CoRR abs/2308.11448 (2023) - [i22]Shentong Mo, Weiguo Pian, Yapeng Tian:
Class-Incremental Grouping Network for Continual Audio-Visual Learning. CoRR abs/2309.05281 (2023) - [i21]Shentong Mo, Miao Xin:
Tree of Uncertain Thoughts Reasoning for Large Language Models. CoRR abs/2309.07694 (2023) - [i20]Shentong Mo, Zhun Sun, Chao Li:
Exploring Data Augmentations on Self-/Semi-/Fully- Supervised Pre-trained Models. CoRR abs/2310.18850 (2023) - [i19]Shentong Mo, Paul Pu Liang, Russ Salakhutdinov, Louis-Philippe Morency:
MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things. CoRR abs/2311.06217 (2023) - [i18]Shentong Mo, Bhiksha Raj:
Weakly-Supervised Audio-Visual Segmentation. CoRR abs/2311.15080 (2023) - [i17]Shentong Mo, Pedro Morgado:
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling. CoRR abs/2312.01017 (2023) - [i16]Jiantao Wu, Shentong Mo, Sara Atito, Josef Kittler, Zhenhua Feng, Muhammad Awais:
Beyond Accuracy: Statistical Measures and Benchmark for Evaluation of Representation from Self-Supervised Learning. CoRR abs/2312.01118 (2023) - [i15]Shentong Mo, Enze Xie, Yue Wu, Junsong Chen, Matthias Nießner, Zhenguo Li:
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation. CoRR abs/2312.07231 (2023) - 2022
- [c11]Shentong Mo, Zhun Sun, Chao Li:
Rethinking Prototypical Contrastive Learning through Alignment, Uniformity and Correlation. BMVC 2022: 299 - [c10]Shentong Mo, Pedro Morgado:
Localizing Visual Sounds the Easy Way. ECCV (37) 2022: 218-234 - [c9]Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides:
Unitail: Detecting, Reading, and Matching in Retail Scene. ECCV (7) 2022: 705-722 - [c8]Shentong Mo, Pedro Morgado:
A Closer Look at Weakly-Supervised Audio-Visual Source Localization. NeurIPS 2022 - [c7]Shentong Mo, Yapeng Tian:
Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing. NeurIPS 2022 - [i14]Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang:
Context Autoencoder for Self-Supervised Representation Learning. CoRR abs/2202.03026 (2022) - [i13]Paul Pu Liang, Yiwei Lyu, Xiang Fan, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Ruslan Salakhutdinov:
HighMMT: Towards Modality and Task Generalization for High-Modality Representation Learning. CoRR abs/2203.01311 (2022) - [i12]Shentong Mo, Daizong Liu, Wei Hu:
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding. CoRR abs/2203.03838 (2022) - [i11]Shentong Mo, Pedro Morgado:
Localizing Visual Sounds the Easy Way. CoRR abs/2203.09324 (2022) - [i10]Shentong Mo, Jingfei Xia, Xiaoqing Tan, Bhiksha Raj:
Point3D: tracking actions as moving points with 3D CNNs. CoRR abs/2203.10584 (2022) - [i9]Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides:
Unitail: Detecting, Reading, and Matching in Retail Scene. CoRR abs/2204.00298 (2022) - [i8]Jiantao Wu, Shentong Mo:
Object-wise Masked Autoencoders for Fast Pre-training. CoRR abs/2205.14338 (2022) - [i7]Shentong Mo, Zhun Sun, Chao Li:
Siamese Prototypical Contrastive Learning. CoRR abs/2208.08819 (2022) - [i6]Shentong Mo, Pedro Morgado:
A Closer Look at Weakly-Supervised Audio-Visual Source Localization. CoRR abs/2209.09634 (2022) - [i5]Shentong Mo, Zhun Sun, Chao Li:
Rethinking Prototypical Contrastive Learning through Alignment, Uniformity and Correlation. CoRR abs/2210.10194 (2022) - 2021
- [c6]Shentong Mo, Zhun Sun, Chao Li:
Siamese Prototypical Contrastive Learning. BMVC 2021: 185 - [c5]Shentong Mo, Jingfei Xia, Xiaoqing Tan, Bhiksha Raj:
Point3D: tracking actions as moving points with 3D CNNs. BMVC 2021: 259 - [c4]Miao Xin, Shentong Mo, Yuanze Lin:
EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks. CVPR Workshops 2021: 1462-1471 - [c3]Shentong Mo, Miao Xin:
Long-Term Head Pose Forecasting Conditioned on the Gaze-Guiding Prior. CVPR Workshops 2021: 2239-2248 - [c2]Jiantao Wu, Shentong Mo, Lin Wang:
An Empirical Study of Uncertainty Gap for Disentangling Factors. Trustworthy AI @ ACM Multimedia 2021: 1-8 - [c1]Shentong Mo, Xin Miao:
OsGG-Net: One-step Graph Generation Network for Unbiased Head Pose Estimation. ACM Multimedia 2021: 2465-2473 - [i4]Shentong Mo, Pengtao Xie:
Learning by Examples Based on Multi-level Optimization. CoRR abs/2109.10824 (2021) - [i3]Shentong Mo, Xi Fu, Chenyang Hong, Yizhen Chen, Yuxuan Zheng, Xiangru Tang, Zhiqiang Shen, Eric P. Xing, Yanyan Lan:
Multi-modal Self-supervised Pre-training for Regulatory Genome Across Cell Types. CoRR abs/2110.05231 (2021) - 2020
- [i2]Shentong Mo, Haofan Wang, Pinxu Ren, Ta-Chung Chi:
Automatic Speech Verification Spoofing Detection. CoRR abs/2012.08095 (2020) - [i1]Shentong Mo, Xiaoqing Tan, Jingfei Xia, Pinxu Ren:
Towards Improving Spatiotemporal Action Recognition in Videos. CoRR abs/2012.08097 (2020)
2010 – 2019
- 2018
- [d1]Qiang Zou, Shentong Mo, Xiaochang Pei:
SERS spectrum of RHB solution measured on different patterns. IEEE DataPort, 2018
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-11 21:28 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint