default search action
Andros Tjandra
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j7]Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli:
Scaling Speech Technology to 1, 000+ Languages. J. Mach. Learn. Res. 25: 97:1-97:52 (2024) - [c38]Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model. ICASSP 2024: 12201-12205 - [c37]Alexander H. Liu, Matthew Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu:
Generative Pre-training for Speech with Flow Matching. ICLR 2024 - [c36]K. R. Prajwal, Bowen Shi, Matthew Le, Apoorv Vyas, Andros Tjandra, Mahi Luthra, Baishan Guo, Huiyu Wang, Triantafyllos Afouras, David Kant, Wei-Ning Hsu:
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation. ICML 2024 - [i33]Chung-Ming Chien, Andros Tjandra, Apoorv Vyas, Matt Le, Bowen Shi, Wei-Ning Hsu:
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning. CoRR abs/2406.06251 (2024) - 2023
- [c35]Mumin Jin, Prashant Serai, Jilong Wu, Andros Tjandra, Vimal Manohar, Qing He:
Voice-Preserving Zero-Shot Multiple Accent Conversion. ICASSP 2023: 1-5 - [c34]Andros Tjandra, Nayan Singhal, David Zhang, Ozlem Kalinli, Abdelrahman Mohamed, Duc Le, Michael L. Seltzer:
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities. ICASSP 2023: 1-5 - [c33]Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, Ozlem Kalinli:
Learning ASR Pathways: A Sparse Multilingual ASR Model. ICASSP 2023: 1-5 - [i32]Heli Qi, Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain. CoRR abs/2301.02966 (2023) - [i31]Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli:
Scaling Speech Technology to 1, 000+ Languages. CoRR abs/2305.13516 (2023) - [i30]Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model. CoRR abs/2309.13018 (2023) - [i29]Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu:
Generative Pre-training for Speech with Flow Matching. CoRR abs/2310.16338 (2023) - [i28]Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu:
Audiobox: Unified Audio Generation with Natural Language Prompts. CoRR abs/2312.15821 (2023) - 2022
- [c32]Andros Tjandra, Diptanu Gon Choudhury, Frank Zhang, Kritika Singh, Alexis Conneau, Alexei Baevski, Assaf Sela, Yatharth Saraf, Michael Auli:
Improved Language Identification Through Cross-Lingual Self-Supervised Learning. ICASSP 2022: 6877-6881 - [c31]Sangeeta Srivastava, Yun Wang, Andros Tjandra, Anurag Kumar, Chunxi Liu, Kritika Singh, Yatharth Saraf:
Conformer-Based Self-Supervised Learning For Non-Speech Audio Tasks. ICASSP 2022: 8862-8866 - [c30]Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli:
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale. INTERSPEECH 2022: 2278-2282 - [c29]Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji, Andros Tjandra, Sakriani Sakti:
NIX-TTS: Lightweight and End-to-End Text-to-Speech Via Module-Wise Distillation. SLT 2022: 970-976 - [i27]Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, John H. L. Hansen, Ozlem Kalinli:
Learning ASR pathways: A sparse multilingual ASR model. CoRR abs/2209.05735 (2022) - [i26]Andros Tjandra, Nayan Singhal, David Zhang, Ozlem Kalinli, Abdelrahman Mohamed, Duc Le, Michael L. Seltzer:
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities. CoRR abs/2211.05756 (2022) - [i25]Mumin Jin, Prashant Serai, Jilong Wu, Andros Tjandra, Vimal Manohar, Qing He:
Voice-preserving Zero-shot Multiple Accent Conversion. CoRR abs/2211.13282 (2022) - 2021
- [j6]Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Multimodal Chain: Cross-Modal Collaboration Through Listening, Speaking, and Visualizing. IEEE Access 9: 70286-70299 (2021) - [j5]Sahoko Nakayama, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Code-Switching ASR and TTS Using Semisupervised Learning with Machine Speech Chain. IEICE Trans. Inf. Syst. 104-D(10): 1661-1677 (2021) - [c28]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. Interspeech 2021: 4089-4093 - [i24]Andros Tjandra, Diptanu Gon Choudhury, Frank Zhang, Kritika Singh, Alexei Baevski, Assaf Sela, Yatharth Saraf, Michael Auli:
Improved Language Identification Through Cross-Lingual Self-Supervised Learning. CoRR abs/2107.04082 (2021) - [i23]Sangeeta Srivastava, Yun Wang, Andros Tjandra, Anurag Kumar, Chunxi Liu, Kritika Singh, Yatharth Saraf:
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks. CoRR abs/2110.07313 (2021) - [i22]Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli:
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale. CoRR abs/2111.09296 (2021) - 2020
- [j4]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Recurrent Neural Network Compression Based on Low-Rank Tensor Representation. IEICE Trans. Inf. Syst. 103-D(2): 435-449 (2020) - [j3]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Machine Speech Chain. IEEE ACM Trans. Audio Speech Lang. Process. 28: 976-989 (2020) - [j2]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Corrections to "Machine Speech Chain". IEEE ACM Trans. Audio Speech Lang. Process. 28: 1706 (2020) - [c27]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-Based Acoustic Modeling for Hybrid Speech Recognition. ICASSP 2020: 6874-6878 - [c26]Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig:
DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks. ICASSP 2020: 6899-6903 - [c25]Sashi Novitasari, Andros Tjandra, Tomoya Yanagita, Sakriani Sakti, Satoshi Nakamura:
Incremental Machine Speech Chain Towards Enabling Listening While Speaking in Real-Time. INTERSPEECH 2020: 4372-4376 - [c24]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge. INTERSPEECH 2020: 4851-4855 - [c23]Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Augmenting Images for ASR and TTS Through Single-Loop and Dual-Loop Multimodal Chain Framework. INTERSPEECH 2020: 4901-4905 - [c22]Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis. SLTU-CCURL@LREC 2020: 131-138 - [i21]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge. CoRR abs/2005.11676 (2020) - [i20]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. CoRR abs/2010.12973 (2020) - [i19]Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework. CoRR abs/2011.02099 (2020) - [i18]Sashi Novitasari, Andros Tjandra, Tomoya Yanagita, Sakriani Sakti, Satoshi Nakamura:
Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time. CoRR abs/2011.02126 (2020) - [i17]Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition. CoRR abs/2011.02127 (2020) - [i16]Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis. CoRR abs/2011.02128 (2020)
2010 – 2019
- 2019
- [j1]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
End-to-End Speech Recognition Sequence Training With Reinforcement Learning. IEEE Access 7: 79758-79769 (2019) - [c21]Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Listening While Speaking and Visualizing: Improving ASR Through Multimodal Chain. ASRU 2019: 471-478 - [c20]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Speech-to-Speech Translation Between Untranscribed Unknown Languages. ASRU 2019: 593-600 - [c19]Sahoko Nakayama, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Zero-Shot Code-Switching ASR and TTS with Multilingual Machine Speech Chain. ASRU 2019: 964-971 - [c18]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
End-to-end Feedback Loss in Speech Chain Framework via Straight-through Estimator. ICASSP 2019: 6281-6285 - [c17]Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura:
VQVAE Unsupervised Unit Discovery and Multi-Scale Code2Spec Inverter for Zerospeech Challenge 2019. INTERSPEECH 2019: 1118-1122 - [c16]Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition. INTERSPEECH 2019: 3835-3839 - [c15]Sahoko Nakayama, Takatomo Kano, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Recognition and translation of code-switching speech utterances. O-COCOSDA 2019: 1-6 - [i15]Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura:
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019. CoRR abs/1905.11449 (2019) - [i14]Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning. CoRR abs/1906.00579 (2019) - [i13]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Speech-to-speech Translation between Untranscribed Unknown Languages. CoRR abs/1910.00795 (2019) - [i12]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-based Acoustic Modeling for Hybrid Speech Recognition. CoRR abs/1910.09799 (2019) - [i11]Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig:
Deja-vu: Double Feature Presentation in Deep Transformer Networks. CoRR abs/1910.10324 (2019) - 2018
- [c14]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Sequence-to-Sequence Asr Optimization Via Reinforcement Learning. ICASSP 2018: 5829-5833 - [c13]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Tensor Decomposition for Compressing Recurrent Neural Network. IJCNN 2018: 1-8 - [c12]Takuma Mori, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Compressing End-to-end ASR Networks by Tensor-Train Decomposition. INTERSPEECH 2018: 806-810 - [c11]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Machine Speech Chain with One-shot Speaker Adaptation. INTERSPEECH 2018: 887-891 - [c10]Sahoko Nakayama, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Speech Chain for Semi-Supervised Learning of Japanese-English Code-Switching ASR and TTS. SLT 2018: 182-189 - [c9]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Multi-Scale Alignment and Contextual History for Attention Mechanism in Sequence-to-Sequence Model. SLT 2018: 648-655 - [i10]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Tensor Decomposition for Compressing Recurrent Neural Network. CoRR abs/1802.10410 (2018) - [i9]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Machine Speech Chain with One-shot Speaker Adaptation. CoRR abs/1803.10525 (2018) - [i8]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model. CoRR abs/1807.08280 (2018) - [i7]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator. CoRR abs/1810.13107 (2018) - 2017
- [c8]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Listening while speaking: Speech chain by deep learning. ASRU 2017: 301-308 - [c7]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Attention-based Wav2Text with feature transfer learning. ASRU 2017: 309-315 - [c6]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Local Monotonic Attention Mechanism for End-to-End Speech And Language Processing. IJCNLP(1) 2017: 431-440 - [c5]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Compressing recurrent neural network with tensor train. IJCNN 2017: 4451-4458 - [c4]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Speech recognition features based on deep latent Gaussian models. MLSP 2017: 1-6 - [i6]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Compressing Recurrent Neural Network with Tensor Train. CoRR abs/1705.08052 (2017) - [i5]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Local Monotonic Attention Mechanism for End-to-End Speech Recognition. CoRR abs/1705.08091 (2017) - [i4]Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura:
Gated Recurrent Neural Tensor Network. CoRR abs/1706.02222 (2017) - [i3]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Listening while Speaking: Speech Chain by Deep Learning. CoRR abs/1707.04879 (2017) - [i2]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Attention-based Wav2Text with Feature Transfer Learning. CoRR abs/1709.07814 (2017) - [i1]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Sequence-to-Sequence ASR Optimization via Reinforcement Learning. CoRR abs/1710.10774 (2017) - 2016
- [c3]Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura:
Gated Recurrent Neural Tensor Network. IJCNN 2016: 448-455 - 2015
- [c2]Andros Tjandra, Sakriani Sakti, Satoshi Nakamura, Mirna Adriani:
Stochastic Gradient Variational Bayes for deep learning-based ASR. ASRU 2015: 175-180 - [c1]Andros Tjandra, Sakriani Sakti, Graham Neubig, Tomoki Toda, Mirna Adriani, Satoshi Nakamura:
Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR. ICASSP 2015: 4525-4529
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:15 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint