Computer Science > Computation and Language

arXiv:2304.00171 (cs)

[Submitted on 31 Mar 2023]

Title:Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Authors:Rami Botros, Anmol Gulati, Tara N. Sainath, Krzysztof Choromanski, Ruoming Pang, Trevor Strohman, Weiran Wang, Jiahui Yu

View PDF

Abstract:Conformer models maintain a large number of internal states, the vast majority of which are associated with self-attention layers. With limited memory bandwidth, reading these from memory at each inference step can slow down inference. In this paper, we design an optimized conformer that is small enough to meet on-device restrictions and has fast inference on TPUs. We explore various ideas to improve the execution speed, including replacing lower conformer blocks with convolution-only blocks, strategically downsizing the architecture, and utilizing an RNNAttention-Performer. Our optimized conformer can be readily incorporated into a cascaded-encoder setting, allowing a second-pass decoder to operate on its output and improve the accuracy whenever more resources are available. Altogether, we find that these optimizations can reduce latency by a factor of 6.8x, and come at a reasonable trade-off in quality. With the cascaded second-pass, we show that the recognition accuracy is completely recoverable. Thus, our proposed encoder can double as a strong standalone encoder in on device, and as the first part of a high-performance ASR pipeline.

Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2304.00171 [cs.CL]
	(or arXiv:2304.00171v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2304.00171

Submission history

From: Rami Botros [view email]
[v1] Fri, 31 Mar 2023 23:30:48 UTC (290 KB)

Computer Science > Computation and Language

Title:Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators