Abstract
The paper explores speech-to-speech interpretation systems (SISs) application to service multilingual communication in humanitarian contexts related to forced migration. The study compares various systems capacity to process non-native speakers’ speech to map major challenges for the above systems use within the mentioned settings. The research introduces interim results of a pilot study in terms of research sample (non-native speakers and interpreters), the selected language pair, and the comparative list of recommender platforms. The research includes the relevant literature study, comparative analysis of SISs outputs regarding the interpretation of non-native speakers’ accented speech, when interpreted from English into Russian, interpreters’ surveys on the above analysis results. The technology under study included Google Translator, Microsoft Translator, and Yandex. The pool of research participants included refugees from different countries and professional interpreters. The research rested on comparative qualitative multidimensional analysis, integrated content-based selection of academic sources and their theoretical analysis, descriptive empirical analysis of language errors by SISs, interpreters’ survey through open-ended questionnaire, factor, cluster, and content analysis to process their replies. The results map those language and communicative context features that should be considered for digital interpreting systems further tuning in terms of multilingual instrumentation, set forth the tasks to develop the relevant methodology for further studies, customize the technology to specific communicative settings and to train specialists to use it in socially critical contexts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anumanchipalli, G.K., Chartier, J., Chang, E.F.: Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019)
Bohouta, G., Këpuska, V.Z.: Comparing speech recognition systems (Microsoft API, Google API And CMU Sphinx). Int. J. Eng. Res. Appl. 7(3), 20–24 (2017)
Chamchong, R., Wong, K.W. (eds): Multi-disciplinary Trends in Artificial Intelligence: 13th International Conference, Kuala Lumpur, Malaysia, 17–19 November 2019, Proceedings, vol. 11909. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33709-4
Chen, Zh.: Co-designing a chatbot for and with refugees and migrants. Unpublished master’s thesis. Aalto University, Espoo, Finland (2019). https://aaltodoc.aalto.fi/handle/123456789/39282. Accessed 10 Jan 2021
Cominelli, L., Mazzei, D., De Rossi, D.E.: SEAI: social emotional artificial intelligence based on Damasio’s theory of mind. Front. Robot. AI 5, 6 (2018)
Česonis, R.: Human language technologies and digitalisation in a multilingual interpreting setting. In: Besznyák, R., Szabó, C., Fischer, M. (eds.) Fit-For-Market Translator and Interpreter Training in a Digital Age, pp. 179–195. Vernon Press, Wilmington (2020)
Dutta, S., Klakow, D.: Evaluating a neural multi-turn chatbot using BLEU score. Technical report. Saarland University, Saarbrücken (2019)
Flasiński, M.: Introduction to Artificial Intelligence. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-58487-4
Fu, K.S.: Applications of Pattern Recognition. CRC Press, Boca Raton (2019)
Guest, G., Bunce, A., Johnson, L.: How many interviews are enough? An experiment with data saturation and variability. Field Methods 18(1), 59–82 (2006)
Gonzalez-Rodriguez, D., Hernandez, R.: Self-Organized Linguistic Systems: from traditional AI to bottom-up generative processes. Futures 103, 27–34 (2018)
Hunt, M., Pringle, J., Christen, M., Eckenwiler, L., Schwartz, L., Davé, A.: Ethics of emergent information and communication technology applications in humanitarian medical assistance. Int. Health 8(4), 239–245 (2016)
Jackson, P.C.: Introduction to Artificial Intelligence. Courier Dover Publications, Mineola (2019)
Kandagal, A.P., Udayashankara, V.: Speaker independent speech recognition using maximum likelihood approach for isolated words. Int. J. Comput. Appl. 7(6), 72–83 (2017)
Kim, J.B., Kweon, H.J., Lee, R. (eds.): Computational Science/Intelligence and Applied Informatics. CSII 2019. SCI, vol. 848, pp. 1–10. Springer, Cham (2020). https://doi.org/10.1007/978-3-319-96806-3
Kletečka-Pulker, M., Parrag, S., Drožđek, B., Wenzel, T.: Language barriers and the role of Interpreters: a challenge in the work with migrants and refugees. In: Wenzel, T., Drožđek, B. (eds.) An Uncertain Safety, pp. 345–361. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-72914-5
Koenecke, A., et al.: Racial disparities in automated speech recognition. Proc. Natl. Acad. Sci. 117(14), 7684–7689 (2020)
Lim, H.: Design for computer-mediated multilingual communication with AI support. In: Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing, pp. 93–96 (2018)
Luo, X., Zhou, M., Li, S., Wu, D., Liu, Z., Shang, M.: Algorithms of unconstrained non-negative latent factor analysis for recommender systems. IEEE Trans. Big Data 7(1), 227–240 (2021)
Maučec, M.S., Brest, J.: Slavic languages in phrase-based statistical machine translation: a survey. Artif. Intell. Rev. 51(1), 77–117 (2017). https://doi.org/10.1007/s10462-017-9558-2
Mishra, S.K.: Artificial Intelligence and Natural Language Processing. Cambridge Scholars Publishing, Cambridge (2018)
O’Brien, S., Federici, F., Cadwella, P., Marlowec, J., Gerberd, B.: Language translation during disaster: a comparative analysis of five national approaches. Int. J. Disast. Risk Reduct. 31, 627–636 (2018)
Saad, U., Afzal, U.: El-Issawi: a model to measure QoE for virtual personal assistant. Multimedia Tools Appl. 76(10), 12517–12537 (2016)
Shcherba, L.: Phonetics of the French Language, 7th edn. Higher School, Moscow (1963)
Shang, M., Luo, X., Liu, Z., Chen, J., Yuan, Y., Zhou, M.: Randomized latent factor model for high-dimensional and sparse matrices from industrial applications. IEEE/CAA J. Automatica Sinica 6(1), 131–141 (2019)
Sinhababu, N., Saxena, R., Sarma, M., Samanta, D.: Medical information retrieval and interpretation: a question-answer based interaction Model. arXiv preprint arXiv:2101.09662 (2021)
Al Smadi, K., Al Issa, H.A., Trrad, I., Al Smadi, T.: Artificial intelligence for speech recognition based on neural networks. J. Signal Inf. Process. 6(2), 66–72 (2015)
Strobel, M., Dwyer, C.: Obstacles to Adopting Speech Recognition in Emergency Services Solutions (2018). https://aisel.aisnet.org/amcis2018/AdoptionDiff/Presentations/25/
Song, Y., Li, M., Luo, X., Yang, G., Wang, C.: Improved symmetric and nonnegative matrix factorization models for undirected, sparse and large-scaled networks: a triple factorization-based approach. IEEE Trans. Industr. Inf. 16(5), 3006–3017 (2020)
Wahl, B., Cossy-Gantner, A., Germann, S., Schwalbe, N.R.: Artificial intelligence (AI) and global health: how can AI contribute to health in resource-poor settings? BMJ Global Health 3(4), e000798 (2018)
Wakatsuki, D., Kato, N., Shionome, T.: Development of web-based remote speech-to-text interpretation system captiOnline. J. Adv. Comput. Intell. Intell. Inform. 21(2), 310–320 (2017)
Wu, Y., et al.: See What i’m saying? Comparing intelligent personal assistant use for native and non-native language speakers. In: 22nd International Conference on Human-Computer Interaction with Mobile Devices and Services, pp. 1–9 (2020)
Wu, D., Luo, X., Shang, Y., He, Y., Wang, G., Zhou, M.: A deep latent factor model for high-dimensional and sparse matrices in recommender systems. IEEE Trans. Syst. Man Cybernet. Syst. 51(7), 4285–4296 (2021)
Xiang, W., Wang, B.: A survey of event extraction from text. IEEE Access 7, 173111–173137 (2019)
Zhang, X., Miyaki, T., Rekimoto, J.: WithYou: automated adaptive speech tutoring with context-dependent speech recognition. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
APPENDIX. Links to Videos, Length of Recordings, Number of Syntagms (Last Accessed January 10, 2021)
APPENDIX. Links to Videos, Length of Recordings, Number of Syntagms (Last Accessed January 10, 2021)
Video 1. Refugee crisis: huge queues on the Serbian-Macedonian border (2015, November 11). https://www.youtube.com/watch?v=B43F3ZH-TTw (1.37–2.11; 2.44–3.25 min, 30 syntagms).
Video 2. Refugees face increasingly perilous journeys to Europe (2017, December 5). https://www.youtube.com/watch?v=Yma4OhnLLPw (1.46–2.12; 2.40–2.50 min, 10 syntagms).
Video 3. Rescued African migrants say they are fleeing slavery (2017, June 28). https://www.youtube.com/watch?v=lnSgWGUJ3jE (5.54–5.70; 6.23–6.44; 10.13–10.37 min, 24 syntagms).
Video 4. Surviving One of the Deadliest Routes to Europe: Refugees at Sea (2016, January 11). https://www.youtube.com/watch?v=nPelTu3iupc (3.36–5.12 min, 61 synt).
Video 5. Tensions between Afghan and Syrian refugees on the Greek island of Lesbos (2015, November 21). https://www.theguardian.com/world/video/2015/nov/21/tensions-between-afghan-and-syrian-refugees-on-the-greek-island-of-lesbos-video (0.41–0.52; 1.32–1.38; 1.50–3.03; 2.18–2.48; 3.53–4.13; 4.23–4.44; 5.32–5.48 min, 42 syntagms).
Video 6. Afghan refugees describe treacherous journeys to Turkey (2018, April 12). https://www.youtube.com/watch?v=yZACCMy2go8 (0.45–1; 1.07–1.20 min, 21 synt.)
Video 7. Niger refugees: Hundreds hope for a new life in Europe (2019, April 16). https://www.aljazeera.com/news/2019/04/niger-refugees-hundreds-hope-life-europe-190416104509532.html (0.17–0.43; 0.49–0.53 min, 55 syntagms).
Video 8. Unaccompanied refugee children share their dreams and despair … (2019, August 29). https://www.unicef.org/eca/stories-region/unaccompanied-refugee-children-share-their-dreams-and-despair-they-await-uncertain (0.46–1.17; 2.50–3.23; 4.02–4.59; 5.01–5.32; 5.44–7.31; 7.46–8.10, 9.40–9, 59–11.03–11.23 min, 106 syntagms).
Video 9. Rohingya English club conversation (2017, February 16) https://www.youtube.com/watch?v=G3rBY8N-3wE) (0–2.45 min, 38 syntagms).
Video 10. She tells her story on why she fled away from North Korea (2017, January 13 https://www.youtube.com/watch?v=EKbnyLKLHbo (6.37 min, 296 syntagms).
Video 11. From Myanmar to India (2019, December 3). https://thediplomat.com/2019/12/from-myanmar-to-india-refugees-lives-matter/ (6.19 min, 291 synt.).
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Atabekova, A. (2022). Communication with Non-native Speakers Through the Service of Speech-To-Speech Interpreting Systems: Testing Technology Capacity and Exploring Specialists’ Views. In: Serhani, M.A., Zhang, LJ. (eds) Services – SERVICES 2021. SERVICES 2021. Lecture Notes in Computer Science(), vol 12996. Springer, Cham. https://doi.org/10.1007/978-3-030-96585-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-96585-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96584-6
Online ISBN: 978-3-030-96585-3
eBook Packages: Computer ScienceComputer Science (R0)