New Year’s Day Speeches of Czech Presidents: Phonetic Analysis and Text Analysis

Milan Jičínský¹⁶ &
Jaroslav Marek¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10244))

Included in the following conference series:

IFIP International Conference on Computer Information Systems and Industrial Management

1885 Accesses

Abstract

The aim of our study is verification of programmed algorithms of phonetic analysis using concrete data, and reassurance that it works as also sought after. For our testing, the appropriate recordings of New Year’s Day speeches of Czech and Czechoslovak presidents are available. The very first available recording of presidential speech comes from 1935. All transcripts and recordings of the last 87 speeches are located on the web page www.rozhlas.cz. The primary goal of this paper is to analyze voice characteristics of the speaker (log energy, speech velocity and Zero crossing rate). Especially words “with greatest energy” will be found. There will be a list of words having the highest energy values. The most interesting results will be presented by graphical tools. Using a software, capable of text analysis, transcript characteristics such as most frequent words, length of words, total number of words and different words will be computed. The most frequent words will be presented. Political speeches often become the subject of various analyses. Our calculation allows a new perspective on speeches. It is interesting to compare the most frequent semantic words and words with the greatest energy. The results can be historically important. It allows an extraction of new information from available data and scientifically different approach.

You have full access to this open access chapter, Download conference paper PDF

The “One Day of Speech” Corpus: Phonetic and Syntactic Studies of Everyday Spoken Russian

Acoustical Frame Rate and Pronunciation Variant Statistics

Analysis of Emotional Speech—A Review

Keywords

1 Introduction

Nowadays analysis of speech is very popular. It started in the second half of 20th century when basic signal characteristics were discovered. Fundamentals are well described in [1,2,3]. This field of study evolved very quickly so there are many applications from key word detection [4] across transcription of fluent speech [5] to recognition of speaker [6]. This article is reserved for those who want to study phonetic analysis and its fundamentals. Even historians whose field of study is 20th century in the Czech Republic can appreciate the most frequented words and words with the highest energy. Linguists can be satisfied with changes in individual speeches whether it’s written or spoken.

Many authors aim their work on linguistic analysis of political speeches. For example, articles End-of-year speeches of Italian presidents or inaugural speeches of US presidents were researched in [7, 8]. Relationship between ideology and language and thematic concentration of Czechoslovak New Year’s Day speeches is analyzed in article [9]. Of course, it is very interesting to study influences of ideology, originality of author and his abilities to differ from uniformity. The most frequent words can provide an information about recent years because they react on the most important events. Some of those words will be listed.

Main goal of this publication is to present words that were said with the greatest energy. Words with the greatest energy allows to track what president emphasized on during a reading. Emphasis of the speaker will be probably on positive words. The only exception could be the time of war. Ideological words may be emphasized in some speeches. There will be one more characteristic calculated for each speech – speech velocity.

2 Voice Characteristics

Analysis of recordings of New Year’s Day speeches will be introduced in this chapter. The intensity of voice (energy) and speech velocity will represent voice characteristic of speaker. Energy tells something about how much emphasis speaker uses and speech velocity shows how fast speaker speaks. These variables can be influenced e.g. by age or by sickness. Then the words having the greatest energy can be found. It could be interesting to compare these words with most frequent thematic words. President didn’t have to be an author of written text. But he could highlight any words he wanted to (Fig. 1). It depended on what he considered to be important. This is the example of individuality. Then ZCR (zero crossing rate) characteristics will be shown.

2.1 Obtaining Data and Its Processing

Source data have been obtained from website www.rozhlas.cz. Speeches are recorded with useful software Audacity. Sampling rate of each speech is 8 kHz. Each recording is modified because the original ones contain a music before the speech starts. Calculations of voice parameters are realized in MATLAB (Fig. 2). Scheme of processing can be simplified as on Fig. 3.

Segmentation means that the record is divided into frames of the same length (typically 20 ms long). Frames are overlapping each other right in the half (in this case). Overlapping is recommended to the fact that parameters can be changed in jumps. So this enhancement improves the dramatic changes and it can describe even changes near an edge of frame without loss of useful data. After the segmentation follows a parameterization. Feature vector values of each frame is computed during the parameterization. Features can be divided into: basic, spectral, cepstral and dynamic. ZCR and energy ranks among the basic features. Feature extraction and segmentation is also discussed in [2].

2.2 Intensity of Voice

The intensity of voice is characterized by energy. So the energy is a key parameter which defines the intensity of voice. Energy is defined as the sum of squared values of samples within one frame. Logarithm function is used for better range of energy values. In this case Log energy of ordinary noise is around 5. Whenever speech is contained in recording, values of energy are greater for those frames. Typically, the energy of speaking person can reach even value of 15. It depends on how loud speaker speaks. Log energy is defined as

$$ E = \log \left( {\sum\nolimits_{n = 0}^{{{\text{L}} - 1}} {x^{2} \left( n \right)} } \right) , $$

(1)

where L is the frame length, concretely the number of samples contained in the frame. x(n) is the designation for the current sample value.

In comparison of all presidents it’s evident to see that president Hácha spoke not as loud as others. He had no emphasis. This could be caused by political situation. Hácha used to be a president during the hardest time of the Czechoslovak history. He was helpless president of protectorate state. The only thing he could do was to make people feel calm and safe, even if it wasn’t possible. As for president Husák, very significant decrease of energy was observed between years 1978 and 1979 (Fig. 4).

2.3 ZCR

Zero crossing rate is a parameter that characterizes changing of sign from negative to positive or back. Zero crossing rate is related to the frequency. There is one value of ZCR for each frame (the same as for the energy). The principle of ZCR can be easily explained with Fig. 5. ZCR value is equal to the count of all dots. The dots are placed to the points where signal intersects x axis and changes the sign.

It’s often used for voice activity detection [11] – to find out if human speech in record is present or not. As for voiced signal ZCR values are typically low. Noises and unvoiced signals have higher values. This method is sensitive to noises and direct component shifts. It even allows us to find out if concrete phoneme is voiced (b, d, g, z, v, h, …) or unvoiced (p, t, k, s, f, ch, c, …). Especially the sibilance (s,c,š,č,…) have higher ZCR values (Fig. 6).

$$ ZCR = 1/2\sum\nolimits_{n = 1}^{{{\text{L}} - 1}} {\left| {\text{sgn} \,x\left( n \right) - \text{sgn} \,x\left( {n - 1} \right)} \right|} . $$

(2)

Data variability is relatively high. So, the mean value of ZCR is not that good to represent individual speaker. Better results can be obtained using ZCR dynamically. That means ZCR of each frame is used. Then search for dynamic changes instead of treating it as one static value. It’s preferable to use it for each frame.

2.4 Speech Velocity

For the purposes of the article there is a created parameter that can be used to link results of text and voice interpretation into one value that characterizes the speaker. It’s called speech velocity. This mean value represents how many words the speaker pronounces during the time of one second. The speech of president Husák from 1989 is significantly the slowest. President Hácha is speaking relatively slowly too. On the other hand, the fastest tempo of speaking can be recognized in speeches from 1938, 1943 (Beneš), 1959 (Novotný) and 1996 (Havel) (Fig. 7).

3 Characteristics of Written Text

All studies are realized for 87 speeches of Czechoslovak presidents, Czech presidents or Czechoslovak prime ministers. The unique situation happened due to the World War II. The Czechoslovak Republic had two presidents. President Beneš left his country and exiled to the Great Britain. But he was still very politically active. Then Hácha was chosen to be a president of protectorate. So, both groups of speeches were analyzed between 1940 and 1945. On the other hand, president in exile is considered to be more important subject of our analysis.

We can expect changes in using of different length words during a long history of New Year’s Day speeches. Therefore, the first aim of our calculation is to determine average of word length. Calculations of text parameters are made using software based on Java [10] called “Statistika v lexikální analýze”. This GUI (Graphical User Interface) has been created during diploma thesis. It makes easier the whole text processing. The software allows to analyze frequent letters and words, length of words, aggregation and alliteration and some other features. The original purpose of existence of software is analysis of poems and its translations as in [12].

3.1 Mean of Word Length

On the Fig. 8 can be seen mean values of length of words of analyzed texts. Length of words of communistic presidents (Zápotocký, Novotný, Husák) is much greater than nowadays (Havel, Klaus, Zeman). President Beneš used the shortest words. The greatest variability can be seen at Svoboda and Hácha.

Estimate of expected value is given by

$$ \bar{x} = 1/n\sum\nolimits_{i = 1}^{k} {if_{i} } , $$

(3)

where i = 1, …, k is length of word, k is length of the longest word.

3.2 The Most Frequent Words

Conjunctions and prepositions of course ranks among the most frequent words. Conjunction “a” is the most frequent in all speeches except Novotný (1964) – “v”, Svoboda (1973) – “v” and Hácha (1944) – “se”. Figure 10 is the list of sorted conjunctions and prepositions used from first to fourth position. The common words can be seen. The most frequent words with meaning will be presented in Fig. 10 too. These words differ much more than prepositions. Presidents react on current political events such as crisis, protectorate, war or return of democracy. Meaning words can provide a quick preview of content. Comparison with inaugural speeches of US presidents can be interesting. As for words with meaning, for example Roosevelt (1933) said words: HAVE, NATIONAL and Truman (1949) used the words: WORD, HAVE, NATIONS, PEACE, FREEDOM, PEOPLE, FREE, UNITED, MORE, SECURITY, DEMOCRACY [7]. See Fig. 9 or 10.

Figures are divided into some subsections. As mentioned before, the Czechoslovak Republic had two presidents between 1940 and 1945. In 1993 the second anomaly appeared. The Czechoslovak republic ceased to exist. Since 1993 the country was divided into two smaller autonomous countries: The Czech Republic and the Slovak Republic. So, the president Václav Havel had no speech in 1993. Prime ministers were speaking to their nations instead of president.

Rows are sorted by years. Each row has its color depending on president or prime minister. Colors were chosen according to all figures. The first column contains first four most frequent conjunctions and prepositions. Then there are three columns containing the most frequent words sorted by order. The last column shows the word with the highest energy. Those words are written by uppercase.

3.3 Number of Words

Scatter chart will be used to demonstrate a vocabulary richness. Coordinates on axis x means total number of words in speech and coordinates on axis y means total number of different words. Functional dependency can be modeled by Gompertz curve. Presidents with values above the curve have greater ratio of words than other presidents. Language richness of speeches under the curve can be considered lesser. In article [9] author mentioned that thematic concentration of president Havel is surprisingly low. But this claim doesn’t seem to correspond with language richness. According to the Fig. 10, ratio of number of different words and total number of words is greater as for Havel. This could be caused by choosing different methods of evaluating language richness (Fig. 11).

4 Conclusions

This article’s goal is to present results of our research and show that data we already had can be processed in different way. The extraction of information is much discussed nowadays. Main purpose of research is finding the words with the greatest energy. Because they have historical importance, they can be used as keywords and they even characterize the speaker.

Scale of publication doesn’t allow to detail comment and the description of used algorithms. Many hours of machine time have been needed during the calculations of phonetic parameters of speeches. Archive [13] contains 74 speeches. So, this is more than 19 h of recordings to be analyzed. Before calculation of mean values of ZCR and Log energy, there was an extensive table for each speech. Presented parameters were created by reducing the table containing millions of values (each frame parameter values) into one mean value. Unreduced data may be used for further analysis.

Comparing the table of most frequent thematic words with table containing the words pronounced with the greatest energy brings almost no match. The speaker didn’t emphasize the most frequent words. But he chose to highlight other words. For example, Masaryk talked about economy. Beneš emphasized the war and human kind. Novotný insisted on hard work and improving the communistic country. Havel emphasized the very first words: “Dear fellow citizens.” It can provide some information without listening to the whole speech. It even characterizes the president himself and an era of each president (the most important events, standard of living, relationship between president and citizens).

References

Liang, B., et al.: Feature analysis and extraction for audio automatic classification. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, pp. 767–772. IEEE (2005)
Google Scholar
Liu, Z., Wang, Y., Chen, T.: Audio feature extraction and analysis for scene segmentation and classification. J. VLSI Signal Process. 20(1), 61–79 (1998)
Article Google Scholar
Bhattacharjee, R.: Short Term Time Domain Processing of Speech (Theory): Speech Signal Processing Laboratory. IIT GUWAHATI Virtual Lab, Guwahati (2016). http://iitg.vlab.co.in/?sub=59&brch=164&sim=857&cnt=1. Accessed 5 Dec 2016
Kanda, N., et al.: Open-vocabulary keyword detection from super-large scale speech database. In: 2008 IEEE 10th Workshop on Multimedia Signal Processing, pp. 939–944. IEEE (2008)
Google Scholar
Nouza, J., et al.: Very large vocabulary speech recognition system for automatic transcription of Czech broadcast programs. In: INTERSPEECH (2004)
Google Scholar
Wasson, D., Donaldson, R.: Speech amplitude and zero crossings for automated identification of human speakers. IEEE Trans. Acoust. Speech Signal Process. 23(4), 390–392 (1975)
Article Google Scholar
Kubát, M., Čech, R.: Quantitative analysis of US Presidential Inaugural Addresses. Glottometrics 34, 14–27 (2016)
Google Scholar
Tuzzi, A., Popescu, I.-I., Altmann, G.: Quantitative Analysis of Italian Texts. RAM-Verlag, Lüdenscheid (2010)
Google Scholar
Čech, R.: Language and ideology: quantitative thematic analysis of New Year speeches given by Czechoslovak and Czech presidents (1949–2011). Qual. Quant. 48, 899–910 (2014)
Article Google Scholar
Šlahora, J.: Matematická lingvistika a překlady básně E.A. Poea Havran. Master thesis. University Pardubice, Faculty of electrical engineering and informatics, Pardubice (2015). https://dk.upce.cz/bitstream/handle/10195/60403/SlahoraJ_MatematickaLingvistika_JM_2015.zip?sequence=1&isAllowed=y. Accessed 5 Dec 2016
Bachu, R., Kopparthi, S., Adapa, B., Barkana, B.: Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. In: American Society for Engineering Education (ASEE) Zone Conference Proceedings, pp. 1–7 (2008)
Google Scholar
Marek, J., Šlahora, J.: Měření podobnosti překladů básně Havran. Forum Statisticum Slovacum, vol. X(5), pp. 90–95 (2013). ISSN: 1336-7420
Google Scholar
URL: http://www.rozhlas.cz/zpravy/data/_zprava/od-tgm-k-zemanovi-poslechnete-si-vanocni-a-novorocni-projevy-vsech-prezidentu--1436738. Accessed 17 Dec 2016

Download references

Author information

Authors and Affiliations

Fakulta elektrotechniky a informatiky, Univerzita Pardubice, Studentská 95, 530 02, Pardubice I, Czech Republic
Milan Jičínský & Jaroslav Marek

Authors

Milan Jičínský
View author publications
You can also search for this author in PubMed Google Scholar
Jaroslav Marek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Milan Jičínský .

Editor information

Editors and Affiliations

Bialystok University of Technology, Bialystock, Poland
Khalid Saeed
Warsaw University of Technology, Warsaw, Poland
Władysław Homenda
A.K. Choudhury School of Information Technology, University of Calcutta, Kolkata, West Bengal, India
Rituparna Chaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jičínský, M., Marek, J. (2017). New Year’s Day Speeches of Czech Presidents: Phonetic Analysis and Text Analysis. In: Saeed, K., Homenda, W., Chaki, R. (eds) Computer Information Systems and Industrial Management. CISIM 2017. Lecture Notes in Computer Science(), vol 10244. Springer, Cham. https://doi.org/10.1007/978-3-319-59105-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-59105-6_10
Published: 17 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59104-9
Online ISBN: 978-3-319-59105-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

New Year’s Day Speeches of Czech Presidents: Phonetic Analysis and Text Analysis

Abstract

Similar content being viewed by others

The “One Day of Speech” Corpus: Phonetic and Syntactic Studies of Everyday Spoken Russian