research-article

Open access

PonDeFlick: A Japanese Text Entry on Smartwatch Commonalizing Flick Operation with Smartphone Interface

Authors:

Kai Akamine,

Ryotaro Tsuchida,

Tsuneo Kato,

Akihiro TamuraAuthors Info & Claims

CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Article No.: 941, Pages 1 - 11

https://doi.org/10.1145/3613904.3642569

Published: 11 May 2024 Publication History

All formats PDF

Abstract

While the QWERTY keyboard is a standard text entry for Latin script languages on smart devices, it is not always true for non-Latin script languages. In Japanese, the most popular text entry on smartphones is a flick-based interface that systematically assigns more than fifty kana characters to twelve keys of a numeric keypad in combination with flick directions. Under these circumstances, studies on Japanese text entry on smartwatches have focused on an efficient interface design that takes advantage of the regularity of the kana consonant and vowel structure, but overlooked commonality with familiar interfaces. Thus, we propose PonDeFlick, a Japanese text entry that commonalizes the flick directions with the familiar smartphone interface while providing the entire touchscreen for gestural operation. A ten-day user study showed that PonDeFlick reached a text-entry speed of 57.7 characters per minute, significantly faster than the numeric-keypad-based interface and a modification of PonDeFlick without the commonality.

Figure 1:

1 Introduction

Smartwatches are spreading as wearable devices mainly for health monitoring and receiving notifications. They are expected to be used not only passively but also actively if their text entry becomes easier to use. In interacting with the small touchscreen, a menu must be concisely allocated with visual information, and the “fat finger” and occlusion problems [28] must be considered. Efficient menu interfaces with minimal occlusion [20, 21] and those for eyes-free interaction [5, 24] have been recently proposed. In implementing text entries, the problems have been mainly tackled with three approaches. The first one involves focusing on a part of a keyboard by zooming in [22], splitting [4, 11], or showing a call-out [17]. The second approach involves assigning multiple characters to a reduced number of keys and making these selectable [14, 15, 27]. The third approach involves statistical decoding using probabilistic touch and language models [8, 31, 32]. Statistical decoding enables precise tap and gestural typing on a full QWERTY keyboard on a smartwatch. The QWERTY layout serves as a standard virtual keyboard on smartwatches in Latin script languages.

However, this is not always the case for non-Latin script languages. In Japanese, text entry has additional challenges, such as having more than fifty syllabic characters, kana, to enter and subsequent kana-kanji conversion, which converts a sequence of kana characters into a standard Japanese text style with a mixture of kanji and kana. Figure 2 shows a basic set of kana syllabary; a kana is composed of a consonant and a vowel (CV) (or a single vowel). More than fifty kana are made up with additional symbols for voiced consonants, contracted sounds, and double consonants. Before smartphones prevailed, Japanese users became familiar with a Japanese text entry based on a numeric keypad on feature phones. This text entry interface assigns five kana of a common consonant (i.e., characters in a row in Figure 2) to a numeric key and the symbols and marks to the remaining two keys; thus, five kana can be toggled through multiple taps. This interface was inherited by smartphones, as shown in Figure 3. The virtual numeric keypad introduced flicking in addition to multiple tapping for kana selection. Specifically, simple tapping selects a kana with the vowel ’a’ and flicking in four directions corresponds to four characters with the vowels ’i’, ’u’, ’e’, and ’o’. Most Japanese users use the virtual numeric-keypad-based text entry on smartphones.

Under these circumstances, smartwatches have no standard Japanese text entry interface. Although the keys of the numeric keypad are larger than those of a QWERTY keyboard, simple porting of the numeric keypad onto a smartwatch makes the key size and spacing very small. Flick operation generally requires a wider area than tapping or swiping. Tojo et al. proposed a Japanese text entry with flick operation based on kana keys allocated in an annular layout [30]. In light of the examples, we consider the essential factors to be 1) a space-efficient key layout with big key size and spacing, 2) minimal operations to enter a character, and 3) the regularity of operations that is simple enough to remember. Previous studies have focused on interface design that efficiently allocates the keys on a small touchscreen and takes advantage of the regularity of the kana CV structure in kana selection, but overlooked commonality regarding flick directions with familiar smartphone interfaces. Thus, we propose PonDeFlick, an annular-layout-based Japanese kana text entry that provides the entire touchscreen for gestural operation and simplifies the gestural operation by commonalizing the flick directions with familiar smartphone interfaces.

We conducted a ten-day user study comparing PonDeFlick with a miniature numeric-keypad-based flick text entry and a modification of PonDeFlick that keeps the regularity of the kana CV structure but does not commonalize the flick directions. The user study revealed the effectiveness of commonality in flick operation.

Figure 2:

Figure 3:

2 Related Studies

When implementing a keyboard on a smartwatch, the small form factor makes precise typing difficult. Various solutions to the "fat finger" and occlusion problems have been proposed. The solutions can be roughly classified into three groups.

The first category is zooming a part of the keyboard [3, 4, 11, 17, 22]. ZoomBoard [22] zooms in on a small QWERTY keyboard by taps, and an additional tap specifies the desired key. Swipeboard [4] divides a QWERTY keyboard into nine regions, allowing users to swipe twice to enter a character. The first swipe specifies the area where the character is located, and the second swipe specifies the desired character. SplitBoard [11] displays half of a QWERTY keyboard on a small touchscreen, enabling a user to switch to the other half and an additional page for numbers and special characters by horizontal flicking. Virtual Sliding QWERTY [3] enables a user to move an oversized keyboard to a desired position by tap-and-drag operation. ZShift keyboard [17] displays a call-out showing a zoomed-in image of the touched area in a non-occluded area [33], enabling a user to change the character if needed by shifting the touching finger slightly. A systematic literature review [18] covered early works before 2018. These interfaces require additional taps or swipes to manipulate the display.

The second category is keyboards with a reduced number of keys, which assigns multiple characters to each of the keys and makes the multi-characters selectable somehow [10, 14, 15, 27]. SwipeKey [27] is a Latin script text entry that determines a letter based on tapping and flick directions on a reduced number of square keys in a tiled layout. The work tested various key sizes and numbers of flick directions and optimized the configuration in a 25 mm x 15 mm rectangular keyboard on a smartwatch. The user study showed that six keys of 7.5 mm x 7.5 mm squares with five flick directions (tapping and four directions) recorded the fastest text-entry speed and the lowest error rate. This configuration obtained the lowest difficulty score and the highest preference in subjective evaluation. The work also reported that the error rate increased drastically if the key size was smaller than 5.7 mm x 5.7 mm. DualKey [10] assigns two adjacent letters in the QWERTY keyboard to a key and makes the two selectable by finger identification between the index and middle fingers. UOIT keyboard [14] divides 26 English letters into 13 frequent one-keystroke letters and the other 13 two-keystroke letters and defines an easy-to-learn rule that maps them to 13 frequent letters (’u’, ’o’, ’i’, ’t’ etc. for the one-keystroke letters) and pairs of the 13 letters for the two-keystroke letters. Meanwhile, ambiguous keyboards [6, 15] reduce the load of specifying letters. Komninos’s ambiguous keyboard provides context-based word suggestions, word completion, and next-word suggestions on a six-key keypad in an alphabetical layout [15]. WrisText [6] enables one-handed text entry by whirling the wrist of the watch hand toward six directions of an annular ambiguous keyboard.

The third category is statistical decoding [7, 8, 31, 32] using probabilistic touch and language models. The models enable precise detection of key touches and accurate prediction of the next words. Google’s WatchWriter [8] provided precise tap typing and gesture typing on a miniature QWERTY keyboard based on their Smart Touch Keyboard and Smart Gesture Keyboard techniques developed on smartphones [25]. VelociTap [32] achieved a text-entry speed of 41 words per minute on a 40-mm-wide keyboard with a sentence-based decoder incorporating a probabilistic touch model, a 12-gram character language model, and a 4-gram word language model. While the statistical decoding enables fast text entry with suggestion, auto-completion, and auto-correction, it sometimes causes errors, especially in entering rare words such as proper nouns. VelociWatch [31] tackled a challenging text input task with error avoidance functions such as letter locking and selection slots. The statistical decoding techniques strongly support efficient text entry regardless of keyboard type.

Various virtual keyboards on smart devices have been proposed for non-Latin script languages. The circumstances of Korean text entry are similar to Japanese. The most popular virtual keyboard on smartphones is a QWERTY-like Korean keyboard. Some manufacturers released original text entries on the numeric keypad of feature phones, and smartphones inherited them. Ilinkin et al. [12] implemented four popular types of Korean text entries on smartwatches and conducted a comparative evaluation. Although the QWERTY-like Korean keyboard is preferred for two-thumb typing on smartphones, the three numeric-keypad-based text entry interfaces performed better than the QWERTY-like Korean keyboard on smartwatches. These numeric-keypad-based interfaces were based on tapping. Flick-based interfaces require a wider area, generally. As a Japanese text entry on smartwatches, BubbleFlick [30] provided the widest area possible for flick operation while also leaving an area for editing text by rearranging the twelve keys of the numeric keypad in an annular layout. Though it opened up the entire touchscreen for flick operation, it left an issue that the flick directions changed depending on the keys, making it hard to learn even after 30-day uses.

In menu interface research, Marking Menu [16], which enables command selection by directional gesture, is an influential one that facilitates users’ smooth transition from a novice mode to an expert mode and has given inspiration to many variations. For instance, FlowMenu [9] enables consecutive command selection by combining the marking menu with Quikwriting [23] based on an octant with a rest area in the center. Text entry and command selection face a similar challenge in that many commands must be grouped clearly and easily selectable. Zone menu [34], which increases the number of commands by setting multiple zones to start directional gestures, forms groups of commands. The hierarchical levels of the Marking Menu are effectively increased by distance extended marking menu [19]. The hierarchical gestures suggest solutions for two-step text entry [13, 27, 30] to meet the three requirements in the introduction. The marking menu is further extended to those initiating from a bezel, such as Bezel Menus [13], which proposes a Latin alphabet text-entry similar to our PonDeFlick on smartphones, Bezel-Tap Gestures [26] on tablets, Bezel-to-bezel interaction for eyes-free interaction on smartwatches [24], and bezel-based selection interfaces for minimal-occlusion interaction [20, 21]. The menu interface and text entry have differences as well. While users usually select commands inconsecutively and memorize only frequently used ones for menu selection, they have to select a variety of commands successively for text entry. So, the gestural operation should be light in cognitive load. In other words, the operation should be more reflexive for text entry.

3 PonDeFlick

PonDeFlick is an interface that allocates necessary keys and a text-editing area efficiently in a small touchscreen while providing the entire area for flick operation that has commonality with the popular Japanese text entry on smartphones. Figure 1 shows screenshots of PonDeFlick. Panel (a) is the initial screen. Ten keys of representative kana, which are the heading characters of the rows circled in the kana syllabary table (Figure 2), and two keys for symbols and marks, which make up twelve in total, are arranged in an annular layout. The size of a key is 6.82 mm in diameter, which is greater than 5.7 mm specified in [27], and the spacing between the keys is 0.64 mm. The text editing area in the center shows three lines of entered text with seven kana per line.

Forty-six kana are systematically assigned to the ten keys in combination with tapping and four flick directions. The leftmost of the two bottom keys is for adding a voiced sound mark or modifying a kana to a double consonant or contracted sound. The rightmost one is for adding punctuation, i.e., point, comma, question, or exclamation point. An additional key inside the ring is a completion key. A leftward flick in the text editing area works as a backspace. Vertical flicking is used for scrolling up and down the entered text. Panel (b) illustrates the entering of ’na’, which comprises the consonant ’n’ and vowel ’a’. The bold yellow shades illustrate trajectories of gestural strokes. A finger touches down on the ’na’ key and then touches up on the key. A flick guide indicating the flick direction is displayed 0.3 seconds after the touchdown. Panel (c) illustrates the entering of ’ne’, which comprises the consonant ’n’ and vowel ’e’. A finger touches down on the ’na’ key, slides slightly towards the center, and then flicks rightward. One of four kana characters with vowels ’i’, ’u’, ’e’, and ’o’ is determined by the flick direction, i.e., leftward for ’i’, upward for ’u’, rightward for ’e’, and downward for ’o’. The correspondence between the flicking direction and the vowel is shown in Figure 4. This correspondence is common with that for selecting a kana on the numeric-keypad-based Japanese text entry on smartphones.

In operating PonDeFlick, a finger stroke changes its direction. To detect flicking and recognize its direction anywhere on the surface, we developed an algorithm to search for the final inflection point, which is considered the starting point of the flicking. Figure 5 illustrates the algorithm.

Figure 4:

Figure 5:

Let the sequence of points in the stroke be denoted as P₀ = (x₀, y₀)^T, P₁ = (x₁, y₁)^T, ⋅⋅⋅, P_N = (x_N, y_N)^T, where P₀ and P_N are the touch-down and touch-up points, respectively. Let the inflection point be denoted as S, which is initialized with P₀. We set two thresholds: a minimal travel distance D for flick detection and an angular threshold θ for inflection point detection.

After touch-down, touched positions are continuously detected at intervals of a few milliseconds as P₀, P₁, ⋅⋅⋅, P_N. If a touch-up is detected with a travel distance below D from P₀ on one of the keys, as shown in Panel (b) of Figure 1, a kana with the vowel ’a’ is selected. If a travel distance exceeds D, a search of a new inflection point runs each time a new touched position P_n is obtained. The closest point to P_n with a travel distance over D, P_{n − k}(k ≥ 1), is a new inflection point candidate. It is P_{n − 2} in Figure 5. The displacement vector from P_{n − k} to P_n is \(\overrightarrow{P_{n-k}P_{n}}\), and the vector from the current S to P_{n − k} is \(\overrightarrow{SP_{n-k}}\). The angle α formed between the two vectors is measured. If α ≥ θ, the inflection point is updated to P_{n − k} (S = P_{n − k}). Note that D and θ are set to 30 dp (approximately 4.8 mm) and 70 degrees, respectively.

When detecting a touch-up, the final displacement vector \(\overrightarrow{SP_N}\) determines the flick direction. The maximum of inner products with \(\vec{d_i}=(-1, 0)^T\), \(\vec{d_u}=(0, -1)^T\), \(\vec{d_e}=(1, 0)^T\), and \(\vec{d_o}=(0, 1)^T\) determines a kana to enter.

Figure 6:

4 User Study

4.1 Design

We conducted a ten-day continuous user study. We compared the performance of PonDeFlick with two other interfaces. One is PonDeSlide, a modification of PonDeFlick, which has regularity in selecting one among five kana, but does not commonalize the flick directions with those of the familiar smartphone interface. We had verified that PonDeSlide performed better than BubbleFlick [30] in both objective and subjective metrics internally (See Appendix for details). The other is a numeric-keypad-based Japanese text entry interface that Google released in 2017. We call this interface "KeypadFlick" hereafter. KeypadFlick provides a baseline to evaluate how fast and how much users learn our original interfaces. The comparison between PonDeFlick and PonDeSlide reveals the effect of commonalizing flick directions. PonDeSlide and KeypadFlick are described in the following subsections. We recruited thirty-four participants and assigned each participant one of the three interfaces.

4.2 Participants

Thirty-four undergraduate and graduate students (19 men and 15 women with ages ranging from 18 to 30) participated in the user study. All participants used a numeric-keyboard-based Japanese text entry interface on their smartphones every day.

On the assignment of an interface, we had all participants enter a set of five Japanese sentences with a numeric-keypad-based text entry on their smartphone. We measured the text-entry speed and assigned an interface so that every interface group would have a commensurate average speed. We had 12, 11, and 11 participants for the PonDeFlick, PonDeSlide, and KeypadFlick groups. We did not inform them of other interfaces than the assigned one.

4.3 Apparatus

The smartwatch model we used was Google Pixel Watch. It has a watch face with a diameter of 41.0 mm and a resolution of 320 ppi.

4.4 Phrase Sets

We composed ten sets of fifteen to eighteen short Japanese sentences using only basic words to present the participants with a new set each day. Figure 7 shows a sample set of sentences. Each set of sentences was designed to form kana pangram, which amounted to 268 kana on average. Figure 6 shows the relative frequency distributions of kana included in all sentence sets and a corpus of 15k sentences written on a smartphone-based Japanese SNS. The distribution of the sentence sets had Pearson’s correlation coefficient of 0.70 with that of the 15k-sentence corpus. No Latin scripts or Arabic numerals were included in any set.

We prepared another set of five sentences for measuring a text-entry speed on a smartphone. Each sentence was composed of 25 kana, which came up to 125 in total.

Figure 7:

4.5 Procedure

We had the participants enter a daily-changing set of sentences over 10 consecutive days. On the first day, an experimenter lent the participants a smartwatch with a briefing, measured a text-entry speed on their smartphone with the numeric-keypad-based flick text entry, and assigned a smartwatch interface. The participants were instructed to enter all text in hiragana, correct errors as best they could by using backspacing, and not to use the kana to kanji conversion or the predictive conversion functions available in KeypadFlick. They were also instructed not to use the text entry for extra hours. The participants wore the smartwatch on the non-dominant wrist and operated it with the dominant hand’s index finger.

On days 2 to 10, the participants entered a specified set of sentences at home. They have a single opportunity to enter each sentence without practice in advance. The sentences were presented by using Google Forms. All entered texts were recorded in a log file.

4.6 Metrics

Performance was objectively measured on the basis of text-entry speed in characters per minute (CPM) and total kana error rate, which counts both corrected and uncorrected errors [29], in errors per character (EPC) based on the log file. The text-entry speed was influenced by the time required to correct errors.

We measured participants’ subjective assessment on the System Usability Scale (SUS) [2] and conducted interviews after they completed the user study.

Figure 8:

4.7 PonDeSlide

PonDeSlide is a modification of PonDeFlick. The key layout and kana-character assignment are the same as PonDeFlick. Figure 8 shows screenshots: entering ’na’ and ’ne’, respectively. When a finger touches down on a key, five kana characters are displayed in a line toward the center, aligned in the order of vowels ’a’, ’i’, ’u’, ’e’, and ’o’. This interface is simple enough for users to easily select a kana by where to release the finger while sliding. Backspacing is implemented in the same way as PonDeFlick, by flicking leftward in the lower half of the central area.

4.8 Japanese Text Entry on Numeric Keypad

Google’s Japanese text entry for the Android smartwatches released in 2017 is a miniature of the popular Japanese text entry on smartphones, based on flick operation on a numeric keypad. Figure 9 shows its screenshots: an initial screen and a screen touching down on ’na’ key. The keypad in the lower center has a 3x4 key layout with 10 kana keys and two keys for symbols and marks. When a finger touches down on a key, a flick guide is displayed over the keypad, indicating five kana with flick directions. Seven keys surrounding the lower side of the keypad are 1) move the cursor leftward, 2) switch script types, 3) enter symbols, 4) space, 5) delete, 6) enter, and 7) move the cursor rightward, from left to right. Predicted word candidates are displayed over the keypad. The size of a key is 5.0 mm in width and 2.5 mm in height.

Figure 9:

Table 1:

Text entry	Number of participants	Mean CPM [char/min]			Mean TER (NCER) in EPC [error/char]
interface		Day 1	Day 5	Day 10	Day 1	Day 5	Day 10
PonDeFlick	12	27.2	46.3	57.7	22.6 (3.0)	12.6 (3.0)	10.4 (3.6)
PonDeSlide	11	25.5	38.9	42.3	16.0 (3.3)	10.3 (1.5)	11.5 (2.1)
KeypadFlick	11	33.8	47.6	49.2	28.0 (3.0)	21.6 (1.5)	19.6 (1.3)

Table 1: Mean text-entry speeds in CPM and total error rates (TER) and not-corrected error rate (NCER) in EPC on first, fifth, and final days for three participant groups.

5 Results of User Study

Table 1 lists the mean text-entry speeds (CPMs) and error rates (EPCs) on the first, fifth, and final days for the three groups.

Figure 10:

5.1 Text-entry Speed

Figure 10 shows daily mean CPMs. The CPMs on the first day were 27.2, 25.5, and 33.8 for the PonDeFlick, PonDeSlide, and KeypadFlick groups, respectively. Those on the final day (day 10) respectively reached 57.7, 42.3, and 49.2. We conducted ANOVA tests on the daily CPMs, with interfaces being a factor. There were significant differences on days 1 to 3 and 8 to 10, with a significant level of 0.05. Bonferroni’s multiple comparisons showed significant differences between KeypadFlick and PonDeSlide on days 1 to 3, between PonDeFlick and PonDeSlide on days 8 to 10 (p=0.018, 0.004, <0.001), and between PonDeFlick and KeypadFlick on day 10 (p=0.018). The KeypadFlick group recorded the fastest on the first day, probably because the participants were familiar with the interface on their smartphones. However, the CPM of the PonDeFlick group increased rapidly, caught up with that of the KeypadFlick group, and surpassed it on day 10. The CPM of the PonDeFlick seems to be still increasing on day 10, while the others seem to reach stable performance. The mean CPM of the PonDeSlide group was more than 5 CPM slower than those of the other groups after day 7.

Figure 11:

5.2 Correlation Between Text-entry Speeds on Smartwatch and Smartphone

The effectiveness of commonalizing flick directions with the smartphone interface can be verified from a correlation between the CPMs on smartwatches and smartphones. Figure 11 shows a scatter plot of all participants’ CPMs on smartwatches and smartphones. The horizontal and vertical axes represent the CPM on a smartphone measured on the first day and that on a smartwatch, which is averaged over the period from day 6 to day 10. To begin with, the distributions of the smartphone CPMs were not biased among the three groups. The mean smartphone CPMs were 115.5, 113.9, and 116.6 for the PonDeFlick, PonDeSlide, and KeypadFlick groups. Correlations were observed between the CPMs on smartphones and smartwatches for PonDeFlick and KeypadFlick, whereas no correlation was observed for PonDeSlide. Pearson’s correlation coefficients were 0.87, 0.79, and -0.09 for PonDeFlick, KeypadFlick, and PonDeSlide, respectively. The three regression lines show that the participants who typed faster on a smartphone gained more advantage of PonDeFlick and KeypadFlick (i.e., the commonality in flick directions) than those who typed slower.

5.3 Error Rate

Figure 12 shows daily mean total error rates and not-corrected error rates in EPC. We call the total error rates as EPCs hereafter. The EPCs on the first day were 22.6, 16.0, and 28.0% for the PonDeFlick, PonDeSlide, and KeypadFlick groups, respectively. We conducted ANOVA tests on the daily EPCs, with interfaces being a factor. There were significant differences on days 1, 4, 5, 7, 9, and 10, with a significant level of 0.05. Bonferroni’s multiple comparisons showed significant differences between KeypadFlick and PonDeFlick on days 4, 5, 7, 9, and 10, and between KeypadFlick and PonDeSlide on days 1, 4, 5, 7, and 9. The KeypadFlick group showed the highest EPCs, probably due to the small key size. The PonDeSlide group showed the lowest EPC on the first day and gradually decreased to around 12%. The PonDeFlick group ranked between the KeypadFlick and PonDeSlide Groups on the first day but decreased to the same level as the PonDeSlide group on the final day. Not-corrected errors occupied 12.7% of all errors. Note that the PonDeFlick group showed higher not-corrected error rates because the group included one participant who did not correct errors carefully.

Figure 12:

5.4 System Usability Scale Scores

Table 2 shows the mean SUS scores of the three groups. PonDeFlick, PonDeSlide, and KeypadFlick scored 84.4, 67.5, and 68.6, respectively. SUS scores of 80 and 70 correspond to adjective ratings of "excellent" and "good," respectively [1]. We conducted an ANOVA test on the SUS scores. There was a significant difference (F_{(2, 31)} = 4.03, p = 0.028), and Bonferroni’s multiple comparisons showed marginally significant differences between PonDeFlick and PonDeSlide (p = 0.051) and between PonDeFlick and KeypadFlick (p = 0.076).

Table 2:

	SUS score
	Mean	SD
PonDeFlick	84.4	11.1
PonDeSlide	67.5	19.1
KeypadFlick	68.6	17.2

Table 2: SUS scores

5.5 Interview Survey

The experimenter interviewed the participants after they completed the user study. The major pros and cons collected from the comments are listed with the numbers in parentheses as follows.

(1)

PonDeFlick

Pros

•

I could type smoothly once I learned the key arrangement. (5)

•

I could type as I did on my smartphone every day. (2)

•

The keys were large enough to type stress-freely.

Cons

•

It would be nice to have a function to enter kana with a vowel ’a’ after carelessly starting flicking. (1)

•

Backspacing was difficult. (1)

(2)

PonDeSlide

Pros

•

I could type smoothly once I learned the key arrangement. (3)

Cons

•

The finger occluded the slider. (1)

•

Careful operation was needed on where to release the finger. (1)

(3)

KeypadFlick

Pros

•

I could type smoothly once I got used to the small keys. (2)

Cons

•

The whole text could not be viewed when typing long. (5)

•

I often typed unintended characters due to the small keys. (3)

PonDeFlick got favorable comments from more than half of the participants on the commonality with their familiar smartphone interface. The comments on PonDeSlide imply it requires carefulness on where to release the finger. The KeypadFlick group could be divided into two types: those who could type smoothly and those who struggled with typing on the small screen.

6 Discussion

Considering the CPMs, EPCs, SUS scores, and subjective comments, KeypadFlick on smartwatches had too small keys to operate, though its precise touch detection enabled fast typing. PonDeFlick and PonDeSlide had larger keys in a wider area, making their operation easier. The results with the different key sizes of KeypadFlick and our original interfaces match the finding of the SwipeKey work that the error rate increases drastically if the button size is smaller than 5.7 mm x 5.7 mm [27]. Comparing the two variations, we consider that PonDeSlide’s slide gesture that determines a kana by where to touch up forced the user to operate it more carefully than PonDeFlick. PonDeFlick’s flick gesture alleviated the need for carefulness and enabled faster typing than PonDeSlide.

PonDeFlick can be viewed as an application of the marking menu with four flick directions to an annular key layout, whereas PonDeSlide is a linear menu. Regarding comparing the marking and linear menus, a paper reported experiments on learning with a grid-based marking menu (M3 Gesture Menu), a multi-stroke marking menu, and a linear menu on a smartphone [35]. The experimental results exhibited that the users of the two marking menus got much faster after three ten-minute practice sessions, whereas those of the linear menu did not. Our experimental results showed the same tendency on a small smartwatch touchscreen. In the experiment, PonDeFlick showed that the speed of flicking made up for the extra time to slide toward the center. This finding might apply to a more general menu interface on a small surface. Commonalizing flick operations with users’ familiar interface can benefit easy operation, even if the key layout differs. In other words, the flick operations can be designed separately from the key layout to some extent.

A limitation of this study is a potential issue that the CPM of the PonDeFlick might not reach stable performance in the 10-day experiment. The experimental period should be longer to obtain stable performance in the long run. Another limitation is that we have not proposed a solution for square-face smartwatches. However, even if the key layout differs from PonDeFlick, it is probably effective to commonalize the flick directions with the numeric-keypad-based text entry on smartphones.

7 Conclusions

We proposed a smartwatch Japanese text entry that commonalizes flick directions with the familiar smartphone interface and provides the entire touchscreen for the flick operation. The ten-day user study showed that the PonDeFlick group reached a mean text-entry speed of 57.7 CPM, which exceeded those of the PonDeSlide and KeypadFlick groups and demonstrated the effectiveness of commonalizing the flick directions. The correlation between all participants’ text-entry speeds on smartwatches and smartphones showed that the participants who typed faster on a smartphone gained more advantage of PonDeFlick than those who typed slower on a smartphone. PonDeFlick reached a mean EPC of around 11% after 8 days, about half that of KeypadFlick. The SUS scores and interview survey showed the advantages of PonDeFlick in usability.

A PonDeSlide Vs. BubbleFlick

Table 3:

Text entry	Number of participants	Mean CPM [char/min]			Mean EPC [error/char]
interface		Day 1	Day 10	Day 20	Day 1	Day 10	Day 20
PonDeSlide	11	28.1	40.4	47.2	10.5	8.9	8.1
BubbleFlick	11	27.0	39.7	43.7	10.1	6.4	7.4
KeypadFlick	5	31.3	45.7	49.2	13.3	5.4	7.6

Table 3: Mean text-entry speeds in CPM and total error rates in EPC on first, tenth, and final days for PonDeSlide, BubbleFlick, and KeypadFlick groups.

Figure 13:

A.1 Overview of Pilot Study

Since BubbleFlick had a problem that users struggled to learn the flick directions that changed depending on the keys, we developed PonDeSlide, which improved the regularity of the kana selection operation, before developing PonDeFlick. We conducted a 20-day pilot study comparing the performance of PonDeSlide, BubbleFlick, and KeypadFlick. We recruited twenty-seven participants and assigned each participant one of the three interfaces, splitting them into three groups: PonDeSlide, BubbleFlick, and KeypadFlick.

A.1.1 Participants.

Twenty-seven undergraduate and graduate students with ages ranging from 21 to 24 participated in the pilot study. All participants had been using smartphones and a numeric-keyboard-based Japanese text entry for three to eight years. Three of them had worn a smartwatch for less than a year but had never entered text with it.

A.1.2 Apparatus.

The smartwatch model we used for the user study was Moto 360 2nd Gen, which has a round watch face with a diameter of 34.8 mm and a resolution of 263 ppi.

A.1.3 Phrase Sets.

We composed twenty sets of eighteen short Japanese sentences to present the participants with a new set each day. Each set formed a kana pangram with about 300 kanas.

A.1.4 Metrics.

Objective measurements were text-entry speed in CPM and total kana error rate in EPC. A subjective assessment was measured on the SUS. An interview was made after completing the user study.

A.1.5 Procedure.

We had the participants enter a daily-changing set of sentences over 20 consecutive days. On the first day, an experimenter lent a smartwatch to the participants with a briefing and randomly assigned a smartwatch interface. The participants were instructed to enter all text in hiragana, correct errors as best they could using backspacing, and not use the kana to kanji conversion or the predictive conversion function available in KeypadFlick. They were also instructed not to use the text entry for extra hours.

Figure 14:

A.2 Results of Pilot Study

Table 3 lists the mean text-entry speeds (CPMs) and total error rates (EPCs) on the first, tenth, and final days for the three groups. Table 4 shows the mean SUS scores. Since PonDeSlide was faster in CPMs and gained a higher SUS score than BubbleFlick, though not statistically significant, we selected PonDeSlide as a baseline.

Table 4:

	SUS score
	Mean	SD
PonDeSlide	78.3	11.0
BubbleFlick	67.5	16.0
KeypadFlick	61.0	7.6

Table 4: SUS scores of PonDeSlide, BubbleFlick, and KeypadFlick.

A.2.1 Text-entry Speed.

Figure 13 shows daily mean CPMs of the three groups. We conducted ANOVA tests on the daily CPMs, with interfaces being a factor. No significant difference was observed, with a significance level of 0.05. The PonDeSlide and BubbleFlick groups started with similar mean CPMs, and the PonDeSlide group exhibited higher mean CPMs than the BubbleFlick group over time though the differences were not statistically significant. The KeypadFlick group showed higher mean CPMs than the other two groups.

A.2.2 Error Rate.

Figure 14 shows daily mean EPCs of the three groups. We conducted ANOVA tests on the daily EPCs, with interfaces being a factor. There were significant differences on days 2, 3, 5, 6, and 7, with a significant level of 0.05. Bonferroni’s multiple comparisons showed significant differences between KeypadFlick and BubbleFlick on days 2, 5, 6, and 7 and between KeypadFlick and PonDeSlide on day 6. The KeypadFlick group showed higher EPCs than the other two groups initially but decreased to the same level over time. While the CPMs of PonDeSlide and KeypadFlick in Figure 13 were similar to those in Figure 10, the EPCs in Figure 14 were smaller than those in Figure 12. This is probably due to the smartwatch models. The active touchscreen area of Google Pixel Watch is smaller than that of Moto 360 Gen 2.

A.2.3 System Usability Scale Scores.

PonDeSlide, BubbleFlick, and KeypadFlick scored 78.3, 67.5, and 61.0, respectively, as shown in Table 4. We conducted an ANOVA test on the SUS scores. There was a significant difference (F_{(2, 24)} = 3.83, p = 0.036), and Bonferroni’s multiple comparisons showed a significant difference between PonDeSlide and KeypadFlick (p = 0.049).

A.2.4 Interview Survey.

We asked all participants if they memorized the arrangement of keys and kana. More than half of the PonDeSlide group answered they memorized the key layout and operations for selecting a kana, whereas none of the BubbleFlick group answered yes. Participants who answered yes in the PonDeSlide group commented, “I developed a sense of how much to move my fingers as I got used to it" and "The arrangement of keys and kana is very simple."

Supplemental Material

MP4 File - Video Preview

Video Preview

Transcript for: Video Preview

MP4 File - Video Presentation

Video Presentation

Transcript for: Video Presentation

References

[1]

A. Bangor, P. Kortun, and J. Miller. 2009. Determining what individual SUS scores mean: adding an adjective rating scale. Journal of Usability Studies 4, 3 (2009), 114–123.

Abstract

1 Introduction

2 Related Studies

3 PonDeFlick

4 User Study

4.1 Design

4.2 Participants

4.3 Apparatus

4.4 Phrase Sets

4.5 Procedure

4.6 Metrics

4.7 PonDeSlide

4.8 Japanese Text Entry on Numeric Keypad

5 Results of User Study

5.1 Text-entry Speed

5.2 Correlation Between Text-entry Speeds on Smartwatch and Smartphone

5.3 Error Rate

5.4 System Usability Scale Scores

5.5 Interview Survey

6 Discussion

7 Conclusions

A PonDeSlide Vs. BubbleFlick

A.1 Overview of Pilot Study

A.1.1 Participants.

A.1.2 Apparatus.

A.1.3 Phrase Sets.

A.1.4 Metrics.

A.1.5 Procedure.

A.2 Results of Pilot Study

A.2.1 Text-entry Speed.

A.2.2 Error Rate.

A.2.3 System Usability Scale Scores.

A.2.4 Interview Survey.

Supplemental Material

References

Index Terms

Recommendations

BubbleFlick: investigating effective interface for Japanese text entry on smartwatches

One Stroke Alphanumeric Input Method by Sliding-in and Sliding-out on the Smartwatch Screen

Designing EdgeWrite Versions for Japanese Text Entry

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

HTML Format

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations