Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3613904.3642569acmconferencesArticle/Chapter ViewFull TextPublication PageschiConference Proceedingsconference-collections
research-article
Open access

PonDeFlick: A Japanese Text Entry on Smartwatch Commonalizing Flick Operation with Smartphone Interface

Published: 11 May 2024 Publication History

Abstract

While the QWERTY keyboard is a standard text entry for Latin script languages on smart devices, it is not always true for non-Latin script languages. In Japanese, the most popular text entry on smartphones is a flick-based interface that systematically assigns more than fifty kana characters to twelve keys of a numeric keypad in combination with flick directions. Under these circumstances, studies on Japanese text entry on smartwatches have focused on an efficient interface design that takes advantage of the regularity of the kana consonant and vowel structure, but overlooked commonality with familiar interfaces. Thus, we propose PonDeFlick, a Japanese text entry that commonalizes the flick directions with the familiar smartphone interface while providing the entire touchscreen for gestural operation. A ten-day user study showed that PonDeFlick reached a text-entry speed of 57.7 characters per minute, significantly faster than the numeric-keypad-based interface and a modification of PonDeFlick without the commonality.
Figure 1:
Figure 1: PonDeFlick. Bold yellow lines show gestural strokes. (a) Initial screen. Entered text is displayed in center. (b) Entering kana character ’na’. Touching down and up (i.e. tapping) on ’na’ key. (c) Entering kana character ’ne’. Touching down on ’na’ key, swiping to center, and then flicking rightward.

1 Introduction

Smartwatches are spreading as wearable devices mainly for health monitoring and receiving notifications. They are expected to be used not only passively but also actively if their text entry becomes easier to use. In interacting with the small touchscreen, a menu must be concisely allocated with visual information, and the “fat finger” and occlusion problems [28] must be considered. Efficient menu interfaces with minimal occlusion [20, 21] and those for eyes-free interaction [5, 24] have been recently proposed. In implementing text entries, the problems have been mainly tackled with three approaches. The first one involves focusing on a part of a keyboard by zooming in [22], splitting [4, 11], or showing a call-out [17]. The second approach involves assigning multiple characters to a reduced number of keys and making these selectable [14, 15, 27]. The third approach involves statistical decoding using probabilistic touch and language models [8, 31, 32]. Statistical decoding enables precise tap and gestural typing on a full QWERTY keyboard on a smartwatch. The QWERTY layout serves as a standard virtual keyboard on smartwatches in Latin script languages.
However, this is not always the case for non-Latin script languages. In Japanese, text entry has additional challenges, such as having more than fifty syllabic characters, kana, to enter and subsequent kana-kanji conversion, which converts a sequence of kana characters into a standard Japanese text style with a mixture of kanji and kana. Figure 2 shows a basic set of kana syllabary; a kana is composed of a consonant and a vowel (CV) (or a single vowel). More than fifty kana are made up with additional symbols for voiced consonants, contracted sounds, and double consonants. Before smartphones prevailed, Japanese users became familiar with a Japanese text entry based on a numeric keypad on feature phones. This text entry interface assigns five kana of a common consonant (i.e., characters in a row in Figure 2) to a numeric key and the symbols and marks to the remaining two keys; thus, five kana can be toggled through multiple taps. This interface was inherited by smartphones, as shown in Figure 3. The virtual numeric keypad introduced flicking in addition to multiple tapping for kana selection. Specifically, simple tapping selects a kana with the vowel ’a’ and flicking in four directions corresponds to four characters with the vowels ’i’, ’u’, ’e’, and ’o’. Most Japanese users use the virtual numeric-keypad-based text entry on smartphones.
Under these circumstances, smartwatches have no standard Japanese text entry interface. Although the keys of the numeric keypad are larger than those of a QWERTY keyboard, simple porting of the numeric keypad onto a smartwatch makes the key size and spacing very small. Flick operation generally requires a wider area than tapping or swiping. Tojo et al. proposed a Japanese text entry with flick operation based on kana keys allocated in an annular layout [30]. In light of the examples, we consider the essential factors to be 1) a space-efficient key layout with big key size and spacing, 2) minimal operations to enter a character, and 3) the regularity of operations that is simple enough to remember. Previous studies have focused on interface design that efficiently allocates the keys on a small touchscreen and takes advantage of the regularity of the kana CV structure in kana selection, but overlooked commonality regarding flick directions with familiar smartphone interfaces. Thus, we propose PonDeFlick, an annular-layout-based Japanese kana text entry that provides the entire touchscreen for gestural operation and simplifies the gestural operation by commonalizing the flick directions with familiar smartphone interfaces.
We conducted a ten-day user study comparing PonDeFlick with a miniature numeric-keypad-based flick text entry and a modification of PonDeFlick that keeps the regularity of the kana CV structure but does not commonalize the flick directions. The user study revealed the effectiveness of commonality in flick operation.
Figure 2:
Figure 2: Japanese kana syllabary (hiragana) table. Rows and columns represent consonants and vowels, respectively. Latin script of each character is shown below kana. Heading character of each row is circled as representative kana.
Figure 3:
Figure 3: Numeric-keypad-based Japanese text entry on smartphones. (a) Initial screen. (b) Flick guide indicating flick directions appears over keypad on touch-down on ’na’ key.

2 Related Studies

When implementing a keyboard on a smartwatch, the small form factor makes precise typing difficult. Various solutions to the "fat finger" and occlusion problems have been proposed. The solutions can be roughly classified into three groups.
The first category is zooming a part of the keyboard [3, 4, 11, 17, 22]. ZoomBoard [22] zooms in on a small QWERTY keyboard by taps, and an additional tap specifies the desired key. Swipeboard [4] divides a QWERTY keyboard into nine regions, allowing users to swipe twice to enter a character. The first swipe specifies the area where the character is located, and the second swipe specifies the desired character. SplitBoard [11] displays half of a QWERTY keyboard on a small touchscreen, enabling a user to switch to the other half and an additional page for numbers and special characters by horizontal flicking. Virtual Sliding QWERTY [3] enables a user to move an oversized keyboard to a desired position by tap-and-drag operation. ZShift keyboard [17] displays a call-out showing a zoomed-in image of the touched area in a non-occluded area [33], enabling a user to change the character if needed by shifting the touching finger slightly. A systematic literature review [18] covered early works before 2018. These interfaces require additional taps or swipes to manipulate the display.
The second category is keyboards with a reduced number of keys, which assigns multiple characters to each of the keys and makes the multi-characters selectable somehow [10, 14, 15, 27]. SwipeKey [27] is a Latin script text entry that determines a letter based on tapping and flick directions on a reduced number of square keys in a tiled layout. The work tested various key sizes and numbers of flick directions and optimized the configuration in a 25 mm x 15 mm rectangular keyboard on a smartwatch. The user study showed that six keys of 7.5 mm x 7.5 mm squares with five flick directions (tapping and four directions) recorded the fastest text-entry speed and the lowest error rate. This configuration obtained the lowest difficulty score and the highest preference in subjective evaluation. The work also reported that the error rate increased drastically if the key size was smaller than 5.7 mm x 5.7 mm. DualKey [10] assigns two adjacent letters in the QWERTY keyboard to a key and makes the two selectable by finger identification between the index and middle fingers. UOIT keyboard [14] divides 26 English letters into 13 frequent one-keystroke letters and the other 13 two-keystroke letters and defines an easy-to-learn rule that maps them to 13 frequent letters (’u’, ’o’, ’i’, ’t’ etc. for the one-keystroke letters) and pairs of the 13 letters for the two-keystroke letters. Meanwhile, ambiguous keyboards [6, 15] reduce the load of specifying letters. Komninos’s ambiguous keyboard provides context-based word suggestions, word completion, and next-word suggestions on a six-key keypad in an alphabetical layout [15]. WrisText [6] enables one-handed text entry by whirling the wrist of the watch hand toward six directions of an annular ambiguous keyboard.
The third category is statistical decoding [7, 8, 31, 32] using probabilistic touch and language models. The models enable precise detection of key touches and accurate prediction of the next words. Google’s WatchWriter [8] provided precise tap typing and gesture typing on a miniature QWERTY keyboard based on their Smart Touch Keyboard and Smart Gesture Keyboard techniques developed on smartphones [25]. VelociTap [32] achieved a text-entry speed of 41 words per minute on a 40-mm-wide keyboard with a sentence-based decoder incorporating a probabilistic touch model, a 12-gram character language model, and a 4-gram word language model. While the statistical decoding enables fast text entry with suggestion, auto-completion, and auto-correction, it sometimes causes errors, especially in entering rare words such as proper nouns. VelociWatch [31] tackled a challenging text input task with error avoidance functions such as letter locking and selection slots. The statistical decoding techniques strongly support efficient text entry regardless of keyboard type.
Various virtual keyboards on smart devices have been proposed for non-Latin script languages. The circumstances of Korean text entry are similar to Japanese. The most popular virtual keyboard on smartphones is a QWERTY-like Korean keyboard. Some manufacturers released original text entries on the numeric keypad of feature phones, and smartphones inherited them. Ilinkin et al. [12] implemented four popular types of Korean text entries on smartwatches and conducted a comparative evaluation. Although the QWERTY-like Korean keyboard is preferred for two-thumb typing on smartphones, the three numeric-keypad-based text entry interfaces performed better than the QWERTY-like Korean keyboard on smartwatches. These numeric-keypad-based interfaces were based on tapping. Flick-based interfaces require a wider area, generally. As a Japanese text entry on smartwatches, BubbleFlick [30] provided the widest area possible for flick operation while also leaving an area for editing text by rearranging the twelve keys of the numeric keypad in an annular layout. Though it opened up the entire touchscreen for flick operation, it left an issue that the flick directions changed depending on the keys, making it hard to learn even after 30-day uses.
In menu interface research, Marking Menu [16], which enables command selection by directional gesture, is an influential one that facilitates users’ smooth transition from a novice mode to an expert mode and has given inspiration to many variations. For instance, FlowMenu [9] enables consecutive command selection by combining the marking menu with Quikwriting [23] based on an octant with a rest area in the center. Text entry and command selection face a similar challenge in that many commands must be grouped clearly and easily selectable. Zone menu [34], which increases the number of commands by setting multiple zones to start directional gestures, forms groups of commands. The hierarchical levels of the Marking Menu are effectively increased by distance extended marking menu [19]. The hierarchical gestures suggest solutions for two-step text entry [13, 27, 30] to meet the three requirements in the introduction. The marking menu is further extended to those initiating from a bezel, such as Bezel Menus [13], which proposes a Latin alphabet text-entry similar to our PonDeFlick on smartphones, Bezel-Tap Gestures [26] on tablets, Bezel-to-bezel interaction for eyes-free interaction on smartwatches [24], and bezel-based selection interfaces for minimal-occlusion interaction [20, 21]. The menu interface and text entry have differences as well. While users usually select commands inconsecutively and memorize only frequently used ones for menu selection, they have to select a variety of commands successively for text entry. So, the gestural operation should be light in cognitive load. In other words, the operation should be more reflexive for text entry.

3 PonDeFlick

PonDeFlick is an interface that allocates necessary keys and a text-editing area efficiently in a small touchscreen while providing the entire area for flick operation that has commonality with the popular Japanese text entry on smartphones. Figure 1 shows screenshots of PonDeFlick. Panel (a) is the initial screen. Ten keys of representative kana, which are the heading characters of the rows circled in the kana syllabary table (Figure 2), and two keys for symbols and marks, which make up twelve in total, are arranged in an annular layout. The size of a key is 6.82 mm in diameter, which is greater than 5.7 mm specified in [27], and the spacing between the keys is 0.64 mm. The text editing area in the center shows three lines of entered text with seven kana per line.
Forty-six kana are systematically assigned to the ten keys in combination with tapping and four flick directions. The leftmost of the two bottom keys is for adding a voiced sound mark or modifying a kana to a double consonant or contracted sound. The rightmost one is for adding punctuation, i.e., point, comma, question, or exclamation point. An additional key inside the ring is a completion key. A leftward flick in the text editing area works as a backspace. Vertical flicking is used for scrolling up and down the entered text. Panel (b) illustrates the entering of ’na’, which comprises the consonant ’n’ and vowel ’a’. The bold yellow shades illustrate trajectories of gestural strokes. A finger touches down on the ’na’ key and then touches up on the key. A flick guide indicating the flick direction is displayed 0.3 seconds after the touchdown. Panel (c) illustrates the entering of ’ne’, which comprises the consonant ’n’ and vowel ’e’. A finger touches down on the ’na’ key, slides slightly towards the center, and then flicks rightward. One of four kana characters with vowels ’i’, ’u’, ’e’, and ’o’ is determined by the flick direction, i.e., leftward for ’i’, upward for ’u’, rightward for ’e’, and downward for ’o’. The correspondence between the flicking direction and the vowel is shown in Figure 4. This correspondence is common with that for selecting a kana on the numeric-keypad-based Japanese text entry on smartphones.
In operating PonDeFlick, a finger stroke changes its direction. To detect flicking and recognize its direction anywhere on the surface, we developed an algorithm to search for the final inflection point, which is considered the starting point of the flicking. Figure 5 illustrates the algorithm.
Figure 4:
Figure 4: Correspondence between flick directions and vowels. Diagonal dashed lines represent boundaries.
Figure 5:
Figure 5: Search of inflection point to determine flick direction.
Let the sequence of points in the stroke be denoted as P0 = (x0, y0)T, P1 = (x1, y1)T, ⋅⋅⋅, PN = (xN, yN)T, where P0 and PN are the touch-down and touch-up points, respectively. Let the inflection point be denoted as S, which is initialized with P0. We set two thresholds: a minimal travel distance D for flick detection and an angular threshold θ for inflection point detection.
After touch-down, touched positions are continuously detected at intervals of a few milliseconds as P0, P1, ⋅⋅⋅, PN. If a touch-up is detected with a travel distance below D from P0 on one of the keys, as shown in Panel (b) of Figure 1, a kana with the vowel ’a’ is selected. If a travel distance exceeds D, a search of a new inflection point runs each time a new touched position Pn is obtained. The closest point to Pn with a travel distance over D, Pnk(k ≥ 1), is a new inflection point candidate. It is Pn − 2 in Figure 5. The displacement vector from Pnk to Pn is \(\overrightarrow{P_{n-k}P_{n}}\), and the vector from the current S to Pnk is \(\overrightarrow{SP_{n-k}}\). The angle α formed between the two vectors is measured. If αθ, the inflection point is updated to Pnk (S = Pnk). Note that D and θ are set to 30 dp (approximately 4.8 mm) and 70 degrees, respectively.
When detecting a touch-up, the final displacement vector \(\overrightarrow{SP_N}\) determines the flick direction. The maximum of inner products with \(\vec{d_i}=(-1, 0)^T\), \(\vec{d_u}=(0, -1)^T\), \(\vec{d_e}=(1, 0)^T\), and \(\vec{d_o}=(0, 1)^T\) determines a kana to enter.
Figure 6:
Figure 6: Relative frequency distribution of kana characters included in all sentence sets.

4 User Study

4.1 Design

We conducted a ten-day continuous user study. We compared the performance of PonDeFlick with two other interfaces. One is PonDeSlide, a modification of PonDeFlick, which has regularity in selecting one among five kana, but does not commonalize the flick directions with those of the familiar smartphone interface. We had verified that PonDeSlide performed better than BubbleFlick [30] in both objective and subjective metrics internally (See Appendix for details). The other is a numeric-keypad-based Japanese text entry interface that Google released in 2017. We call this interface "KeypadFlick" hereafter. KeypadFlick provides a baseline to evaluate how fast and how much users learn our original interfaces. The comparison between PonDeFlick and PonDeSlide reveals the effect of commonalizing flick directions. PonDeSlide and KeypadFlick are described in the following subsections. We recruited thirty-four participants and assigned each participant one of the three interfaces.

4.2 Participants

Thirty-four undergraduate and graduate students (19 men and 15 women with ages ranging from 18 to 30) participated in the user study. All participants used a numeric-keyboard-based Japanese text entry interface on their smartphones every day.
On the assignment of an interface, we had all participants enter a set of five Japanese sentences with a numeric-keypad-based text entry on their smartphone. We measured the text-entry speed and assigned an interface so that every interface group would have a commensurate average speed. We had 12, 11, and 11 participants for the PonDeFlick, PonDeSlide, and KeypadFlick groups. We did not inform them of other interfaces than the assigned one.

4.3 Apparatus

The smartwatch model we used was Google Pixel Watch. It has a watch face with a diameter of 41.0 mm and a resolution of 320 ppi.

4.4 Phrase Sets

We composed ten sets of fifteen to eighteen short Japanese sentences using only basic words to present the participants with a new set each day. Figure 7 shows a sample set of sentences. Each set of sentences was designed to form kana pangram, which amounted to 268 kana on average. Figure 6 shows the relative frequency distributions of kana included in all sentence sets and a corpus of 15k sentences written on a smartphone-based Japanese SNS. The distribution of the sentence sets had Pearson’s correlation coefficient of 0.70 with that of the 15k-sentence corpus. No Latin scripts or Arabic numerals were included in any set.
We prepared another set of five sentences for measuring a text-entry speed on a smartphone. Each sentence was composed of 25 kana, which came up to 125 in total.
Figure 7:
Figure 7: Example of sentence set.

4.5 Procedure

We had the participants enter a daily-changing set of sentences over 10 consecutive days. On the first day, an experimenter lent the participants a smartwatch with a briefing, measured a text-entry speed on their smartphone with the numeric-keypad-based flick text entry, and assigned a smartwatch interface. The participants were instructed to enter all text in hiragana, correct errors as best they could by using backspacing, and not to use the kana to kanji conversion or the predictive conversion functions available in KeypadFlick. They were also instructed not to use the text entry for extra hours. The participants wore the smartwatch on the non-dominant wrist and operated it with the dominant hand’s index finger.
On days 2 to 10, the participants entered a specified set of sentences at home. They have a single opportunity to enter each sentence without practice in advance. The sentences were presented by using Google Forms. All entered texts were recorded in a log file.

4.6 Metrics

Performance was objectively measured on the basis of text-entry speed in characters per minute (CPM) and total kana error rate, which counts both corrected and uncorrected errors [29], in errors per character (EPC) based on the log file. The text-entry speed was influenced by the time required to correct errors.
We measured participants’ subjective assessment on the System Usability Scale (SUS) [2] and conducted interviews after they completed the user study.
Figure 8:
Figure 8: PonDeSlide. left: Entering ’na’. Touch down and up on ’na’ key. Right: Entering ’ne’. Touch down on ’na’, slide, and touch up on ’ne’.

4.7 PonDeSlide

PonDeSlide is a modification of PonDeFlick. The key layout and kana-character assignment are the same as PonDeFlick. Figure 8 shows screenshots: entering ’na’ and ’ne’, respectively. When a finger touches down on a key, five kana characters are displayed in a line toward the center, aligned in the order of vowels ’a’, ’i’, ’u’, ’e’, and ’o’. This interface is simple enough for users to easily select a kana by where to release the finger while sliding. Backspacing is implemented in the same way as PonDeFlick, by flicking leftward in the lower half of the central area.

4.8 Japanese Text Entry on Numeric Keypad

Google’s Japanese text entry for the Android smartwatches released in 2017 is a miniature of the popular Japanese text entry on smartphones, based on flick operation on a numeric keypad. Figure 9 shows its screenshots: an initial screen and a screen touching down on ’na’ key. The keypad in the lower center has a 3x4 key layout with 10 kana keys and two keys for symbols and marks. When a finger touches down on a key, a flick guide is displayed over the keypad, indicating five kana with flick directions. Seven keys surrounding the lower side of the keypad are 1) move the cursor leftward, 2) switch script types, 3) enter symbols, 4) space, 5) delete, 6) enter, and 7) move the cursor rightward, from left to right. Predicted word candidates are displayed over the keypad. The size of a key is 5.0 mm in width and 2.5 mm in height.
Figure 9:
Figure 9: Google’s Japanese text entry on smartwatches released in 2017 (KeypadFlick). Left: Initial screen. Right: Touching down on ’na’ key.
Table 1:
Text entryNumber of participantsMean CPM [char/min]Mean TER (NCER) in EPC [error/char]
interface Day 1Day 5Day 10Day 1Day 5Day 10
PonDeFlick1227.246.357.722.6 (3.0)12.6 (3.0)10.4 (3.6)
PonDeSlide1125.538.942.316.0 (3.3)10.3 (1.5)11.5 (2.1)
KeypadFlick1133.847.649.228.0 (3.0)21.6 (1.5)19.6 (1.3)
Table 1: Mean text-entry speeds in CPM and total error rates (TER) and not-corrected error rate (NCER) in EPC on first, fifth, and final days for three participant groups.

5 Results of User Study

Table 1 lists the mean text-entry speeds (CPMs) and error rates (EPCs) on the first, fifth, and final days for the three groups.
Figure 10:
Figure 10: Daily text-entry speeds in CPM for three groups. Error bars show 95% confidence intervals.

5.1 Text-entry Speed

Figure 10 shows daily mean CPMs. The CPMs on the first day were 27.2, 25.5, and 33.8 for the PonDeFlick, PonDeSlide, and KeypadFlick groups, respectively. Those on the final day (day 10) respectively reached 57.7, 42.3, and 49.2. We conducted ANOVA tests on the daily CPMs, with interfaces being a factor. There were significant differences on days 1 to 3 and 8 to 10, with a significant level of 0.05. Bonferroni’s multiple comparisons showed significant differences between KeypadFlick and PonDeSlide on days 1 to 3, between PonDeFlick and PonDeSlide on days 8 to 10 (p=0.018, 0.004, <0.001), and between PonDeFlick and KeypadFlick on day 10 (p=0.018). The KeypadFlick group recorded the fastest on the first day, probably because the participants were familiar with the interface on their smartphones. However, the CPM of the PonDeFlick group increased rapidly, caught up with that of the KeypadFlick group, and surpassed it on day 10. The CPM of the PonDeFlick seems to be still increasing on day 10, while the others seem to reach stable performance. The mean CPM of the PonDeSlide group was more than 5 CPM slower than those of the other groups after day 7.
Figure 11:
Figure 11: Scatter plot of individual participants’ text-entry speeds on smartphones and smartwatches.

5.2 Correlation Between Text-entry Speeds on Smartwatch and Smartphone

The effectiveness of commonalizing flick directions with the smartphone interface can be verified from a correlation between the CPMs on smartwatches and smartphones. Figure 11 shows a scatter plot of all participants’ CPMs on smartwatches and smartphones. The horizontal and vertical axes represent the CPM on a smartphone measured on the first day and that on a smartwatch, which is averaged over the period from day 6 to day 10. To begin with, the distributions of the smartphone CPMs were not biased among the three groups. The mean smartphone CPMs were 115.5, 113.9, and 116.6 for the PonDeFlick, PonDeSlide, and KeypadFlick groups. Correlations were observed between the CPMs on smartphones and smartwatches for PonDeFlick and KeypadFlick, whereas no correlation was observed for PonDeSlide. Pearson’s correlation coefficients were 0.87, 0.79, and -0.09 for PonDeFlick, KeypadFlick, and PonDeSlide, respectively. The three regression lines show that the participants who typed faster on a smartphone gained more advantage of PonDeFlick and KeypadFlick (i.e., the commonality in flick directions) than those who typed slower.

5.3 Error Rate

Figure 12 shows daily mean total error rates and not-corrected error rates in EPC. We call the total error rates as EPCs hereafter. The EPCs on the first day were 22.6, 16.0, and 28.0% for the PonDeFlick, PonDeSlide, and KeypadFlick groups, respectively. We conducted ANOVA tests on the daily EPCs, with interfaces being a factor. There were significant differences on days 1, 4, 5, 7, 9, and 10, with a significant level of 0.05. Bonferroni’s multiple comparisons showed significant differences between KeypadFlick and PonDeFlick on days 4, 5, 7, 9, and 10, and between KeypadFlick and PonDeSlide on days 1, 4, 5, 7, and 9. The KeypadFlick group showed the highest EPCs, probably due to the small key size. The PonDeSlide group showed the lowest EPC on the first day and gradually decreased to around 12%. The PonDeFlick group ranked between the KeypadFlick and PonDeSlide Groups on the first day but decreased to the same level as the PonDeSlide group on the final day. Not-corrected errors occupied 12.7% of all errors. Note that the PonDeFlick group showed higher not-corrected error rates because the group included one participant who did not correct errors carefully.
Figure 12:
Figure 12: Daily total error rate and not-corrected error rate in EPC for three groups. Error bars show 95% confidence intervals.

5.4 System Usability Scale Scores

Table 2 shows the mean SUS scores of the three groups. PonDeFlick, PonDeSlide, and KeypadFlick scored 84.4, 67.5, and 68.6, respectively. SUS scores of 80 and 70 correspond to adjective ratings of "excellent" and "good," respectively [1]. We conducted an ANOVA test on the SUS scores. There was a significant difference (F(2, 31) = 4.03, p = 0.028), and Bonferroni’s multiple comparisons showed marginally significant differences between PonDeFlick and PonDeSlide (p = 0.051) and between PonDeFlick and KeypadFlick (p = 0.076).
Table 2:
 SUS score
 MeanSD
PonDeFlick84.411.1
PonDeSlide67.519.1
KeypadFlick68.617.2
Table 2: SUS scores

5.5 Interview Survey

The experimenter interviewed the participants after they completed the user study. The major pros and cons collected from the comments are listed with the numbers in parentheses as follows.
(1)
PonDeFlick
Pros
 
I could type smoothly once I learned the key arrangement. (5)
I could type as I did on my smartphone every day. (2)
The keys were large enough to type stress-freely.
Cons
 
It would be nice to have a function to enter kana with a vowel ’a’ after carelessly starting flicking. (1)
Backspacing was difficult. (1)
(2)
PonDeSlide
Pros
 
I could type smoothly once I learned the key arrangement. (3)
Cons
 
The finger occluded the slider. (1)
Careful operation was needed on where to release the finger. (1)
(3)
KeypadFlick
Pros
 
I could type smoothly once I got used to the small keys. (2)
Cons
 
The whole text could not be viewed when typing long. (5)
I often typed unintended characters due to the small keys. (3)
PonDeFlick got favorable comments from more than half of the participants on the commonality with their familiar smartphone interface. The comments on PonDeSlide imply it requires carefulness on where to release the finger. The KeypadFlick group could be divided into two types: those who could type smoothly and those who struggled with typing on the small screen.

6 Discussion

Considering the CPMs, EPCs, SUS scores, and subjective comments, KeypadFlick on smartwatches had too small keys to operate, though its precise touch detection enabled fast typing. PonDeFlick and PonDeSlide had larger keys in a wider area, making their operation easier. The results with the different key sizes of KeypadFlick and our original interfaces match the finding of the SwipeKey work that the error rate increases drastically if the button size is smaller than 5.7 mm x 5.7 mm [27]. Comparing the two variations, we consider that PonDeSlide’s slide gesture that determines a kana by where to touch up forced the user to operate it more carefully than PonDeFlick. PonDeFlick’s flick gesture alleviated the need for carefulness and enabled faster typing than PonDeSlide.
PonDeFlick can be viewed as an application of the marking menu with four flick directions to an annular key layout, whereas PonDeSlide is a linear menu. Regarding comparing the marking and linear menus, a paper reported experiments on learning with a grid-based marking menu (M3 Gesture Menu), a multi-stroke marking menu, and a linear menu on a smartphone [35]. The experimental results exhibited that the users of the two marking menus got much faster after three ten-minute practice sessions, whereas those of the linear menu did not. Our experimental results showed the same tendency on a small smartwatch touchscreen. In the experiment, PonDeFlick showed that the speed of flicking made up for the extra time to slide toward the center. This finding might apply to a more general menu interface on a small surface. Commonalizing flick operations with users’ familiar interface can benefit easy operation, even if the key layout differs. In other words, the flick operations can be designed separately from the key layout to some extent.
A limitation of this study is a potential issue that the CPM of the PonDeFlick might not reach stable performance in the 10-day experiment. The experimental period should be longer to obtain stable performance in the long run. Another limitation is that we have not proposed a solution for square-face smartwatches. However, even if the key layout differs from PonDeFlick, it is probably effective to commonalize the flick directions with the numeric-keypad-based text entry on smartphones.

7 Conclusions

We proposed a smartwatch Japanese text entry that commonalizes flick directions with the familiar smartphone interface and provides the entire touchscreen for the flick operation. The ten-day user study showed that the PonDeFlick group reached a mean text-entry speed of 57.7 CPM, which exceeded those of the PonDeSlide and KeypadFlick groups and demonstrated the effectiveness of commonalizing the flick directions. The correlation between all participants’ text-entry speeds on smartwatches and smartphones showed that the participants who typed faster on a smartphone gained more advantage of PonDeFlick than those who typed slower on a smartphone. PonDeFlick reached a mean EPC of around 11% after 8 days, about half that of KeypadFlick. The SUS scores and interview survey showed the advantages of PonDeFlick in usability.

A PonDeSlide Vs. BubbleFlick

Table 3:
Text entryNumber of participantsMean CPM [char/min]Mean EPC [error/char]
interface Day 1Day 10Day 20Day 1Day 10Day 20
PonDeSlide1128.140.447.210.58.98.1
BubbleFlick1127.039.743.710.16.47.4
KeypadFlick531.345.749.213.35.47.6
Table 3: Mean text-entry speeds in CPM and total error rates in EPC on first, tenth, and final days for PonDeSlide, BubbleFlick, and KeypadFlick groups.
Figure 13:
Figure 13: Daily text-entry speeds in CPM for PonDeSlide, BubbleFlick, and KeypadFlick groups. Error bars show 95% confidence intervals.

A.1 Overview of Pilot Study

Since BubbleFlick had a problem that users struggled to learn the flick directions that changed depending on the keys, we developed PonDeSlide, which improved the regularity of the kana selection operation, before developing PonDeFlick. We conducted a 20-day pilot study comparing the performance of PonDeSlide, BubbleFlick, and KeypadFlick. We recruited twenty-seven participants and assigned each participant one of the three interfaces, splitting them into three groups: PonDeSlide, BubbleFlick, and KeypadFlick.

A.1.1 Participants.

Twenty-seven undergraduate and graduate students with ages ranging from 21 to 24 participated in the pilot study. All participants had been using smartphones and a numeric-keyboard-based Japanese text entry for three to eight years. Three of them had worn a smartwatch for less than a year but had never entered text with it.

A.1.2 Apparatus.

The smartwatch model we used for the user study was Moto 360 2nd Gen, which has a round watch face with a diameter of 34.8 mm and a resolution of 263 ppi.

A.1.3 Phrase Sets.

We composed twenty sets of eighteen short Japanese sentences to present the participants with a new set each day. Each set formed a kana pangram with about 300 kanas.

A.1.4 Metrics.

Objective measurements were text-entry speed in CPM and total kana error rate in EPC. A subjective assessment was measured on the SUS. An interview was made after completing the user study.

A.1.5 Procedure.

We had the participants enter a daily-changing set of sentences over 20 consecutive days. On the first day, an experimenter lent a smartwatch to the participants with a briefing and randomly assigned a smartwatch interface. The participants were instructed to enter all text in hiragana, correct errors as best they could using backspacing, and not use the kana to kanji conversion or the predictive conversion function available in KeypadFlick. They were also instructed not to use the text entry for extra hours.
Figure 14:
Figure 14: Daily total error rate in EPC for PonDeSlide, BubbleFlick, and KeypadFlick groups. Error bars show 95% confidence intervals.

A.2 Results of Pilot Study

Table 3 lists the mean text-entry speeds (CPMs) and total error rates (EPCs) on the first, tenth, and final days for the three groups. Table 4 shows the mean SUS scores. Since PonDeSlide was faster in CPMs and gained a higher SUS score than BubbleFlick, though not statistically significant, we selected PonDeSlide as a baseline.
Table 4:
 SUS score
 MeanSD
PonDeSlide78.311.0
BubbleFlick67.516.0
KeypadFlick61.07.6
Table 4: SUS scores of PonDeSlide, BubbleFlick, and KeypadFlick.

A.2.1 Text-entry Speed.

Figure 13 shows daily mean CPMs of the three groups. We conducted ANOVA tests on the daily CPMs, with interfaces being a factor. No significant difference was observed, with a significance level of 0.05. The PonDeSlide and BubbleFlick groups started with similar mean CPMs, and the PonDeSlide group exhibited higher mean CPMs than the BubbleFlick group over time though the differences were not statistically significant. The KeypadFlick group showed higher mean CPMs than the other two groups.

A.2.2 Error Rate.

Figure 14 shows daily mean EPCs of the three groups. We conducted ANOVA tests on the daily EPCs, with interfaces being a factor. There were significant differences on days 2, 3, 5, 6, and 7, with a significant level of 0.05. Bonferroni’s multiple comparisons showed significant differences between KeypadFlick and BubbleFlick on days 2, 5, 6, and 7 and between KeypadFlick and PonDeSlide on day 6. The KeypadFlick group showed higher EPCs than the other two groups initially but decreased to the same level over time. While the CPMs of PonDeSlide and KeypadFlick in Figure 13 were similar to those in Figure 10, the EPCs in Figure 14 were smaller than those in Figure 12. This is probably due to the smartwatch models. The active touchscreen area of Google Pixel Watch is smaller than that of Moto 360 Gen 2.

A.2.3 System Usability Scale Scores.

PonDeSlide, BubbleFlick, and KeypadFlick scored 78.3, 67.5, and 61.0, respectively, as shown in Table 4. We conducted an ANOVA test on the SUS scores. There was a significant difference (F(2, 24) = 3.83, p = 0.036), and Bonferroni’s multiple comparisons showed a significant difference between PonDeSlide and KeypadFlick (p = 0.049).

A.2.4 Interview Survey.

We asked all participants if they memorized the arrangement of keys and kana. More than half of the PonDeSlide group answered they memorized the key layout and operations for selecting a kana, whereas none of the BubbleFlick group answered yes. Participants who answered yes in the PonDeSlide group commented, “I developed a sense of how much to move my fingers as I got used to it" and "The arrangement of keys and kana is very simple."

Supplemental Material

MP4 File - Video Preview
Video Preview
Transcript for: Video Preview
MP4 File - Video Presentation
Video Presentation
Transcript for: Video Presentation

References

[1]
A. Bangor, P. Kortun, and J. Miller. 2009. Determining what individual SUS scores mean: adding an adjective rating scale. Journal of Usability Studies 4, 3 (2009), 114–123.
[2]
J. Brooke. 1996. SUS - a quick and dirty usability scale. In Usability Evaluation Industry. CRC Press, Boca Raton, FL, USA, 189–194.
[3]
J. Cha, E. Choi, and J. Lim. 2015. Virtual Sliding QWERTY: A new text entry method for smartwatches using Tap-N-Drag. Applied Ergonomics 51 (2015), 263–272.
[4]
X. A. Chen, T. Grossman, and G. Fitzmaurice. 2014. SwipeBoard: A Text Entry Technique for Ultra-small Interfaces That Supports Novice to Expert Transitions. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (Honolulu, HA, USA) (UIST ’14). ACM, New York, NY, USA, 615–620. http://doi.acm.org/10.1145/2642918.2647354
[5]
B. Frauchard, E. Lecolinet, and O. Chapuis. 2020. Side Crossing Menus: Enabling Large Sets of Gestures for Small Surfaces. Proceedings of the ACM on Human-Computer Interaction 4, ISS (2020), 189: 1–19.
[6]
J. Gong, Z. Xu, Q. Guo, T. Seyed, X. Chen, X. Bi, and X. Yang. 2018. WrisText: One-handed Text Entry on Smartwatch using Wrist Gestures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Montréal, QC, Canada) (CHI ’18). ACM, New York, NY, USA, 181: 1–14. https://dl.acm.org/doi/10.1145/3173574.3173755
[7]
J. Goodman, G. Venolia, K. Sterury, and C. Parker. 2002. Language Modeling for Soft Keyboard. In Proceedings of 7th International Conference on Intelligent User Interface (San Francisco, CA, USA) (IUI ’02). ACM, New York, NY, USA, 194–195. https://doi.org/10.1145/502716.502753
[8]
M. Gordon, T. Ouyang, and S. Zhai. 2016. WatchWriter: Tap and Gesture Typing on a Smartwatch Miniature Keyboard with Statistical Decoding. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, CA, USA) (CHI ’16). ACM, New York, NY, USA, 3817–3821. http://doi.acm.org/10.1145/2858036.2858242
[9]
F. Guimbretière and T. Winograd. 2000. FlowMenu: Combining Command, Text, and Data Entry. In Proceedings of the Annual Symposium on User Interface Software and Technology (San Diego, CA, USA) (UIST ’00). ACM, New York, NY, USA, 213–216. https://doi.org/10.1145/354401.354778
[10]
A. Gupta and R. Balakrishnan. 2016. DualKey: Miature Screen Text Entry via Finger Identification. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, CA, USA) (CHI ’16). ACM, New York, NY, USA, 59–70. http://doi.acm.org/10.1145/2858036.2858052
[11]
J. Hong, S. Heo, P. Isokoshi, and G. Lee. 2015. SplitBoard: A Simple Split Soft Keyboard for Wristwatch-sized Touch Screens. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Seoul, Korea) (CHI ’15). ACM, New York, NY, USA, 1233–1236. http://doi.acm.org/10.1145/2702123.2702273
[12]
I. Ilinkin and S. Kim. 2017. Evaluation of Korean Text Entry Methods for Smartwatches. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Denver, CO, USA) (CHI ’17). ACM, New York, NY, USA, 722–726. http://doi.acm.org/10.1145/3025453.3025657
[13]
M. Jain and R. Balakrishnan. 2012. User Learning and Performance with Bezel Menus. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, TX, USA) (CHI ’12). ACM, New York, NY, USA, 2221–2230. https://doi.org/10.1145/2207676.2208376
[14]
R. Jang, C. Jung, D. Mohaisen, K. Lee, and D. Nyang. 2022. A One-Page Text Entry Method Optimized for Rectangle Smartwatches. IEEE Transactions on Mobile Computing 21, 10 (2022), 3443–3453.
[15]
A. Komninos and M. Dunlop. 2014. Text Input on a Smart Watch. IEEE Pervasive Comuputing 13, 4 (2014), 50–58.
[16]
G. Kurtenbach and W. Buxton. 1993. The Limits of Expert Performance Using Hierarchic Marking Menus. In Proceedings of the INTERCHI ’93 and CHI ’93 Conference on Human Factors in Computing Systems (Amsterdam, Netherlands) (CHI ’93). ACM, New York, NY, USA, 482–487. http://dx.doi.org/10.1145/169059.169426
[17]
L. A. Leiva, A. Sahami, A. Catala, N. Henze, and A. Schmidt. 2015. Text Entry on Tiny QWERTY Soft Keyboards. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Seoul, Korea) (CHI ’15). ACM, New York, NY, USA, 669–678. https://doi.acm.org/10.1145/27022123.2702388
[18]
M. Luna, F. Soares, H. Nascimento, J. Siqueira, T. Nascimento, E. Souza, and R. da Costa. 2018. Text Entry on Smartwatches: A Systematic Review of Literature. In Proceedings of 42nd IEEE International Conference on Computer Software and Applications (Tokyo, Japan) (COMPSAC ’18). IEEE, 272–277. https://ieeexplore.ieee.org/document/8377870
[19]
M. Nancel and M. Beaudouin-Lafon. 2009. Extending Marking Menus with Integram Dimensions: Application to the Dartboard Menu. https://inria.hal.science/hal-01523310/.
[20]
A. Neshati, B. Rey, S. A. Faleel, S. Bardot, C. Latulipe, and P. Irani. 2021. BezelGlide: Interacting with Graphs on Smartwatches with Minimal Screen Occlusion. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). ACM, New York, NY, USA, 501: 1–13. https://doi.org/10.1145/3411764.3445201
[21]
A. Neshati, A. Salo, S. A. Faleel, Z. Li, H. Liang, C. Latulipe, and P. Irani. 2022. EdgeSelect: Smartwatch Data Interaction with Minimal Screen Occlusion. In Proceedings of International Conference on Multimodal Interaction (Bengalure, India) (ICMI ’22). ACM, New York, NY, USA, 288–. https://doi.org/10.1145/3536221.3556586
[22]
S. Oney, C. Harrison, A. Ogan, and J. Wiese. 2013. ZoomBoard: A Diminutive QWERTY Soft Keyboard Using Iterative Zooming for Ultra-small Devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13). ACM, New York, NY, USA, 2799–2802. http://doi.acm.org/10.1145/2858036.2858242
[23]
K. Perlin. 1998. Quikwriting: Continuous Stylus-Based Text Entry. In Proceedings of the Annual Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’98). ACM, New York, NY, USA, 215–216. https://doi.org/10.1145/288392.288613
[24]
B. Rey, K. Zhu, S. T. Perrault, S. Bardot, A. Neshati, and P. Irani. 2022. Understanding and Adapting Bezel-to-Bezel Interactions for Circular Smartwatches in Mobile and Encumbered Scenarios. Proceedings of the ACM on Human-Computer Interaction 6, MHCI (2022), 201: 1–28.
[25]
S. Reyal, S. Zhai, and P. O. Kristensson. 2015. Performance and User Experience of Touchscreen and Gesture Keyboards in a Lab Setting and in the Wild. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Seoul, Korea) (CHI ’15). ACM, New York, NY, USA, 679–688. http://doi.acm.org/10.1145/2702123.2702597
[26]
M. Serrano, E. Lecolinet, and Y. Guiard. 2013. Bezel-Tap Gestures: Quick Activation of Commands from Sleep Mode on Tablets. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13). ACM, New York, NY, USA, 3027–3036. https://doi.org/10.1145/2470654.2481421
[27]
Y. F. Shao, M. C. Ogimoto, R. Pointner, Y. C. Ling, C. T. Wu, and M. Chen. 2016. SwipeKey: A Swipe-based Keyboard Design for Smartwatches. In Proceedings of the International Conference on Human-Computer Interaction with Mobile Devices and Services (Florence, Italy) (MobileHCI ’16). ACM, New York, NY, USA, 60–71. http://doi.acm.org/10.1145/2935334.2935336
[28]
K. A. Siek, Y. Rogers, and K. H. Connelly. 2005. Fat Finger Worries: How Older and Younger Users Physically Interact with PDAs. In Proceedings of the IFIP International Conference on Human-Computer Interaction (Rome, Italy) (INTERACT05). Springer, Verlag Berlin, Heidelberg, 267–280. https://doi.org/10.1007/11555261_24
[29]
R. W. Soukoreff and I. S. MacKenzie. 2003. Metrics for Text Entry Research: An Evaluation of MSD and KSPC, and a New Unified Error Metric. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Ft. Lauderdale, Florida, USA) (CHI ’03). ACM, New York, NY, USA, 113–120. http://doi.acm.org/10.1145/642611.642632
[30]
T. Tojo, T. Kato, and S. Yamamoto. 2018. BubbleFlick: Investigating Effective Interface for Japanese Text Entry on Smartwatches. In Proceedings of the International Conference on Human-Computer Interaction with Mobile Devices and Services (Barcelona, Spain) (MobileHCI ’18). ACM, New York, NY, USA, 44:1–12. http://doi.org/10.1145/3229434.3229455
[31]
K. Vertanen, C. Fletcher, A. Stanage, R. Watling, and P. O. Kristensson. 2019. VelociWatch: Designing and Evaluating a Virtual Keyboard for the Input of Challenging Text. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Glasgow, Scotland, UK) (CHI ’19). ACM, New York, NY, USA, 591:1–14. http://doi.org/10.1145/3290605.3300821
[32]
K. Vertanen, H. Memmi, J. Emge, S. Reyal, and P. O. Kristensson. 2015. Investigating Fast Mobile Text Entry Using Sentence-Based Decoding of Touchscreen Keyboard Input. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Seoul, Korea) (CHI ’15). ACM, New York, NY, USA, 659–668. http://doi.acm.org/10.1145/2702123.2702135
[33]
D. Vogel and P. Baudisch. 2007. Shift: A Technique for Operating Pen-Based Interfaces Using Touch. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, CA, USA) (CHI 2007). ACM, New York, NY, USA, 657–666. https://doi.org/10.1145/1240624.1240727
[34]
S. Zhao, M. Agrawala, and K. Hinckley. 2006. Zone and Polygon Menus: Using Relative Position to Increase the Breadth of Multi-Stroke Marking Menus. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Montréal, QC, Canada) (CHI 2006). ACM, New York, NY, USA, 1077–1086. https://doi.org/10.1145/1124772.1124933
[35]
J. Zheng, X. Bi, Yang Li, and S. Zhai. 2018. M3 Gesture Menu: Design and Experimental Analyses of Marking Menus for Touchscreen Mobile Interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Montréal, QC, Canada) (CHI ’18). ACM, New York, NY, USA, 249: 1–14. https://dl.acm.org/doi/10.1145/3173574.3173823

Index Terms

  1. PonDeFlick: A Japanese Text Entry on Smartwatch Commonalizing Flick Operation with Smartphone Interface
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems
    May 2024
    18961 pages
    ISBN:9798400703300
    DOI:10.1145/3613904
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 May 2024

    Check for updates

    Author Tags

    1. Japanese kana
    2. PonDeFlick
    3. smartwatch
    4. software keyboard
    5. text entry

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CHI '24

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI '25
    CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 1,020
      Total Downloads
    • Downloads (Last 12 months)1,020
    • Downloads (Last 6 weeks)118
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media