Nothing Special   »   [go: up one dir, main page]

Enhancing Robot Programming With Visual Feedback and Augmented Reality 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Enhancing Robot Programming with

Visual Feedback and Augmented Reality

Stéphane Magnenat Morderchai Ben-Ari Severin Klinger


Disney Research Zurich, Dept. of Science Teaching Computer Graphics Lab.
Switzerland Weizmann Inst. Sci., Israel ETH Zurich, Switzerland
stephane@magnenat.net moti.ben- severin.klingler@inf.ethz.ch
ari@weizmann.ac.il
Robert W. Sumner
Disney Research Zurich,
ETH Zurich, Switzerland
sumner@disneyresearch.com

ABSTRACT 1. INTRODUCTION
In our previous research, we showed that students using the Robotics activities are widely used to introduce students
educational robot Thymio and its visual programming envi- to science, mathematics, engineering, technology ( stem) in
ronment were able to learn the important computer-science general and to computer science in particular [6]. Robotics
concept of event-handling. This paper extends that work activities are exciting and fun, but we are also interested
by integrating augmented reality ( ar) into the activities. in investigating if the activities lead to learning of stem
Students used a tablet that displays in real time the event subjects. In a previous paper [9], we described research con-
executed on the robot. The event is overlaid on the tablet ducted during an outreach program using the Thymio II ed-
over the image from a camera, which shows the location of ucation robot and its Visual Programming Language (vpl).
the robot when the event was executed. In addition, visual We showed that students successfully learned the important
feedback (fb) was implemented in the software. We devel- computer-science concept of event-handling.
oped a novel video questionnaire to investigate the perfor- However, while students were able to comprehend behav-
mance of the students on robotics tasks. Data were collected iors consisting of independent events, they had trouble with
comparing four groups: ar+fb, ar+non-fb, non-ar+fb, sequences of events. This paper explores two independent
non-ar+non-fb. The results showed that students receiving ways of improving their understanding of robotics program-
feedback made significantly fewer errors on the tasks. Those ming: visual feedback (fb) that shows which event handler
using ar made fewer errors, but this improvement was not is currently being executed and augmented reality (ar) (as
significant, although their performance improved. Technical originally suggested by the first author [7]).
problems with the ar hardware and software showed where The research methodology was improved. In [9], learning
improvements are needed. was measured by administering a textual questionnaire con-
taining exercises about vpl programs and the behaviors of
the robot that could be observed when the programs were
Categories and Subject Descriptors run. We observed that some young students found the tex-
K.3.2 [Computers & Education]: Computer and Infor- tual questionnaire difficult to understand. Therefore, we
mation Science Education - Computer Science Education; implemented a new type of research instrument—a video
I.2.9 [Robotics] questionnaire—where the students were given a multiple-
choice among several short video clips.
General Terms The performance of the students was measured in a 2×2
experimental setup: treatment groups that used ar com-
Human Factors pared with control groups that did not, and treatment groups
that received fb compared with those that did not.
Keywords Section 2 describes the robot and the software environ-
robotics in education; Thymio; Aseba; VPL; augmented re- ment, while Section 3 discusses previous work on ar in edu-
ality; event-actions pair cation and the ar system that we developed. The research
methodology and the design of the video questionnaire are
Permission to make digital or hard copies of all or part of this work for personal or presented in Section 4. The results of the analysis, the dis-
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
cussion and the limitations of the research appear in Sec-
tion on the first page. Copyrights for components of this work owned by others than tions 5–7. Section 8 describes our plans for the future.
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
ITiCSE’15, July 04–08, 2014, Vilnius, Lithuania. 2. THYMIO II AND ASEBA
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ACM 978-1-4503-3440-2/15/07 ...$15.00.
The Thymio II robot [11] (Figure 1) and its Aseba software
http://dx.doi.org/10.1145/2729094.2742585 . were created at the Swiss Federal Institute of Technology
Figure 1: The Thymio II robot with a top image for
tracking by the camera of the tablet.

(epfl and ethz) and ecal (University of Arts and Design).


Both the hardware design and the software are open-source.
The robot is small (11 × 11 × 5 cm), self-contained and
robust with two independently-driven wheels for differential
drive. It has five proximity sensors on the front and two on
the back, and two sensors on the bottom. There are five but-
tons on the top, a three-axis accelerometer, a microphone,
an infrared sensor for receiving signals from a remote control
and a thermometer. For output, there are rgb leds at the
top and bottom of the robot, as well as mono-colored leds Figure 2: The Aseba/VPL environment
next to the sensors, and a sound synthesizer. A printed im-
age was attached to the top of the robot so that the camera
could recognize the robot when ar was used (Figure 1).
The Aseba programming environment [8] uses the con- grounded cognition [3], we propose to make this relation ex-
struct onevent to create event handlers for the sensors. vpl plicit and available to the student. Our hypothesis is that
is a component of Aseba for visual programming.1 Figure 2 this will allow the students to understand better what their
shows a vpl program for following a line of black tape on program is doing and to lead them to learn faster. We pro-
a white floor. On the left is a column of event blocks and pose to use a tablet to provide a “window” into the “live
on the right is a column of action blocks. By dragging and mind” of the robot, localizing the robot using ar. The re-
dropping one event block and one or more action blocks sulting live inspection system uses both the spatiality and
to the center pane, an event-actions pair is created. Both the temporality of event execution to help students under-
event and action blocks are parametrized, enabling the user stand what their program is doing.
to create many programs from a small number of blocks. Previous work on ar in education [2, 13] has highlighted
The robotic activities reported here used a development that ar systems have a cost in term of weight and bulkiness;
version of vpl; the most important improvement is that furthermore, little quantitative comparison of their effective-
several actions can be attached to a single event.2 ness against non-ar solutions for the same problem has been
Visual feedback is implemented by causing an event-ac- conducted [13]. Moreover, there has been little use of ar in
tions pair to blink whenever it is executed. This facilitates computer-science education. Some work has explored how
understanding the temporal relation between the pairs and to input programs using physical artifacts [4], but none has
the spatial relation between the robot and its environment. used ar to provide facilities for tracing and debugging.

3.2 The augmented reality system


3. AUGMENTED REALITY
The ar system consists of an Android or iOS application,
3.1 Background which runs on a tablet and connects to the computer run-
ning vpl. The application finds the position of the tablet
Visual programming languages have been used extensively by detecting a ground image using the tablet’s camera (Fig-
[1] and event-based programming is claimed to be an effec- ure 3), and the position of the robot by detecting the printed
tive approach for teaching introductory programming [5]. image on its top (Figure 1). The camera’s image is shown
However, we found that the asynchronous nature of visual on the screen of the tablet and is overlaid with the event-
event-based programming renders the understanding and actions pairs at the physical locations they were executed
tracing of their execution difficult [9]. Nevertheless, when a by the robot (Figure 4). At the bottom of the tablet screen,
visual language is used, we can perceive a relation between the execution times of these events are shown on a local time
the spatiotemporal location of the robot and the execution line that can be scrolled and zoomed using drag and pinch
of the program. Building on the neurological evidence on gestures. Augmented pairs can be selected by touching them
1 on screen, and the corresponding time in the local timeline
A reference manual and a tutorial are available at http:
//aseba.wikidot.com/en:thymioprogram. will be highlighted. A global timeline indicates which part
2
This is the reason we now use the term event-actions pair. of the total recording the local timeline shows.
An AR event-actions pair
icon is generated when the
robot executes this pair

local timeline

global timeline

Figure 4: The GUI of the VPL AR App


Figure 3: The concept of the VPL AR App

We collected usage data from the vpl editor: addition


The ar system was implemented using the Vuforia li- and deletion of blocks, change of parameters, and clicks on
brary3 for tracking the image of the ground and the top buttons. For the ar system, we collected usage data from
of the robot, and its plugin for the Unity framework.4 It the tablet: its position in 3D, the position of the robot in
communicates with vpl through tcp/ip. 2D, and the state of the application.6

4. RESEARCH METHODOLOGY 4.3 The questionnaire


In our previous research [9], the questions consisted of
4.1 Population an image of a vpl program, together with multiple-choice
The workshops consisted of 14 sessions of 75 minutes. Two responses that were textual descriptions of the behavior of
sessions were run in parallel. The workshops took place in the robot when executing the program.7 The students were
Lugano, Switzerland on October 16–17 2014. asked to choose the description that correctly described the
There were 10–18 high-school students per session from behavior. We found that some students had difficulty un-
high schools in the Swiss canton of Ticino. The median age derstanding the textual descriptions.
of the students was 16 (low/high quartiles: 16/17). Consent For the current research we used a novel type of question-
forms were required and participants were allowed to opt naire based upon video clips.8 There were eight questions,
out of the study. four each of the following types:
There was one robot and tablet per 2 or 3 students. Var-
• The student is shown a video of the behavior of a robot
ious models of iOS and Android tablets (7–10 ") were used.
and then asked to select a program (one out of four)
There were four teaching assistants, two per room, who
that causes the behavior.
were students at USI (Università della Svizzera italiana).
The same two assistants were always paired together. Af- • The student is shown a program and four videos and
ter every two sessions, the assistants exchanged rooms to then be asked to select the video that demonstrates
prevent bias due to a specific pair of assistants. the behavior of the robot running the program.
4.2 Experimental setup The questionnaire was constructed using the Forms facil-
There were two independent variables in our research: ity of Google Drive. In Forms, you can include videos by
uploading the videos to YouTube and providing the URL.
• Augmented reality: ar was used by the students in
The questionnaire in [9] was constructed using the taxon-
room 1, while in room 2 vpl was used without ar.
omy in [10] that combines the Bloom and SOLO taxonomies.
• Visual feedback: The first session in each room used For this research, we limited the questions to those at the
fb from vpl and the ar system, whereas fb was not applying level of the taxonomy, because a student who is
used during the second session. able to track the execution of a program can be assumed
to understand what the individual instructions do and how
The first 15 minutes of each session were devoted to in- they work together, whereas at the understanding level, rote
troducing the robot and its built-in behaviors. During the learning might be sufficient to select a correct answer. Eval-
next 15 minutes, the students learned about vpl; this was uating higher cognitive levels such as creating will be done in
followed by 30 minutes devoted to solving increasingly chal- later phases of the research. In vpl, every program has both
lenging tasks.5 During the final 15 minutes the students
answered the video questionnaire. pressed; others were more difficult like following a track or
navigating a labyrinth.
3 6
http://www.qualcomm.com/products/vuforia The raw data is available at https://aseba.wikidot.com/en:
4 thymiopaper-vpl-iticse2015.
http://www.unity3d.com
5 7
The tasks were the same as those used in [9]. Some tasks http://thymio.org/en:thymiopaper-vpl-iticse2014
8
were very simple like changing colors when a button is http://thymio.org/en:thymiopaper-vpl-iticse2015
treatment control p-value Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Feedback 0.81 (n = 47) 1.74 (n = 34) 0.003 A 7.3 0.0 2.4 34.1 9.8 24.4 7.3 14.6
Augm. Reality 1.00 (n = 41) 1.40 (n = 40) 0.10 N 7.5 2.5 2.5 30.0 17.5 30.0 25.0 25.0
p 0.69 0.99 0.48 0.87 0.49 0.75 0.06 0.37
Table 1: The mean mistake count and the p-value
of Pearson’s chi-square test for different conditions. Table 4: The error rate (%) for AR/non-AR; p-
values of Pearson’s chi-square test. n = A:41, N:40
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
AF 0.0 0.0 4.0 32.0 4.0 24.0 0.0 4.0 Q8 showed the following program:
AN 18.8 0.0 0.0 37.5 18.8 25.0 18.8 31.2
NF 0.0 0.0 0.0 22.7 18.2 18.2 18.2 18.2
NN 16.7 5.6 5.6 38.9 16.7 44.4 33.3 33.3

Table 2: The error rate (%) of the questionnaire


answers. n = AF:25, AN:16, NF:22, NN:18

an event and an action, so we did not make the unistructural


/ multistructural distinction from the SOLO taxonomy.

5. RESULTS
The students had to select one of four video clips that var-
5.1 The questionnaire ied in the condition that could cause the robot’s leds to
become red. The correct video showed red when either the
To compare the treatment vs. the control groups, we
front button was pressed or an obstacle was placed in front
counted the number of mistakes for every participant. Ta-
of the robot. Although these questions are relatively simple,
ble 1 shows the mean mistake count and the p-value of Pear-
the students must reason on the spatial and logical relations
son’s chi-square test of its histograms, for the null hypothesis
between sensing and acting. The significant improvement
of no effect. We used Laplace smoothing (adding 1 to each
when using fb probably indicates that the fb caused the
bin) to apply the chi-square test even when the control group
students to become more aware of these relations while ex-
has 0 entries for a given mistake count. We see that using
perimenting with the robot.
fb decreases the mistake count significantly, while using ar
Table 4 shows that the error rate is generally lower with ar
is not significant.
than without, but the difference is not significant except for
Table 2 shows the error rate of the answers for the four
Q7 whose significance is borderline. In Q7, the students had
setups in the experiment: AF = ar and fb, AN = ar with
to select one of four video clips that varied in the behavior of
no fb, NF = no ar but with fb, NN = neither ar nor
the robot when an object was placed in front of its sensors.
fb. We see that some questions were answered correctly by
The program caused the robot to turn right or left when
almost all students, while other questions were more difficult
an object was detected in front of the left or right sensors,
and more than 30 % of the students gave the wrong answers.
respectively; when the object was detected by the center
We also see that the error rate depends on the setup.
sensor, the robot moved forward.9 This question required
We see from Table 3 that the error rate is always lower
understanding the relation between two event-actions pairs
with fb than without, and that this difference is significant
in sequence and the specific sensor events. We believe that
for Q1 and Q8, and borderline significant for Q7.
seeing the execution of the event-actions pairs in context
Q1 showed a video of the robot turning right when pre-
improved the understanding of these relations.
sented with an object in front of its left sensor. The students
had to select one of four programs, which differed in invert-
ing the left/right sensors and the direction of movement.
5.2 The usage data
The correct program was: To better understand the differences between treatment
and control groups for the different conditions we investi-
gated the usage data that we collected during the study.
Figure 5 compares the median time between consecutive
clicks on the run button in the vpl environment with the
median number of actions between two consecutive runs,
when using ar or not and when using fb or not. For ar,
there is a significant difference between the treatment group
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 and the control group. With ar, there were significantly
F 0.0 0.0 2.1 27.7 10.6 21.3 8.5 10.6 fewer actions between the runs and significantly less time
N 17.6 2.9 2.9 38.2 17.6 35.3 26.5 32.4 between clicks (Mann-Whitney U test, p < 0.001). When
p 0.01 0.87 0.62 0.44 0.56 0.25 0.06 0.03 fb is given, there is no significant difference in the usage
data of the treatment and control groups.
Table 3: The error rate (%) for fb/non-fb; p-values 9
The program and videos can be examined at http://thymio.
of Pearson’s chi-square test. n = F:47, N:34 org/en:thymiopaper-vpl-iticse2015.
• Several students tended to keep the tablet too close
to the robot and therefore the tablet did not see the
14
median number of actions between runs

ground image.
12 • The use of the tablet was not uniform: some students
did not use the tablet at all, while others seemed lost
10
in contemplating reality through the tablet.
8
• The students did not always realize whether the tablet
was tracking the ground or not.
6

• The software did not work uniformly well on different


4
devices, especially those with different screen sizes.
2
• Energy consumption was a problem.
0
• The current setup of a computer running vpl while a
0 20 40 60 80 100
separate tablet runs the ar is cumbersome.
median time between runs (s)

• Some tablets have poor focusing abilities and some-


Figure 5: Comparison of usage behavior of students times stayed out of focus for several minutes.
using AR (circles) or not (squares).
• The ar system sometimes lost track of the robot.
treatment control p-value
These problems point out the technical and pedagogical
Time between runs (s) difficulties of deploying ar in an educational context. Sev-
Feedback 23.5 (n = 47) 24.8 (n = 34) 0.39 eral technological difficulties can be solved by investing more
Augm. Reality 19.0 (n = 41) 36.0 (n = 40) < 0.001 effort in development. For example, vpl should run on the
Action count between runs tablet so a computer is not needed, energy-saving algorithms
Feedback 4.0 (n = 47) 4.8 (n = 34) 0.45 should be implemented, as should algorithms for robust lo-
Augm. Reality 2.0 (n = 41) 7.9 (n = 40) < 0.001 calization [12, Chapter 5]. The pedagogical difficulties point
to the need for careful instruction on how to use ar and how
to debug programs.
Table 5: Median statistics of usage behavior and the
p-value of Mann-Whitney U test. 5.3.2 Implementation of the questionnaire
Feedback from colleagues and observations from a pilot
A possible explanation for this difference is that using ar use of the questionnaire led us to re-do the video clips. The
enabled students to identify possible errors in their programs original clips were taken with the robot facing the user and
quicker and more precisely, so they had a better under- the camera. This makes it easy to see the horizontal prox-
standing of the necessary steps and were addressing smaller imity sensors, which were widely used in the questionnaire’s
problems at a time. Therefore, the more advanced learn- programs, but it required mental effort to interpret the di-
ing environment led to the reduced reaction times and fewer rections “right” and “left” when they referred to the body of
actions between runs. Conversely, this difference could be the robot. The video clips were photographed again, this
interpreted as follows: the additional complexity introduced time from the back of the robot. The advantage is that
by the ar system caused stress for the students and pre- there is no need to mentally translate the directions; the
vented them from focusing on the programming task. In disadvantage is that the sensors cannot be seen.
turn, this could have led to a trial-and-error behavior where We found Google Forms easy to use, but there were two
the students tested different programs at random without disadvantages: there is little flexibility in the format of the
understanding the underlying concepts. Further research is questions, and YouTube shows unrelated videos after each
necessary to show which of these two hypotheses is correct, clip, which proved to be distracting to some students.
or find another reason.
6. DISCUSSION
5.3 The observations For all the questions in the questionnaire, students who
In this section we present some issues that we observed received fb achieved lower errors rates than those who did
during the sessions. not receive fb. Similarly, students who used ar achieved
5.3.1 Design and implementation of the AR lower error rates in seven out of the eight questions. How-
ever, the improved performance was only significant in a few
• The groups with ar required intensive support because cases. The results are therefore more encouraging than con-
ar significantly increased the complexity of the setup. clusive. They do seem to indicate that visualization such as
By intensive support we mean that students had more that provided both by fb and ar can improve students’ spa-
than two questions per hour in average and answering tial and temporal understanding of programs in the context
these questions required more than one minute. of robotics.
The observations in Section 5.3 show that implementing
• Some students found using the tablet as a debugger to ar is a difficult technical challenge. One has to take into ac-
be unintuitive. count physical aspects such as the weight and position of the
tablet, as well as algorithmic aspects such as the localization, their dedication running the workshops. We thank the stu-
and interaction design aspects such as the user interface. It dents attending the workshops for their cooperation. Fi-
is also not surprising that intensive support and explicit in- nally, we thank the anonymous reviewers who provided use-
struction is needed if students are to obtain the maximum ful feedback that improved the article. The research lead-
benefit from a sophisticated technology like ar. ing to these results has received funding from the European
We found the video-based questionnaire to be very suc- Union’s Seventh Framework Programme under grant agree-
cessful; it allowed the students to answer the questionnaire in ment no 603662.
less time than with the textual questionnaire of [9]. A video
questionnaire is more appropriate when studying young chil- 10. REFERENCES
dren as it does not require a high level of linguistic capa- [1] A. L. Ambler, T. Green, T. D. Kumura,
bilities. Even when language is not a problem, we believe A. Repenning, and T. Smedley. 1997 visual
that video questionnaires should be used when asking about programming challenge summary. In Proceedings IEEE
the physical behavior of robots. Free and easily available Symposium on Visual Languages, pages 11–18, 1997.
tools—Google Forms and YouTube—enabled us to quickly [2] T. N. Arvanitis, A. Petrou, J. F. Knight, S. Savas,
construct an adequate questionnaire, although more flexible S. Sotiriou, M. Gargalakos, and E. Gialouri. Human
software support is needed for an optimal experience. factors and qualitative pedagogical evaluation of a
mobile augmented reality system for science education
7. LIMITATIONS OF THE RESEARCH used by learners with physical disabilities. Personal
and Ubiquitous Computing, 13(3):243–250, 2009.
The large population ensures that the results are reliable,
[3] L. W. Barsalou. Grounded cognition. Annual Review
but the experiment was carried out in one location, at one
of Psychology, 59(1):617–645, 2008.
time and using a specific robot and ar system, so it may
not be generalizable. [4] M. U. Bers, L. Flannery, E. R. Kazakoff, and
While the technical difficulties described in Section 5.3 A. Sullivan. Computational thinking and tinkering:
were not surprising in a first attempt to use ar in this con- Exploration of an early childhood robotics curriculum.
text, they did cause the activities to be sub-optimal and Computers & Education, 72:145–157, 2014.
possibly prevented ar from realizing its full potential. [5] K. Bruce, A. Danyluk, and M. Thomas. Java: An
The limitation of the questions to those at the analyz- Eventful Approach. Prentice Hall, 2006.
ing level of the Bloom taxonomy means that the research [6] K. P. King and M. Gura, editors. Classroom Robotics:
focused on only one form of learning. Case Stories of 21st Century Instruction for
Millennial Students. Information Age Publishing,
Charlotte, NC, 2007.
8. CONCLUSIONS [7] S. Magnenat and F. Mondada. Improving the Thymio
We carried out a quantitative study of the effect of visual Visual Programming Language experience through
fb and ar on the learning of a cs concept using a mobile augmented reality. Technical Report EPFL-200462,
robot. Visual fb had a significant positive effect on some ETH Zürich and EPFL, March 2014.
questions, while ar had a positive effect but the improve- http://infoscience.epfl.ch/record/200462 (last accessed
ment was not significant, although it did improve the stu- 16 November 2014).
dents’ performance. Together with our previous study [9], [8] S. Magnenat, P. Rétornaz, M. Bonani, V. Longchamp,
this research supports the claim that robotics activities are and F. Mondada. ASEBA: A Modular Architecture
effective for teaching introductory cs at the K-12 level. for Event-Based Control of Complex Robots.
We described a new research tool—the video-based ques- IEEE/ASME Transactions on Mechatronics,
tionnaire—that is appropriate for investigating learning in PP(99):1–9, 2010.
young students. [9] S. Magnenat, J. Shin, F. Riedo, R. Siegwart, and
We believe that this paper is the first report of a study M. Ben-Ari. Teaching a core CS concept through
of using ar to improve learning in cs education, and one robotics. In Proceedings of the Nineteenth Annual
of the small number of quantitative study on the use of ar Conference on Innovation & Technology in Computer
in education in general [13]. We described some of the dif- Science Education, pages 315–320, Uppsala, Sweden,
ficulties of using ar in the context of educational robotics 2014.
activities. As next step, the ar hardware and software need [10] O. Meerbaum-Salant, M. Armoni, and M. Ben-Ari.
to be made more robust and easy to use, and learning ma- Learning computer science concepts with Scratch.
terials designed for ar must be developed. Then further Computer Science Education, 23(3):239–264, 2013.
research is needed to accurately characterize when fb and
[11] F. Riedo, M. Chevalier, S. Magnenat, and
ar improve learning.
F. Mondada. Thymio II, a robot that grows wiser with
children. In IEEE Workshop on Advanced Robotics
9. ACKNOWLEDGEMENTS and its Social Impacts (ARSO), 2013.
We thank Alessia Marra, Maurizio Nitti, Maria Beltran [12] R. Siegwart, I. R. Nourbakhsh, and D. Scaramuzza.
and Manon Briod for their artistic contributions to vpl Introduction to Autonomous Mobile Robots (Second
and its augmented reality version. We also thank Prof. Edition). MIT Press, Cambridge, MA, 2011.
Matthias Hauswirth for allowing us to run the USI work- [13] P. Sommerauer and O. Müller. Augmented reality in
shop, and Elisa Larghi for her help in organizing it. We informal learning environments: A field experiment in
thank the assistants of the USI workshop: Christian Vuerich, a mathematics exhibition. Computers & Education,
Matteo Morisoli, Gabriele Cerfoglio, Filippo Ferrario for 79:59–68, 2014.

You might also like