Keywords

1 Introduction

Despite expert warnings, many drivers throughout the world use cell phones or talking to other passengers while driving. Studies clearly indicate that this type of behavior has a negative effect on drivers’ ability to drive safely.

While video calling using mobile phones remains a potential distraction for today’s drivers, in this paper we turn our attention to see whether having conversations with the passengers replaced by video calling using augmented reality (AR) displays could improve the user experience without raising the risk level.

AR displays project images into the user’s visual scene in such a way that those images appear to be part of the natural scene. AR devices, such as HoloLens, have the potential to reduce driver distractions by presenting visual information close to the driver’s visual focus, while also allowing the driver to continue to view the driving environment. However, HoloLens is a powerful computer and we can expect drivers to use it as such, even engaging in video calls. It is not known how distracting this is.

Therefore, in this paper we assess the effects of a video call on the screen, simulating AR effect, and contrast this to the case of a speech-only conversation. We conducted a study in which participants controlled a simulated vehicle and at the same time engaged in a secondary task to do word guessing. Based on prior work with video calling while driving, our hypothesis is that on straight roads drivers’ visual attention to the road ahead will be reduced when they can see the passenger compared to the case when they can only hear them (Reason might be owing to nonverbal cues such as facial expressions, gestures, and posture that are not available when conversing by looking through the rear mirror).

2 Methodology

2.1 Tasks

The participants engaged in two tasks parallelly: the driving task and a spoken task. We define the driving task as the primary task and the spoken task is a secondary task. The driver should take the primary task with high priority and finish the secondary task using the spare effort. We take the common car following as the driving task. The scenario was set at high-way and the speed was 75 km/h on a three lanes straight road. The driver must follow a blue passenger car with an appropriate distance. Apart from the lead vehicle there was no other traffic. During the experiment procedure, the leading car may change lane suddenly due to the road crone set on the road. The participants must follow the leading car to change lane and manage to avoid the crash accident. As a result, if the participant was devoted to the secondary task, he would miss the sudden behavior of the front car and got involved in an accident.

As the participant performed the primary task, he also engaged in a spoken task. The driver and passenger would play a serious of games of guessing the words. The game was designed for two players. One player was given a target word and he should describe the meaning of the word without directly saying this word. The driver should guess the word. The purpose of this game was to offer the driver with the feeling of talking with the passengers.

The spoken task was carried on with two different ways: traditional method and visible method. The traditional method is just the talking between the passenger and driver that the driver can only hear the passenger and make response merely basing on the vocal information. The visible method enables the driver to make use of his optical channel by indirectly seeing the passenger’s facial expression and gesture through AR device. We adopted the WeChat video call to transfer the real-time picture of the passenger to the driver. The position of the video call was the top right corner of the driving scenario. Although we first wanted to use the HoloLens to offer the AR experience, the HoloLens 1 generation no longer supports the Skype. As an alternative, we use WeChat video talk as a compromise. Actually, the experience of these two methods is similar considering that we only conduct the driving task through driving simulator. The simulator can only provide 2D image while the true driving scenario should be 3D. For each participant the same experimenter acted as the passenger sitting behind the driver.

2.2 Design

We conducted a one-factor within-subjects experiment in which we compare two conditions. In the speech-only condition the driver can only hear the passenger(experimenter). In the video-call condition the driver could see the passenger(experimenter) additionally. In both conditions the passenger can see the driving scenario and the state of the driver, just like the true experience when he is sitting behind the driver. We counterbalanced the presentation order of the two conditions. The items of guessing game were the same for each participant.

2.3 Participants

Six student participants took part in the experiment to test the proposed demo. They are all the students from the course HCI. Each participant would receive a gift for taking part in the experiment.

2.4 Driving Scenario Building

The driving scenario was built with SimVista, the built-in develop kit of SimCreator. The total length of the road is 20 km. The speed of the leading car is set at 75 km/h. We design two different roads with different setting of the road crone. The detailed information of setting is shown in Table 1.

Table 1. The configuration of the road crone (position)

2.5 Objective Measurement

We collected several results including the driving performance, the eye movement, and the performance of the game. The detailed description is shown as follows:

  1. 1)

    Percent dwell time on different AOIs (Area of interest).

    We have three AOIs: road ahead, video call interface, and speed dashboard. A higher percent of dwell time indicates more visual resource being devoted to the specific area. This can directly show the participant’s visual resource allocation.

  2. 2)

    Standard deviation of lane position, as defined in SAE J2944. Increased SDLP can indicate worse driving performance.

  3. 3)

    Number of missed words in the game. Missed words can indicate the performance of the game.

2.6 Subjective Measurement

  1. 1)

    The usability questionnaire.

  2. 2)

    NASA TLX.

3 Results

3.1 Design of the In-Vehicle Communication System

The purpose of designing the in-vehicle communication system arouses from my personal experience. Traditionally, when we talk to acquaintance, we tend to look at him/her. We can conduct eye communication to better understand his/her intention and make judgement about his/her emotional state. When we want to have a talk with the driver, we can no longer see him/her directly. As an alternative, we can only hear, and the communication is quite eccentric. As a result, we want to design a system that can improve the quality of communication even when we are driving a car.

Traditionally, we hold the view that when you are driving a car, you must concentrate all your effort on the driving task. However, that is the merely the regulation or imagination. The truth is that the driver may make a phone call or operate his cellphone to check the notification. If we only count on the imagination that all the drivers will obey the rules and regulations, we are just escape from the problem. We should not compromise and be a coward. A question is raised naturally: Can we help to solve the problem that the driver wants to refer to other information that may shift his visual attention? Or could we help weaken the influence of visual attention shift? When we check the existing technology, augmented reality is the best option.

Following this path, we put forward our solution to help enrich the in-vehicle communication. The first thing we need to clarify is that the comparison is between AR-based display and traditional dashboard-based display. We should not make the assumption that all the driver will obey the traffic rules and refuse using cell phones while driving. We aim to lower the effect of the multimedia information on the driving performance.

The design of the communication system is simple, we need two cameras to capture both the driver and the passenger’s image. The real-time image of the passenger is shown through a window fixed at the right top corner of the driving scenario. The driver can switch quickly between the road and the video call interface. All the glance and fixation behavior will be record by the SMI head-wear eye tracker.

The primary task is a simple car following task. The secondary task is a word guessing game. By setting the word guessing task we want to simulate the communication environment for the driver.

3.2 Test Results

We invite six users to test out prototype. The whole procedure lasted for about 45 min. Each participant will try the two conditions and fill a questionnaire to report their feelings about the system. The analysis about the test results will be carried on from two aspects: the objective measurement and the subjective measurement.

Objective Measurement

Eye Tracker Data

The analysis of eye tracker data is carried on with the software BeGaze. The segment of the AOI (area of interest) is shown in Fig. 1.

Fig. 1.
figure 1

The segment of AOI (Color figure online)

Figure 1 was capture by the head-wear eye tracker. The red rectangle is the area of video call interface, the green rectangle is the area of road and the yellow rectangle is the area of speed dashboard.

First, we collected some parameter of the drivers’ eye movement on the three areas.

We also visualize the time allocation on the three areas. The dwell time on the three areas is shown in Fig. 2. The overall data of the fixation and glance support the hypothesis that the driver utilize the in-vehicle communication system quite frequent. The total dwell time on the road is about 35% and the total dwell time on the video interface is about 29%.

Fig. 2.
figure 2

The dwell time on three AOIs

The number of glance and fixation on the road is far more than the video call interface. The finding supports the driver still gather most of their attention on the driving task. We still need to investigate the fixation and glance behavior from a minor aspect. Thus, we calculate the average fixation duration and the average pupil size.

Driving Performance

The drivers’ performance of the secondary task can show the quality and efficiency of the communication. We use the number of missed words as the indicator.

We first conduct the Shapiro-Wilk normality test for both conditions. The result shows the data pass the normality test.

We then conduct the paired t-test on the performance of the two conditions for each participant. The participants’’ performance under two conditions show margin significant. The performance under video call condition (M = 2.83, SD = 3.37) is better than the performance under voice only condition (M = 5.83, SD = 18.97), t (5) = −2.1958, p-value = 0.079. Consider we only have 6 participants, if we could recruit more participants the result will be more persuasive.

By referring to the items that the participants fail to guess within 60 s, we find video call can help the driver better understand the passenger’s expression through their expression and gesture. The finding supports our hypothesis that the introduction of video call will significantly improve the quality and performance of the talking.

Considering that the driver assistance system used in our research is actually a feedback of video images, our original intention is to provide a good user experience in communication between the driver and the passenger. Of course, the premise of the function realization is that the system can not affect the driver’s normal driving safety behavior. Therefore, we mainly researched and demonstrated it during the experiment.

Our experimental data is mainly based on the single factor analysis of the driver assistance system. Therefore, Welch and Brown analyzed the speed and off-center line data, and the significance value is much larger than 0.05, so it can be obtained. It is concluded that there is no significant difference in the driver’s behavior in the presence or absence of the driving assistance system. The above two data mainly reflect the stability of the vehicle speed and driving route.

The analysis results of the above experimental data are very important for our next work. We have confirmed through experimental design and data analysis that the video assisted driving system in our study does not affect the driver’s driving activities, and further explains our company. The research done has certain research value.

Subjective Measurement

User Experience Evaluation

Learning and using, is it helpful for your driving behavior and so on. The main acquisition of the above information is mainly in the form of questionnaires, which are mainly divided into four categories: availability, ease of use, easy learning, and satisfaction.

In the data processing stage, we convert it into 1–5 unequal scores according to the answer options, which agrees that the score is up to 5 points, and so on.

The obtained visualization results give us a certain understanding of the user experience. The figure shows the specific scores of the six subjects in each category. It can be seen that in terms of usability, the ratio is 3:3, that is, half of the people think it is more useful, and the other half means that it does not help. In terms of ease of learning and use, all the scores of the participants are larger than the average value, indicating that everyone agrees with the easy learning and use characteristics of the system in this study, and does not consume a lot of time for the participants to deeply study the system. Understand that simple communication can be easily used. In terms of satisfaction, only one participant score was lower than the average. Most of the participants were satisfied with the system, mainly in the stability of the system and the comfort during driving.

NASA TLX Subjective Evaluation

In fact, the psychological load level of users when using products also directly affects the user’s subjective satisfaction. In order to increase the reliability of the experimental questionnaire, we completed the NASA TLX scale after each group of participants participated in the experiment. The user scores the behavior of participating in the experiment according to the above six dimensions.

The mental load of the six subjects in the experiment was: 70.67, 82.67, 84.67, 80.00, 52.33, 64.33. From the experimental results, the brain load score is still relatively high overall, indicating that during the experiment, our auxiliary system did not consume a lot of energy of the subjects, and this result is also related to our previous driving behavior of the participants. The results of the data analysis are consistent.

4 Discussion

This research has carried out the following research work and achieved corresponding research results:

  1. 1)

    Through the literature research, this study comprehensively combed the application scenarios of the car driving assistance system, and finally determined the communication problem between the driver and the passenger through the realization of the video visualization function. As we all know, in the driving process, the driver is mainly used to drive my vehicle safely. According to our investigation of the driver’s driving habits, most people will unconsciously turn their attention to the dialogue when they communicate with other people. This is because of an instinct, especially when making some interesting conversations, you will unconsciously want to observe the communicator’s facial expressions, and such frequent movements may pose a threat to safe driving. On the one hand, this study realizes the interaction between the driver and the passenger through the video screen line projected on the windshield during normal driving. On the other hand, through the analysis and processing of the experimental data, we have drawn a very important conclusion, that is, In the case of using the system, the driver does not affect the normal driving because of watching the video screen. On the contrary, he can improve the driver’s driving comfort and safety, especially the research is related to deaf and some people. Special people who need gesture communication provide convenience for information exchange.

  2. 2)

    This study verified its effectiveness and reliability through experiments. The experiment evaluated the degree of access to the driver’s communication information in both assisted and unassisted driving situations. The main results are as follows:

    1. a)

      When there is an auxiliary system, the number of guessing games completed by the participants is significantly higher than that of the experimental subjects without the assisted driving conditions. (Or, to complete the same number of guessing games, the participants who have the auxiliary system spend less time) This also shows that in the process of communication, many expressions and actions play a key factor, if only text is used Communication can sometimes have a certain impact on understanding.

    2. b)

      In both cases, there was no significant difference between the participants in avoiding obstacles during driving and maintaining the speed of the vehicle. In other words, after we joined a video assistant system, the driver would not Watching the picture in the video affects normal driving, because during the use, any driver will not always stare at the video feedback system, its presence is more similar to the dashboard of our car, or like the driving mirror You won’t always pay attention to all of its dynamics, but you will maintain a certain frequency to see the information it feeds back with glasses, and our implementation goal is exactly the same, verifying its security and reliability through comparison.

  3. 3)

    This study can be installed in all existing models. The video assist system used in the experiment has no requirements for the type of the car, and its construction process is relatively simple, and the cost is relatively low. Through the investigation of most of the models on the market, all vehicles do not have similar auxiliary tools for car interior communication, and this research provides a new idea for car auxiliary systems, we can carry out functional on the existing basis. Development, for example, for the auxiliary system of public transportation such as taxis, we can add hand movement capture, facial feature capture, etc. This is mainly because during the public transportation driving process, the driver’s own safety is very important. In the absence of the assisted driving system, the driver can only rely on the rearview mirror in the cab to observe the rear, and it also has many blind spots, such as only seeing the face, unable to see the hand movements of the rear person, and Our system can only conduct a risk assessment in time by analyzing facial features and hand movements. If there are some dangerous behaviors such as robbery or attacking the driver, a voice prompt will be given to inform the driver of the potential risk factors and let the driver Drivers can have a correct approach, this is just one of them that can be developed Function, there are many related functions can be developed for use in our support system, based on the result, this study follow safe driving some related studies provide some valuable reference.