978 3 319 09840 1 - 20 1
978 3 319 09840 1 - 20 1
978 3 319 09840 1 - 20 1
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Abstract
Today’s traffic environment, such as traffic and information signs, road markings, and vehicles, is
designed for human visual perception (even if first approaches for automatic evaluation by electronic
sensor systems in the vehicle exist – see chapter “▶ Intersection Assistance”). This is done by different
shapes, colors, or a temporal change of the signals.
It is therefore a good choice to use a system similar to the human eye for machine perception of the
environment. Camera systems are ideal candidates as they offer a comparable spectral, spatial, and
temporal resolution. In addition to the “replica” of human vision, specific camera systems can provide
other functions, including imaging in infrared spectral regions for night vision or a direct distance
measurement.
This chapter covers details on specific applications of camera-based driver assistance systems and the
resulting technical needs for the camera system. Use cases covering the outside and inside of the vehicle
are shown. The basis of every camera system is the camera module with its main parts – the lens system
and the image sensor. The underlying technology is described, and the formation of the camera image is
discussed. Moving to the system level, basic camera architectures including mono and stereo systems are
analyzed. The chapter is completed with a discussion of the calibration of camera systems.
1 Applications
Due to their versatility, camera systems in automobiles are used both for surveillance of the interior of the
car as well as the surroundings (Loce et al. 2013). The following section discusses these applications and
explains the specifics of using camera systems for these.
A first driver assistance system using a camera system was the so-called rear view camera. The driver is
assisted by the display of a live video stream on a monitor system. Advanced functions using computer
vision are, for example, used in the high-beam assist function. In these systems, the video image is not
displayed, but a specific function is directly derived from the camera image.
In addition, cameras are used in the interior of the vehicle. Here, two main functions are of importance.
First, the driver monitoring for detecting the state of the driver and the driver’s intention and, secondly, the
use of camera systems in the context of an advanced man–machine interface for controlling functions, for
example, by means of gesture and gaze control.
*Email: martin.punke@continental-corporation.com
Page 1 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Fig. 1 Field of view of an interior camera for (a) driver monitoring and (b) hand gesture recognition
selected, because it is invisible to the driver and can ensure a high image quality at nighttime and in
quickly changing light conditions.
Page 2 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Figure 2 shows such a situation. It depicts the field of views of various sensors on the car. The sensors
are designed such that the fields of view overlap, and different detection ranges can be achieved. In this
example, the detection by camera systems as well as short and long range radar sensors is shown.
Today, different types of sensors can be integrated in a single housing. An example is the SRLCam
(Fig. 3) of Continental, where a short-range Lidar (SRL) is combined with a multifunction camera. This
enables a compact and inexpensive sensor. The performance of the emergency braking function is
increased since the systems are working with a sensor fusion (see also chapter “▶ Data Fusion of
Environment Detection Sensors for ADAS”). Another example of sensor combination is radar and camera
system (RACAM) from Delphi (Delphi 2014).
Page 3 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Cameras in the Visible Spectral Range The majority of the systems currently used are operating in the
visible spectral region and thus similar to the human eye. As previously explained, this enables the
detection of all relevant traffic signs, etc. by the camera systems that are also relevant for humans.
The high-beam assist function is used to automatically turn on and off the high beams of the car. More
advanced systems also have a variable range control and independently dim certain areas. Such systems
are enabled by segmented LED (light-emitting diodes) headlamps. The various light functions are
controlled by the camera that is analyzing the oncoming traffic. Important for this camera function is
the ability to distinguish at least the color of the tail and front lights of a vehicle. Therefore, color-sensitive
camera systems are used (details in Sect. 3.3.4). Another requirement for this function is the need for a
high dynamic range of the camera system, arising from the large differences of light intensities that occur
at night (see also Sect. 2.1.4).
Traffic sign recognition detects all relevant traffic signs (e.g., speed limits, one-way street labels) and
the information is made available to the driver. The traffic sign recognition feature requires a high-
performance camera system (Stein et al. 2008). The traffic signs must be recorded with a high resolution
(>15 pixels per degree) so that the character recognition functions correctly. Since most traffic signs are
placed at the edge of the road and the automobile is moving at high speed, a short (<30 ms) exposure time
is necessary to avoid strong motion blur.
For safety reasons, many vehicles are equipped with a lane detection feature (see also chapter “▶ Lane
Change Assistance”). Important for this function is a very high recognition rate even at nighttime and with
poor road conditions. It is advantageous if the camera system can distinguish colors, since a detection of
different colored markings on a street is enabled, for example, at construction sites (see also Sect. 3.3.4).
In order to respond to other road users (e.g., vehicles, pedestrians, cyclists), a robust object recognition
is necessary. Here, a variety of aspects are relevant for the camera system (Raphael et al. 2011). For a large
detection range, e.g., for vehicle detection on highways, a high resolution is necessary. Pedestrian
detection benefits from a large field of view. In general, a high sensitivity of the camera system is very
important.
In particular, for object recognition, stereo camera systems have many advantages. By generating a
depth map, various objects can be detected, and the distance to the vehicle is measured directly. In
addition, the so-called free space detection is possible, which identifies areas on the street that are possible
to use. Another function which can be implemented using a stereo camera system is the road condition
recognition. In this way, the vehicle can adjust to bad road conditions, etc. (Daimler 2014).
Cameras in the Infrared Spectral Range A disadvantage of cameras operating in the visible spectral
range is the insufficient sensitivity at very low light conditions. A possible alternative or add-on is the
usage of cameras operating in the infrared spectral range (see also chapter “▶ Visibility Improvement
Systems for Passenger Cars”). Here, mainly two approaches are used. In the first approach, the traffic
scene is illuminated using special infrared headlights (LED or halogen). The camera system is equipped
with spectral filters, so that the camera sensor is only sensitive to the IR wavelength. Another possibility is
the use of special cameras that are sensitive in the far infrared (FIR). These cameras can directly detect
Page 4 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
heat radiation of pedestrians and animals and thus triggering an assistance function. A major drawback is
the high system cost since these camera systems are not based on conventional image sensors
(K€allhammer 2006).
Rearview Cameras The camera module of rear view cameras is usually integrated into the tailgate of the
car (e.g., close to the license plate). The video stream is then displayed on a monitor on the instrument
panel. Firstly, accidents by running over people behind the car can be avoided. On the other hand,
advanced systems support the driver during parking by displaying a graphical overlay (see chapter
“▶ Parking Assistance”).
Surround View Cameras Surround view systems are equipped with four or more cameras around the
vehicle. The video information of the cameras are transmitted to a central processing unit. Camera
modules for such systems are usually equipped with the so-called fish-eye lenses, which allow a
horizontal field of view of more than 180 . From the camera images, a 360 view of the environment is
generated and provided to the driver as a parking aid on a monitor. In future, not only a parking support
will be possible, but the camera images are also used for object detection and general environmental
detection in addition to the front view camera.
Mirror Replacement An approach that is likely to play a significant role in future is the replacement of
normal exterior mirrors with camera systems. This is beneficial for fuel consumption (less drag) and opens
up completely new design possibilities. Similar to surround view systems, a high dynamic range and a
good color reproduction of the camera image are necessary for a good quality of the displayed video. Such
camera systems are addressed in the international standard ISO/DIS16505 (ISO/DIS 16505).
Page 5 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Image
processing
Display
The optical image is converted by the image sensor into digital values. Therefore, the design and
adaptation of the optical system to the image sensor are crucial to image quality. The sensor mainly
influences the resolution (number of pixels), field of view (number and arrangement of pixels), dynamic
range, color reproduction, and especially the sensitivity (see Sect. 3.3).
In the next step of the processing chain, the image quality is affected by the image processing steps in
the processor. In addition, the performance of the computer vision algorithms crucially depends on the
performance of the processing unit.
a ¼ tan1 ðh=d Þ
In surround view systems, almost all camera modules are using a horizontal field of view of more than
180 . This is due to the desired 360 representation of the vehicle environment. In order to calculate a
single picture from the single frames, an overlap between the FOVs of the different cameras is required.
In the area of driver monitoring, the imaging of the drivers head is important. Different anatomical
requirements and installation situations result in a field of view of about 40–50 . For gesture recognition,
for example, using a camera module in the functional unit in the roof, larger FOVs (>50 ) are usually
selected. This provides the driver with more freedom in the gesture operation.
Page 6 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
a
Horizon
VFOV
HFOV
100
Fig. 5 Vertical (a) and horizontal (b) field of view of a front view camera
Fig. 6 Effect of decreasing resolution using the example of a traffic sign (480 650/72 96/36 48/24 32/18 24/
12 16)
Page 7 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Page 8 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
garages, or oncoming vehicles at night. A pavement marking at night can exhibit a luminance L of < 10
cd/m2, while the headlights of a vehicle in the same scene can have a luminance of up to 100,000 cd/m2
(Hertel 2010). Due to these issues, a very high dynamic range of the camera system is necessary. With an
appropriate design (see also Sect. 3.3), one can achieve more than 120 dB dynamic range within imaging
systems. The dynamic range is defined as follows:
3 Camera Module
Camera modules can be very different in design. A camera module is defined here as the combination of
lens, image sensor, electronics, and packaging technology. It is of course possible to accommodate more
components within the camera module assembly, such as image processors.
3.1.1 Electronics
Image sensors have a wide variety of analog and digital inputs and outputs. The most important are the
power supply for the analog and digital parts of the image sensor, external timing signal (clock), digital
Page 9 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
control inputs and outputs, and interfaces for the configuration and image data transmission to the
computing unit.
The transfer of settings (e.g., integration time) is realized via the configuration bus system with a low
bandwidth. The transmission of the image data can be achieved via a parallel or serial interface. As data
rates increase significantly with higher pixel counts and refresh rates, a trend is the usage of high-speed
serial interfaces, such as the Camera Serial Interface (CSI) (MIPI 2014).
3.2 Optics
The camera lens typically comprises a stack of single lens elements, optical filter elements, and the overall
lens housing. The applied lens designs in the automotive industry are a trade-off between optical
performance, costs, and durability. This trade-off strongly influences the choice of materials of the single
lens elements, as well as their number, and the choice of lens housing materials. An example for durability
and robustness requirements, respectively, is the focal point stability over temperatures ranging up to
100 C.
Page 10 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Lens Design for Front View Cameras Low light conditions and the demand for sufficient frame rates
require a short exposure time in a front view camera, which is compensated for in terms of light sensitivity
by a low f-number, i.e., a large aperture and/or a small focal length. F-numbers < 2 are in use. Large
apertures introduce image aberrations to sharpness and color, which need to be removed by additional
single lens elements, resulting in multiple single lens elements in an automotive camera lens.
The ability of a camera lens to produce a sharp image is expressed by the modulation transfer function
(MTF). The MTF describes how well real-world contrasts at different spatial frequencies are reproduced
at the image plane.
Camera lenses become more expensive with better performance, because more single lens elements are
required. Therefore, an overshoot in performance has to be avoided due to cost considerations, and thus, it
is important to know the minimum required image quality in order for the driver assistance to run properly.
Image quality is degraded by the usual optical aberrations, which are accumulated and expressed in the
so-called point–spread function (PSF). It describes how a real-world point source (or point in the scene) is
reproduced on the image plane, where it is typically a spatial intensity distribution and not a point
anymore (Sinha 2012). A practical example is a head light at night in the far distance. Another image-
forming influence is the sampling nature of the image sensor and the discrete spatial distribution of the
pixels. A general design rule for lenses is to keep the PSF close to the size of a pixel, because PSFs smaller
than the pixel size (provided by better and more expensive lenses) cannot be resolved at all.
Image sharpness varies along the optical axis, i.e., the position of the image sensor is important. This
so-called depth of focus provides a narrow region of sufficient sharpness and is smaller at lower
f-numbers. By means of focusing close to the hyperfocal distance, it is nevertheless possible to image
large portions of the real-world scene with high sharpness and leave margins for temperature-induced
focal point shifts.
Page 11 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Image aberrations typically found in automotive camera lens designs are spherical aberration, chro-
matic aberration, coma, astigmatism, and field curvature and act on the intensity distribution of the PSF
(Hecht 1998). Chromatic aberration, for example, results in wrong reproduction of colors. All image
aberrations can be reduced to negligible amounts by using more single lens elements, aspherical lens
elements, or expensive lens materials (Sinha 2012). Other image artifacts, for example, lens flare and
ghost images, can appear due to multiple internal reflections and scattering on optical and mechanical
surfaces (Reinhard et al. 2008).
Image distortions of a few percent are negligible in lens design, because they can be corrected by
software algorithms. Therefore, designing an expensive and distortion-free camera lens is not required.
Especially for correspondence in stereo camera images, the distortion correction has to work accurately,
and the lens distortion is therefore measured during fabrication (see Sect. 5).
Lenses for Surround View Cameras Surround view cameras typically require a large field of view. As a
result for moderately priced lenses, a large distortion is observed toward the edges of the image. The high
distortion needs to be corrected very accurately, because images from more than one camera are stitched
together to generate the surround view, and finding correspondences in single images is the basis of image
stitching. Also, high brightness reduction toward the edges (optical vignetting) needs to be considered and
corrected. Additionally, the first lens is usually exposed to the environment and has to be designed in a
robust way (e.g., glass instead of plastic).
Lenses for Interior Cameras Lenses for interior cameras need to monitor the driver in a range from
40 to 100 cm. Typical f-numbers for such lenses are f > 2, because a relatively large depth of focus is
required matching the abovementioned range.
Page 12 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
3.3.2 Resolution
The quality of digital video is not only influenced by the spatial resolution, but also by contrast and
temporal resolution. These influences describe the number of pixels on which an object is imaged, the
number of gray levels a scene can be resolved, and the time interval between two images (Fiete 2010).
Spatial Resolution To reconstruct a structure by a digital image sensor, it must be mapped to multiple
pixels. The required number of pixels per angle is therefore determined by the smallest structure one
wants to resolve. The total number of pixels is determined by the FOV and the desired resolution in pixels
per angle (see Sect. 2.1.2).
To realize a high number of pixels on an image sensor, there are two options. The first is to keep the
pixel pitch (distance between the centers of neighbored pixels) and increase the die size. The second
approach is to lower the pixel pitch (and thus the pixel size) while keeping the die size. For cost reasons,
the latter approach is chosen in most cases. From the perspective of the signal-to-noise ratio, keeping the
pixel pitch is preferable, because with smaller pixels, also the filling factor further decreases. Furthermore,
less temporal noise will be generated when pixel volume decreases, but it will not go down proportionally
(Fiete 2010; Theuwissen 2008). Care must be taken to ensure that the disadvantages of decreasing the
pixel size can be compensated by, e.g., new pixel design and improved manufacturing processes.
Contrast Resolution For the detection of objects, a high differentiation of object brightness is advan-
tageous to also detect, for example, a dark-clothed person at night. This is achieved by an A/D conversion
of the signal with a bit depth of 8–10 bit in simple systems and by 12 bit and more at more sophisticated
Page 13 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
systems. HDR sensors (high dynamic range) internally often work with much higher bit depths, which are
then compressed for easier data transfer to a bit depth of typically 10–14 bit.
Temporal Resolution The refresh rate is the time interval between two recordings. A low refresh rate
involves the risk that the system’s reaction on an event occurs too late or the event is totally missed and
furthermore complicates the tracking of objects. A high frame rate increases the demands on the interface
and further image processing. Typical values are at about 30 frames per second.
Lateral Overflow This concept is based on analog resets which do not fully reset the pixel but allow a
certain level of charges to remain (partial saturation). Several such partial resets are performed during the
integration time, each allowing an increasing amount of charges to remain while the time intervals
between the partial resets are getting shorter (Darmont 2012). Capturing moving objects may result in
some motion artifacts if partial saturations are reached due to the repeated partial integration and
superposition into a single image.
An advantage of the method is that the high dynamic information is directly available, and hence no
information needs to be cached, which makes the process very applicable for global shutter sensors (see
Sect. 3.3.5).
Multi-exposure In this concept, successively several individual captures with different sensitivities (by,
e.g., integration time or gain) are taken, and this information is combined to a high dynamic range image.
Moving objects are captured at several points in time and thus at different positions within the frame,
causing noticeable motion artifacts. Therefore, a small time gap between two single integrations is very
important and also an appropriate combination scheme for the transform into an HDR image (Solhusvik
et al. 2009).
The benefits of the multi-exposure method are the SNR performance as CDS and that other corrective
measures can be performed individually for each of the single integrations.
Split Pixels In split pixel concept, every pixel is divided into two or more sub-pixels. Different
sensitivities of the sub-pixels are achieved by different-sized photosensitive areas (typical ratios of
about 1: 4–1: 8), different gain stages, or choice of different integration times.
The great advantage of this approach is the time-parallel capturing of the scene in different dynamic
ranges, resulting in lower motion artifacts than is the case with lateral overflow or multi-exposure process.
Page 14 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
The disadvantage is that the dynamic range is generated by only two single integrations, which results
either in a lower overall dynamic or a strong compression of the dynamic range (Solhusvik et al. 2009;
Solhusvik et al. 2013). Furthermore, there is a somewhat poorer fill factor, since two photodiodes and their
circuits must be accommodated in the pixel pitch.
Page 15 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
ERS
GS
Fig. 11 Difference of electronic rolling shutter (ERS) and global shutter (GS) using the example of a cube with the direction of
movement to the right. ERS records moving objects at different rows at different points in time, GS all the rows at the same time
its shape in the captured image (see Fig. 11, lower illustration). Also randomly pulsed light sources such as
LED vehicle lighting or active traffic signs produce a homogeneous result, as long as the pulses occur
during the integration time.
Disadvantage to the global shutter is the need to store the image information until it is read out. This
results in the need for additional transistors per pixel and analog memory cell arrays (sample and hold
circuit). This leads to parasitic effects, and also additional chip area is required. The resulting poorer fill
factor results in a higher noise level compared to the electronic rolling shutter, where the information is
output directly (Yadid-Pecht and Etienne-Cummings 2004).
In an electronic rolling shutter (ERS) imager, the integration of each pixel is, with an interval of one
clock cycle, individually started and also individually stopped after the integration time (Yadid-Pecht and
Etienne-Cummings 2004; Baxter 2013). The shutter “rolls” across the single pixels over the entire active
pixel array. At each clock cycle, the information of that single pixel, which just stopped its integration, is
available and is read out directly. This eliminates the need of the sample and hold circuit. That is why ERS
has great advantages in the SNR compared to the GS. Furthermore, CDS is easier to integrate with ERS
due to lower number of pixel transistors (Yadid-Pecht and Etienne-Cummings 2004).
Disadvantage of sequentially integrating the pixel is the time gap of the integration between the image
lines. So the upper part of an object that moves horizontally through the image is taken at an earlier point
in time than the lower part, which leads to the well-known effect of distorted or “leaning” objects.
Recording pulsed light sources (active variable traffic signs) results in the effect that short single pulses are
seen only by those few lines which are integrating photons at that time.
Figure 11 shows the position of a moving cube and the line (light gray), whose integration was started at
different times while the ERS image (top) was captured. When the lines are set together, this will result in
a distorted representation of the cube. With GS (below), the integration of all pixels happens at the same
time, resulting in an undistorted representation of the cube.
4 System Architecture
To meet the demands of all required functions, the system architecture requires the correct design of
hardware and software components as well as the image processing algorithms. In addition, the mechan-
ical design of the system and the mechanical and electronic connections with the vehicle are of
importance.
Page 16 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Image
control
Lens Electronics
Mechanics
Since camera systems for driver assistance functions are safety-related vehicle components (e.g., by
braking intervention), the system must comply with the ISO norm 26262 (“Road vehicles – Functional
safety ”) (ISO 26262). Here, depending on the function, different automotive safety integrity levels (ASIL)
are necessary to be implemented.
Page 17 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
transformation step in different output values. Background of the operation is either an adaption to a
particular display system or the improved image processing. In the case of a representation on a display, a
noise reduction, an edge enhancement, and color correction are typically performed (Reinhard et al. 2008;
Nakamura 2006).
If the image data is used for machine vision tasks, often a distortion correction, a calculation of the
optical flow, and, in the case of a stereo camera, an image rectification and a creation of the disparity map
are performed. From the preprocessed images, the desired information is extracted in the main image
processing step (see also chapters “▶ Fundamentals of Machine Vision” and “▶ Stereovision for
ADAS”).
4.1.4 Communication
The data exchange with other controllers in the vehicle is realized via the communication interface of the
camera system. Common vehicle bus systems are the controller area network (CAN) bus, the FlexRay
bus, and the Ethernet standard. While CAN and FlexRay bus systems are only used to control the camera
system and to transfer the output data in the form of, e.g., object lists, a transmission of raw image data is
possible when using Ethernet because of the higher data rates.
4.1.5 Electronics
The design of the electronics follows the high standards in the automotive industry, among others, with
regard to durability and electromagnetic immunity and compatibility. The large amounts of data that must
be processed in real time using complex algorithms lead to a system design with several processors or
multi-core processors (Stein et al. 2008). The resulting amount of heat to be dissipated is a challenge in the
vehicle installation space and has to be considered already in the electronics and housing design.
4.1.6 Mechanics
The housing of the camera system is the interface between the electronics and the camera module to the
vehicle. It must be thermally stable and easy to assemble. Moreover, the housing usually forms the
shielding of the electronics for better electromagnetic compatibility. In the case of the front view camera,
the housing is located behind the windshield. To avoid reflections between the windshield and the
housing, a so-called stray light cover is often used. In camera systems where the camera modules are in
direct contact with the environment (e.g., surround view), the housing must also be sealed against
humidity.
Page 18 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Data interface
Camera
DSP
module
Control bus
IPC
Vehicle bus
Microcontroller Heating
Data
interface
Camera
DSP
module
Control bus
IPC
Vehicle bus
Microcontroller Heating
Camera Control bus
module
IPC
IPC
interface
Data
IPC
FPGA DSP
Page 19 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
4.3.3 Design
An important criterion for the use of a stereo camera system is the accuracy of the depth estimation. Only
with precise depth estimation, functions such as a camera-based emergency brake assist or an automatic
distance control system can be realized reliably. These functions rely on precise distance information from
the vehicle to the object.
Important parameters in a stereo system are the paraxial focal length of the cameras f and the distance
between the cameras (base width b). The so-called disparity d is derived from the distance of the pixels
defined by the projection of lines from the image point P incident on the image sensors.
With the size of a pixel on the image sensor sP, a distance to the object zC can be calculated:
bf
zC ¼
d sP
Dd sP z2C
Dz ¼
bf
Accuracies of Dd of less than one pixel can be achieved to minimize the error of the disparity estimation.
An example calculation of a camera system with a focal length of 5 mm, a pixel size of 3.75 mm, a base
width of 200 mm, and a disparity error of 0.25 pixels results in an error of the distance measurement of
2.3 m (4.7 %) at an object distance of 50 m.
Improvements are possible by increasing the resolution of the image sensor using a smaller pixel size, a
larger base width, or a larger focal length. Changing these parameters requires, however, a change in the
system design. The field of view of the camera would be limited by a longer focal length. The base width
should be kept as small as possible because of the requirements of the vehicle design. A smaller pixel size
can lead to a lower sensitivity of the system. In addition, the required computing power increases
significantly with higher resolutions.
5 Calibration
To ensure that driver assistance functions like sign recognition, lane keeping assist, or head light control
work in a reliable and correct manner, it is important to interpret images of the used camera system
correctly. To this end, additional information about the images is necessary which can be determined by
calibration algorithms. Such information, which we denote as calibration parameters, cannot only be used
Page 20 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
for better interpretation of the images but also for a compensation of deviations from a defined norm. One
example might be the linearization of an image sensor’s response curve.
This section provides a discussion about the calibration parameters that are typically determined for
driver assistance systems. Furthermore, the section describes where a calibration normally takes place and
how it can be conducted.
– Principal point
– Focal length
– Image distortion
– Pixel scale factors
– Camera position
– Camera orientation
Page 21 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
6 Outlook
The technology in the field of camera systems is developing rapidly. One reason is the area of consumer
electronics with its demand for ever more powerful and less expensive image sensors, camera lenses, and
computing platforms. These advances will be used in future also in the vehicle environment. Greater
computing power enables both advances in image processing as well as the usage of larger resolutions of
Page 22 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
the image sensors. Higher refresh rates and improved sensitivities are additional developments in the field
of camera sensors. With these technological advances, camera systems will be used in the future in various
applications in the vehicle.
References
Brainard DH (1994) Bayesian method for reconstructing color images from trichromatic samples. In:
Proceedings of the IS&T 47th annual meeting, Rochester, NY, p. 375–380
Baxter D (2013) A line based HDR sensor simulator for motion artifact prediction. Proc SPIE
8653:86530F
Civera J, Bueno DR, Davison AJ, Montiel JMM (2009) Camera self-calibration for sequential bayesian
structure from motion. In: Proceedings of the 2009 IEEE international conference on Robotics and
Automation, Kobe, Japan
Daimler (2014) Homepage Daimler. http://www.mercedes-benz.com. Accessed 10 Jan 2014
Darmont A (2012) High dynamic range imaging, sensors and architectures. SPIE Press, Washington
Delphi (2014) Homepage Delphi. http://delphi.com/manufacturers/auto/safety/active/racam/. Accessed
10 Jan 2014
El Gamal A, Eltoukhy H (2005) CMOS image sensors. IEEE Circuits Devices Mag 21(3):6
Fiete RD (2010) Modelling the imaging chain of digital cameras. SPIE Press, Bellingham
Fischer RE (2008) Optical system design. McGraw-Hill, New York
Gentex (2014) Homepage Gentex. http://www.gentex.com/automotive/products/forward-driving-assist.
Accessed 10 Jan 2014
Hecht E (1998) Optics. Addison Wesley Longman, New York
Hertel D (2010) Extended use of incremental signal-to-noise ratio as reliability criterion for multiple-slope
wide-dynamic-range image capture. J Electron Imaging 19(1):011007
Holst GC, Lomheim TS (2011) CMOS/CCD sensors and camera systems. SPIE Press, Washington
ISO 26262: Road vehicles – functional safety
ISO/DIS 16505: Road vehicles – ergonomic and performance aspects of camera-monitor
systems – requirements and test procedures
K€allhammer J (2006) Night vision: requirements and possible roadmap for FIR and NIR systems. Proc
SPIE 6198:61980F
Loce RP, Berna I, Wu W, Bala R (2013) Computer vision in roadway transportation systems: a survey.
J Electron Imaging 22(4):041121
Miller JWY, Murphey YL, Khairallah F (2004) Camera performance considerations for automotive
applications. Proc SPIE 5265:163
MIPI (2014) Homepage MIPI-Alliance. http://www.mipi.org/specifications/camera-interface. Accessed
10 Jan 2014
Nakamura J (2006) Image sensors and signal processing for digital still cameras. CRC Press, Boca Raton
Raphael E, Kiefer R, Reisman P, Hayon G (2011) Development of a camera-based forward collision alert
system. SAE Int J Passenger Cars Mech Syst 4(1):467
Reinhard E, Khan EA, Aky€ uz AO, Johnson GM (2008) Color imaging. A.K. Peters, Wellesley
Sinha PK (2012) Image acquisition and preprocessing for machine vision systems. SPIE Press,
Washington
Solhusvik J, Yaghmai S, Kimmels A, Stephansen C, Storm A, Olsson J, Rosnes A, Martinussen T,
Willassen T, Pahr PO, Eikedal S, Shaw S, Bhamra R, Velichko S, Pates D, Datar S, Smith S, Jiang L,
Page 23 of 24
Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Wing D, Chilumula A (2009) A 1280 960 3.75um pixel CMOS imager with triple exposure
HDR. In: Proceedings of 2009 international image sensor workshop, Utah, USA
Solhusvik J, Kuang J, Lin Z, Manabe S, Lyu J, Rhodes H (2013) A comparison of high dynamic range CIS
technologies for automotive applications. In: Proceedings of 2013 international image sensor work-
shop, Utah, USA
Stein GP, Gat I, Hayon G (2008) Challenges and solutions for bundling multiple DAS applications on a
single hardware platform. Israel computer vision day
Theuwissen AJP (2008) Course “digital camera systems” – hand out. CEI.se, Finspong
Tsai RY (1987) A versatile camera calibration technique for high-accuracy 3D machine vision metrology
using off-the-shelf TV cameras and lenses. IEEE J Robot Autom 3(4):323
Yadid-Pecht O, Etienne-Cummings R (2004) CMOS imagers: from phototransduction to image
processing. Kluwer, Dordrecht
Zhang Z (1998) Determining the epipolar geometry and its uncertainty: a review. Int J Comput Vis
27(2):161
Page 24 of 24