978 3 319 09840 1 - 20 1

Handbook of Driver Assistance Systems
DOI 10.1007/978-3-319-09840-1_20-1
# Springer International Publishing Switzerland 2015
Automotive Camera (Hardware)

Martin Punkea*, Stefan Menzela, Boris Werthessena, Nicolaj Stachec and Maximilian Höpflb
a
Continental AG/ADC Automotive Distance Control Systems GmbH, Lindau, Germany
b
Continental AG/Continental Automotive GmbH, Frankfurt on the Main, Germany
c
Continental AG/Continental Teves AG & Co. oHG, Frankfurt am Main, Germany
Abstract
Today’s traffic environment, such as traffic and information signs, road markings, and vehicles, is
designed for human visual perception (even if first approaches for automatic evaluation by electronic
sensor systems in the vehicle exist – see chapter “▶ Intersection Assistance”). This is done by different
shapes, colors, or a temporal change of the signals.
It is therefore a good choice to use a system similar to the human eye for machine perception of the
environment. Camera systems are ideal candidates as they offer a comparable spectral, spatial, and
temporal resolution. In addition to the “replica” of human vision, specific camera systems can provide
other functions, including imaging in infrared spectral regions for night vision or a direct distance
measurement.
This chapter covers details on specific applications of camera-based driver assistance systems and the
resulting technical needs for the camera system. Use cases covering the outside and inside of the vehicle
are shown. The basis of every camera system is the camera module with its main parts – the lens system
and the image sensor. The underlying technology is described, and the formation of the camera image is
discussed. Moving to the system level, basic camera architectures including mono and stereo systems are
analyzed. The chapter is completed with a discussion of the calibration of camera systems.
1 Applications
Due to their versatility, camera systems in automobiles are used both for surveillance of the interior of the
car as well as the surroundings (Loce et al. 2013). The following section discusses these applications and
explains the specifics of using camera systems for these.
A first driver assistance system using a camera system was the so-called rear view camera. The driver is
assisted by the display of a live video stream on a monitor system. Advanced functions using computer
vision are, for example, used in the high-beam assist function. In these systems, the video image is not
displayed, but a specific function is directly derived from the camera image.
In addition, cameras are used in the interior of the vehicle. Here, two main functions are of importance.
First, the driver monitoring for detecting the state of the driver and the driver’s intention and, secondly, the
use of camera systems in the context of an advanced man–machine interface for controlling functions, for
example, by means of gesture and gaze control.
1.1 Driver and Interior Detection

The application of a camera in the interior requires a different camera design compared to the exterior. For
driver monitoring applications, the object distance is smaller, and therefore, the optical parameters, e.g.,
depth of focus, differ significantly. Imaging in the near-infrared spectral region with artificial lighting is
*Email: martin.punke@continental-corporation.com
Page 1 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Fig. 1 Field of view of an interior camera for (a) driver monitoring and (b) hand gesture recognition
selected, because it is invisible to the driver and can ensure a high image quality at nighttime and in
quickly changing light conditions.
1.1.1 Driver Monitoring and Gaze Control

New driver assist systems support the driver and at the same time introduce a new role to the driver: the
driver turns from an acting role to an observing role. The driver is more of a moderator.
The different assist functions provide information to the driver, who can start and adopt the functions to
the traffic situation or can completely take over the driving task again.
The information about the driver, i.e., state and intention, will be part of the interaction concept in the
car enabling a holistic HMI (human–machine interface).
Initially, the interior camera applications are focused on drowsiness detection. Additional functions like
driver identification are used to apply personal preferences, adaptive warning depending on the driver’s
head position and gaze direction, as well as an augmented HMI (augmentation: overlay of HMI
information with objects in the car’s environment), e.g., in the augmented reality head-up display, and
are also enabled by interior camera technology.
A driver monitoring system can include up to four cameras. The number of cameras in the system
depends on the application of the system. The more cameras, the higher the precision for the gaze
detection and the wider the detection area. In Fig. 1a, a possible field of view for a mono camera driver
monitoring system is shown.
1.1.2 HMI: Hand Gesture Recognition

Today, multimodal HMI concepts allow the driver to interact with the system in many different ways,
including traditional knobs, push/turn control knobs, and modern touch surfaces. The recognition of hand
gestures in space and the approximation of the driver’s hand to the central touch display can be detected by
an interior camera.
Two technologies can be applied for hand gesture recognition. On the one hand, a conventional 2D
image is used and, on the other hand, a 3D depth map, which is generated by a time-of-flight- (TOF)
sensor. Both use similar optics, but differ in the sensor and the near-infrared illumination unit. Time-of-
flight cameras have a smaller resolution, but can provide depth information, which makes the object
detection and object tracking work easier. Fig. 1b shows an interior camera for hand gesture recognition
and its field of view.
1.2 Environmental Detection

The aim of the environmental detection is the full recognition of all relevant road users, road scenery, and
road signs in order to be able to react accordingly. Here, a variety of different sensor technologies can be
considered thus ensuring both the correct recognition as well as the best possible availability.
Page 2 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Fig. 2 Environmental detection by different sensor systems
Fig. 3 Combination of a camera system with a lidar sensor (Continental SRLCam)
Figure 2 shows such a situation. It depicts the field of views of various sensors on the car. The sensors
are designed such that the fields of view overlap, and different detection ranges can be achieved. In this
example, the detection by camera systems as well as short and long range radar sensors is shown.
Today, different types of sensors can be integrated in a single housing. An example is the SRLCam
(Fig. 3) of Continental, where a short-range Lidar (SRL) is combined with a multifunction camera. This
enables a compact and inexpensive sensor. The performance of the emergency braking function is
increased since the systems are working with a sensor fusion (see also chapter “▶ Data Fusion of
Environment Detection Sensors for ADAS”). Another example of sensor combination is radar and camera
system (RACAM) from Delphi (Delphi 2014).
Page 3 of 24
DOI 10.1007/978-3-319-09840-1_20-1
1.2.1 Front View Cameras

Front view cameras are usually placed behind the windshield of the automobile and close to the rearview
mirror (or even integrated in the rearview mirror (Gentex 2014)). This position has the great advantage in
the wide field of view and the protection through the windshield. In addition, the windshield in front of the
camera is swept by the wipers thus ensuring a clean view. Exceptions to this placement position are
camera systems working in the far-infrared range. These systems are installed in the headlights or radiator
area due to the low transmission of the windshield in this spectral region (K€allhammer 2006).
Cameras in the Visible Spectral Range The majority of the systems currently used are operating in the
visible spectral region and thus similar to the human eye. As previously explained, this enables the
detection of all relevant traffic signs, etc. by the camera systems that are also relevant for humans.
The high-beam assist function is used to automatically turn on and off the high beams of the car. More
advanced systems also have a variable range control and independently dim certain areas. Such systems
are enabled by segmented LED (light-emitting diodes) headlamps. The various light functions are
controlled by the camera that is analyzing the oncoming traffic. Important for this camera function is
the ability to distinguish at least the color of the tail and front lights of a vehicle. Therefore, color-sensitive
camera systems are used (details in Sect. 3.3.4). Another requirement for this function is the need for a
high dynamic range of the camera system, arising from the large differences of light intensities that occur
at night (see also Sect. 2.1.4).
Traffic sign recognition detects all relevant traffic signs (e.g., speed limits, one-way street labels) and
the information is made available to the driver. The traffic sign recognition feature requires a high-
performance camera system (Stein et al. 2008). The traffic signs must be recorded with a high resolution
(>15 pixels per degree) so that the character recognition functions correctly. Since most traffic signs are
placed at the edge of the road and the automobile is moving at high speed, a short (<30 ms) exposure time
is necessary to avoid strong motion blur.
For safety reasons, many vehicles are equipped with a lane detection feature (see also chapter “▶ Lane
Change Assistance”). Important for this function is a very high recognition rate even at nighttime and with
poor road conditions. It is advantageous if the camera system can distinguish colors, since a detection of
different colored markings on a street is enabled, for example, at construction sites (see also Sect. 3.3.4).
In order to respond to other road users (e.g., vehicles, pedestrians, cyclists), a robust object recognition
is necessary. Here, a variety of aspects are relevant for the camera system (Raphael et al. 2011). For a large
detection range, e.g., for vehicle detection on highways, a high resolution is necessary. Pedestrian
detection benefits from a large field of view. In general, a high sensitivity of the camera system is very
important.
In particular, for object recognition, stereo camera systems have many advantages. By generating a
depth map, various objects can be detected, and the distance to the vehicle is measured directly. In
addition, the so-called free space detection is possible, which identifies areas on the street that are possible
to use. Another function which can be implemented using a stereo camera system is the road condition
recognition. In this way, the vehicle can adjust to bad road conditions, etc. (Daimler 2014).
Cameras in the Infrared Spectral Range A disadvantage of cameras operating in the visible spectral
range is the insufficient sensitivity at very low light conditions. A possible alternative or add-on is the
usage of cameras operating in the infrared spectral range (see also chapter “▶ Visibility Improvement
Systems for Passenger Cars”). Here, mainly two approaches are used. In the first approach, the traffic
scene is illuminated using special infrared headlights (LED or halogen). The camera system is equipped
with spectral filters, so that the camera sensor is only sensitive to the IR wavelength. Another possibility is
the use of special cameras that are sensitive in the far infrared (FIR). These cameras can directly detect
Page 4 of 24
DOI 10.1007/978-3-319-09840-1_20-1
heat radiation of pedestrians and animals and thus triggering an assistance function. A major drawback is
the high system cost since these camera systems are not based on conventional image sensors
(K€allhammer 2006).
1.2.2 Cameras for Detection of the Surroundings of the Car

Compared to the front view cameras, the class of camera systems that control the surroundings of the car
has other objectives. These camera systems often cover a wide field of view and also provide a video
stream to the driver.
Rearview Cameras The camera module of rear view cameras is usually integrated into the tailgate of the
car (e.g., close to the license plate). The video stream is then displayed on a monitor on the instrument
panel. Firstly, accidents by running over people behind the car can be avoided. On the other hand,
advanced systems support the driver during parking by displaying a graphical overlay (see chapter
“▶ Parking Assistance”).
Surround View Cameras Surround view systems are equipped with four or more cameras around the
vehicle. The video information of the cameras are transmitted to a central processing unit. Camera
modules for such systems are usually equipped with the so-called fish-eye lenses, which allow a
horizontal field of view of more than 180 . From the camera images, a 360 view of the environment is
generated and provided to the driver as a parking aid on a monitor. In future, not only a parking support
will be possible, but the camera images are also used for object detection and general environmental
detection in addition to the front view camera.
Mirror Replacement An approach that is likely to play a significant role in future is the replacement of
normal exterior mirrors with camera systems. This is beneficial for fuel consumption (less drag) and opens
up completely new design possibilities. Similar to surround view systems, a high dynamic range and a
good color reproduction of the camera image are necessary for a good quality of the displayed video. Such
camera systems are addressed in the international standard ISO/DIS16505 (ISO/DIS 16505).
2 Cameras for Driver Assistance Systems

As shown in the previous sections, the application range of camera systems in vehicles is very diverse.
Therefore, many different variants of the camera systems exist. The following section further describes
some aspects of the design.
A basic camera architecture is shown in Fig. 4. An object or a scene is projected through an imaging
lens onto the image sensor. The pixels of the image sensor are converting the photons into an electronic
output signal, which is analyzed by a processor unit. In the case of a direct presentation to the user, the
output is displayed on a screen.
2.1 Criteria for the Design

For the design of a camera system, both the analysis of the individual parts and the complete system is
necessary. Many performance parameters are influenced by several parts of the system.
The lens of the camera system is very important for a good overall performance. Among other
components, it affects the possible resolution, field of view, depth of field, color reproduction, and also
the sensitivity of the system. Since an optical system never produces a perfect image (see Sect. 3.2),
potential errors, such as a distortion, must be corrected.
Page 5 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Image
processing
Lens Image sensor
Display
Fig. 4 Diagram of a basic camera architecture
The optical image is converted by the image sensor into digital values. Therefore, the design and
adaptation of the optical system to the image sensor are crucial to image quality. The sensor mainly
influences the resolution (number of pixels), field of view (number and arrangement of pixels), dynamic
range, color reproduction, and especially the sensitivity (see Sect. 3.3).
In the next step of the processing chain, the image quality is affected by the image processing steps in
the processor. In addition, the performance of the computer vision algorithms crucially depends on the
performance of the processing unit.
2.1.1 Field of View

The field of view (FOV) of a camera plays an important role in the application and is essentially defined by
the lens and the image sensor. One distinguishes the field of view in horizontal and vertical direction
(HFOV, VFOV).
In front view camera systems, usually the horizontal field of view is the most important. However, the
different assistance functions require differently wide horizontal fields of view. Relatively large HFOV
values are needed in case of lane detection (in tight curve scenarios) and object detection (e.g., for the
detection of crossing vehicles or pedestrians that are running onto the road). Values of more than 40 are
meaningful for these applications (see Fig. 5a).
Another aspect in the choice of field of view values is the influence of motion blur in the image. For
large field angles, the position of the object in the image during the exposure time varies greatly. This
manifests itself in a blurred image in the border areas for longer exposure times. Therefore, the maximum
usable angle is also limited by this effect.
The vertical field of view is determined primarily by the mounting height and the minimum detection
distance at close range. An exemplary calculation for a passenger car results in an angle a of 18 below the
horizon at a height h of 1.3 m and a distance d of 4 m (see Fig. 5b).
a ¼ tan1 ðh=d Þ
In surround view systems, almost all camera modules are using a horizontal field of view of more than
180 . This is due to the desired 360 representation of the vehicle environment. In order to calculate a
single picture from the single frames, an overlap between the FOVs of the different cameras is required.
In the area of driver monitoring, the imaging of the drivers head is important. Different anatomical
requirements and installation situations result in a field of view of about 40–50 . For gesture recognition,
for example, using a camera module in the functional unit in the roof, larger FOVs (>50 ) are usually
selected. This provides the driver with more freedom in the gesture operation.
Page 6 of 24
DOI 10.1007/978-3-319-09840-1_20-1
a
Horizon
VFOV
HFOV
100
Fig. 5 Vertical (a) and horizontal (b) field of view of a front view camera
Fig. 6 Effect of decreasing resolution using the example of a traffic sign (480 650/72 96/36 48/24 32/18 24/
12 16)
2.1.2 Camera Resolution

The possible resolution of a camera system is a complex interplay of the resolution of the lens and the
image sensor as well as the image processing. For the design of a camera system, a purely theoretical
analysis is necessary as a first step. Here, the object to be resolved (e.g., a vehicle 100 m away) is analyzed
and the necessary resolution for the image processing defined (e.g., 10 pixels per vehicle width). Then,
using the geometric relations, a necessary resolution in pixels per degree is calculated. Usual values
are > 15 pixels per degree in the case of driver assistance functions in the field of environmental
detection. In particular, traffic sign recognition has high demands toward the overall resolution (e.g., to
recognize additional characters). In Fig. 6, the effect of different resolutions is shown. While the form and
the warning symbol can be extracted from the right images, this will not work with pictogram and text in
the supplementary sign. In the area of driver monitoring, higher resolutions may be necessary, e.g., to
realize an eye tracking. An important aspect in choosing the optimal resolution is – in addition to the
geometric requirements – the available computing power for image processing.
Page 7 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Fig. 7 Importance of color separation for the detection of lane markers
2.1.3 Color Reproduction

In the area of advanced driver assistance systems (ADAS), the widely used CMOS image sensors are
predominantly sensitive in the visible (VIS) and the near-infrared (NIR) spectral region. A separation in
different color channels is realized via color filters on the image sensor (see Sect. 3.3).
How Sect. 1 describes, the color reproduction is of great advantage for applications in the field of front
view and surround view cameras. While in surround view systems, a realistic representation of the camera
image on the monitor is very important; in the front view applications, the ability to distinguish between
individual color channels is critical.
An example of the importance of color separation is shown in Fig. 7. While a camera system with color
information (a) is clearly able to distinguish between white and yellow lane markings, in the case of a
monochrome image (b), this is no longer possible.
2.1.4 Dynamic Range

The dynamic range (DR) of a camera system describes the ability to record both dark and bright areas in
the image. In dark areas, the dynamic range is limited by the noise limit of the image sensor and in bright
areas by the saturation limit of the image sensor. In addition to the image sensor, the dynamic range of the
camera lens and the optical path defines the overall system dynamic range. The dynamic range of the lens
is adversely affected by stray light. In addition, effects such as ghosting and flare can occur, which reduces
the image quality especially in strong front light situations. Elements in the optical path such as the
windshield can limit the total dynamic range of the system.
Traffic situations in the field of driver assistance systems are showing large differences in brightness
levels. Examples are scenes with low-standing sun (see Fig. 8), entries/exits of tunnels and parking
Page 8 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Fig. 8 Illustration of a traffic scene with a high dynamic range
garages, or oncoming vehicles at night. A pavement marking at night can exhibit a luminance L of < 10
cd/m2, while the headlights of a vehicle in the same scene can have a luminance of up to 100,000 cd/m2
(Hertel 2010). Due to these issues, a very high dynamic range of the camera system is necessary. With an
appropriate design (see also Sect. 3.3), one can achieve more than 120 dB dynamic range within imaging
systems. The dynamic range is defined as follows:
DRðdBÞ ¼ 20 log10 ðLMAX =LMIN Þ
3 Camera Module
Camera modules can be very different in design. A camera module is defined here as the combination of
lens, image sensor, electronics, and packaging technology. It is of course possible to accommodate more
components within the camera module assembly, such as image processors.
3.1 Construction of a Camera Module

The main elements of the camera module are the lens and the image sensor which are held by a suitable
mechanical structure. Additionally, electronic components and a connection to the image processor are
necessary. As shown in Fig. 9, the single elements are assembled into a compact module using different
packaging technologies.
The camera lens consists of multiple single lens elements that are built into a complete lens system. Part
of the optical system is often an infrared cut-off filter (IRCF), which only transmits the non-infrared parts
of the light spectrum. The lower part of the camera module exhibits the image sensor, the printed circuit
board (PCB), and the electronic components.
Of course the design parameters such as resolution and dynamic range are critical for the basic design of
a module. For a robust design, in particular the environmental impacts during the lifetime of the
system – such as changes in temperature and humidity – are critical. The different applications lead to
different requirements for the module design, since a camera in the area of surround view has direct
contact with the external environment, while modules for front view and driver monitoring are only used
inside the passenger compartment.
3.1.1 Electronics
Image sensors have a wide variety of analog and digital inputs and outputs. The most important are the
power supply for the analog and digital parts of the image sensor, external timing signal (clock), digital
Page 9 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Fig. 9 Schematic structure of a camera module
control inputs and outputs, and interfaces for the configuration and image data transmission to the
computing unit.
The transfer of settings (e.g., integration time) is realized via the configuration bus system with a low
bandwidth. The transmission of the image data can be achieved via a parallel or serial interface. As data
rates increase significantly with higher pixel counts and refresh rates, a trend is the usage of high-speed
serial interfaces, such as the Camera Serial Interface (CSI) (MIPI 2014).
3.1.2 Packaging Technology

To ensure compliance with the requirements of the optical parameters over the lifetime of the camera
module, the choice of suitable packaging technologies is very important. Firstly, the attachment of the
image sensor to the PCB is of importance; on the other hand, the alignment of the lens system to the image
sensor is critical.
The image sensor and other electronic components can be assembled to rigid printed circuit boards
made of organic material, flexible printed circuit boards, or ceramic carriers. Image sensors are mounted
onto the PCB in packaged or unpackaged form.
The alignment of the camera lens to the image sensor can be achieved using several mechanisms.
Common are approaches that actively align the image sensor to the optical system via multi-axis
adjustment and fastening with an adhesive bond. The biggest advantage is the precise adjustment.
Alternatively, an adjustment only in the direction of the optical axis is possible, which is realized by
means of a screw thread. However, a possible tilt of the optical axis to the image sensor plane will not be
corrected with the latter method.
3.2 Optics
The camera lens typically comprises a stack of single lens elements, optical filter elements, and the overall
lens housing. The applied lens designs in the automotive industry are a trade-off between optical
performance, costs, and durability. This trade-off strongly influences the choice of materials of the single
lens elements, as well as their number, and the choice of lens housing materials. An example for durability
and robustness requirements, respectively, is the focal point stability over temperatures ranging up to
100 C.
Page 10 of 24
DOI 10.1007/978-3-319-09840-1_20-1
3.2.1 Lens Design for Automotive Applications

Optical materials of choice are standard crown glasses or flint glasses for their moderate costs. Additional
single plastic lens elements might also be applied, but the high use-case temperatures in conjunction with
the high robustness requirements limit the number of plastic single lens elements.
Single glass lens elements are fabricated by conventional polishing methods. Alternative techniques,
e.g., lens pressing for aspherical glass elements, are not preferred for their higher costs. Plastic lens
elements are fabricated by injection molding and are available in both spherical and aspherical shapes
(Fischer 2008).
A camera lens for driver assistance systems is typically delivered in a water proof and humidity proof
housing, either made of plastic or metal, to assure use-case robustness. The surfaces of the housing
material are preferably black by choice of material or are covered by a black coating to minimize stray
light in the optical path. An additional optical filter to suppress UV radiation and infrared radiation is
inserted into the optical path if only the visible wavelength range is of interest. Cameras for applications in
the near-infrared wavelength range utilize a band–pass filter to suppress all light except for the near-
infrared light. All optically active surfaces are also coated with antireflection layers to avoid parasitic
multiple reflections resulting in lens flares and other unwanted image artifacts.
3.2.2 Optical Requirements for Driver Assistance Applications

The optical lens design depends on the specific application. Important parameters are the field of view, the
light sensitivity, the resulting image distortion, and the image sharpness. In a first approximation, the field
of view is given by the focal length, which is typically constant in camera lenses for automotive
applications because of robustness and cost considerations.
Lens Design for Front View Cameras Low light conditions and the demand for sufficient frame rates
require a short exposure time in a front view camera, which is compensated for in terms of light sensitivity
by a low f-number, i.e., a large aperture and/or a small focal length. F-numbers < 2 are in use. Large
apertures introduce image aberrations to sharpness and color, which need to be removed by additional
single lens elements, resulting in multiple single lens elements in an automotive camera lens.
The ability of a camera lens to produce a sharp image is expressed by the modulation transfer function
(MTF). The MTF describes how well real-world contrasts at different spatial frequencies are reproduced
at the image plane.
Camera lenses become more expensive with better performance, because more single lens elements are
required. Therefore, an overshoot in performance has to be avoided due to cost considerations, and thus, it
is important to know the minimum required image quality in order for the driver assistance to run properly.
Image quality is degraded by the usual optical aberrations, which are accumulated and expressed in the
so-called point–spread function (PSF). It describes how a real-world point source (or point in the scene) is
reproduced on the image plane, where it is typically a spatial intensity distribution and not a point
anymore (Sinha 2012). A practical example is a head light at night in the far distance. Another image-
forming influence is the sampling nature of the image sensor and the discrete spatial distribution of the
pixels. A general design rule for lenses is to keep the PSF close to the size of a pixel, because PSFs smaller
than the pixel size (provided by better and more expensive lenses) cannot be resolved at all.
Image sharpness varies along the optical axis, i.e., the position of the image sensor is important. This
so-called depth of focus provides a narrow region of sufficient sharpness and is smaller at lower
f-numbers. By means of focusing close to the hyperfocal distance, it is nevertheless possible to image
large portions of the real-world scene with high sharpness and leave margins for temperature-induced
focal point shifts.
Page 11 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Image aberrations typically found in automotive camera lens designs are spherical aberration, chro-
matic aberration, coma, astigmatism, and field curvature and act on the intensity distribution of the PSF
(Hecht 1998). Chromatic aberration, for example, results in wrong reproduction of colors. All image
aberrations can be reduced to negligible amounts by using more single lens elements, aspherical lens
elements, or expensive lens materials (Sinha 2012). Other image artifacts, for example, lens flare and
ghost images, can appear due to multiple internal reflections and scattering on optical and mechanical
surfaces (Reinhard et al. 2008).
Image distortions of a few percent are negligible in lens design, because they can be corrected by
software algorithms. Therefore, designing an expensive and distortion-free camera lens is not required.
Especially for correspondence in stereo camera images, the distortion correction has to work accurately,
and the lens distortion is therefore measured during fabrication (see Sect. 5).
Lenses for Surround View Cameras Surround view cameras typically require a large field of view. As a
result for moderately priced lenses, a large distortion is observed toward the edges of the image. The high
distortion needs to be corrected very accurately, because images from more than one camera are stitched
together to generate the surround view, and finding correspondences in single images is the basis of image
stitching. Also, high brightness reduction toward the edges (optical vignetting) needs to be considered and
corrected. Additionally, the first lens is usually exposed to the environment and has to be designed in a
robust way (e.g., glass instead of plastic).
Lenses for Interior Cameras Lenses for interior cameras need to monitor the driver in a range from
40 to 100 cm. Typical f-numbers for such lenses are f > 2, because a relatively large depth of focus is
required matching the abovementioned range.
3.3 Image Sensor

There are two fundamental architectures of digital image sensors – CMOS (complementary metal oxide
semiconductor) and CCD (charge-coupled device). CCD sensors had great advantages over CMOS
sensors until the end of the last century in terms of noise performance, but are hardly used anymore in
driver assistance systems. Today, the advantages of CMOS sensors predominate, and disadvantages in
noise performance are compensated (Miller et al. 2004; El Gamal and Eltoukhy 2005).
Active pixel sensors (APS) are based on CMOS technology. The terms APS and CMOS sensors are used
interchangeably herein. The following sections briefly describe the image sensor attributes sensitivity and
noise, resolution, concepts to increase the dynamic range, the reproduction of color by use of color filters,
and shutter concepts of APS sensors.
In CCD sensors and PPS (Passive Pixel Sensor), the charges generated by the pixel need to be
transferred to a common conversion node where they are converted to a voltage. Contrary, APS have
active pixels in the meaning of each pixel individually converting charge to voltage and having integrated
analog-to-digital converter, which converts the voltage to a digital signal.
Since the image information of the sensor is already available in digital form, it can directly be used
externally as well as internally.
Examples for internal use of digital data are the creation of a histogram of the recorded scene, the
realization of automatic exposure control, and the generation of high dynamic images or to guarantee
functional safety.
Good guidelines for the characterization of image sensors are standards such as EMVA1288 and ISO
standards such as 12232, 12233, 14524, 15739, and 16067.
Page 12 of 24
DOI 10.1007/978-3-319-09840-1_20-1
3.3.1 Sensitivity and Noise

The sensitivity of the image sensor is substantially influenced by its noise performance. Therefore, noise
performance over the required temperature range is an important criterion when selecting an image sensor.
The noise of an image sensor mainly consists of the temporary, photon shot, and spatial noise
components (Holst and Lomheim 2011; Yadid-Pecht and Etienne-Cummings 2004; Fiete 2010).
At very low signal levels, temporal noise is the dominant source of noise, which among other things is
composed of reset, thermal noise, and quantization noise.
Reset noise describes differences of the remaining charges after reset at the start of the integration and
can be suppressed by the use of correlated double sampling (CDS). An additional readout is performed
just before the integration starts to gather the current reset level, and its result is subtracted from the total
signal after integration.
Dark current noise is generated by charges caused by thermal energy. With increasing temperature, this
noise component increases nonlinearly.
At medium and high signal levels, the photon shot noise dominates. It describes the statistical intensity
distribution of the number of incident photons. As it has its source outside the sensor, it can’t be
compensated by the image sensor. Photon shot noise is calculated as the square root of the generated
signal and thus also determines the maximum signal-to-noise ratio (SNR).
Quantization noise is caused by the inaccuracy in the conversion of the electrical signal into a discrete
digital signal and can be minimized by a higher bit depth of the analog-to-digital converter.
Spatial noise describes relative static differences in the offset and the gain of individual pixels (also
known as fixed pattern noise – FPN) caused by variations in the current-to-voltage conversion of the
active pixels or as a column FPN in the amplifier and A/D converter circuits of the respective columns.
Similar to dark current noise, there is a nonlinear dependency on the temperature of the sensor (Yadid-
Pecht and Etienne-Cummings 2004; Fiete 2010; Theuwissen 2008).
3.3.2 Resolution
The quality of digital video is not only influenced by the spatial resolution, but also by contrast and
temporal resolution. These influences describe the number of pixels on which an object is imaged, the
number of gray levels a scene can be resolved, and the time interval between two images (Fiete 2010).
Spatial Resolution To reconstruct a structure by a digital image sensor, it must be mapped to multiple
pixels. The required number of pixels per angle is therefore determined by the smallest structure one
wants to resolve. The total number of pixels is determined by the FOV and the desired resolution in pixels
per angle (see Sect. 2.1.2).
To realize a high number of pixels on an image sensor, there are two options. The first is to keep the
pixel pitch (distance between the centers of neighbored pixels) and increase the die size. The second
approach is to lower the pixel pitch (and thus the pixel size) while keeping the die size. For cost reasons,
the latter approach is chosen in most cases. From the perspective of the signal-to-noise ratio, keeping the
pixel pitch is preferable, because with smaller pixels, also the filling factor further decreases. Furthermore,
less temporal noise will be generated when pixel volume decreases, but it will not go down proportionally
(Fiete 2010; Theuwissen 2008). Care must be taken to ensure that the disadvantages of decreasing the
pixel size can be compensated by, e.g., new pixel design and improved manufacturing processes.
Contrast Resolution For the detection of objects, a high differentiation of object brightness is advan-
tageous to also detect, for example, a dark-clothed person at night. This is achieved by an A/D conversion
of the signal with a bit depth of 8–10 bit in simple systems and by 12 bit and more at more sophisticated
Page 13 of 24
DOI 10.1007/978-3-319-09840-1_20-1
systems. HDR sensors (high dynamic range) internally often work with much higher bit depths, which are
then compressed for easier data transfer to a bit depth of typically 10–14 bit.
Temporal Resolution The refresh rate is the time interval between two recordings. A low refresh rate
involves the risk that the system’s reaction on an event occurs too late or the event is totally missed and
furthermore complicates the tracking of objects. A high frame rate increases the demands on the interface
and further image processing. Typical values are at about 30 frames per second.
3.3.3 Dynamic Range

The usable dynamic of an image sensor is defined by the range of light intensities that can be digitally
resolved, limited by the clear distinction between signal and noise and the pixel’s saturation level. In the
real world, dynamics of about 120 dB, corresponding to a contrast ratio of 1: 1.000.000, must be expected
(Darmont 2012).
Linear sensors have a maximum dynamic range of 60–70 dB, so they can’t represent the entire dynamic
range of the scene. HDR sensors can achieve dynamic ranges of about 120 dB.
In the following, two time-based (lateral overflow and multi-exposure) and one spatial (split pixels)
HDR concepts are discussed. In time-based method, every single pixel performs several partial integra-
tions or several individual integrations in succession, while spatial concepts integrate spatially separate
sub-pixels simultaneously (Yadid-Pecht and Etienne-Cummings 2004; Fiete 2010; Darmont 2012).
Lateral Overflow This concept is based on analog resets which do not fully reset the pixel but allow a
certain level of charges to remain (partial saturation). Several such partial resets are performed during the
integration time, each allowing an increasing amount of charges to remain while the time intervals
between the partial resets are getting shorter (Darmont 2012). Capturing moving objects may result in
some motion artifacts if partial saturations are reached due to the repeated partial integration and
superposition into a single image.
An advantage of the method is that the high dynamic information is directly available, and hence no
information needs to be cached, which makes the process very applicable for global shutter sensors (see
Sect. 3.3.5).
Multi-exposure In this concept, successively several individual captures with different sensitivities (by,
e.g., integration time or gain) are taken, and this information is combined to a high dynamic range image.
Moving objects are captured at several points in time and thus at different positions within the frame,
causing noticeable motion artifacts. Therefore, a small time gap between two single integrations is very
important and also an appropriate combination scheme for the transform into an HDR image (Solhusvik
et al. 2009).
The benefits of the multi-exposure method are the SNR performance as CDS and that other corrective
measures can be performed individually for each of the single integrations.
Split Pixels In split pixel concept, every pixel is divided into two or more sub-pixels. Different
sensitivities of the sub-pixels are achieved by different-sized photosensitive areas (typical ratios of
about 1: 4–1: 8), different gain stages, or choice of different integration times.
The great advantage of this approach is the time-parallel capturing of the scene in different dynamic
ranges, resulting in lower motion artifacts than is the case with lateral overflow or multi-exposure process.
Page 14 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Fig. 10 Scheme of a CMOS image sensor with RGB color filter
The disadvantage is that the dynamic range is generated by only two single integrations, which results
either in a lower overall dynamic or a strong compression of the dynamic range (Solhusvik et al. 2009;
Solhusvik et al. 2013). Furthermore, there is a somewhat poorer fill factor, since two photodiodes and their
circuits must be accommodated in the pixel pitch.
3.3.4 Color Reproduction

A photosensitive element can only provide the information that electrons are generated by incident light,
but is not able to provide information about the wavelength of the incident light. Most digital image
sensors are therefore monochrome sensors and can only supply gray values of the captured scene.
To assign color information, a color filter needs to be inserted in the pixel’s optical path, so that only
certain wavelength ranges reach the pixel. Typically, color filter arrays (CFAs) with red (R), green (G),
and blue (B) color filters are used. To create RGB color information for each pixel, an interpolation with
surrounding pixels having different color filters is performed. It needs to be mentioned that a large amount
of the incident light power is absorbed by the filters, which leads to a lower effective sensitivity.
Depending on whether sensitivity, sharpness, or color fidelity is the focus of development, different
approaches might be beneficial.
The Bayer CFA (RGGB) is a classic color camera concept with sophisticated approaches to color
reconstruction with minimal reduction in the sharpness by color interpolation (Brainard 1994). A Bayer
CFA unit cell is composed of two diametrically opposed green filters and one single red and blue filter (see
Fig. 10).
Also, very common in front view applications is the Red-Monochrome-CFA (RCCC). Three pixels
without color filters (clear – C) are complemented by a red pixel. This preserves the sensitivity of the
pixels and produces a high-resolution gray-scale image. The teh reproduction of color is only possible in
the sense of “non-red,” by comparing the red pixel output to that of the clear ones. The lower the ratio, the
more the object is rated as blue (non-red). In the automotive practice, this is sufficient to distinguish
between front lights (white) and tail lights (red) (Stein et al. 2008).
3.3.5 Electronic Shutter

In digital cameras, there is usually no mechanical shutter which determines the integration time. Instead,
an electronic “shutter” is used, where the pixel is reset just before the integration begins to set it back to its
origin state and read out at the end of the integration time to gather the signal actually generated by the
incident light. Either global shutter or rolling shutter can be used, which differ in the timing (Yadid-Pecht
and Etienne-Cummings 2004; Fiete 2010).
In a global shutter (GS) imager, the integration of all pixels of a pixel array is started and stopped
simultaneously. The advantage of this technology is that a moving object retains, apart from motion blur,
Page 15 of 24
DOI 10.1007/978-3-319-09840-1_20-1
ERS
GS
Time1 Time 2 Time 3 Time 4 Time 5 Full frame
Fig. 11 Difference of electronic rolling shutter (ERS) and global shutter (GS) using the example of a cube with the direction of
movement to the right. ERS records moving objects at different rows at different points in time, GS all the rows at the same time
its shape in the captured image (see Fig. 11, lower illustration). Also randomly pulsed light sources such as
LED vehicle lighting or active traffic signs produce a homogeneous result, as long as the pulses occur
during the integration time.
Disadvantage to the global shutter is the need to store the image information until it is read out. This
results in the need for additional transistors per pixel and analog memory cell arrays (sample and hold
circuit). This leads to parasitic effects, and also additional chip area is required. The resulting poorer fill
factor results in a higher noise level compared to the electronic rolling shutter, where the information is
output directly (Yadid-Pecht and Etienne-Cummings 2004).
In an electronic rolling shutter (ERS) imager, the integration of each pixel is, with an interval of one
clock cycle, individually started and also individually stopped after the integration time (Yadid-Pecht and
Etienne-Cummings 2004; Baxter 2013). The shutter “rolls” across the single pixels over the entire active
pixel array. At each clock cycle, the information of that single pixel, which just stopped its integration, is
available and is read out directly. This eliminates the need of the sample and hold circuit. That is why ERS
has great advantages in the SNR compared to the GS. Furthermore, CDS is easier to integrate with ERS
due to lower number of pixel transistors (Yadid-Pecht and Etienne-Cummings 2004).
Disadvantage of sequentially integrating the pixel is the time gap of the integration between the image
lines. So the upper part of an object that moves horizontally through the image is taken at an earlier point
in time than the lower part, which leads to the well-known effect of distorted or “leaning” objects.
Recording pulsed light sources (active variable traffic signs) results in the effect that short single pulses are
seen only by those few lines which are integrating photons at that time.
Figure 11 shows the position of a moving cube and the line (light gray), whose integration was started at
different times while the ERS image (top) was captured. When the lines are set together, this will result in
a distorted representation of the cube. With GS (below), the integration of all pixels happens at the same
time, resulting in an undistorted representation of the cube.
4 System Architecture
To meet the demands of all required functions, the system architecture requires the correct design of
hardware and software components as well as the image processing algorithms. In addition, the mechan-
ical design of the system and the mechanical and electronic connections with the vehicle are of
importance.
Page 16 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Image Image Communication

recording processing interface
Image
control
Lens Electronics
Mechanics
Fig. 12 System components of a driver assistance camera system
Since camera systems for driver assistance functions are safety-related vehicle components (e.g., by
braking intervention), the system must comply with the ISO norm 26262 (“Road vehicles – Functional
safety ”) (ISO 26262). Here, depending on the function, different automotive safety integrity levels (ASIL)
are necessary to be implemented.
4.1 System Overview

A camera system consists of the components for image acquisition, image acquisition control, image
processing, and the communication to the vehicle. This is schematically illustrated in Fig. 12. Camera
systems can be designed as either a single unit containing all the components or as a system that uses
separate components (such as a camera module with an external image processing unit).
4.1.1 Image Acquisition

Images are captured by one or more camera modules in the vehicle. The image data of the camera module
is influenced by the image acquisition control and then passed to the image processing unit. Also,
additional image (pre-) processing functions can be integrated directly onto the image sensor. This is
called a system on chip (SOC) sensor. However, it must be analyzed whether the increased demand for
power and associated heat generation (and thus increased image sensor noise) make a high integration
meaningful.
In the case of preprocessing on the image sensor or in the case of camera modules with an external
processing unit, the image data is transferred via special interfaces based on the low-voltage differential
signaling (LVDS) standard or via Ethernet in compressed form (e.g., as Mjpeg or H.264 video).
4.1.2 Image Acquisition Control

The control of the camera image sensor is necessary to set the optimal image parameters in all
environmental situations. The two main control systems are the exposure control and the white balance.
To adapt to different lighting situations, the exposure control is designed so that in dark areas, structures
can be detected and that bright areas are not saturated. Of course this requires large dynamic range of the
camera module.
The control of the white balance has the goal to achieve consistent colors even with different color
temperatures of the light source of the scene illumination. For example, white pavement markings must be
recognized as white both in daylight and in tunnels with, e.g., yellowish light.
4.1.3 Image Processing

In the image preprocessing, the preparation of the camera image and the first processing steps take place.
When using an RGB image sensor, a color image is reconstructed in a so-called demosaicing process. As a
further step, a gamma correction takes places, i.e., input values in the image are transferred through a
Page 17 of 24
DOI 10.1007/978-3-319-09840-1_20-1
transformation step in different output values. Background of the operation is either an adaption to a
particular display system or the improved image processing. In the case of a representation on a display, a
noise reduction, an edge enhancement, and color correction are typically performed (Reinhard et al. 2008;
Nakamura 2006).
If the image data is used for machine vision tasks, often a distortion correction, a calculation of the
optical flow, and, in the case of a stereo camera, an image rectification and a creation of the disparity map
are performed. From the preprocessed images, the desired information is extracted in the main image
processing step (see also chapters “▶ Fundamentals of Machine Vision” and “▶ Stereovision for
ADAS”).
4.1.4 Communication
The data exchange with other controllers in the vehicle is realized via the communication interface of the
camera system. Common vehicle bus systems are the controller area network (CAN) bus, the FlexRay
bus, and the Ethernet standard. While CAN and FlexRay bus systems are only used to control the camera
system and to transfer the output data in the form of, e.g., object lists, a transmission of raw image data is
possible when using Ethernet because of the higher data rates.
4.1.5 Electronics
The design of the electronics follows the high standards in the automotive industry, among others, with
regard to durability and electromagnetic immunity and compatibility. The large amounts of data that must
be processed in real time using complex algorithms lead to a system design with several processors or
multi-core processors (Stein et al. 2008). The resulting amount of heat to be dissipated is a challenge in the
vehicle installation space and has to be considered already in the electronics and housing design.
4.1.6 Mechanics
The housing of the camera system is the interface between the electronics and the camera module to the
vehicle. It must be thermally stable and easy to assemble. Moreover, the housing usually forms the
shielding of the electronics for better electromagnetic compatibility. In the case of the front view camera,
the housing is located behind the windshield. To avoid reflections between the windshield and the
housing, a so-called stray light cover is often used. In camera systems where the camera modules are in
direct contact with the environment (e.g., surround view), the housing must also be sealed against
humidity.
4.2 Mono Camera Architecture

A typical mono camera architecture of a front view camera system is shown in Fig. 13. The camera
module is controlled via a communication bus, and the image data is transferred via a parallel interface to
the image processing unit. In this case, a digital signal processor (DSP) is implemented, which can
perform video processing in real time. In other systems, field programmable gate arrays (FPGA) or
dedicated application-specific-integrated circuits (ASIC) are used (Nakamura 2006). The image
processing unit is supported by fast memory modules that are used for the temporary storage of processed
data and in addition for storing multiple pictures in the case of the application of tracking algorithms.
The microcontroller handles the exposure control, the control of the windshield heater, the communi-
cation on the vehicle bus, and other control and monitoring functions. Communication between the
microcontroller and DSP takes place through an interprocess communication (IPC) interface.
Page 18 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Data interface
Camera
DSP
module
Control bus
IPC
Vehicle bus
Microcontroller Heating
Fig. 13 Architecture of a mono camera system
Data
interface
Camera
DSP
module
Control bus
IPC
Vehicle bus
Microcontroller Heating
Camera Control bus
module
IPC
IPC
interface
Data
IPC
FPGA DSP
Fig. 14 Architecture of a stereo camera system
4.3 Stereo Camera Architecture

The main differences between a stereo camera system and a mono camera system are an additional camera
module and further processing units. The complex stereo functions lead to a significantly higher
requirement on processing power.
4.3.1 Structure of a Stereo Camera

The basic structure of a stereo camera is shown in Fig. 14. The two image sensors are controlled by a
single microcontroller. The image signals are transferred for the preprocessing steps to a DSP and an
FPGA. The FPGA is used for the rectification, calculation of the disparity map, and calculation of the
optical flow. The object formation is done on the DSP, i.e., identification of pedestrians, vehicles, etc.
A second DSP is used for ADAS functions like lane departure warning and traffic sign recognition.
Communication with the vehicle is realized by means of a microcontroller. The image processing units are
each connected to fast memory chips.
Page 19 of 24
DOI 10.1007/978-3-319-09840-1_20-1
4.3.2 Differences to Mono Camera Systems

In addition to differences in the image processing, stereo camera systems have additional requirements to
the system architecture. The image recording of both camera modules must be synchronized to avoid an
influence by a temporal and spatial shift of the frames.
Another difference is the increased demand with respect to the assembly and calibration of the camera
system. The two camera modules have to be very precisely aligned to each other in all spatial axes (pitch,
yaw, and roll angles) and then mounted in the housing. Any change in the optical axis of the cameras can
lead to a de-calibration of the system, resulting in failure of the function. Therefore, the housing is
designed to be stable even in varying environmental conditions, and different calibration procedures are
carried out during assembly (see Sect. 5).
4.3.3 Design
An important criterion for the use of a stereo camera system is the accuracy of the depth estimation. Only
with precise depth estimation, functions such as a camera-based emergency brake assist or an automatic
distance control system can be realized reliably. These functions rely on precise distance information from
the vehicle to the object.
Important parameters in a stereo system are the paraxial focal length of the cameras f and the distance
between the cameras (base width b). The so-called disparity d is derived from the distance of the pixels
defined by the projection of lines from the image point P incident on the image sensors.
With the size of a pixel on the image sensor sP, a distance to the object zC can be calculated:
bf
zC ¼
d sP
After derivation, the absolute distance error is
Dd sP z2C
Dz ¼
bf
Accuracies of Dd of less than one pixel can be achieved to minimize the error of the disparity estimation.
An example calculation of a camera system with a focal length of 5 mm, a pixel size of 3.75 mm, a base
width of 200 mm, and a disparity error of 0.25 pixels results in an error of the distance measurement of
2.3 m (4.7 %) at an object distance of 50 m.
Improvements are possible by increasing the resolution of the image sensor using a smaller pixel size, a
larger base width, or a larger focal length. Changing these parameters requires, however, a change in the
system design. The field of view of the camera would be limited by a longer focal length. The base width
should be kept as small as possible because of the requirements of the vehicle design. A smaller pixel size
can lead to a lower sensitivity of the system. In addition, the required computing power increases
significantly with higher resolutions.
5 Calibration
To ensure that driver assistance functions like sign recognition, lane keeping assist, or head light control
work in a reliable and correct manner, it is important to interpret images of the used camera system
correctly. To this end, additional information about the images is necessary which can be determined by
calibration algorithms. Such information, which we denote as calibration parameters, cannot only be used
Page 20 of 24
DOI 10.1007/978-3-319-09840-1_20-1
for better interpretation of the images but also for a compensation of deviations from a defined norm. One
example might be the linearization of an image sensor’s response curve.
This section provides a discussion about the calibration parameters that are typically determined for
driver assistance systems. Furthermore, the section describes where a calibration normally takes place and
how it can be conducted.
5.1 Calibration Parameters

Calibration parameters are variables of a model which is used to precisely describe the camera-based
image acquisition system. For a specific system, the values of these parameters are found by using a
calibration algorithm. One can distinguish between different classes of parameters, depending on the
characteristics of the system to be modeled. The following list gives an overview on the most relevant
parameters in the automotive field and their grouping:
5.1.1 Characterization of the Camera Module
– OECF – opto-electronic conversion function, image sensor response curve

– Noise due to dark current, defect pixels
5.1.2 Geometric Camera Calibration

Intrinsic camera parameters:
– Principal point
– Focal length
– Image distortion
– Pixel scale factors
Extrinsic camera parameters:
– Camera position
– Camera orientation
5.2 Calibration Environments and Calibration Procedures

Where and when a calibration is performed highly depends on the calibration parameters to be determined
and thus the resulting requirements. In principle, a camera-based driver assistance system is calibrated
already during the production process with respect to sensor characterization and intrinsic parameters. For
doing this, target-based calibration setups are installed in production environments which enable mea-
surements with a high reproducibility.
5.2.1 Characterization of the Camera Module: Camera Production Line

To determine the OECF, images with different but defined illumination intensities and constant exposure
times are needed. Alternatively, the illumination intensity on the sensor remains constant, but the
exposure time varies. Both ways have the same effect – the irradiation energy captured by the sensor is
modified in a defined way such that the sensor response can be evaluated and plotted against the
irradiation energy. With such a response curve, the dynamic behavior of the sensor is known. Potential
deviations from the desired behavior can be compensated, and hence, fluctuations in the product quality
are balanced out.
Page 21 of 24
DOI 10.1007/978-3-319-09840-1_20-1
5.2.2 Intrinsic Camera Calibration: Camera Production Line

The calibration procedure during camera production can additionally be used to determine intrinsic
parameters. Typically, a three-dimensional setup with checkerboard targets is used. With the known
setup geometry and the positions of the checkerboard corners in the acquired images, the parameters of the
model to describe the camera can be estimated. In most of the cases, a simple pinhole camera model plus a
model for the lens distortion of low order is already sufficient. A well-known approach of determining
these intrinsic (and additionally extrinsic) parameters is, e.g., the method described by Tsai (1987).
In the case of a stereo camera module, the intrinsic parameters of both cameras are estimated in the
production plant. Additionally, the position and orientation of the cameras to each other are computed.
5.2.3 Extrinsic Camera Calibration: Production Line of the Vehicle

During its production, the camera is not mounted at its final position in the vehicle. Hence, a calibration of
the extrinsic parameters cannot be done at this point – but it can be done in the production line of the
vehicle, as soon as the camera is mounted. With a simple target which is placed at a known position with
respect to the camera, the viewing orientation of the camera can be determined. The position of the camera
is derived from construction data or it is measured by using external devices.
5.2.4 Extrinsic Camera Calibration: During Vehicle Operation (Online)

The step of extrinsic camera calibration in the production line of the vehicle is not always practical for
vehicle manufacturers. In addition, the camera orientation to world is not always constant over the lifetime
of the vehicle. Different states of vehicle loading can have influence on the camera orientation; but the
extrinsic parameters need to be precise also in these cases. Hence, the orientation of the camera is being
determined also during vehicle operation. Depending on the capture range of the used calibration
procedure and the mounting tolerances of the camera module, the calibration in the production line of
the vehicle can thus be omitted.
To determine such a calibration online, several approaches exist. Of big advantage is to use results of
driver assistance functions which are running on the system anyway and to derive calibration parameters
from their outputs. An example to calibrate by using such a “structure from motion” approach is described
in Civera et al. (2009).
5.2.5 Stereo Camera Calibration: During Vehicle Operation (Oline)

Stereo cameras additionally need a particularly precise calibration of the camera orientation and position
to each other. Although these parameters have been determined during camera production, there is a need
to adjust them during normal camera operation. Mechanical influences and temperature changes could
have a severe negative effect on the measurement accuracy otherwise. In addition to the aforementioned
calibration procedures for mono cameras, approaches especially tailored to stereo cameras can be used.
These methods tune the matrix used for the transformation from left camera image to right camera image
iteratively in a way that feature correspondences which are extracted from the images meet the so-called
epipolar constraint (Zhang 1998).
6 Outlook
The technology in the field of camera systems is developing rapidly. One reason is the area of consumer
electronics with its demand for ever more powerful and less expensive image sensors, camera lenses, and
computing platforms. These advances will be used in future also in the vehicle environment. Greater
computing power enables both advances in image processing as well as the usage of larger resolutions of
Page 22 of 24
DOI 10.1007/978-3-319-09840-1_20-1
the image sensors. Higher refresh rates and improved sensitivities are additional developments in the field
of camera sensors. With these technological advances, camera systems will be used in the future in various
applications in the vehicle.
References
Brainard DH (1994) Bayesian method for reconstructing color images from trichromatic samples. In:
Proceedings of the IS&T 47th annual meeting, Rochester, NY, p. 375–380
Baxter D (2013) A line based HDR sensor simulator for motion artifact prediction. Proc SPIE
8653:86530F
Civera J, Bueno DR, Davison AJ, Montiel JMM (2009) Camera self-calibration for sequential bayesian
structure from motion. In: Proceedings of the 2009 IEEE international conference on Robotics and
Automation, Kobe, Japan
Daimler (2014) Homepage Daimler. http://www.mercedes-benz.com. Accessed 10 Jan 2014
Darmont A (2012) High dynamic range imaging, sensors and architectures. SPIE Press, Washington
Delphi (2014) Homepage Delphi. http://delphi.com/manufacturers/auto/safety/active/racam/. Accessed
10 Jan 2014
El Gamal A, Eltoukhy H (2005) CMOS image sensors. IEEE Circuits Devices Mag 21(3):6
Fiete RD (2010) Modelling the imaging chain of digital cameras. SPIE Press, Bellingham
Fischer RE (2008) Optical system design. McGraw-Hill, New York
Gentex (2014) Homepage Gentex. http://www.gentex.com/automotive/products/forward-driving-assist.
Accessed 10 Jan 2014
Hecht E (1998) Optics. Addison Wesley Longman, New York
Hertel D (2010) Extended use of incremental signal-to-noise ratio as reliability criterion for multiple-slope
wide-dynamic-range image capture. J Electron Imaging 19(1):011007
Holst GC, Lomheim TS (2011) CMOS/CCD sensors and camera systems. SPIE Press, Washington
ISO 26262: Road vehicles – functional safety
ISO/DIS 16505: Road vehicles – ergonomic and performance aspects of camera-monitor
systems – requirements and test procedures
K€allhammer J (2006) Night vision: requirements and possible roadmap for FIR and NIR systems. Proc
SPIE 6198:61980F
Loce RP, Berna I, Wu W, Bala R (2013) Computer vision in roadway transportation systems: a survey.
J Electron Imaging 22(4):041121
Miller JWY, Murphey YL, Khairallah F (2004) Camera performance considerations for automotive
applications. Proc SPIE 5265:163
MIPI (2014) Homepage MIPI-Alliance. http://www.mipi.org/specifications/camera-interface. Accessed
10 Jan 2014
Nakamura J (2006) Image sensors and signal processing for digital still cameras. CRC Press, Boca Raton
Raphael E, Kiefer R, Reisman P, Hayon G (2011) Development of a camera-based forward collision alert
system. SAE Int J Passenger Cars Mech Syst 4(1):467
Reinhard E, Khan EA, Aky€ uz AO, Johnson GM (2008) Color imaging. A.K. Peters, Wellesley
Sinha PK (2012) Image acquisition and preprocessing for machine vision systems. SPIE Press,
Washington
Solhusvik J, Yaghmai S, Kimmels A, Stephansen C, Storm A, Olsson J, Rosnes A, Martinussen T,
Willassen T, Pahr PO, Eikedal S, Shaw S, Bhamra R, Velichko S, Pates D, Datar S, Smith S, Jiang L,
Page 23 of 24
DOI 10.1007/978-3-319-09840-1_20-1
Wing D, Chilumula A (2009) A 1280 960 3.75um pixel CMOS imager with triple exposure
HDR. In: Proceedings of 2009 international image sensor workshop, Utah, USA
Solhusvik J, Kuang J, Lin Z, Manabe S, Lyu J, Rhodes H (2013) A comparison of high dynamic range CIS
technologies for automotive applications. In: Proceedings of 2013 international image sensor work-
shop, Utah, USA
Stein GP, Gat I, Hayon G (2008) Challenges and solutions for bundling multiple DAS applications on a
single hardware platform. Israel computer vision day
Theuwissen AJP (2008) Course “digital camera systems” – hand out. CEI.se, Finspong
Tsai RY (1987) A versatile camera calibration technique for high-accuracy 3D machine vision metrology
using off-the-shelf TV cameras and lenses. IEEE J Robot Autom 3(4):323
Yadid-Pecht O, Etienne-Cummings R (2004) CMOS imagers: from phototransduction to image
processing. Kluwer, Dordrecht
Zhang Z (1998) Determining the epipolar geometry and its uncertainty: a review. Int J Comput Vis
27(2):161
Page 24 of 24

978 3 319 09840 1 - 20 1

Uploaded by

Copyright:

Available Formats

978 3 319 09840 1 - 20 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

978 3 319 09840 1 - 20 1

Uploaded by

Copyright:

Available Formats

Handbook of Driver Assistance Systems

Automotive Camera (Hardware)

1.1 Driver and Interior Detection

1.1.1 Driver Monitoring and Gaze Control

1.1.2 HMI: Hand Gesture Recognition

1.2 Environmental Detection

Fig. 2 Environmental detection by different sensor systems

Fig. 3 Combination of a camera system with a lidar sensor (Continental SRLCam)

1.2.1 Front View Cameras

1.2.2 Cameras for Detection of the Surroundings of the Car

2 Cameras for Driver Assistance Systems

2.1 Criteria for the Design

Lens Image sensor

Fig. 4 Diagram of a basic camera architecture

2.1.1 Field of View

2.1.2 Camera Resolution

Fig. 7 Importance of color separation for the detection of lane markers

2.1.3 Color Reproduction

2.1.4 Dynamic Range

Fig. 8 Illustration of a trafﬁc scene with a high dynamic range

DRðdBÞ ¼ 20  log10 ðLMAX =LMIN Þ

3.1 Construction of a Camera Module

Fig. 9 Schematic structure of a camera module

3.1.2 Packaging Technology

3.2.1 Lens Design for Automotive Applications

3.2.2 Optical Requirements for Driver Assistance Applications

3.3 Image Sensor

3.3.1 Sensitivity and Noise

3.3.3 Dynamic Range

Fig. 10 Scheme of a CMOS image sensor with RGB color ﬁlter

3.3.4 Color Reproduction

3.3.5 Electronic Shutter

Time1 Time 2 Time 3 Time 4 Time 5 Full frame

Image Image Communication

Fig. 12 System components of a driver assistance camera system

4.1 System Overview

4.1.1 Image Acquisition

4.1.2 Image Acquisition Control

4.1.3 Image Processing

4.2 Mono Camera Architecture

Fig. 13 Architecture of a mono camera system

Fig. 14 Architecture of a stereo camera system

4.3 Stereo Camera Architecture

4.3.1 Structure of a Stereo Camera

4.3.2 Differences to Mono Camera Systems

After derivation, the absolute distance error is

5.1 Calibration Parameters

5.1.1 Characterization of the Camera Module

– OECF – opto-electronic conversion function, image sensor response curve

5.1.2 Geometric Camera Calibration

Extrinsic camera parameters:

5.2 Calibration Environments and Calibration Procedures

5.2.1 Characterization of the Camera Module: Camera Production Line

5.2.2 Intrinsic Camera Calibration: Camera Production Line

5.2.3 Extrinsic Camera Calibration: Production Line of the Vehicle

5.2.4 Extrinsic Camera Calibration: During Vehicle Operation (Online)

5.2.5 Stereo Camera Calibration: During Vehicle Operation (Oline)

You might also like

DRðdBÞ ¼ 20 log10 ðLMAX =LMIN Þ