Monocular Camera Based Computer Vision System For Cost Effective Autonomous Vehicle

IEEE - 45670
Monocular Camera based Computer Vision System

for Cost Effective Autonomous Vehicle
Ashish Pidurkar Ranjit Sadakale Dr. A.K. Prakash
Electronics and Telecommunication Electronics and Telecommunication Advance Engineering
Govt. College of Engineering Pune Govt. College of Engineering Pune Hella India Automotive Pvt. Ltd.
Pune, India Pune, India Pune, India
ashishpidurkar7@gmail.com sadakaler.extc@coep.ac.in Ak.Prakash@hella.com
Abstract— According to the world health organization, there Radar, camera, ultrasonic sensors, IMU and the techniques for
were 1.25 million road accident deaths globally in 2013. To the sensor fusion like vision-LIDAR/Radar, Visio-LiDAR,
decrease this count, it has been said that autonomous vehicles can GPS-IMU, RSSI-IMU are discussed by the author in detail[2].
play a major role. The autonomous vehicle uses different sensors It is important to get maximum information which is highly
such as LIDAR, RADAR, Ultrasonic sensors, and cameras for
detection of lane lines, objects, traffic sign, traffic lights, and
accurate and reliable, for which sensor fusion can help a lot.
driving path. There are several successful projects on autonomous The sensors are important to locate the vehicles in the space.
vehicles, but the main limitations are high costs and regulations. Also, the location and the placement of the sensors on the
The camera can be a cost-effective solution as compare to LIDAR vehicle should be properly analyzed and optimized. The single
in all the basic detection needed for autonomous driving. The sensor may not work properly in adverse weather and
proper placement of sensors on the vehicle plays an important role surrounding conditions, so the sensor fusion can help in such
in efficient design, working and reducing the cost of the overall cases. The positioning of the sensors is discussed based on the
autonomous vehicle system. This paper discusses the parameters field of view, range, and direction of the sensors [3]. The
considered while deciding the positioning of the camera, and the object distance is estimated by calibrating the multiple LIDAR
geometric calculation for the best positioning of a monocular
camera on an autonomous vehicle.
sensors placed on the vehicle based on their relative positions
and angle of servo motor at the time when the object is
Keywords—autonomous vehicle, camera position, processing, detected[4]. It uses the three LIDAR sensors to estimate the
objects, hardware, algorithm. object distance. But again because of the high cost of the
LIDAR sensors, this system is costly. Using the monocular
I. INTRODUCTION camera sensor and the relatively real world to image pixels
The research in the field of the autonomous vehicle had mapping, the approximate object distance is estimated. This
started with the view of passenger’s safety and better road technique also can be utilizing for vehicle parking. After the
transport. The high-level autonomous vehicles can drive itself detection of an object, using triangle symmetry and focal
safely on the road without any intervention of the driver. An length of the camera, the object distance is estimated [5]. By
autonomous vehicle consists of a variety of sensors like detecting the lane lines, the vehicle distance from both the
ultrasonic sensors, RADAR, LIDAR, Infrared, Camera lines is calculated to decide the position of a vehicle. The
sensors, etc., used for objects detection around the vehicle. It camera calibration is to be used and it is considered as the
is also equally important to select the proper sensors based on drawback of this method. The birds-eye view method is also
the scope of application to make the system highly efficient used for more accurate results of the object distance estimation
and cost-effective. Globally there are lots of success stories of [6][7]. In today’s world, the automotive industry is moving
autonomous vehicles but still, the world is waiting a long to towards smarter and luxury vehicles that have multiple
watch it on the roads on a regular basis. The reason behind it
functionalities of actuation, safety, entertainment, luxury, etc.
is Vehicle costs and government regulations. This paper
discusses a variety of sensors being used in an autonomous So, no of microcontrollers in the system is increasing day by
vehicle, the effectiveness of camera and role of system day. The communication between the controllers must be fast
parameters such as the maximum speed of vehicle, processing and reliable. CAN, LIN, FlexRay is the communication
capability of the hardware, etc. to decide the positioning of the protocols which are satisfying the need for communication in
camera. high-end smart cars. The Gateway embedded system is
developed for the intercommunication among the LIN, CAN,
II. RELATED WORKS and FlexRay protocols [8][9]. The lane lines detection
The architecture design for the Autonomous vehicle is the technique is explained with the help of features extraction and
first step of the design and it’s important to have the better polynomial fitting of the detected lane pixels [10] [11]. HOG
architecture design to reach out the good results with smooth and SVM are used to detect and recognize objects like
and error-free moves throughout the project development. The pedestrians, traffic signs, vehicles, etc. HOG is used for
design parameters considered are sensors architecture plan, feature extraction and the SVM classifier is trained on datasets
software architecture, simultaneous localization of vehicle for the detection of desired objects[12]. The accuracy of this
and trajectory planning, object detection algorithms, and method is good but still, the false positive error must be
hardware architecture[1][17]. The different sensor reduced, and it also requires more parameters to be tuned.
architectures are discussed for different applications with
advantages and disadvantages of the sensor architecture. The As per the several reports on road transport safety, there
sensors placement locations and its deciding parameters are are millions of road accident deaths happens every year. So
not discussed. The different types of sensors like LIDAR, the invention of autonomous vehicle has started to improve
10th ICCCNT 2019

July 6-8, 2019, IIT - Kanpur,
Kanpur, India
IEEE - 45670
the safety of passengers. The major problem for opting conditions, the light intensity can affect the performance of the
autonomous vehicle today is its high cost. The paper has focus camera sensor. The potential camera system can be increased
on the low-cost technology for autonomous vehicle. if we used it along with other sensors like Radar or Lidar. The
This work includes the detailed method of deciding the stereo camera can measure the object depth more accurately,
mounting position of the monocular camera sensor for the but it has a drawback such as high cost, low range of detection
autonomous vehicle. While deciding we take many and difficult to implement. The monocular camera has
parameters into consideration for deciding optimistic camera parameters as listed below:
mounting position on the vehicle which will reduce the cost of 1) Field of view (FOV): It is the maximum area covered
the overall system by mean of reducing the no of sensors and by the vision of a camera sensor. A fixed focus camera cannot
processing power. Many researchers discuss the parameters change its FOV while the variable focus camera has a
like sensor FOV, sensor directions, sensor range, which are provision to change it.
not enough to get the optimized location of the sensor[8][9]. 2) Frames per second (FPS): The maximum rate at
For camera position selection we consider the parameters like which the camera can capture the image is known as frames
Processing speed of hardware, breaking time of vehicle, per second. It is depending on the encoding and decoding
camera sensor specifications, communication medium delay, speed of the camera sensor.
a maximum speed of the vehicle, the maximum height of the 3) Video capture ratio: It is the ratio of a number of pixels
objects to be detected by a camera, path planning requirement in width to the pixels in the hight of the images in the video.
and blind spots detection.
4) Focal length: It is the distance between the image
III. MAJOR SENSORS AND THEIR USE sensor and the lenses of the camera.
The autonomous vehicle uses sensors like ultrasonic, IV. CAMERA CALIBRATION
Radar, Lidar, Infrared, IMU, Camera, etc. [1]. They are used
In the camera calibration we estimate the camera
to collect the surrounding information more accurately based
parameters as listed below and use them to correct the lens
on which the vehicle trajectory is planned. The sensors of our
distortion and measure the size and distance of an object in 3D
interest are Lidar and monocular cameras.
world units.
Most of the development done in the field of the
autonomous vehicle was based on the Lidar sensor. The use A. Camera Intrinsic parameters
of the Lidar is for object detection which gives the 360-degree Focal length, Principal point, Camera skew coefficient.
3D view of the surrounding environment[4]. But when the
need comes to recognize the specific objects like cars, B. Camera Extrinsic parameters
pedestrians, road signs, lane detection, the Lidar system fails The extrinsic parameters represent the location of the camera
to satisfy it. And this is where the camera system comes into in the 3-D scene. Parameters are Rotation vectors matrix,
the picture. The monocular camera sensors are much Translation vectors matrix
inexpensive than Lidar. It reads the 2D surrounding view in
electronic pixels format on which we can do further C. Camera distortion coefficients
processing to recognize the specific objects. Radial distortion coefficients (K1, K2, K3), Tangential
distortion coefficients (P1, P2).
A. LIDAR Sensor
Lidar has high accuracy, precision, and speed but has a We calculate the 3D world points and their respective 2D
high cost and less availability. It works with the concept of image points of chessboard cross points using the function
time of flight (TOF). It sends out the lesser light pulse and cv2.findchessboardcorners() of OpenCV library. Here we
measures the time when it receives it back. It can create a 360- input the 3D world points e.g. (0,0,0), (25, 0, 0), (50, 0, 0) ….,
degree 3D picture of the surrounding and can be used to (25, 25, 0), (25, 50, 0) …., etc., considering that chessboard is
measure the object size and the distance from the source moved on x-y plane only (z = 0) of the world coordinate
vehicle[2]. It has a high range of distance measurement around system[13]. (where 25mm is the distance between two
300 meters with good accuracy and precision. chessboard cross points.), and we get their respective 2D pixel
B. Camera Sensors co-ordinates.
CMOS technology cameras are mostly used in
autonomous vehicles. The reflected light rays are captured on
each pixel of the camera sensor chip. The relative signals get
generated at the output of each pixel which gets amplified for
further processing and realizing the actual picture of the view.
The camera has the ability to see and recognize the colors and
textures[3]. These advantages of the camera allow the
autonomous vehicle to identify different road signs, lane
markings, vehicles, pedestrians, traffic signs, etc. The cost of
the camera is very less than the Lidar sensors. It requires the
high output processing power hardware, but the combined
cost of camera and processing hardware is still less than that
Fig. 1: Camera calibration and pixels mapping [13]
of the Lidar sensor. Like others, the camera sensor is also
having a dark side. The parameters like bad weather
10th ICCCNT 2019

Kanpur, India
IEEE - 45670
Now we use these 2D and corresponding 3D points as an By solving the geometry, we get,
input to the function cv2.calibratecamera() of OpenCV, which ℎ
returns the all required camera parameters i.e. intrinsic matrix, 𝑑= 𝑦 − 𝑦0 [2]
extrinsic matrix (rotational matrix and translation vector), 𝑡𝑎𝑛(𝛼 + 𝑡𝑎𝑛−1 ( ))
𝑓
and distortion coefficients. So, we are done with the camera
calibration. ℎ
Extrinsic parameters and intrinsic parameters together 𝑑= 𝑣 − 𝑣0 [3]
form the camera matrix, using which can calculate the 2D 𝑡𝑎𝑛(𝛼 + 𝑡𝑎𝑛−1 ( ))
𝑓/𝑑𝑦
pixel coordinates of 3D world points of any detected image
captured by the same camera[13][16]. Now we can use these
image points to remove the distortion in our image. If we VI. PARAMETERS IN CONSIDERATION
know the distance of calibration chessboard images from To decide the optimum location of the monocular camera
camera then using these image points (x, y) and triangle for an autonomous vehicle we should take the following
symmetry we can calculate the real size and distance of a parameters in consideration.
detected object in front of the camera.
A. Hardware Image Processing time (Tp)
V. OBJECT DISTANCE ESTIMATION WITH TRIANGLE In the camera system of an autonomous vehicle, the
SYMMETRY processing hardware is used to process captured images to
Using a monocular camera, it is possible to estimate the detect the desired objects such as pedestrians, cars, lane lines,
distance of objects present in front of the autonomous etc. in the image and produce the required outputs. The high
vehicles. The Image captured is processed with the help of processing power GPU’s such as Nvidia GPU’s are used to
image processing and computer vision methods to get the get high-speed processing performance[15]. Based on the
information of desired objects present in the image. For the experimental results, Nvidia could be processing the
detected objects, bounding boxes around them and the corner maximum of 12 frames per seconds of size 720x480 to detect
pixel coordinates of those bounding boxes are calculated [12]. lane lines, pedestrians, and cars.
In the camera calibration process, the known size objects are
kept in front of the camera and then by relating their size with
size in image pixels, the focal length of the camera is
calculated [5][6].
Fig. 2: Camera calibration to estimate the focal length[5].

Fig. 4: Detection results with Processing speed.
Now using the triangle symmetry method, it is possible to
calculate the object distance from the source vehicle. The B. Camera specifications
method used is as explained below. 1) Field of view (FOV): It is defined by the horizontal and
vertical(α) angle of the camera with which it can capture the
view.
2) Frames per second (FPS): It is the speed of the camera
sensor with which it can capture the images.
C. Communication Medium delay (Tc)
There is a large network inside the autonomous vehicle
system to communicate among the processing Hardware,
controlling Hardware, and I/O devices. The communication
Fig. 3: Object distance estimation using triangle symmetry[6]. protocols can be like LIN, CAN, FlexRay, SPI, etc. [9]. CAN
protocol has a data rate speed of around 1Mbps.
In the above figure let the object is standing at the location
‘O’, and the camera image plane is centered at the origin of
the 2D coordinate system. Then with triangle symmetry
calculations as below, we can estimate the object location in
3d world points with good accuracy.
d = estimated distance (mm).

Let, X0 = Y0 = 0,
Fig. 5: Autonomous vehicle hardware Network.
X = (u-u0) dx, Y = (v-v0) dy [1]
dx, dy = pixel’s length in mm.
10th ICCCNT 2019

Kanpur, India
IEEE - 45670
D. Trajectory planning time (St)

The processed image outputs and the sensor outputs are fed
to the trajectory planning algorithm. Based on these inputs and
the localization of the vehicle, the trajectory path gets planned
by the planning algorithm. This planned path is followed by
the vehicle to move forward safely.
E. Actuation control hardware processing time (Tac)
It is the time required to process the output of trajectory
planning and produce the actuation control signals.
F. Actuation and control distance or Breaking distance
(Db) Fig. 6: Geometry for camera positioning.
Based on the planned trajectory, it is the responsibility of
the actuation control system to control the motors and Time take for object detection and producing actuation
actuators to run the vehicle on a planned trajectory at its signals is, t1 = Tp+ St+ Tac +Tc [4]
maximum allowed velocity. It can be calculated using the trial
and error method or by considering the inertia of the vehicle. 𝑆𝑣
It can be considered as a maximum braking distance. Vehicle travelling distance for time t1 is, d1 = [5]
𝑡1
G. Minimum space to avoid the object (Da) Minimum distance vehicle requires for breaking and take a
The minimum space to avoid the object can be calculated safe path or to tackle the object of max defined size is,
based on the width and the length of the source vehicle and
the considered maximum dimensions of the object the camera d2 = Da + Db [6]
is going to detect.
Once the objects detected in the frame the require decisions
H. Maximum speed of vehicle (Sv)
will be taken by the system. As the objects are not stationary
The maximum speed of the vehicle at which it is allowed to and other environmental conditions will keep on changing,
run in all favorable conditions is an important factor to be The decisions of a vehicle will keep on changing with respect
considered while deciding the position of the camera sensor. to successive frames. So before capturing the next frame, the
I. Maximum blind space allowed (Bd m/s) vehicle might have traveled for some distance and it is
necessary to take it in consideration.
When the camera is fixed at the specific height, based on
the tilt angle of the camera the maximum range f camera and
Let, the camera speed = Cs fps. [8]
the blind space gets decided. If we increase the range the blind
space near the vehicle gets increased and vice-versa. This is to 𝑆𝑣 𝑚𝑒𝑡/𝑠𝑒𝑐
be selected based on the complete sensor’s architecture on the Then the distance between each frame is, d3 = [9]
𝐶𝑠
vehicle.
J. Minimum acceptable resolution By observing the geometry, we can write,
We can achieve the maximum range of detection as well as ℎ − 𝐻𝑜
𝑡𝑎𝑛 𝜃 = [10]
less blind space by increasing the height of the camera sensor 𝑑1 + 𝑑2 + 𝑑3
on the vehicle. But then the image resolution gets decreases as
we increase the height of the camera. Due to bad image 𝐵𝑑 [11]
quality, object detection can be wrong. 𝑡𝑎𝑛((90 − 𝛼) − 𝜃) =
ℎ
VII. CAMERA POSITION ESTIMATION Equation [11] can be written as,
All the above parameters are important to be considered 𝑡𝑎𝑛(90 − 𝛼) − 𝑡𝑎𝑛 𝜃 𝐵𝑑
= [12]
while deciding the camera position for autonomous vehicles.
1 + 𝑡𝑎𝑛(90 − 𝛼) ⋅ 𝑡𝑎𝑛 𝜃 ℎ
The camera position will be selected such that it satisfies the
best of all the above parameters. Our target here is to calculate
By substituting the known values and solving these equations,
the height(h) and tilt angle of the camera(θ).
we get the optimum position of the camera in the form of its
Let the maximum height of the objects to be detected by the
height(h) and tilt angle (θ), which best suits the requirement
camera is ‘Ho’. The reason behind keeping the top view of
for positioning the camera at its optimum position.
camera FOV exactly at ‘Ho’ is to capture the information
exactly which we need for detection and neglecting VIII. CONCLUSION
unnecessary information, it helps us to cover more blind spot
It is experimentally proved that the camera mounted at a
by keeping the camera at minimum height to preserve the
calculated position gives improved results. The optimized
information in the image with better resolution.
position of the camera covers the efficient view of the region
of detection and reduces the number of sensors required. So,
the data processing to be done and the cost of the overall
system is decreased. The camera was able to estimate the
10th ICCCNT 2019

Kanpur, India
IEEE - 45670
object’s distance more accurately. Using a fusion of camera development process. IEEE Transactions on Industrial
Electronics, 61(12), pp.7131-7140.
system output with the other sensors, the results can be
[18] Liu, L., Li, H., Dai, Y. and Pan, Q., 2018. Robust and efficient relative
improved further. The monocular camera have some pose with a multi-camera system for autonomous driving in highly
limitations such that its FOV is not much wider and may limit dynamic environments. IEEE Transactions on Intelligent
to the position of camera at higher minimum height. Transportation Systems, 19(8), pp.2432-2444.
[19] Jeon, J., Hwang, S.H. and Moon, H., 2016, August. Monocular vision-
based object recognition for autonomous vehicle driving in a real
REFERENCES driving environment. In 2016 13th International Conference on
Ubiquitous Robots and Ambient Intelligence (URAI) (pp. 393-399).
[1] Zong, W., Zhang, C., Wang, Z., Zhu, J. and Chen, Q., 2018. IEEE.
Architecture design and implementation of an autonomous
vehicle. IEEE Access, 6, pp.21956-21970. [20] Gu, S., Lu, T., Zhang, Y., Alvarez, J.M., Yang, J. and Kong, H., 2018.
3-D LiDAR+ Monocular Camera: An Inverse-Depth-Induced Fusion
[2] Campbell, S., O'Mahony, N., Krpalcova, L., Riordan, D., Walsh, J., Framework for Urban Road Detection. IEEE Transactions on
Murphy, A. and Ryan, C., 2018, June. Sensor Technology in Intelligent Vehicles, 3(3), pp.351-360.
Autonomous Vehicles: A review. In 2018 29th Irish Signals and
[21] Dalal, N. and Triggs, B., 2005, June. Histograms of oriented gradients
Systems Conference (ISSC) (pp. 1-4). IEEE.
for human detection. In international Conference on computer vision &
[3] Taraba, M., Adamec, J., Danko, M. and Drgona, P., 2018, May. Pattern Recognition (CVPR'05) (Vol. 1, pp. 886-893). IEEE Computer
Utilization of modern sensors in autonomous vehicles. In 2018 Society.
ELEKTRO (pp. 1-5). IEEE.
[22] Bilal, M. and Hanif, M.S., 2019. Benchmark Revision for HOG-SVM
[4] Singh, S.A.A.K., Negi, A. and Mudali, S., 2017, January. Analysis of Pedestrian Detector Through Reinvigorated Training and Evaluation
automatic sensing model of an autonomous vehicle. In 2017 Methodologies. IEEE Transactions on Intelligent Transportation
International Conference on Inventive Systems and Control Systems.
(ICISC) (pp. 1-5). IEEE.
[5] Megalingam, R.K., Shriram, V., Likhith, B., Rajesh, G. and Ghanta, S.,
2016, January. Monocular distance estimation using pinhole camera
approximation to avoid vehicle crash and back-over accidents. In 2016
10th International Conference on Intelligent Systems and Control
(ISCO) (pp. 1-5). IEEE.
[6] Ali, A.A. and Hussein, H.A., 2016, May. Distance estimation and
vehicle position detection based on monocular camera. In 2016 Al-
Sadeq International Conference on Multidisciplinary in IT and
Communication Science and Applications (AIC-MITCSA) (pp. 1-4).
IEEE.
[7] Rezaei, M., Terauchi, M. and Klette, R., 2015. Robust vehicle detection
and distance estimation under challenging lighting conditions. IEEE
Transactions on Intelligent Transportation Systems, 16(5), pp.2723-
2743.
[8] Kim, S.H., Seo, S.H., Kim, J.H., Moon, T.M., Son, C.W., Hwang, S.H.
and Jeon, J.W., 2008, July. A gateway system for an automotive
system: LIN, CAN, and FlexRay. In 2008 6th IEEE International
Conference on Industrial Informatics (pp. 967-972). IEEE.
[9] Talbot, S.C. and Ren, S., 2009, June. Comparision of fieldbus systems
can, ttcan, flexray and lin in passenger vehicles. In 2009 29th IEEE
International Conference on Distributed Computing Systems
Workshops (pp. 26-31). IEEE.
[10] Hernández, D.C., Filonenko, A., Shahbaz, A. and Jo, K.H., 2017, July.
Lane marking detection using image features and line fitting model.
In 2017 10th International Conference on Human System Interactions
(HSI) (pp. 234-238). IEEE.
[11] Bottazzi, V.S., Borges, P.V. and Jo, J., 2013, June. A vision-based lane
detection system combining appearance segmentation and tracking of
salient points. In 2013 IEEE Intelligent Vehicles Symposium (IV) (pp.
443-448). IEEE.
[12] Pang, Y., Zhang, K., Yuan, Y. and Wang, K., 2014. Distributed object
detection with linear SVMs. IEEE transactions on cybernetics, 44(11),
pp.2122-2133.
[13] What Is Camera Calibration? - MATLAB & Simulink- MathWorks
India. [online] In.mathworks.com. Available at:
https://in.mathworks.com/help/vision/ug/camera-calibration.html
[Accessed 21 April 2019]
[14] World Health Organization. (2019). Number of road traffic deaths.
[online]Availableat:https://www.who.int/gho/road_safety/mortality/n
umber_text/en/ [Accessed 15 March 2019].
[15] NVIDIA. (2019). Self-Driving Cars Technology & Solutions from
NVIDIA Automotive. [online] Available at:
https://www.nvidia.com/en-us/self-driving-cars/ [Accessed 7 April
2019].
[16] Ramalingam, S. and Sturm, P., 2017. A unifying model for camera
calibration. IEEE transactions on pattern analysis and machine
intelligence, 39(7), pp.1309-1319.
[17] Jo, K., Kim, J., Kim, D., Jang, C. and Sunwoo, M., 2014. Development
of autonomous car—Part I: Distributed system architecture and
10th ICCCNT 2019

Kanpur, India

Monocular Camera Based Computer Vision System For Cost Effective Autonomous Vehicle

Uploaded by

Copyright:

Available Formats

Monocular Camera Based Computer Vision System For Cost Effective Autonomous Vehicle

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Monocular Camera Based Computer Vision System For Cost Effective Autonomous Vehicle

Uploaded by

Copyright:

Available Formats

IEEE - 45670

Monocular Camera based Computer Vision System

10th ICCCNT 2019

10th ICCCNT 2019

Fig. 2: Camera calibration to estimate the focal length[5].

d = estimated distance (mm).

10th ICCCNT 2019

D. Trajectory planning time (St)

10th ICCCNT 2019

10th ICCCNT 2019

You might also like