Nothing Special   »   [go: up one dir, main page]

CN112529011B - Target detection method and related device - Google Patents

Target detection method and related device Download PDF

Info

Publication number
CN112529011B
CN112529011B CN202011455234.7A CN202011455234A CN112529011B CN 112529011 B CN112529011 B CN 112529011B CN 202011455234 A CN202011455234 A CN 202011455234A CN 112529011 B CN112529011 B CN 112529011B
Authority
CN
China
Prior art keywords
data
point cloud
cloud data
feature
image data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011455234.7A
Other languages
Chinese (zh)
Other versions
CN112529011A (en
Inventor
欧勇盛
熊荣
江国来
王志扬
瞿炀炀
徐升
赛高乐
刘超
吴新宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202011455234.7A priority Critical patent/CN112529011B/en
Publication of CN112529011A publication Critical patent/CN112529011A/en
Application granted granted Critical
Publication of CN112529011B publication Critical patent/CN112529011B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a target detection method and a related device. The target detection method comprises the following steps: acquiring image data and point cloud data; respectively carrying out feature extraction on the image data and the point cloud data to obtain first feature data corresponding to the image data and second feature data corresponding to the point cloud data; fusing the first characteristic data and the second characteristic data to obtain third characteristic data; and obtaining a target detection result according to the third characteristic data. The method can meet all-weather requirements, and the cost is greatly reduced.

Description

Target detection method and related device
Technical Field
The invention relates to the technical field of robots, in particular to a target detection method and a related device.
Background
In the running process of the train, a front obstacle becomes one of main factors affecting the running safety of the train; for this reason, how to detect the front obstacle has received a lot of attention.
At present, an image recognition sensing system or a radar sensing system carried by a train is generally utilized to sense the surrounding environment so as to detect obstacles in front of the train; however, in the method for detecting the obstacle by the vision-based image recognition sensing system, when the train is in a weak illumination condition, the robustness is low, so that the system cannot meet all-weather detection requirements; and detect the obstacle based on radar sensing system, if adopt high pencil lidar, then the cost is higher, and if adopt low pencil lidar, its resolution ratio and refresh rate are lower, can't satisfy the requirement.
Disclosure of Invention
The application provides a target detection method and a target detection device, and the target detection method can solve the problems that an all-weather detection requirement cannot be met based on an image recognition sensing system and the cost is high based on a radar sensing system in the existing method.
In order to solve the technical problems, the application adopts a technical scheme that: a target detection method is provided. The method comprises the following steps: acquiring image data and point cloud data; respectively carrying out feature extraction on the image data and the point cloud data to obtain first feature data corresponding to the image data and second feature data corresponding to the point cloud data; fusing the first characteristic data and the second characteristic data to obtain third characteristic data; and obtaining a target detection result according to the third characteristic data.
The method for extracting the characteristics of the image data and the point cloud data respectively to obtain first characteristic data corresponding to the image data and second characteristic data corresponding to the point cloud data comprises the following steps: inputting the image data and the point cloud data into a trained network model, carrying out feature extraction on the image data by utilizing a first feature extraction module in the network model to obtain first feature data of corresponding image data, and carrying out feature extraction on the point cloud data by utilizing a second feature extraction module in the network model to obtain second feature data of corresponding point cloud data; fusing the first characteristic data and the second characteristic data to obtain third characteristic data, wherein the method comprises the following steps: and fusing the first characteristic data and the second characteristic data by utilizing a characteristic fusion module in the network model to obtain third characteristic data.
The first feature extraction module and/or the second feature extraction module has a ResNet structure.
The first feature extraction module and/or the second feature extraction module comprises a plurality of convolution layers, and each convolution layer comprises a BN layer connected with the convolution layer.
The method comprises the steps of utilizing a feature fusion module in a network model to fuse the first feature data and the second feature data to obtain third feature data, and further comprising: creating a fusion loss function; fitting the third characteristic data by using a fusion loss function to obtain a loss value; the loss value is constrained to correct the third characteristic data.
Wherein the method further comprises: constructing a network model; inputting a preset data set into the network model, and training the network model for a plurality of times to correct parameters of the network model.
Wherein after the image data and the point cloud data are acquired, the method further comprises the steps of: and carrying out coordinate transformation on the point cloud data so as to enable the coordinate systems of the point cloud data and the image data to be consistent.
The coordinate transformation of the point cloud data comprises the following steps: converting the world coordinate system of the point cloud data into a camera coordinate system; converting the camera coordinate system into an image coordinate system; and translating the image coordinate system so as to enable the coordinate system of the point cloud data to be consistent with the coordinate system of the image data.
After the coordinate transformation is performed on the point cloud data, the method further comprises the following steps: and filtering out data outside the common field of view of the point cloud data and the image data.
Wherein after the image data and the point cloud data are acquired, the method further comprises the steps of: and filtering the ground point cloud in the point cloud data, wherein the ground point cloud is a set of points with a distance smaller than a preset distance from the ground.
Wherein filtering the ground point cloud in the point cloud data comprises: obtaining a ground model; determining a set of points with the distance smaller than a preset distance from the ground model in the point cloud data as a ground point cloud; the ground point cloud in the point cloud data is filtered.
Wherein, acquire ground model, include: randomly selecting 3 points from the point cloud data in each iteration to establish a plane model; according to the distance between each point in the point cloud data and the plane model, determining an inner point and an outer point; when the number of the inner points meets the preset requirement, a ground model is built by utilizing the inner points meeting the preset requirement.
In order to solve the technical problems, the application adopts another technical scheme that: an object detection device is provided. The object detection device comprises a memory and a processor which are connected with each other; wherein the memory is used for storing program instructions for realizing the target detection method; the processor is configured to execute the program instructions stored in the memory.
In order to solve the technical problems, the application adopts another technical scheme that: a computer-readable storage medium is provided. The computer readable storage medium stores a program file executable by a processor to implement the above-mentioned object detection method.
According to the target detection method and the related device, the first characteristic data corresponding to the image data and the second characteristic data corresponding to the point cloud data are obtained by acquiring the image data and the point cloud data and then respectively carrying out characteristic extraction on the image data and the point cloud data; then fusing the first characteristic data and the second characteristic data to obtain third characteristic data; finally, according to the third characteristic data, a target detection result is obtained; the method is based on the image data and the point cloud data to detect the target, so that the target can be detected based on the point cloud data acquired by the high-beam laser radar at the same time under the condition of weak illumination, the method can meet all-weather requirements, and the target is detected based on the image data and the point cloud data acquired by the low-beam laser radar under the condition of good illumination, so that compared with the scheme of detecting the target by using the high-beam laser radar in all weather, the cost is greatly reduced.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a target detection method according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for filtering a ground point cloud in point cloud data according to an embodiment of the present application;
Fig. 3 is a schematic structural diagram of a network architecture according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a target detection apparatus according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of a computer readable storage medium according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," and the like in this disclosure are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", and "a third" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise. All directional indications (such as up, down, left, right, front, back … …) in the embodiments of the present application are merely used to explain the relative positional relationship, movement, etc. between the components in a particular gesture (as shown in the drawings), and if the particular gesture changes, the directional indication changes accordingly. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The present application will be described in detail with reference to the accompanying drawings and examples.
Referring to fig. 1, fig. 1 is a flowchart of a target detection method according to an embodiment of the application; in this embodiment, a target detection method is provided, which is particularly useful for detecting an obstacle in front of a vehicle running, so as to reduce the occurrence of traffic accidents caused by the obstacle in front.
Specifically, the method comprises the following steps:
step S11: image data and point cloud data are acquired.
Specifically, the image data of the environment to be detected can be obtained through the vision sensor, and meanwhile, the point cloud data of the environment to be detected can be obtained through the laser radar; the environment to be measured may be the front and/or surrounding environment during the running of the vehicle.
Wherein each point data of the laser point cloud data consists of three-dimensional coordinates (XYZ) of the target point, the points being clustered together to describe an environment around the laser radar and a contour of the target in the three-dimensional coordinate system; while image data acquired by a vision sensor, such as a color camera, is located in a two-dimensional plane where each point includes R, G, B components to provide color information of the object in the field of view of the camera. In order to make the acquired data information dimension consistent, transforming the coordinates of the point cloud data to make the coordinate system of the point cloud data and the image data consistent is further included after step S11; specifically, the laser radar and the vision sensor can be calibrated first to obtain internal parameters and external parameters of the laser radar, and then the coordinates of point cloud data are transformed based on the obtained internal parameters and external parameters. The method comprises the steps of projecting point cloud data from a world coordinate system to a pixel coordinate system, specifically, carrying out three transformations, namely rigid transformation, perspective projection and translation transformation, specifically, converting the world coordinate system of the point cloud data into a camera coordinate system, then converting the camera coordinate system into an image coordinate system, and then translating the image coordinate system so as to enable the coordinate system of the point cloud data to be consistent with the coordinate system of the image data.
In order to make the point cloud data and the video of the image data identical, after transforming the coordinates, the data outside the common field of view of the point cloud data and the image data can be further filtered out, so as to obtain a projection diagram of the point cloud on the RGB image.
Specifically, in an embodiment, because the number of the point cloud data collected by each frame of the laser radar is large, the process of directly processing the original data is complex, in order to simplify the processing process of the point cloud data, after the point cloud data is obtained, the ground point cloud in the point cloud data can be further filtered, so that an intelligent vehicle or a robot and the like can only identify the target near the driving path range in the driving process, thereby reducing the scanning range of the laser radar, reducing the number of the point clouds to be processed, and avoiding the influence of the ground point cloud; then converting the coordinates of the point cloud data; the ground point cloud is specifically a set of points with a distance from the ground being smaller than a preset distance; the preset distance may be specifically greater than 0.2 meters, for example, may be 0.3 meters.
Referring to fig. 2, fig. 2 is a flowchart of a method for filtering ground point clouds in point cloud data according to an embodiment of the present application; specifically, the method for filtering the ground point cloud in the point cloud data may include:
Step S21: a ground model is obtained.
Specifically, a three-point RANSAC algorithm can be adopted to obtain a ground model; the RANSAC algorithm comprises the following specific steps: 1) Randomly assuming a small group of local points as initial values, and then fitting a model by using the local points, wherein the model is suitable for the assumed local points, and all unknown parameters can be calculated from the assumed local points; 2) Testing all other data based on the model, if a certain point is suitable for the fitted model, considering the fitted model as an intra-local point, and expanding the intra-local point; 3) If there are enough points to be classified as hypothetical intra-local points, then the estimated model is reasonable enough; 4) Re-estimating the model with all assumed intra-local points to update the model; 5) Finally, the model is evaluated by estimating the error rate of the local points and the model.
In a specific embodiment, step S21 may specifically include randomly selecting 3 points from the point cloud data in each iteration to build a plane model, and then determining an inner point and an outer point according to a distance between each point in the point cloud data and the plane model; specifically, the plane parameters can be calculated with the equation ax+by+cz+d=0 as the plane model; assuming that the plane is perpendicular to the Z axis, the iteration number n is set, the fitting precision is a, namely, points with the distance from the obtained plane being less than or equal to a are regarded as inner points, and points with the distance from the obtained plane being greater than a are regarded as outer points. When the number of the inner points meets the preset requirement, a ground model is built by utilizing the inner points meeting the preset requirement.
Step S22: and determining a set of points with the distance smaller than a preset distance from the ground model in the point cloud data as a ground point cloud.
Step S23: the ground point cloud in the point cloud data is filtered.
Specifically, the specific implementation process of step S23 may refer to the specific implementation process of filtering the ground point cloud in the point cloud data in the prior art, and may achieve the same or similar technical effects, and may refer to the prior art, and details thereof are not described herein.
Step S12: and respectively carrying out feature extraction on the image data and the point cloud data to obtain first feature data corresponding to the image data and second feature data corresponding to the point cloud data.
Specifically, the method further comprises a pre-training network model before executing the step S12; the learning rate and the Batch Size are extremely important parameters in the network model training process; in the specific implementation process, the step of pre-training the network model specifically comprises the steps of constructing the network model, inputting a preset data set into the network model, and training the network model for a plurality of times to correct parameters of the network model, so that the parameters of the network model are reasonably selected and optimized; the preset data set can be a plurality of data acquired in advance, and each data comprises image data and point cloud data; the parameters of the network model include at least a learning rate and a Batch Size. Wherein, the learning rate directly influences the convergence state of the network model; if the learning rate is too large, the network model is easy to converge, so that a globally optimal result is difficult to obtain, and if the learning rate is too small, the network model is slow to converge, so that the training cost is greatly increased; the Batch Size affects the generalization performance of the network model, which is the data volume required by one training in training iteration, and the Size of the Batch Size is limited by the memory of the training platform and is related to the Size of a network input window; therefore, the network model has strong robustness and high accuracy by optimizing the parameters of the network model; specifically, the Batch Size is generally set to be an integer multiple of 4, and the larger the Batch Size is, the higher the initial learning rate is, so as to ensure the model effect.
Specifically, referring to fig. 3, fig. 3 is a schematic structural diagram of a network architecture according to an embodiment of the present application; in a specific embodiment, after the processing in step S11, the image data and the point cloud data are input into a trained network model, and the image data and the point cloud data are preprocessed so that the formats of the image data and the point cloud data are the same; specifically, the format of the preprocessed image data and the point cloud data may be 160×640×3.
Then, the first feature extraction module 31 in the network model is utilized to perform feature extraction on the image data to obtain first feature data corresponding to the image data, and the second feature extraction module 32 in the network model is utilized to perform feature extraction on the point cloud data to obtain second feature data corresponding to the point cloud data. The first feature extraction module 31 and the second feature extraction module 32 are adopted to extract features of the image data and the point cloud data respectively, so that the processes of extracting the two features are not interfered with each other, and the effectiveness of extracting features of different data is effectively ensured.
In a specific embodiment, in order to avoid the phenomenon of gradient disappearance or gradient explosion in training caused by excessive number of convolution layers, each convolution layer may include a BN (Batch Normalization) layer connected to the convolution layer, so as to ensure stable training.
Specifically, in view of limited computing performance of the autopilot platform, the first feature extraction module 31 and the second feature extraction module 32 may use a lightweight convolutional network as the feature extraction network; specifically, the first feature extraction module 31 and/or the second feature extraction module 32 may have a ResNet structure, so that the depth of the network can be effectively increased, and the feature extraction capability of the network can be enhanced, so as to better extract the more hidden features.
Step S13: and fusing the first characteristic data and the second characteristic data to obtain third characteristic data.
Specifically, the first feature data and the second feature data are fused by using a feature fusion module 33 in the network model, so as to obtain third feature data.
Specifically, in an embodiment, in order to obtain a better feature data fusion effect, the method further includes creating a fusion loss function after step S13, then fitting the third feature data with the fusion loss function to obtain a loss value, and then constraining the loss value to correct the third feature data. The fusion loss function may specifically be an MSE loss function, i.e., a mean square loss function; with this lost constraint, the two feature data can be better fused together.
It will be appreciated that the network model is two inputs and one output; wherein, RGB image data obtained by a camera and XYZ point cloud data coded by imaging are input and output as third characteristic data,
Step S14: and obtaining a target detection result according to the third characteristic data.
Specifically, the third characteristic data is processed to obtain targets such as people or vehicles in the environment to be detected, and then a target detection result is obtained.
It should be noted that the training and testing data used in the present application is based on the laser point cloud and left color camera data in the KITTI target detection dataset.
Specifically, a network model for feature level fusion of laser point cloud data and visual image data is designed based on the depth convolution neural network, so that the trained network model has strong robustness and high accuracy; therefore, under the condition of good light, color image data and depth images acquired by an RGB camera can be used for judging the type and the distance of the obstacle respectively, and under the special conditions of low illumination light, rain and fog days and the like, a laser radar can be adopted for real-time obstacle avoidance; furthermore, the method can detect targets such as people, vehicles and the like on the road surface under the condition of large extreme weather or illumination change, and accidents such as traffic accidents and the like are effectively avoided.
According to the target detection method provided by the embodiment, the first characteristic data corresponding to the image data and the second characteristic data corresponding to the point cloud data are obtained by acquiring the image data and the point cloud data and then respectively carrying out characteristic extraction on the image data and the point cloud data; then fusing the first characteristic data and the second characteristic data to obtain third characteristic data; finally, according to the third characteristic data, a target detection result is obtained; the method is based on the image data and the point cloud data to detect the target, so that the target can be detected based on the point cloud data acquired by the high-beam laser radar at the same time under the condition of weak illumination, the method can meet all-weather requirements, and the target is detected based on the image data and the point cloud data acquired by the low-beam laser radar under the condition of good illumination, so that compared with the scheme of detecting the target by using the high-beam laser radar in all weather, the cost is greatly reduced.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an object detection device according to an embodiment of the application; in the present embodiment, there is provided an object detection apparatus 500, the object detection apparatus 500 including a memory 501 and a processor 502 connected to each other; wherein the memory 501 is used to store program instructions for implementing the object detection method according to the above-described embodiments; the processor 502 is configured to execute program instructions stored in the memory 501.
The processor 502 may also be referred to as a CPU (Central Processing Unit ). The processor 502 may be an integrated circuit chip with signal processing capabilities. The processor 502 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 501 may be a memory bank, a TF card, or the like, and may store all information in the object detection apparatus 500, including input raw data, a computer program, intermediate operation results, and final operation results, which are stored in the memory 501. It stores and retrieves information according to the location specified by the controller. With the memory 501, the object detection device 500 has a memory function, so that normal operation can be ensured. The memory 501 in the object detection device 500 may be classified into a main memory (memory) and an auxiliary memory (external memory) according to the purpose, and may be classified into an external memory and an internal memory. The external memory is usually a magnetic medium, an optical disk, or the like, and can store information for a long period of time. The memory refers to a storage component on the motherboard for storing data and programs currently being executed, but is only used for temporarily storing programs and data, and the data is lost when the power supply is turned off or the power is turned off.
The object detection device 500 further includes other components, which are the same as other components and functions of the object detection device in the prior art, and will not be described herein.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a computer readable storage medium according to an embodiment of the application. In the present embodiment, a computer-readable storage medium storing a program file 600 is provided, the program file 600 being executable by a processor to implement the target detection method according to the above-described embodiment.
The program file 600 may be stored in the form of a software product in the computer-readable storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods according to the embodiments of the present application. The aforementioned storage device includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a terminal device such as a computer, a server, a mobile phone, a tablet, or the like.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed.
The foregoing is only the embodiments of the present application, and therefore, the patent scope of the application is not limited thereto, and all equivalent structures or equivalent processes using the descriptions of the present application and the accompanying drawings, or direct or indirect application in other related technical fields, are included in the scope of the application.

Claims (11)

1. A method of target detection, the method comprising:
Acquiring image data and point cloud data;
Respectively carrying out feature extraction on the image data and the point cloud data to obtain first feature data corresponding to the image data and second feature data corresponding to the point cloud data;
fusing the first characteristic data and the second characteristic data to obtain third characteristic data;
obtaining a target detection result according to the third characteristic data;
The feature extraction is performed on the image data and the point cloud data respectively to obtain first feature data corresponding to the image data and second feature data corresponding to the point cloud data, and the feature extraction comprises the following steps:
inputting the image data and the point cloud data into a trained network model, so as to perform feature extraction on the image data by using a first feature extraction module in the network model to obtain first feature data corresponding to the image data, and performing feature extraction on the point cloud data by using a second feature extraction module in the network model to obtain second feature data corresponding to the point cloud data;
The fusing the first feature data and the second feature data to obtain third feature data includes:
Fusing the first characteristic data and the second characteristic data by utilizing a characteristic fusion module in the network model to obtain third characteristic data;
The first feature extraction module and/or the second feature extraction module comprises a plurality of convolution layers, and each convolution layer comprises a BN layer connected with the convolution layer;
The method for fusing the first feature data and the second feature data by using the feature fusion module in the network model further comprises the following steps of:
Creating a fusion loss function;
fitting the third characteristic data by using the fusion loss function to obtain a loss value;
And constraining the loss value to correct the third characteristic data.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The first feature extraction module and/or the second feature extraction module has a ResNet structure.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The method further comprises the steps of:
Constructing a network model;
and inputting a preset data set into the network model, and training the network model for a plurality of times to correct parameters of the network model.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises,
After the image data and the point cloud data are acquired, the method further comprises the following steps:
and carrying out coordinate transformation on the point cloud data so as to enable the coordinate systems of the point cloud data and the image data to be consistent.
5. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
The coordinate transformation of the point cloud data comprises the following steps:
Converting the world coordinate system of the point cloud data into a camera coordinate system;
converting the camera coordinate system into an image coordinate system;
and translating the image coordinate system so as to enable the coordinate system of the point cloud data to be consistent with the coordinate system of the image data.
6. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
After the coordinate transformation is performed on the point cloud data, the method further comprises the following steps:
And filtering out data outside the common field of view of the point cloud data and the image data.
7. The method of claim 1, wherein the step of determining the position of the substrate comprises,
After the image data and the point cloud data are acquired, the method further comprises the following steps:
and filtering the ground point cloud in the point cloud data, wherein the ground point cloud is a set of points with a distance smaller than a preset distance from the ground.
8. The method of claim 7, wherein the step of determining the position of the probe is performed,
The filtering the ground point cloud in the point cloud data comprises the following steps:
obtaining a ground model;
determining a set of points, of which the distance from the point cloud data to the ground model is smaller than the preset distance, as a ground point cloud;
and filtering the ground point cloud in the point cloud data.
9. The method of claim 8, wherein the step of determining the position of the first electrode is performed,
The obtaining the ground model comprises the following steps:
Randomly selecting 3 points from the point cloud data in each iteration to establish a plane model;
determining an inner point and an outer point according to the distance between each point in the point cloud data and the plane model;
when the number of the inner points meets the preset requirement, building a ground model by utilizing the inner points meeting the preset requirement.
10. An object detection device, comprising a memory and a processor connected to each other; wherein the memory is configured to store program instructions for implementing the object detection method according to any one of claims 1 to 9; the processor is configured to execute the program instructions stored in the memory.
11. A computer readable storage medium, characterized in that a program file is stored, which program file is executable by a processor to implement the object detection method according to any of claims 1-9.
CN202011455234.7A 2020-12-10 2020-12-10 Target detection method and related device Active CN112529011B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011455234.7A CN112529011B (en) 2020-12-10 2020-12-10 Target detection method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011455234.7A CN112529011B (en) 2020-12-10 2020-12-10 Target detection method and related device

Publications (2)

Publication Number Publication Date
CN112529011A CN112529011A (en) 2021-03-19
CN112529011B true CN112529011B (en) 2024-09-06

Family

ID=74998977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011455234.7A Active CN112529011B (en) 2020-12-10 2020-12-10 Target detection method and related device

Country Status (1)

Country Link
CN (1) CN112529011B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115703234B (en) * 2021-08-03 2024-01-30 北京小米移动软件有限公司 Robot control method, device, robot and storage medium
CN114445415B (en) * 2021-12-14 2024-09-27 中国科学院深圳先进技术研究院 Method for dividing drivable region and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765894A (en) * 2019-09-30 2020-02-07 杭州飞步科技有限公司 Target detection method, device, equipment and computer readable storage medium
CN111340797A (en) * 2020-03-10 2020-06-26 山东大学 Laser radar and binocular camera data fusion detection method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345510A (en) * 2018-09-07 2019-02-15 百度在线网络技术(北京)有限公司 Object detecting method, device, equipment, storage medium and vehicle
CN111950426A (en) * 2020-08-06 2020-11-17 东软睿驰汽车技术(沈阳)有限公司 Target detection method and device and delivery vehicle

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765894A (en) * 2019-09-30 2020-02-07 杭州飞步科技有限公司 Target detection method, device, equipment and computer readable storage medium
CN111340797A (en) * 2020-03-10 2020-06-26 山东大学 Laser radar and binocular camera data fusion detection method and system

Also Published As

Publication number Publication date
CN112529011A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
US10929694B1 (en) Lane detection method and system based on vision and lidar multi-level fusion
WO2021233029A1 (en) Simultaneous localization and mapping method, device, system and storage medium
CN111027401B (en) End-to-end target detection method with integration of camera and laser radar
US11527077B2 (en) Advanced driver assist system, method of calibrating the same, and method of detecting object in the same
CN110414418B (en) Road detection method for multi-scale fusion of image-laser radar image data
CN112740225B (en) Method and device for determining road surface elements
CN116229408A (en) Target identification method for fusing image information and laser radar point cloud information
CN115393680B (en) 3D target detection method and system for multi-mode information space-time fusion in foggy weather scene
CN112529011B (en) Target detection method and related device
CN114495064A (en) Monocular depth estimation-based vehicle surrounding obstacle early warning method
CN110992424B (en) Positioning method and system based on binocular vision
CN115372990A (en) High-precision semantic map building method and device and unmanned vehicle
CN112683228A (en) Monocular camera ranging method and device
CN115327572A (en) Method for detecting obstacle in front of vehicle
CN114295139A (en) Cooperative sensing positioning method and system
CN115100741B (en) Point cloud pedestrian distance risk detection method, system, equipment and medium
CN117111055A (en) Vehicle state sensing method based on thunder fusion
CN116778262B (en) Three-dimensional target detection method and system based on virtual point cloud
Wang et al. Pedestrian detection based on YOLOv3 multimodal data fusion
CN111833443A (en) Landmark position reconstruction in autonomous machine applications
CN118038411A (en) Remote obstacle detection method and system based on laser radar and camera
CN107862873B (en) A kind of vehicle count method and device based on relevant matches and state machine
CN112733678A (en) Ranging method, ranging device, computer equipment and storage medium
CN114384486A (en) Data processing method and device
CN114648639B (en) Target vehicle detection method, system and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant