Nothing Special   »   [go: up one dir, main page]

WO2020243962A1 - 物体检测方法、电子设备和可移动平台 - Google Patents

物体检测方法、电子设备和可移动平台 Download PDF

Info

Publication number
WO2020243962A1
WO2020243962A1 PCT/CN2019/090393 CN2019090393W WO2020243962A1 WO 2020243962 A1 WO2020243962 A1 WO 2020243962A1 CN 2019090393 W CN2019090393 W CN 2019090393W WO 2020243962 A1 WO2020243962 A1 WO 2020243962A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
compensation value
pixel
information
candidate
Prior art date
Application number
PCT/CN2019/090393
Other languages
English (en)
French (fr)
Inventor
张磊杰
陈晓智
徐斌
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/090393 priority Critical patent/WO2020243962A1/zh
Priority to CN201980012209.0A priority patent/CN111712828A/zh
Publication of WO2020243962A1 publication Critical patent/WO2020243962A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Definitions

  • the embodiments of the present application relate to the technical field of movable platforms, and in particular, to an object detection method, electronic equipment, and a movable platform.
  • Obstacle detection is one of the key technologies of the automatic driving system. It uses the cameras, lidar, millimeter wave radar and other sensors on the vehicle to detect obstacles in the road scene, such as vehicles, pedestrians, etc.
  • the autonomous driving system In the autonomous driving scene, the autonomous driving system not only needs to obtain the position of the obstacle on the image, but also needs to predict the three-dimensional positioning information of the obstacle. The accuracy of the three-dimensional positioning of obstacles directly affects the safety and reliability of autonomous vehicles.
  • the embodiments of the present application provide an object detection method, electronic equipment and a movable platform, which can reduce the cost of object detection.
  • an object detection method including:
  • an electronic device including:
  • Memory used to store computer programs
  • the processor is configured to execute the computer program, specifically:
  • an embodiment of the present application provides a movable platform, including: the electronic device provided in the second aspect of the embodiment of the present application.
  • an embodiment of the present application is a computer storage medium, in which a computer program is stored, and the computer program implements the object detection method provided in the first aspect when executed.
  • the embodiments of the present application provide an object detection method, electronic equipment and a movable platform.
  • the sparse point cloud data and images are projected into a target coordinate system to obtain the data to be processed, Perform three-dimensional detection on the data to be processed, and obtain detection results of objects included in the scene to be detected. Since only sparse point cloud data needs to be acquired, the density of the point cloud data is reduced, thus reducing the complexity and requirements of electronic equipment, and reducing the cost of object detection.
  • FIG. 1 is a schematic diagram of an application scenario involved in an embodiment of this application
  • FIG. 2 is a flowchart of an object detection method provided by an embodiment of the application
  • FIG. 3 is another flowchart of an object detection method provided by an embodiment of the application.
  • FIG. 4 is another flowchart of an object detection method provided by an embodiment of the application.
  • FIG. 5 is another flowchart of an object detection method provided by an embodiment of the application.
  • FIG. 6 is another flowchart of an object detection method provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a lidar provided by an embodiment of the application.
  • words such as “first” and “second” are used to distinguish the same items or similar items with substantially the same function and effect. Those skilled in the art can understand that words such as “first” and “second” do not limit the quantity and order of execution, and words such as “first” and “second” do not limit the difference.
  • the embodiments of the present application can be applied to any field that needs to detect objects.
  • it is applied to the field of intelligent driving such as automatic driving and assisted driving, which can detect obstacles such as vehicles and pedestrians in road scenes.
  • it can be applied to the field of drones, and can detect obstacles in the drone flight scene.
  • Another example is the security field, which detects objects entering a designated area.
  • the object detection method provided in the embodiments of the present application can be applied to a low-complexity neural network, and on the basis of ensuring the accuracy of object detection, the detection scheme is universal on multiple platforms.
  • FIG. 1 is a schematic diagram of an application scenario involved in an embodiment of this application.
  • a smart driving vehicle includes a detection device.
  • the detection equipment can identify and detect the objects in the front lane (such as falling rocks, leftovers, dead branches, pedestrians, vehicles, etc.), and obtain detection information such as the three-dimensional position, orientation and three-dimensional size of the object , And plan the state of intelligent driving based on the detection information, such as changing lanes, decelerating, or stopping.
  • the detection equipment may include radar, ultrasonic detection equipment, Time Of Flight (TOF) ranging detection equipment, visual detection equipment, laser detection equipment, image sensors, etc., and combinations thereof.
  • TOF Time Of Flight
  • the image sensor may be a camera, video camera, etc.
  • the radar may be a general-purpose lidar or a specific lidar that meets the requirements of a specific scenario, such as a rotating scanning multi-line lidar with multiple transmitters and multiple receivers, etc.
  • FIG. 1 is a schematic diagram of an application scenario of this application, and the application scenario of the embodiment of this application includes but is not limited to that shown in FIG. 1.
  • FIG. 2 is a flowchart of an object detection method provided by an embodiment of the application.
  • the execution subject may be an electronic device.
  • the object detection method provided in this embodiment includes:
  • S201 Acquire sparse point cloud data and images of the scene to be detected.
  • point cloud data refers to a collection of point data on the surface of an object obtained by a measuring device.
  • point cloud data can be divided into sparse point cloud data and dense point cloud data. For example, it can be divided according to the distance between points and the number of points. When the distance between points is relatively large and the number of points is relatively small, it can be called sparse point cloud data. When the distance between points is relatively small and the number of points is relatively large, it can be called dense point cloud data or high-density point cloud data.
  • the acquisition of dense point clouds requires high-beam lidar to scan at higher frequencies.
  • the high-beam lidar causes higher operating costs, and the continuous high scanning frequency of the lidar will reduce the service life of the lidar.
  • Other methods of obtaining dense point clouds, such as the point cloud stitching of multiple single-line lidars, require complex algorithms, and the system's robustness is relatively low.
  • sparse point clouds Since only sparse point cloud data needs to be obtained, compared to obtaining high-density point cloud data, the difficulty of obtaining point cloud data is reduced, and the requirements for equipment and the cost of equipment are reduced. Therefore, in daily application scenarios, sparse point clouds have better utilization value than dense point clouds.
  • this embodiment does not limit the scene to be detected, and it may be different according to the type of electronic device and different application scenarios.
  • the scene to be detected may be the road ahead of the vehicle.
  • the scene to be detected may be the flight environment when the drone is flying.
  • acquiring sparse point cloud data and images of the scene to be detected may include:
  • the sparse point cloud data is acquired by a radar sensor, and the image is acquired by an image sensor.
  • the image sensor can be a camera, a camera, and so on.
  • the number of image sensors can be one.
  • the number of radar sensors can be one or more than one.
  • acquiring sparse point cloud data through radar sensors may include:
  • the first sparse point cloud data corresponding to each radar sensor is projected into the target radar coordinate system to obtain the sparse point cloud data.
  • the radar coordinate systems corresponding to multiple radar sensors have a certain conversion relationship, and the conversion relationship may be determined by the external parameters of the radar sensor, which is not limited in this embodiment.
  • the external parameters of the radar sensor include but are not limited to the arrangement of the radar sensor, position, orientation angle, carrier speed, acceleration, etc.
  • the first sparse point cloud data collected by each radar sensor can be projected into the target radar coordinate system to obtain the sparse point cloud data in the target radar coordinate system.
  • the target radar coordinate system may be a radar coordinate system corresponding to any one of the multiple radar sensors.
  • the target radar coordinate system is another radar coordinate system, and the target radar coordinate system has a certain conversion relationship with the radar coordinate system corresponding to each radar sensor.
  • the target radar coordinate system may be the radar coordinate system 1 corresponding to the radar sensor 1.
  • the sparse point cloud data may include: a collection of the sparse point cloud data collected by the radar sensor 2 into the radar coordinate system 1 and the sparse point cloud data collected by the radar sensor 1.
  • data deduplication processing is performed.
  • the sparse point cloud data may include the three-dimensional position coordinates of each point, which may be marked as (x, y, z).
  • the sparse point cloud data may also include the laser reflection intensity value of each point.
  • S202 Project the sparse point cloud data and the image into the target coordinate system, and obtain the data to be processed.
  • the target coordinate system may be an image coordinate system corresponding to the image sensor.
  • the target coordinate system may also be other coordinate systems, which is not limited in this embodiment.
  • projecting sparse point cloud data and images into the target coordinate system to obtain the data to be processed may include:
  • the sparse point cloud data and image are projected into the image coordinate system to obtain the data to be processed.
  • the target coordinate system is the image coordinate system corresponding to the image sensor.
  • the sparse point cloud data can be accurately mapped and matched with some pixels in the image, and the sparse point cloud data outside the image coordinate system can be filtered. For example, suppose the length of the image is H and the width is W. Then, by projecting the sparse point cloud data and the image into the image coordinate system, the sparse point cloud data outside the H ⁇ W range can be filtered out to obtain the data to be processed.
  • the data to be processed may include: the coordinate value and reflectance of each point in the target coordinate system where the sparse point cloud data is projected, and the coordinate value of the pixel in the image in the target coordinate system.
  • each point in the sparse point cloud data and each pixel point in the image may not be completely mapped and matched.
  • the reflectivity of the corresponding point in the target coordinate system can be the laser reflection intensity value of the point in the sparse point cloud data.
  • the reflectivity of the corresponding point in the target coordinate system can be set to zero.
  • S203 Perform three-dimensional detection on the data to be processed, and obtain a detection result of objects included in the scene to be detected.
  • the object information may include at least one of the following: three-dimensional position information, orientation information, three-dimensional size information, and depth value of the object.
  • the object detection method provided in this embodiment can obtain detection results of objects included in the scene to be detected based on the sparse point cloud data and images by acquiring sparse point cloud data and images of the scene to be detected. Since only sparse point cloud data needs to be acquired, the density of the point cloud data is reduced, thus reducing the complexity and requirements of electronic equipment, and reducing the cost of object detection.
  • FIG. 3 is another flowchart of the object detection method provided by the embodiment of the application. As shown in FIG. 3, in the above S203, performing three-dimensional detection on the data to be processed and obtaining the detection result of the objects included in the scene to be detected may include:
  • the basic network model may be pre-trained and used to output feature maps according to the data to be processed. It should be noted that this embodiment does not limit the implementation of the basic network model, and different neural network models may be used according to actual needs, for example, a convolutional neural network model.
  • the basic network model can include several layers of convolution and pooling operations according to actual needs, and finally output a feature map.
  • S302 Input the feature map into the candidate area network model, and obtain a two-dimensional frame of the candidate object.
  • the candidate area network model may be a pre-trained two-dimensional box used to output candidate objects according to the feature map. It should be noted that this embodiment does not limit the implementation of the candidate area network model, and different neural network models may be used according to actual needs, for example, a convolutional neural network model.
  • the two-dimensional frame of the candidate object corresponds to the objects included in the scene to be detected. Each object included in the scene to be detected may correspond to a two-dimensional box with multiple candidate objects.
  • the specific types of objects are not distinguished. For example. Assuming that the objects included in the scene to be detected are two vehicles and one pedestrian, the number of two-dimensional frames of candidate objects obtained may be 100. Then, two vehicles and one pedestrian jointly correspond to the two-dimensional boxes of the 100 candidate objects. In the subsequent steps, it can be determined which object the two-dimensional boxes of the 100 candidate objects correspond to.
  • S303 Determine objects included in the scene to be detected according to the two-dimensional frame of the candidate object, and obtain a compensation value of the information of the object.
  • the object included in the scene to be detected can be determined according to the two-dimensional frame of the candidate object, and the compensation value of the object information can be obtained.
  • the compensation value of the object information may include but is not limited to at least one of the following: the compensation value of the orientation of the object, the compensation value of the three-dimensional position information of the object, the compensation value of the three-dimensional size of the object, and the two-dimensional The compensation value of the frame.
  • the compensation value of the orientation of the object is the difference between the actual value of the orientation of the object and the preset orientation.
  • the compensation value of the three-dimensional position information of the object is the difference between the actual value of the three-dimensional position of the object and the preset three-dimensional position.
  • the compensation value of the three-dimensional size of the object is the difference between the actual value of the three-dimensional size of the object and the preset three-dimensional size.
  • the compensation value of the two-dimensional frame of the object is the difference between the actual value of the two-dimensional frame of the object and the preset value.
  • this embodiment does not limit the specific values of the preset orientation, preset three-dimensional position, preset three-dimensional size, and preset values of the two-dimensional frame of the object.
  • the preset three-dimensional position may be the three-dimensional position of the center point of the chassis of the vehicle.
  • the preset three-dimensional size can be different according to different vehicle models.
  • S304 Acquire the information of the object according to the compensation value of the information of the object.
  • the data to be processed is sequentially input into the basic network model and the candidate area network model to obtain the two-dimensional frame of the candidate object, and then determine the objects included in the scene to be detected according to the two-dimensional frame of the candidate object and The compensation value of the information of the object, and the information of the object is obtained according to the compensation value of the information of the object.
  • obtaining the compensation value of the object information first is easy to implement and has higher accuracy, which improves the accuracy of object detection.
  • FIG. 4 is another flowchart of the object detection method provided by the embodiment of the application.
  • inputting the feature map into the candidate area network model to obtain the two-dimensional frame of the candidate object may include:
  • S401 Acquire the probability that each pixel in the image belongs to the object according to the feature map.
  • S403 Obtain a two-dimensional frame of the candidate object according to the probability that the first pixel belongs to the object and the two-dimensional frame of the object corresponding to the first pixel.
  • the resolution of the image is 100*50, that is, there are 5000 pixels.
  • the probability of each of the 5000 pixels belonging to the object can be obtained.
  • the probability that the pixel 1 belongs to the object is P1
  • the pixel 1 is determined to belong to the object according to the probability P1.
  • the pixel 1 can be called the first pixel, and the two-dimensional frame 1 of the object corresponding to the pixel 1 can be obtained.
  • the probability that the pixel 2 belongs to the object is P2, and it is determined that the pixel 2 does not belong to the object according to the probability P2.
  • the probability that the pixel 3 belongs to the object is P3. According to the probability P3, it can be determined that the pixel 3 belongs to the object.
  • the pixel 3 can be called the first pixel, and the two-dimensional frame 3 of the object corresponding to the pixel 3 can be obtained. Assuming that according to the probability that 5000 pixels belong to the object, it is determined that the first pixel is 200, then a two-dimensional frame of 200 objects can be obtained. After that, it is further screened according to the probability that the 200 first pixels belong to the object and the two-dimensional frame of the object corresponding to the 200 first pixels, and the two-dimensional frame of the candidate object is obtained from the two-dimensional frame of the 200 objects. For example, there may be 50 two-dimensional boxes of candidate objects finally obtained.
  • the two-dimensional frame of the object corresponding to a part of the pixel can be obtained first, and further, the two-dimensional frame of these objects can be swiped again to determine the candidate
  • the two-dimensional frame of the object improves the accuracy of obtaining the two-dimensional frame of the candidate object.
  • determining whether the pixel belongs to the object according to the probability that the pixel belongs to the object may include:
  • the probability that the pixel belongs to the object is greater than or equal to the preset value, it is determined that the pixel belongs to the object.
  • the probability that the pixel belongs to the object is less than the preset value, it is determined that the pixel does not belong to the object.
  • This embodiment does not limit the specific value of the preset value.
  • acquiring the two-dimensional frame of the candidate object according to the probability that the first pixel belongs to the object and the two-dimensional frame of the object corresponding to the first pixel may include:
  • the first pixel to be processed is obtained from the first set composed of a plurality of first pixels, and the first pixel to be processed is deleted from the first set to obtain the updated first set.
  • the first pixel to be processed is the first pixel with the highest probability of belonging to the object in the first set.
  • the associated value between each first pixel and the first pixel to be processed is obtained.
  • the associated value is used to indicate the degree of overlap between the two-dimensional frame of the object corresponding to each first pixel and the two-dimensional frame of the object corresponding to the first pixel to be processed.
  • Pixels 1 to 4 form the initial first set.
  • Pixels 1 to 4 form the initial first set.
  • the probability P2 that pixel 2 belongs to the object in the first set is the largest.
  • Pixel 2 can be called the first pixel to be processed, and update the first set to ⁇ pixel 1, pixel 3, pixel 4 ⁇ .
  • the two-dimensional frame of the object corresponding to pixel 1 and the two-dimensional frame of the object corresponding to pixel 2 have a higher degree of coincidence
  • the two-dimensional frame of the object corresponding to pixel 4 is The degree of coincidence between the two-dimensional frame and the two-dimensional frame of the object corresponding to pixel 2 is also relatively high. Therefore, pixel 1 and pixel 4 can be deleted from the first set to complete the deduplication operation of the two-dimensional frame of the object corresponding to pixel 2.
  • the first set only includes pixel 3.
  • the pixel 3 is acquired from the first set again, and the pixel 3 may be referred to as the first pixel to be processed.
  • the first set does not include the first pixel.
  • the two-dimensional frame of the object corresponding to pixel 2 and pixel 3 may be determined as the two-dimensional frame of the candidate object. There are two two-dimensional boxes of candidate objects.
  • FIG. 5 is another flowchart of the object detection method provided by an embodiment of the application. As shown in FIG. 5, in the foregoing S303, determining the objects included in the scene to be detected according to the two-dimensional frame of the candidate object may include:
  • S501 Input the two-dimensional frame of the candidate object into the first three-dimensional detection network model, and obtain the probability that the candidate object belongs to each of the preset objects.
  • the first three-dimensional detection network model may be pre-trained and used to output the probability that the candidate object belongs to each of the preset objects according to the two-dimensional frame of the candidate object. It should be noted that this embodiment does not limit the implementation of the first three-dimensional detection network model, and different neural network models may be used according to actual needs, for example, a convolutional neural network model. This embodiment does not limit the specific categories of the preset objects.
  • the preset objects may include, but are not limited to, vehicles, bicycles, and pedestrians.
  • the candidate objects can be labeled as candidate objects 1 to 3
  • the two-dimensional boxes of candidate objects can be labeled as two-dimensional boxes 1 to 3.
  • the preset objects include vehicles and pedestrians.
  • Candidate 1 The probability of belonging to a vehicle is P11, and the probability of belonging to a pedestrian is P12.
  • Candidate 2 The probability of belonging to a vehicle is P21, and the probability of belonging to a pedestrian is P22.
  • Candidate 3 The probability of belonging to a vehicle is P31, and the probability of belonging to a pedestrian is P32.
  • the probability that the candidate object belongs to the first object in the preset objects is greater than the preset threshold corresponding to the first object, it is determined that the candidate object is an object included in the scene to be detected.
  • the example in S501 is also used for description.
  • the preset threshold corresponding to the vehicle is Q1
  • the preset threshold corresponding to the pedestrian is Q2.
  • P11>Q1, P21 ⁇ Q1, P31>Q1 it can be determined that candidate object 1 and candidate object 3 are included in the scene to be detected, and both are vehicles.
  • P12 ⁇ Q2, P22 ⁇ Q2, P32 ⁇ Q2 it means that pedestrians are not included in the scene to be detected.
  • obtaining the compensation value of the object information may include:
  • At least one of the following compensation values is also obtained: the compensation value of the orientation of the candidate object, the compensation value of the three-dimensional position information of the candidate object, and the two-dimensional The compensation value of the frame and the compensation value of the three-dimensional size of the candidate object.
  • the compensation value corresponding to the candidate object is determined as the compensation value of the object information.
  • the first three-dimensional detection network model can not only output the probability that the candidate object belongs to each of the preset objects according to the two-dimensional frame of the candidate object, but also output the compensation value of the candidate object information at the same time.
  • the candidate object is determined to be the object included in the scene to be detected according to the probability that the candidate object belongs to each object in the preset object (see the relevant description of S502 for details)
  • the compensation value corresponding to the candidate object can be determined as the compensation value of the object information .
  • the example in S501 is also used for description. Since candidate object 1 and candidate object 3 are vehicles included in the scene to be detected, the compensation values respectively corresponding to candidate object 1 and candidate object 3 may be determined as compensation values of the vehicle information included in the scene to be detected.
  • [-180°, 180°] can be equally divided into multiple intervals, and the center of each interval is set as the preset orientation.
  • the preset orientation may be -150°.
  • the interval to which the orientation of the candidate object belongs and the compensation value of the orientation of the candidate object can be output through the first three-dimensional detection network model.
  • the compensation value is the actual value of the orientation of the candidate object and its belonging The difference between the centers of the interval.
  • the two-dimensional box of the candidate object is input into the first three-dimensional detection network model, and the probability that the candidate object belongs to each of the preset objects and the compensation value of the candidate object’s information can be obtained, thereby determining When the candidate object is an object included in the scene to be detected, the compensation value of the object information can be obtained at the same time.
  • FIG. 6 is another flowchart of the object detection method provided by the embodiment of the application. As shown in FIG. 6, in the foregoing S303, determining the objects included in the scene to be detected according to the two-dimensional frame of the candidate object may include:
  • S602 Determine the objects included in the scene to be detected according to the probability that the candidate object belongs to each of the preset objects.
  • the semantic prediction network model may be pre-trained and used to output the probability that the candidate object belongs to each of the preset objects according to the two-dimensional frame of the candidate object. It should be noted that this embodiment does not limit the implementation of the semantic prediction network model, and different neural network models can be used according to actual needs, for example, a convolutional neural network model. This embodiment does not limit the specific categories of the preset objects.
  • the preset objects may include, but are not limited to, vehicles, bicycles, and pedestrians.
  • obtaining the compensation value of the object information may include:
  • the two-dimensional frame of the object included in the scene to be detected is input into the second three-dimensional detection network model, and the compensation value of the object information is obtained.
  • the compensation value includes at least one of the following: the compensation value of the orientation of the object, the three-dimensional position information of the object The compensation value, the compensation value of the two-dimensional frame of the object and the compensation value of the three-dimensional size of the object.
  • the second three-dimensional detection network model may be a pre-trained compensation value used to output object information according to the two-dimensional frame of the object. It should be noted that this embodiment does not limit the implementation of the second three-dimensional detection network model, and different neural network models may be used according to actual needs, for example, a convolutional neural network model.
  • the difference between this embodiment and the embodiment shown in FIG. 5 lies in that: in this embodiment, two models of the semantic prediction network model and the second three-dimensional detection network model are involved.
  • the output of the semantic prediction network model is the probability that the candidate object belongs to each of the preset objects.
  • the compensation value of the object information is output through the second three-dimensional detection network model.
  • the first three-dimensional detection network model is involved.
  • the first three-dimensional detection network model can simultaneously output the probability that the candidate object belongs to each of the preset objects and the compensation value of the candidate object's information.
  • the compensation value of the object information includes the compensation value of the orientation of the object.
  • obtaining the information of the object according to the compensation value of the information of the object may include:
  • the orientation information of the object can be obtained according to the compensation value of the orientation of the object.
  • the compensation value of the object information includes the compensation value of the three-dimensional position information of the object.
  • obtaining the information of the object according to the compensation value of the information of the object may include:
  • the three-dimensional position information of the object is acquired according to the compensation value of the three-dimensional position information of the object and the three-dimensional position information of the reference point of the object.
  • the three-dimensional position information of the object can be obtained according to the compensation value of the three-dimensional position information of the object.
  • the compensation value of the object information includes the compensation value of the three-dimensional size of the object.
  • obtaining the information of the object according to the compensation value of the information of the object may include:
  • the three-dimensional size information of the object is obtained.
  • the three-dimensional size information of the object can be obtained according to the compensation value of the three-dimensional size of the object.
  • the compensation value of the object information includes the compensation value of the two-dimensional frame of the object.
  • obtaining the information of the object according to the compensation value of the information of the object may include:
  • the position information of the two-dimensional frame of the object is acquired.
  • the depth value of the object can be obtained according to the compensation value of the two-dimensional frame of the object.
  • the content of the compensation value of the object information is different, and the various implementations described above can be combined with each other to obtain at least one of the following information of the object: the orientation information of the object, the three-dimensional position information of the object, and the object The three-dimensional size information and the depth value of the object.
  • obtaining the depth value of the object according to the position information of the two-dimensional frame of the object may include:
  • the position information of the two-dimensional frame of the object is input into the first region segmentation network model to obtain sparse point cloud data on the surface of the object.
  • the sparse point cloud data on the surface of the object is clustered and segmented to obtain the sparse point cloud data of the target point on the surface of the object.
  • the first region segmentation network model may be pre-trained and used to output sparse point cloud data on the surface of the object according to the position information of the two-dimensional frame of the object. It should be noted that this embodiment does not limit the implementation of the first region segmentation network model, and different neural network models may be used according to actual needs, for example, a convolutional neural network model.
  • the target point on the vehicle may be a raised point on the rear of the vehicle.
  • the target point of the pedestrian can be a point on the pedestrian's head, and so on.
  • obtaining the depth value of the object according to the position information of the two-dimensional frame of the object may include:
  • the position information of the two-dimensional frame of the object is input into the second region segmentation network model to obtain sparse point cloud data of the target surface on the surface of the object.
  • the second region segmentation network model may be pre-trained and used to output sparse point cloud data of the target surface on the surface of the object according to the position information of the two-dimensional frame of the object. It should be noted that this embodiment does not limit the implementation of the second region segmentation network model, and different neural network models can be used according to actual needs, for example, a convolutional neural network model.
  • sparse point cloud data of the target surface on the surface of the object can be obtained, so that the depth value of the object can be determined.
  • this embodiment does not limit the position of the target surface.
  • the target surface on the vehicle may be the rear of the vehicle. If the direction of travel of the vehicle is opposite to the direction of movement of the electronic device, the target surface on the vehicle may be the front of the vehicle.
  • the target surface of the pedestrian can be the head of the pedestrian, and so on.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • the electronic device provided in this embodiment is used to implement the object detection method provided in any implementation manner of FIG. 2 to FIG. 6.
  • the electronic device provided in this embodiment may include:
  • the memory 12 is used to store computer programs
  • the processor 11 is configured to execute the computer program, specifically:
  • the processor 11 is specifically configured to:
  • the information of the object is acquired according to the compensation value of the information of the object.
  • the processor 11 is specifically configured to:
  • the processor 11 is specifically configured to:
  • the first pixel to be processed is the first pixel with the greatest probability of belonging to an object in the first set;
  • each first pixel in the updated first set obtain the associated value between each first pixel and the first pixel to be processed; the associated value is used to indicate each first pixel The degree of overlap between the two-dimensional frame of the object corresponding to the pixel and the two-dimensional frame of the object corresponding to the first pixel to be processed;
  • Delete the first pixel whose associated value is greater than the preset value from the updated first set and re-execute the steps of obtaining the first pixel to be processed and updating the first set until the first set does not include the first pixel.
  • all the first pixels to be processed are determined as the two-dimensional frame of the candidate object.
  • the processor 11 is specifically configured to:
  • the objects included in the scene to be detected are acquired.
  • the processor 11 is specifically configured to:
  • At least one of the following compensation values is also obtained: the compensation value of the orientation of the candidate object, the compensation value of the three-dimensional position information of the candidate object, the compensation value of the candidate object The compensation value of the two-dimensional frame and the compensation value of the three-dimensional size of the candidate object;
  • the compensation value corresponding to the candidate object is determined as the compensation of the information of the object value.
  • the processor 11 is specifically configured to:
  • the objects included in the scene to be detected are determined.
  • the processor 11 is specifically configured to:
  • the two-dimensional frame of the object included in the scene to be detected is input into the second three-dimensional detection network model, and the compensation value of the information of the object is obtained.
  • the compensation value includes at least one of the following: the compensation value of the orientation of the object, The compensation value of the three-dimensional position information of the object, the compensation value of the two-dimensional frame of the object, and the compensation value of the three-dimensional size of the object.
  • the compensation value of the object information includes the compensation value of the orientation of the object
  • the processor 11 is specifically configured to:
  • the compensation value of the object information includes the compensation value of the three-dimensional position information of the object
  • the processor 11 is specifically configured to:
  • the compensation value of the object information includes the compensation value of the three-dimensional size of the object
  • the processor 11 is specifically configured to:
  • the three-dimensional size information of the object is acquired according to the compensation value of the three-dimensional size of the object and the reference value of the three-dimensional size of the object corresponding to the object.
  • the compensation value of the information of the object includes the compensation value of the two-dimensional frame of the object
  • the processor 11 is specifically configured to:
  • the processor 11 is specifically configured to:
  • the depth value of the object is determined according to the sparse point cloud data of the target point.
  • the processor 11 is specifically configured to:
  • the information of the object includes at least one of the following: three-dimensional position information, orientation information, three-dimensional size information, and a depth value of the object.
  • the processor 11 is specifically configured to:
  • the sparse point cloud data is acquired through at least one radar sensor, and the image is acquired through an image sensor.
  • the number of the radar sensors is greater than one; the processor 11 is specifically configured to:
  • the first sparse point cloud data corresponding to each radar sensor is projected into the target radar coordinate system to acquire the sparse point cloud data.
  • the processor 11 is specifically configured to:
  • the sparse point cloud data and the image are projected into the camera coordinate system to obtain the data to be processed.
  • the data to be processed includes: the coordinate value and reflectivity of each point in the target coordinate system projected from the sparse point cloud data, and the pixels in the image are in the target coordinate system The coordinate value.
  • the electronic device may further include a radar sensor 13 and an image sensor 14.
  • This embodiment does not limit the number and installation positions of the radar sensor 13 and the image sensor 14.
  • the electronic device provided in this embodiment is used to implement the object detection method provided in any implementation manner of FIG. 2 to FIG. 6.
  • the technical solution and the technical effect are similar, and the details are not repeated here.
  • the embodiment of the present application also provides a movable platform, which may include the electronic device provided in the embodiment shown in FIG. 7. It should be noted that this embodiment does not limit the type of the movable platform, and it can be any device that needs to perform object detection. For example, it can be a drone, a vehicle, or other means of transportation.
  • the ranging device 200 includes a ranging module 210, which includes a transmitter 203 (for example, a transmitting circuit), a collimating element 204, a detector 205 (for example, may include a receiving circuit, a sampling circuit, and an arithmetic circuit), and an optical path change Element 206.
  • the ranging module 210 is used to emit a light beam, receive the return light, and convert the return light into an electrical signal.
  • the transmitter 203 can be used to emit a light pulse sequence.
  • the transmitter 203 may emit a sequence of laser pulses.
  • the laser beam emitted by the transmitter 203 is a narrow-bandwidth beam with a wavelength outside the visible light range.
  • the collimating element 204 is arranged on the exit light path of the emitter 203, and is used to collimate the light beam emitted from the emitter 203, and collimate the light beam emitted from the emitter 203 into parallel light and output to the scanning module.
  • the collimating element 204 is also used to condense at least a part of the return light reflected by the probe.
  • the collimating element 204 may be a collimating lens or other elements capable of collimating a light beam.
  • the transmitting light path and the receiving light path in the distance measuring device are combined before the collimating element 204 through the light path changing element 206, so that the transmitting light path and the receiving light path can share the same collimating element, so that the light path More compact.
  • the transmitter 203 and the detector 205 may respectively use their own collimating elements, and the optical path changing element 206 is arranged on the optical path behind the collimating element.
  • the light path changing element can use a small-area mirror to remove The transmitting light path and the receiving light path are combined.
  • the light path changing element may also use a reflector with a through hole, where the through hole is used to transmit the emitted light of the emitter 203 and the reflector is used to reflect the return light to the detector 205. In this way, the shielding of the back light by the bracket of the small mirror in the case of using the small mirror can be reduced.
  • the optical path changing element deviates from the optical axis of the collimating element 204.
  • the optical path changing element may also be located on the optical axis of the collimating element 204.
  • the distance measuring device 200 further includes a scanning module 202.
  • the scanning module 202 is placed on the exit light path of the distance measuring module 210.
  • the scanning module 202 is used to change the transmission direction of the collimated beam 219 emitted by the collimating element 204 and project it to the external environment, and project the return light to the collimating element 204 .
  • the returned light is collected on the detector 205 via the collimating element 204.
  • the scanning module 202 may include at least one optical element for changing the propagation path of the light beam, wherein the optical element may change the propagation path of the light beam by reflecting, refracting, or diffracting the light beam.
  • the scanning module 202 includes a lens, a mirror, a prism, a galvanometer, a grating, a liquid crystal, an optical phased array (Optical Phased Array), or any combination of the foregoing optical elements.
  • at least part of the optical elements are moving.
  • a driving module is used to drive the at least part of the optical elements to move.
  • the moving optical elements can reflect, refract, or diffract the light beam to different directions at different times.
  • the multiple optical elements of the scanning module 202 may rotate or vibrate around a common axis 209, and each rotating or vibrating optical element is used to continuously change the propagation direction of the incident light beam.
  • the multiple optical elements of the scanning module 202 may rotate at different speeds or vibrate at different speeds.
  • at least part of the optical elements of the scanning module 202 may rotate at substantially the same rotation speed.
  • the multiple optical elements of the scanning module may also rotate around different axes.
  • the multiple optical elements of the scanning module may also rotate in the same direction or in different directions; or vibrate in the same direction, or vibrate in different directions, which is not limited herein.
  • the scanning module 202 includes a first optical element 214 and a driver 216 connected to the first optical element 214.
  • the driver 216 is used to drive the first optical element 214 to rotate around the rotation axis 209 to change the first optical element 214.
  • the direction of the beam 219 is collimated.
  • the first optical element 214 projects the collimated light beam 219 to different directions.
  • the angle between the direction of the collimated beam 219 changed by the first optical element and the rotation axis 209 changes as the first optical element 214 rotates.
  • the first optical element 214 includes a pair of opposed non-parallel surfaces through which the collimated light beam 219 passes.
  • the first optical element 214 includes a prism whose thickness varies in at least one radial direction.
  • the first optical element 214 includes a wedge prism, and the collimated beam 219 is refracted.
  • the scanning module 202 further includes a second optical element 215, the second optical element 215 rotates around the rotation axis 209, and the rotation speed of the second optical element 215 is different from the rotation speed of the first optical element 214.
  • the second optical element 215 is used to change the direction of the light beam projected by the first optical element 214.
  • the second optical element 215 is connected to another driver 217, and the driver 217 drives the second optical element 215 to rotate.
  • the first optical element 214 and the second optical element 215 can be driven by the same or different drivers, so that the rotation speed and/or rotation of the first optical element 214 and the second optical element 215 are different, so as to project the collimated light beam 219 to the outside space.
  • the controller 218 controls the drivers 216 and 217 to drive the first optical element 214 and the second optical element 215, respectively.
  • the rotational speeds of the first optical element 214 and the second optical element 215 may be determined according to the area and pattern expected to be scanned in actual applications.
  • the drivers 216 and 217 may include motors or other drivers.
  • the second optical element 215 includes a pair of opposite non-parallel surfaces through which the light beam passes. In one embodiment, the second optical element 215 includes a prism whose thickness varies in at least one radial direction. In one embodiment, the second optical element 215 includes a wedge prism.
  • the scanning module 202 further includes a third optical element (not shown) and a driver for driving the third optical element to move.
  • the third optical element includes a pair of opposite non-parallel surfaces, and the light beam passes through the pair of surfaces.
  • the third optical element includes a prism whose thickness varies in at least one radial direction.
  • the third optical element includes a wedge prism. At least two of the first, second, and third optical elements rotate at different rotation speeds and/or rotation directions.
  • each optical element in the scanning module 202 can project light to different directions, such as directions 211 and 213, so that the space around the distance measuring device 200 is scanned.
  • directions 211 and 213 the directions that the space around the distance measuring device 200 is scanned.
  • the return light 212 reflected by the probe 201 is incident on the collimating element 204 after passing through the scanning module 202.
  • the detector 205 and the transmitter 203 are placed on the same side of the collimating element 204, and the detector 205 is used to convert at least part of the return light passing through the collimating element 204 into an electrical signal.
  • an anti-reflection film is plated on each optical element.
  • the thickness of the antireflection coating is equal to or close to the wavelength of the light beam emitted by the emitter 203, which can increase the intensity of the transmitted light beam.
  • a filter layer is plated on the surface of an element located on the beam propagation path in the distance measuring device, or a filter is provided on the beam propagation path for transmitting at least the wavelength band of the beam emitted by the transmitter 203 , Reflect other bands to reduce the noise caused by ambient light to the receiver.
  • the transmitter 203 may include a laser diode through which nanosecond laser pulses are emitted.
  • the laser pulse receiving time can be determined, for example, the laser pulse receiving time can be determined by detecting the rising edge time and/or the falling edge time of the electrical signal pulse. In this way, the distance measuring device 200 can calculate the TOF using the pulse receiving time information and the pulse sending time information, so as to determine the distance between the probe 201 and the distance measuring device 200.
  • the embodiments of the present application also provide a computer storage medium.
  • the computer storage medium is used to store computer software instructions for detecting the above object. When it runs on a computer, the computer can perform various possible object detection in the above method embodiment. method. When the computer-executable instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application can be generated in whole or in part.
  • the computer instructions can be stored in a computer storage medium, or transmitted from one computer storage medium to another computer storage medium, and the transmission can be transmitted to another by wireless (such as cellular communication, infrared, short-range wireless, microwave, etc.) Website site, computer, server or data center for transmission.
  • the computer storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, an SSD).
  • a person of ordinary skill in the art can understand that all or part of the steps in the above method embodiments can be implemented by a program instructing relevant hardware.
  • the foregoing program can be stored in a computer readable storage medium. When the program is executed, it is executed. Including the steps of the foregoing method embodiment; and the foregoing storage medium includes: read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks, etc., which can store program codes Medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

一种物体检测方法、电子设备和可移动平台。其中,物体检测方法包括:获取待检测场景的稀疏点云数据和图像(S201);将稀疏点云数据和图像投影到目标坐标系中,获取待处理数据(S202);对待处理数据进行三维检测,获取待检测场景包括的物体的检测结果(S203)。通过获取稀疏点云数据和图像实现物体的检测,降低了点云数据的密度,因此降低了物体检测的成本。

Description

物体检测方法、电子设备和可移动平台 技术领域
本申请实施例涉及可移动平台技术领域,尤其涉及一种物体检测方法、电子设备和可移动平台。
背景技术
障碍物检测是自动驾驶系统的关键技术之一,其利用车辆搭载的相机、激光雷达、毫米波雷达等传感器来检测道路场景中的障碍物,比如,车辆、行人,等。在自动驾驶场景中,自动驾驶系统不仅需要获得障碍物在图像上的位置,还需要预测出障碍物的三维定位信息。障碍物三维定位的精度直接影响了自动驾驶车辆的安全性和可靠性。
目前,障碍物检测定位方法通常依赖于准确的深度测量传感器,可以获得高密度点云数据,比如激光雷达传感器等。但是,准确的深度测量传感器成本高昂,导致自动驾驶系统的成本高昂。
发明内容
本申请实施例提供一种物体检测方法、电子设备和可移动平台,可以降低物体检测的成本。
第一方面,本申请实施例提供一种物体检测方法,包括:
获取待检测场景的稀疏点云数据和图像;
将所述稀疏点云数据和所述图像投影到目标坐标系中,获取待处理数据;
对所述待处理数据进行三维检测,获取所述待检测场景包括的物体的检测结果。
第二方面,本申请实施例提供一种电子设备,包括:
存储器,用于存储计算机程序;
处理器,用于执行所述计算机程序,具体用于:
获取待检测场景的稀疏点云数据和图像;
将所述稀疏点云数据和所述图像投影到目标坐标系中,获取待处理数据;
对所述待处理数据进行三维检测,获取所述待检测场景包括的物体的检测结果。
第三方面,本申请实施例提供一种可移动平台,包括:本申请实施例第二方面提供的电子设备。
第四方面,本申请实施例一种计算机存储介质,所述存储介质中存储计算机程序,所述计算机程序在执行时实现如第一方面提供的物体检测方法。
本申请实施例提供一种物体检测方法、电子设备和可移动平台,通过获取待检测场景的稀疏点云数据和图像,将稀疏点云数据和图像投影到目标坐标系中,获取待处理数据,对待处理数据进行三维检测,获取待检测场景包括的物体的检测结果。由于只需要获取稀疏点云数据,降低了点云数据的密度,因此降低了电子设备的复杂度和要求,降低了物体检测的成本。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例涉及的一种应用场景示意图;
图2为本申请实施例提供的物体检测方法的一种流程图;
图3为本申请实施例提供的物体检测方法的另一种流程图;
图4为本申请实施例提供的物体检测方法的另一种流程图;
图5为本申请实施例提供的物体检测方法的另一种流程图;
图6为本申请实施例提供的物体检测方法的另一种流程图;
图7为本申请实施例提供的电子设备的结构示意图;
图8为本申请实施例提供的激光雷达的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于 本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
在本申请的描述中,“多个”是指两个或多于两个。
另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
本申请实施例可以应用于任何需要对物体进行检测的领域。例如应用于自动驾驶、辅助驾驶等智能驾驶领域,可以检测道路场景中的车辆、行人等障碍物的检测。又例如,可以应用于无人机领域,可以检测无人机飞行场景中的障碍物的检测。再例如,安防领域,对进入指定区域的物体进行检测。本申请实施例提供的物体检测方法,可以适用低复杂度神经网络,在确保物体检测准确率的基础上,使检测方案在多平台上具有通用性。
示例性的,图1为本申请实施例涉及的一种应用场景示意图,如图1所示,智能驾驶车辆包括探测设备。智能驾驶车辆在行驶过程中,探测设备可以对前方车道的物体(如落石、遗撒物、枯枝、行人、车辆等)进行识别和检测,获得物体的三维位置、姿态朝向和三维尺寸等检测信息,并根据这些检测信息来规划智能驾驶的状态,例如为变道、减速或者停车等。
可选的,探测设备可以包括雷达、超声波探测设备、飞行时间测距法(Time Of Flight,TOF)测距探测设备、视觉探测设备、激光探测设备、图像传感器等及其组合。本申请实施例对于传感器的数量和实现类型不做限定。例如,图像传感器可以为相机、摄像机,等。雷达可以为通用激光雷达或者满足特定场景需求的特定激光雷达,例如具有多发多收传感器的旋转扫描式多线激光雷达,等。
需要说明的是,图1为本申请的一种应用场景示意图,本申请实施例的应用场景包括但不限于图1所示。
下面以具体地实施例对本发明的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图2为本申请实施例提供的物体检测方法的一种流程图。本实施例提供的物体检测方法,执行主体可以为电子设备。如图2所示,本实施例提供的物体检测方法,包括:
S201、获取待检测场景的稀疏点云数据和图像。
其中,点云数据是指通过测量设备得到的物体外观表面的点数据集合。根据不同的条件,点云数据可以分为稀疏点云数据和密集点云数据。例如,可以根据点与点之间的间距以及点的数量进行划分。当点与点之间的间距比较大、点数量比较少时,可以称为稀疏点云数据。当点与点之间的间距比较小、点数量比较大时,可以称为密集点云数据或者高密度点云数据。
密集点云的获取需要高线束激光雷达,在较高频率下进行扫描。而高线束激光雷达造成了较高的使用成本,激光雷达持续在较高的扫描频率会降低激光雷达的使用寿命。其它密集点云的获取方式,例如多单线激光雷达的点云拼接则需要复杂的算法,系统鲁棒性比较低。
由于只需要获取稀疏点云数据,相比于获取高密度点云数据,降低了获取点云数据的难度,降低了对设备的要求和设备的成本。因此,在日常应用场景下,稀疏点云会比利用密集点云具有更好的利用价值。
需要说明的是,本实施例对于待检测场景不做限定,根据电子设备的类型以及应用场景的不同可以有所不同。例如,当电子设备应用于自动驾驶车辆时,所述待检测场景可以为车辆行驶前方的道路。当电子设备应用于无人机时,所述待检测场景可以为无人机飞行时的飞行环境。
可选的,获取待检测场景的稀疏点云数据和图像,可以包括:
通过雷达传感器获取所述稀疏点云数据,通过图像传感器获取所述图像。
本实施例对于雷达传感器和图像传感器的数量、安装位置和实现方式不做限定。例如,图像传感器可以为相机、摄像头,等等。图像传感器的数量可以为一个。雷达传感器的数量可以为一个或者多于一个。
可选的,当雷达传感器的数量大于1时,通过雷达传感器获取稀疏点云数据,可以包括:
分别通过每个雷达传感器获取对应的第一稀疏点云数据。
根据至少一个雷达传感器的外参,将每个雷达传感器分别对应的第一稀疏点云数据投影到目标雷达坐标系中,获取稀疏点云数据。
具体的,雷达传感器为多个。多个雷达传感器对应的雷达坐标系之间,具有一定的转换关系,该转换关系可以通过雷达传感器的外参确定,本实施例对此不做限定。雷达传感器的外参包括但是不限于雷达传感器的布置方式,位置,朝向角度,载体速度,加速度等等。可以将每个雷达传感器分别采集到的第一稀疏点云数据投影到目标雷达坐标系中,获取目标雷达坐标系中的稀疏点云数据。可选的,目标雷达坐标系可以为多个雷达传感器中的任意一个雷达传感器对应的雷达坐标系。或者,目标雷达坐标系为其他的雷达坐标系,所述目标雷达坐标系与每个雷达传感器对应的雷达坐标系具有一定的转换关系。
下面通过示例进行说明。
假设雷达传感器为2个,分别称为雷达传感器1和雷达传感器2。目标雷达坐标系可以为雷达传感器1对应的雷达坐标系1。稀疏点云数据可以包括:将雷达传感器2采集到的稀疏点云数据投影到雷达坐标系1中后与雷达传感器1采集到的稀疏点云数据的集合。
可选的,若将每个雷达传感器分别对应的第一稀疏点云数据投影到目标雷达坐标系后存在重叠的点云数据,则进行数据去重处理。
通过数据去重处理,提升了获取的稀疏点云数据的有效性,从而提升了物体检测的准确性。
可选的,稀疏点云数据可以包括每个点的三维位置坐标,可以标记为(x,y,z)。可选的,稀疏点云数据还可以包括每个点的激光反射强度值。
S202、将稀疏点云数据和图像投影到目标坐标系中,获取待处理数据。
具体的,将稀疏点云数据和图像投影到目标坐标系中,可以实现稀疏点云数据与图像中像素的匹配,提升了待处理数据的有效性。可选的,目标坐标系可以为图像传感器对应的图像坐标系。目标坐标系还可以为其他坐标系,本实施例对此不做限定。
可选的,将稀疏点云数据和图像投影到目标坐标系中,获取待处理数据,可以包括:
通过雷达传感器与图像传感器的外参,将稀疏点云数据和图像投影到图像坐标系中,获取待处理数据。
具体的,雷达传感器对应的雷达坐标系与图像传感器对应的图像坐标系 之间具有一定的转换关系,该转换关系可以通过雷达传感器和图像传感器的外参确定,本实施例对此不做限定。在本实现方式中,目标坐标系为图像传感器对应的图像坐标系。通过将稀疏点云数据和图像投影到图像坐标系中,稀疏点云数据可以准确的与图像中的部分像素进行映射匹配,图像坐标系外的稀疏点云数据可以进行过滤处理。举例说明,假设图像的长为H、宽为W。那么,通过将稀疏点云数据和图像投影到图像坐标系中,可以将H×W范围外的稀疏点云数据过滤掉,获得待处理数据。
可选的,待处理数据可以包括:稀疏点云数据投影到目标坐标系中每个点的坐标值和反射率,以及图像中的像素点在目标坐标系中的坐标值。
需要说明的是,将稀疏点云数据和图像投影到目标坐标系中,稀疏点云数据中的每个点与图像中的每个像素点并不一定完全映射匹配上。当可以匹配上时,目标坐标系中对应的点的反射率可以为稀疏点云数据中点的激光反射强度值。当匹配不上时,目标坐标系中对应的点的反射率可以置为0。
S203、对待处理数据进行三维检测,获取待检测场景包括的物体的检测结果。
可选的,物体的信息可以包括下列中的至少一项:三维位置信息、朝向信息、三维尺寸信息和物体的深度值。
可见,本实施例提供的物体检测方法,通过获取待检测场景的稀疏点云数据和图像,可以基于稀疏点云数据和图像获取待检测场景包括的物体的检测结果。由于只需要获取稀疏点云数据,降低了点云数据的密度,因此降低了电子设备的复杂度和要求,降低了物体检测的成本。
图3为本申请实施例提供的物体检测方法的另一种流程图。如图3所示,上述S203中,对待处理数据进行三维检测,获取待检测场景包括的物体的检测结果,可以包括:
S301、将待处理数据输入基础网络模型,获取特征图。
其中,基础网络模型可以为预先训练的、用于根据待处理数据输出特征图。需要说明的是,本实施例对于基础网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。所述基础网络模型根据实际需求可以包含若干层卷积和池化的操作,并最终输出一 个特征图。
S302、将特征图输入候选区域网络模型,获取候选物体的二维框。
其中,候选区域网络模型可以为预先训练的、用于根据特征图输出候选物体的二维框。需要说明的是,本实施例对于候选区域网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。所述候选物体的二维框,与待检测场景中包括的物体相对应。待检测场景中包括的每个物体可以对应有多个候选物体的二维框。
需要说明的是,在本步骤中,并不区分物体的具体类型。举例说明。假设,待检测场景中包括的物体为2个车辆和1个行人,获取的候选物体的二维框可以为100个。那么,2个车辆和1个行人共同对应这100个候选物体的二维框。在后续的步骤中,可以确定这100个候选物体的二维框分别对应哪个物体。
S303、根据候选物体的二维框确定待检测场景包括的物体,并获取物体的信息的补偿值。
具体的,可以根据候选物体的二维框确定所述待检测场景中包括的物体是什么,并获取物体的信息的补偿值。
可选的,物体的信息的补偿值可以包括但不限于下列中的至少一种:物体的朝向的补偿值、物体的三维位置信息的补偿值、物体的三维尺寸的补偿值和物体的二维框的补偿值。
其中,物体的朝向的补偿值为物体的朝向的实际值与预设朝向之间的差值。物体的三维位置信息的补偿值为物体的三维位置的实际值与预设三维位置之间的差值。物体的三维尺寸的补偿值为物体的三维尺寸的实际值与预设三维尺寸之间的差值。物体的二维框的补偿值为物体的二维框的实际值与预设值之间的差值。需要说明的是,本实施例对于预设朝向、预设三维位置、预设三维尺寸和物体二维框的预设值的具体取值不做限定。例如,对于车辆来说,预设三维位置可以为车辆的底盘的中心点的三维位置。预设三维尺寸可以根据车辆型号的不同有所不同。
S304、根据物体的信息的补偿值获取物体的信息。
本实施例提供的物体检测方法,将待处理数据依次输入基础网络模型和候选区域网络模型,可以获取候选物体的二维框,进而根据候选物体的二维 框确定待检测场景中包括的物体以及物体的信息的补偿值,并根据物体的信息的补偿值获取物体的信息。相比于直接获取物体的信息,先获取物体的信息的补偿值,易于实现且准确性较高,提升了物体检测的准确性。
图4为本申请实施例提供的物体检测方法的另一种流程图。如图4所示,上述S302中,将特征图输入候选区域网络模型,获取候选物体的二维框,可以包括:
S401、根据特征图获取图像中每个像素点属于物体的概率。
S402、若根据每个像素点属于物体的概率确定第一像素属于物体,则获取第一像素对应的物体的二维框。
S403、根据第一像素属于物体的概率和第一像素对应的物体的二维框,获取候选物体的二维框。
下面结合示例进行说明。
假设,图像的分辨率为100*50,即有5000个像素点。根据特征图可以获取这5000个像素点中每个像素点属于物体的概率。假设,像素点1属于物体的概率为P1,根据概率P1确定像素点1属于物体,像素点1可以称为第一像素点,可以获取像素点1对应的物体的二维框1。像素点2属于物体的概率为P2,根据概率P2确定像素点2不属于物体。像素点3属于物体的概率为P3,根据概率P3可以确定像素点3属于物体,像素点3可以称为第一像素点,可以获取像素点3对应的物体的二维框3。假设,根据5000个像素点分别属于物体的概率确定第一像素为200个,那么,可以获取200个物体的二维框。之后,根据200个第一像素属于物体的概率和200个第一像素分别对应的物体的二维框进一步筛选,从这200个物体的二维框中获取候选物体的二维框。例如,最终获取的候选物体的二维框可以为50个。
需要说明的是,上述数字仅是示例,本实施例对此不作限定。
本实施例提供的物体检测方法,根据图像中像素点属于物体的概率,可以先获取到一部分像素对应的物体的二维框,进一步,在这些物体的二维框中再次进行刷选,确定候选物体的二维框,提升了获取候选物体的二维框的准确性。
可选的,S402中,根据像素点属于物体的概率确定该像素点是否属于物 体,可以包括:
若像素点属于物体的概率大于或等于预设值,则确定该像素点属于物体。
若像素点属于物体的概率小于预设值,则确定该像素点不属于物体。
本实施例对于预设值的具体取值不做限定。
可选的,S403中,根据第一像素属于物体的概率和第一像素对应的物体的二维框,获取候选物体的二维框,可以包括:
从多个第一像素组成的第一集合中获取待处理的第一像素,并将待处理的第一像素从第一集合中删除,获取更新后的第一集合。待处理的第一像素为第一集合中属于物体的概率最大的第一像素。
对于更新后的第一集合中的每个第一像素,获取每个第一像素分别与待处理的第一像素之间的关联值。关联值用于指示每个第一像素对应的物体的二维框与待处理的第一像素对应的物体的二维框的重合程度。
将关联值大于预设值的第一像素从更新后的第一集合中删除,并重新执行上述获取待处理的第一像素和更新第一集合的步骤,直至第一集合不包括第一像素为止,将所有待处理的第一像素对应的物体的二维框确定为候选物体的二维框。
下面结合示例进行说明。
假设,第一像素的数量为4个,标记为像素1~4。像素1~4属于物体的概率分别为P1~P4。其中,P2>P3>P1>P4。像素1~4组成初始的第一集合。首先,第一集合中像素2属于物体的概率P2最大,从第一集合中获取像素2,像素2可以称为待处理的第一像素,并更新第一集合为{像素1,像素3,像素4}。获取第一集合中像素1、像素3、像素4分别与像素2之间的关联值,依次标记为Q12、Q32、Q42。假设,Q12>预设值Q,并且,Q42>预设值Q,说明像素1对应的物体的二维框与像素2对应的物体的二维框重合度较高,像素4对应的物体的二维框与像素2对应的物体的二维框重合度也较高,因此,可以将像素1和像素4从第一集合中删除,完成针对像素2对应的物体的二维框的去重操作。此时,第一集合仅包括像素3。再次从第一集合中获取像素3,像素3可以称为待处理的第一像素。将像素3从第一集合中删除后,第一集合不包括第一像素。可以将像素2和像素3对应的物体的二维框确定为所述候选物体的二维框。候选物体的二维框为2个。
可见,通过在第一像素对应的物体的二维框中实现去重操作,大大减少了获得的候选物体的二维框的数量,提升了候选物体的二维框的有效性,有利于提升物体检测的准确性。
图5为本申请实施例提供的物体检测方法的另一种流程图。如图5所示,上述S303中,根据候选物体的二维框确定待检测场景包括的物体,可以包括:
S501、将候选物体的二维框输入第一三维检测网络模型,获取候选物体属于预设物体中每个物体的概率。
其中,第一三维检测网络模型可以为预先训练的、用于根据候选物体的二维框输出候选物体属于预设物体中每个物体的概率。需要说明的是,本实施例对于第一三维检测网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。本实施例对于预设物体的具体类别不做限定,例如,预设物体可以包括但不限于车辆、自行车、行人。
下面通过示例进行说明。
假设,候选物体的二维框的数量为3个,候选物体可以标记为候选物体1~3,候选物体的二维框可以标记为二维框1~3。预设物体包括车辆和行人。那么,将候选物体的二维框输入第一三维检测网络模型后,输出结果如下:
候选物体1:属于车辆的概率为P11,属于行人的概率为P12。
候选物体2:属于车辆的概率为P21,属于行人的概率为P22。
候选物体3:属于车辆的概率为P31,属于行人的概率为P32。
S502、根据候选物体属于预设物体中每个物体的概率,获取待检测场景包括的物体。
可选的,在一种实现方式中,若候选物体属于预设物体中第一物体的概率大于所述第一物体对应的预设阈值,则确定所述候选物体为待检测场景包括的物体。还以S501中的示例进行说明。假设,车辆对应的预设阈值为Q1,行人对应的预设阈值为Q2。假设,P11>Q1,P21<Q1,P31>Q1,则可以确定待检测场景中包括候选物体1和候选物体3,且均为车辆。假设,P12<Q2,P22<Q2,P32<Q2,则说明待检测场景中不包括行人。
需要说明的是,S502中,根据候选物体属于预设物体中每个物体的概率, 获取待检测场景包括的物体,还可以具有其他的实现方式,本实施例对此不作限定。
可选的,S303中,获取物体的信息的补偿值,可以包括:
通过将候选物体的二维框输入第一三维检测网络模型,还获取下列补偿值中的至少一项:候选物体的朝向的补偿值、候选物体的三维位置信息的补偿值、候选物体的二维框的补偿值和候选物体的三维尺寸的补偿值。
若根据候选物体属于预设物体中每个物体的概率确定候选物体为待检测场景包括的物体,则将候选物体对应的补偿值确定为物体的信息的补偿值。
具体的,通过第一三维检测网络模型,不仅可以根据候选物体的二维框输出候选物体属于预设物体中每个物体的概率,还可以同时输出候选物体的信息的补偿值。如果根据候选物体属于预设物体中每个物体的概率确定候选物体为待检测场景包括的物体(具体参见S502的相关描述),则可以将候选物体对应的补偿值确定为物体的信息的补偿值。还以S501中的示例进行说明。由于候选物体1和候选物体3为待检测场景中包括的车辆,那么,可以将候选物体1和候选物体3分别对应的补偿值确定为待检测场景中包括的车辆的信息的补偿值。
下面,以候选物体的朝向的补偿值为例,对补偿值进行说明。
对于候选物体的朝向的补偿值,可以将[-180°,180°]等分成多个区间,将每个区间的中心设置为预设朝向。例如,对于区间[-160°,-140°]而言,预设朝向可以为-150°。对于每个候选物体的二维框,通过第一三维检测网络模型都可以输出候选物体的朝向所属的区间以及候选物体的朝向的补偿值,该补偿值即为候选物体的朝向的实际值与所属区间的中心之间的差值。
本实施例提供的物体检测方法,将候选物体的二维框输入第一三维检测网络模型,可以获取候选物体属于预设物体中每个物体的概率以及候选物体的信息的补偿值,从而在确定候选物体为待检测场景中包括的物体时可以同时获取物体的信息的补偿值。
图6为本申请实施例提供的物体检测方法的另一种流程图。如图6所示,上述S303中,根据候选物体的二维框确定待检测场景包括的物体,可以包括:
S601、将候选物体的二维框输入语义预测网络模型,获取候选物体属于 预设物体中每个物体的概率。
S602、根据候选物体属于预设物体中每个物体的概率,确定待检测场景包括的物体。
其中,语义预测网络模型可以为预先训练的、用于根据候选物体的二维框输出候选物体属于预设物体中每个物体的概率。需要说明的是,本实施例对于语义预测网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。本实施例对于预设物体的具体类别不做限定,例如,预设物体可以包括但不限于车辆、自行车、行人。
可选的,S303中,获取物体的信息的补偿值,可以包括:
将待检测场景包括的物体的二维框输入第二三维检测网络模型,获取物体的信息的补偿值,补偿值包括下列中的至少一项:物体的朝向的补偿值、物体的三维位置信息的补偿值、物体的二维框的补偿值和物体的三维尺寸的补偿值。
其中,第二三维检测网络模型可以为预先训练的、用于根据物体的二维框输出物体的信息的补偿值。需要说明的是,本实施例对于第二三维检测网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。
本实施例与图5所示实施例的区别在于:在本实施例中,涉及语义预测网络模型和第二三维检测网络模型这两个模型。语义预测网络模型的输出为候选物体属于预设物体中每个物体的概率。在根据所述概率确定候选物体为待检测场景包括的物体后,再通过第二三维检测网络模型输出物体的信息的补偿值。而在图5所示实施例中,涉及第一三维检测网络模型。通过第一三维检测网络模型可以同时输出候选物体属于预设物体中每个物体的概率以及候选物体的信息的补偿值。
在上述实施例的基础上,在一种可能的实现方式中,物体的信息的补偿值包括物体的朝向的补偿值。S304中,根据物体的信息的补偿值获取物体的信息,可以包括:
获取物体所属的预设朝向区间的中心角度。
根据物体的朝向的补偿值和物体所属的预设朝向区间的中心角度,获取 物体的朝向信息。
在该种实现方式中,可以根据物体的朝向的补偿值获取物体的朝向信息。
可选的,在另一种可能的实现方式中,物体的信息的补偿值包括物体的三维位置信息的补偿值。S304中,根据物体的信息的补偿值获取物体的信息,可以包括:
获取物体的参考点的三维位置信息。
根据物体的三维位置信息的补偿值和物体的参考点的三维位置信息,获取物体的三维位置信息。
在该种实现方式中,可以根据物体的三维位置信息的补偿值获取物体的三维位置信息。
可选的,在另一种可能的实现方式中,物体的信息的补偿值包括物体的三维尺寸的补偿值。S304中,根据物体的信息的补偿值获取物体的信息,可以包括:
获取物体对应的物体的三维尺寸的参考值。
根据物体的三维尺寸的补偿值和物体对应的物体的三维尺寸的参考值,获取物体的三维尺寸信息。
在该种实现方式中,可以根据物体的三维尺寸的补偿值获取物体的三维尺寸信息。
可选的,在另一种可能的实现方式中,物体的信息的补偿值包括物体的二维框的补偿值。S304中,根据物体的信息的补偿值获取物体的信息,可以包括:
获取物体对应的二维框的参考值。
根据物体的二维框的补偿值和物体对应的二维框的参考值,获取物体的二维框的位置信息。
根据物体的二维框的位置信息获取物体的深度值。
在该种实现方式中,可以根据物体的二维框的补偿值获取物体的深度值。
需要说明的是,物体的信息的补偿值包括的内容不同,上述各种实现方式可以相互结合,从而可以获取物体的下列信息中的至少一项:物体的朝向信息、物体的三维位置信息、物体的三维尺寸信息和物体的深度值。
可选的,在一种实现方式中,根据物体的二维框的位置信息获取物体的 深度值,可以包括:
将物体的二维框的位置信息输入第一区域分割网络模型,获取物体的表面上的稀疏点云数据。
对物体的表面上的稀疏点云数据进行聚类分割,获取物体表面上的目标点的稀疏点云数据。
根据目标点的稀疏点云数据确定物体的深度值。
其中,第一区域分割网络模型可以为预先训练的、用于根据物体的二维框的位置信息输出物体的表面上的稀疏点云数据。需要说明的是,本实施例对于第一区域分割网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。
在该种实现方式中,通过第一区域分割网络模型,可以获取到准确的物体表面上的稀疏点云数据。进而,对物体表面上的稀疏点云数据进行聚类分割,可以获取到物体表面上目标点的稀疏点云数据,并最终确定物体的深度值。
需要说明的是,本实施例对于目标点的位置不做限定。例如,车辆上的目标点,可以为车辆尾部上凸起的点。行人的目标点,可以为行人头部上的点,等等。
可选的,在另一种实现方式中,根据物体的二维框的位置信息获取物体的深度值,可以包括:
将物体的二维框的位置信息输入第二区域分割网络模型,获取物体的表面上的目标面的稀疏点云数据。
根据目标面的稀疏点云数据获取物体的深度值。
其中,第二区域分割网络模型可以为预先训练的、用于根据物体的二维框的位置信息输出物体的表面上的目标面的稀疏点云数据。需要说明的是,本实施例对于第二区域分割网络模型的实现方式不做限定,根据实际需求可以采用不同的神经网络模型,例如,卷积神经网络模型。
在该种实现方式中,通过第二区域分割网络模型,可以获取到物体表面上的目标面的稀疏点云数据,从而可以确定物体的深度值。
需要说明的是,本实施例对于目标面的位置不做限定。例如,如果车辆的行驶方向与电子设备的移动方向一致,车辆上的目标面可以为车辆的尾部。 如果车辆的行驶方向与电子设备的移动方向相反,车辆上的目标面可以为车辆的前部。行人的目标面,可以为行人的头部,等等。
图7为本申请实施例提供的电子设备的结构示意图。本实施例提供的电子设备,用于执行图2~图6任一实现方式提供的物体检测方法。如图7所示,本实施例提供的电子设备,可以包括:
存储器12,用于存储计算机程序;
处理器11,用于执行所述计算机程序,具体用于:
获取待检测场景的稀疏点云数据和图像;
将所述稀疏点云数据和所述图像投影到目标坐标系中,获取待处理数据;
对所述待处理数据进行三维检测,获取所述待检测场景包括的物体的检测结果。
可选的,所述处理器11具体用于:
将所述待处理数据输入基础网络模型,获取特征图;
将所述特征图输入候选区域网络模型,获取候选物体的二维框;
根据所述候选物体的二维框确定所述待检测场景包括的物体,并获取所述物体的信息的补偿值;
根据所述物体的信息的补偿值获取所述物体的信息。
可选的,所述处理器11具体用于:
根据所述特征图获取所述图像中每个像素点属于物体的概率;
若根据所述每个像素点属于物体的概率确定第一像素属于物体,则获取所述第一像素对应的物体的二维框;
根据所述第一像素属于物体的概率和所述第一像素对应的物体的二维框,获取所述候选物体的二维框。
可选的,所述处理器11具体用于:
从多个所述第一像素组成的第一集合中获取待处理的第一像素,并将所述待处理的第一像素从所述第一集合中删除,获取更新后的第一集合;所述待处理的第一像素为所述第一集合中属于物体的概率最大的第一像素;
对于所述更新后的第一集合中的每个第一像素,获取每个第一像素分别与所述待处理的第一像素之间的关联值;所述关联值用于指示每个第一像素 对应的物体的二维框与所述待处理的第一像素对应的物体的二维框的重合程度;
将关联值大于预设值的第一像素从所述更新后的第一集合中删除,并重新执行上述获取待处理的第一像素和更新第一集合的步骤,直至第一集合不包括第一像素为止,将所有所述待处理的第一像素确定为所述候选物体的二维框。
可选的,所述处理器11具体用于:
将所述候选物体的二维框输入第一三维检测网络模型,获取所述候选物体属于预设物体中每个物体的概率;
根据所述候选物体属于预设物体中每个物体的概率,获取所述待检测场景包括的物体。
可选的,所述处理器11具体用于:
通过将所述候选物体的二维框输入第一三维检测网络模型,还获取下列补偿值中的至少一项:候选物体的朝向的补偿值、候选物体的三维位置信息的补偿值、候选物体的二维框的补偿值和候选物体的三维尺寸的补偿值;
若根据所述候选物体属于预设物体中每个物体的概率确定所述候选物体为所述待检测场景包括的物体,则将所述候选物体对应的补偿值确定为所述物体的信息的补偿值。
可选的,所述处理器11具体用于:
将所述候选物体的二维框输入语义预测网络模型,获取所述候选物体属于预设物体中每个物体的概率;
根据所述候选物体属于预设物体中每个物体的概率,确定所述待检测场景包括的物体。
可选的,所述处理器11具体用于:
将所述待检测场景包括的物体的二维框输入第二三维检测网络模型,获取所述物体的信息的补偿值,所述补偿值包括下列中的至少一项:物体的朝向的补偿值、物体的三维位置信息的补偿值、物体的二维框的补偿值和物体的三维尺寸的补偿值。
可选的,所述物体的信息的补偿值包括所述物体的朝向的补偿值,所述处理器11具体用于:
获取所述物体所属的预设朝向区间的中心角度;
根据所述物体的朝向的补偿值和所述物体所属的预设朝向区间的中心角度,获取所述物体的朝向信息。
可选的,所述物体的信息的补偿值包括所述物体的三维位置信息的补偿值,所述处理器11具体用于:
获取所述物体的参考点的三维位置信息;
根据所述物体的三维位置信息的补偿值和所述物体的参考点的三维位置信息,获取所述物体的三维位置信息。
可选的,所述物体的信息的补偿值包括所述物体的三维尺寸的补偿值,所述处理器11具体用于:
获取所述物体对应的物体的三维尺寸的参考值;
根据所述物体的三维尺寸的补偿值和所述物体对应的物体的三维尺寸的参考值,获取所述物体的三维尺寸信息。
可选的,所述物体的信息的补偿值包括所述物体的二维框的补偿值,所述处理器11具体用于:
获取所述物体对应的二维框的参考值;
根据所述物体的二维框的补偿值和所述物体对应的二维框的参考值,获取所述物体的二维框的位置信息;
根据所述物体的二维框的位置信息获取所述物体的深度值。
可选的,所述处理器11具体用于:
将所述物体的二维框的位置信息输入第一区域分割网络模型,获取所述物体的表面上的稀疏点云数据;
对所述物体的表面上的稀疏点云数据进行聚类分割,获取所述物体表面上的目标点的稀疏点云数据;
根据所述目标点的稀疏点云数据确定所述物体的深度值。
可选的,所述处理器11具体用于:
将所述物体的二维框的位置信息输入第二区域分割网络模型,获取所述物体的表面上的目标面的稀疏点云数据;
根据所述目标面的稀疏点云数据获取所述物体的深度值。
可选的,所述物体的信息包括下列中的至少一项:三维位置信息、朝向 信息、三维尺寸信息和所述物体的深度值。
可选的,所述处理器11具体用于:
通过至少一个雷达传感器获取所述稀疏点云数据,通过图像传感器获取所述图像。
可选的,所述雷达传感器的数量大于1;所述处理器11具体用于:
分别通过每个所述雷达传感器获取对应的第一稀疏点云数据;
根据所述至少一个雷达传感器的外参,将每个所述雷达传感器分别对应的第一稀疏点云数据投影到目标雷达坐标系中,获取所述稀疏点云数据。
可选的,所述处理器11具体用于:
通过所述雷达传感器与所述图像传感器之间的外参矩阵,将所述稀疏点云数据和所述图像投影到相机坐标系中,获取所述待处理数据。
可选的,所述待处理数据包括:所述稀疏点云数据投影到所述目标坐标系中每个点的坐标值和反射率,以及所述图像中的像素点在所述目标坐标系中的坐标值。
可选的,电子设备还可以包括雷达传感器13和图像传感器14。本实施例对于雷达传感器13和图像传感器14的数量、安装位置不做限定。
本实施例提供的电子设备,用于执行图2~图6任一实现方式提供的物体检测方法,技术方案和技术效果相似,此处不摘赘述。
本申请实施例还提供一种可移动平台,可以包括图7所示实施例提供的电子设备。需要说明的是,本实施例对于可移动平台的类型不做限定,可以为任意一种需要进行物体检测的设备。例如,可以为无人机、车辆或者其他交通工具。
如图8所示,在一个可选的实施例中。测距装置200包括测距模块210,测距模块210包括发射器203(例如,发射电路)、准直元件204、探测器205(例如,可以包括接收电路、采样电路和运算电路)和光路改变元件206。测距模块210用于发射光束,且接收回光,将回光转换为电信号。其中,发射器203可以用于发射光脉冲序列。在一个实施例中,发射器203可以发射激光脉冲序列。可选的,发射器203发射出的激光束为波长在可见光范围之外的窄带宽光束。准直元件204设置于发射器203的出射光路上,用于准直从发射器203发出的光束,将发射器203发出的光束准直为平行光出射至扫 描模块。准直元件204还用于会聚经探测物反射的回光的至少一部分。该准直元件204可以是准直透镜或者是其他能够准直光束的元件。
在图8所示实施例中,通过光路改变元件206来将测距装置内的发射光路和接收光路在准直元件204之前合并,使得发射光路和接收光路可以共用同一个准直元件,使得光路更加紧凑。在其他的一些实现方式中,也可以是发射器203和探测器205分别使用各自的准直元件,将光路改变元件206设置在准直元件之后的光路上。
在图8所示实施例中,由于发射器203出射的光束的光束孔径较小,测距装置所接收到的回光的光束孔径较大,所以光路改变元件可以采用小面积的反射镜来将发射光路和接收光路合并。在其他的一些实现方式中,光路改变元件也可以采用带通孔的反射镜,其中该通孔用于透射发射器203的出射光,反射镜用于将回光反射至探测器205。这样可以减小采用小反射镜的情况中小反射镜的支架会对回光的遮挡。
在图8所示实施例中,光路改变元件偏离了准直元件204的光轴。在其他的一些实现方式中,光路改变元件也可以位于准直元件204的光轴上。
测距装置200还包括扫描模块202。扫描模块202放置于测距模块210的出射光路上,扫描模块202用于改变经准直元件204出射的准直光束219的传输方向并投射至外界环境,并将回光投射至准直元件204。回光经准直元件204汇聚到探测器205上。
在一个实施例中,扫描模块202可以包括至少一个光学元件,用于改变光束的传播路径,其中,该光学元件可以通过对光束进行反射、折射、衍射等等方式来改变光束传播路径。例如,扫描模块202包括透镜、反射镜、棱镜、振镜、光栅、液晶、光学相控阵(Optical Phased Array)或上述光学元件的任意组合。一个示例中,至少部分光学元件是运动的,例如通过驱动模块来驱动该至少部分光学元件进行运动,该运动的光学元件可以在不同时刻将光束反射、折射或衍射至不同的方向。在一些实施例中,扫描模块202的多个光学元件可以绕共同的轴209旋转或振动,每个旋转或振动的光学元件用于不断改变入射光束的传播方向。在一个实施例中,扫描模块202的多个光学元件可以以不同的转速旋转,或以不同的速度振动。在另一个实施例中,扫描模块202的至少部分光学元件可以以基本相同的转速旋转。在一些实施 例中,扫描模块的多个光学元件也可以是绕不同的轴旋转。在一些实施例中,扫描模块的多个光学元件也可以是以相同的方向旋转,或以不同的方向旋转;或者沿相同的方向振动,或者沿不同的方向振动,在此不作限制。
在一个实施例中,扫描模块202包括第一光学元件214和与第一光学元件214连接的驱动器216,驱动器216用于驱动第一光学元件214绕转动轴209转动,使第一光学元件214改变准直光束219的方向。第一光学元件214将准直光束219投射至不同的方向。在一个实施例中,准直光束219经第一光学元件改变后的方向与转动轴209的夹角随着第一光学元件214的转动而变化。在一个实施例中,第一光学元件214包括相对的非平行的一对表面,准直光束219穿过该对表面。在一个实施例中,第一光学元件214包括厚度沿至少一个径向变化的棱镜。在一个实施例中,第一光学元件214包括楔角棱镜,对准直光束219进行折射。
在一个实施例中,扫描模块202还包括第二光学元件215,第二光学元件215绕转动轴209转动,第二光学元件215的转动速度与第一光学元件214的转动速度不同。第二光学元件215用于改变第一光学元件214投射的光束的方向。在一个实施例中,第二光学元件215与另一驱动器217连接,驱动器217驱动第二光学元件215转动。第一光学元件214和第二光学元件215可以由相同或不同的驱动器驱动,使第一光学元件214和第二光学元件215的转速和/或转向不同,从而将准直光束219投射至外界空间不同的方向,可以扫描较大的空间范围。在一个实施例中,控制器218控制驱动器216和217,分别驱动第一光学元件214和第二光学元件215。第一光学元件214和第二光学元件215的转速可以根据实际应用中预期扫描的区域和样式确定。驱动器216和217可以包括电机或其他驱动器。
在一个实施例中,第二光学元件215包括相对的非平行的一对表面,光束穿过该对表面。在一个实施例中,第二光学元件215包括厚度沿至少一个径向变化的棱镜。在一个实施例中,第二光学元件215包括楔角棱镜。
一个实施例中,扫描模块202还包括第三光学元件(图未示)和用于驱动第三光学元件运动的驱动器。可选地,该第三光学元件包括相对的非平行的一对表面,光束穿过该对表面。在一个实施例中,第三光学元件包括厚度沿至少一个径向变化的棱镜。在一个实施例中,第三光学元件包括楔角棱镜。 第一、第二和第三光学元件中的至少两个光学元件以不同的转速和/或转向转动。
扫描模块202中的各光学元件旋转可以将光投射至不同的方向,例如方向211和213,如此对测距装置200周围的空间进行扫描。当扫描模块202投射出的光211打到探测物201时,一部分光被探测物201沿与投射的光211相反的方向反射至测距装置200。探测物201反射的回光212经过扫描模块202后入射至准直元件204。
探测器205与发射器203放置于准直元件204的同一侧,探测器205用于将穿过准直元件204的至少部分回光转换为电信号。
一个实施例中,各光学元件上镀有增透膜。可选的,增透膜的厚度与发射器203发射出的光束的波长相等或接近,能够增加透射光束的强度。
一个实施例中,测距装置中位于光束传播路径上的一个元件表面上镀有滤光层,或者在光束传播路径上设置有滤光器,用于至少透射发射器203所出射的光束所在波段,反射其他波段,以减少环境光给接收器带来的噪音。
在一些实施例中,发射器203可以包括激光二极管,通过激光二极管发射纳秒级别的激光脉冲。进一步地,可以确定激光脉冲接收时间,例如,通过探测电信号脉冲的上升沿时间和/或下降沿时间确定激光脉冲接收时间。如此,测距装置200可以利用脉冲接收时间信息和脉冲发出时间信息计算TOF,从而确定探测物201到测距装置200的距离。
根据上述实施例提供的激光雷达,可以实现对激光雷达点云数据的获取。
本申请实施例还提供一种计算机存储介质,计算机存储介质用于储存为上述物体检测的计算机软件指令,当其在计算机上运行时,使得计算机可以执行上述方法实施例中各种可能的物体检测方法。在计算机上加载和执行所述计算机执行指令时,可全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机指令可以存储在计算机存储介质中,或者从一个计算机存储介质向另一个计算机存储介质传输,所述传输可以通过无线(例如蜂窝通信、红外、短距离无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储 设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如SSD)等。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:只读内存(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (41)

  1. 一种物体检测方法,其特征在于,包括:
    获取待检测场景的稀疏点云数据和图像;
    将所述稀疏点云数据和所述图像投影到目标坐标系中,获取待处理数据;
    对所述待处理数据进行三维检测,获取所述待检测场景包括的物体的检测结果。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述待处理数据进行三维检测,获取所述待检测场景包括的物体的检测结果,包括:
    将所述待处理数据输入基础网络模型,获取特征图;
    将所述特征图输入候选区域网络模型,获取候选物体的二维框;
    根据所述候选物体的二维框确定所述待检测场景包括的物体,并获取所述物体的信息的补偿值;
    根据所述物体的信息的补偿值获取所述物体的信息。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述特征图输入候选区域网络模型,获取候选物体的二维框,包括:
    根据所述特征图获取所述图像中每个像素点属于物体的概率;
    若根据所述每个像素点属于物体的概率确定第一像素属于物体,则获取所述第一像素对应的物体的二维框;
    根据所述第一像素属于物体的概率和所述第一像素对应的物体的二维框,获取所述候选物体的二维框。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述第一像素属于物体的概率和所述第一像素对应的物体的二维框,获取所述候选物体的二维框,包括:
    从多个所述第一像素组成的第一集合中获取待处理的第一像素,并将所述待处理的第一像素从所述第一集合中删除,获取更新后的第一集合;所述待处理的第一像素为所述第一集合中属于物体的概率最大的第一像素;
    对于所述更新后的第一集合中的每个第一像素,获取每个第一像素分别与所述待处理的第一像素之间的关联值;所述关联值用于指示每个第一像素对应的物体的二维框与所述待处理的第一像素对应的物体的二维框的重合程度;
    将关联值大于预设值的第一像素从所述更新后的第一集合中删除,并重新执行上述获取待处理的第一像素和更新第一集合的步骤,直至第一集合不包括第一像素为止,将所有所述待处理的第一像素对应的物体的二维框确定为所述候选物体的二维框。
  5. 根据权利要求2所述的方法,其特征在于,所述根据所述候选物体的二维框确定所述待检测场景包括的物体,包括:
    将所述候选物体的二维框输入第一三维检测网络模型,获取所述候选物体属于预设物体中每个物体的概率;
    根据所述候选物体属于预设物体中每个物体的概率,获取所述待检测场景包括的物体。
  6. 根据权利要求5所述的方法,其特征在于,所述获取所述物体的信息的补偿值,包括:
    通过将所述候选物体的二维框输入第一三维检测网络模型,还获取下列补偿值中的至少一项:候选物体的朝向的补偿值、候选物体的三维位置信息的补偿值、候选物体的二维框的补偿值和候选物体的三维尺寸的补偿值;
    若根据所述候选物体属于预设物体中每个物体的概率确定所述候选物体为所述待检测场景包括的物体,则将所述候选物体对应的补偿值确定为所述物体的信息的补偿值。
  7. 根据权利要求2所述的方法,其特征在于,所述根据所述候选物体的二维框确定所述待检测场景包括的物体,包括:
    将所述候选物体的二维框输入语义预测网络模型,获取所述候选物体属于预设物体中每个物体的概率;
    根据所述候选物体属于预设物体中每个物体的概率,确定所述待检测场景包括的物体。
  8. 根据权利要求7所述的方法,其特征在于,所述获取所述物体的信息的补偿值,包括:
    将所述待检测场景包括的物体的二维框输入第二三维检测网络模型,获取所述物体的信息的补偿值,所述补偿值包括下列中的至少一项:物体的朝向的补偿值、物体的三维位置信息的补偿值、物体的二维框的补偿值和物体的三维尺寸的补偿值。
  9. 根据权利要求5至8任一项所述的方法,其特征在于,所述物体的信息的补偿值包括所述物体的朝向的补偿值,所述根据所述物体的信息的补偿值获取所述物体的信息,包括:
    获取所述物体所属的预设朝向区间的中心角度;
    根据所述物体的朝向的补偿值和所述物体所属的预设朝向区间的中心角度,获取所述物体的朝向信息。
  10. 根据权利要求5至8任一项所述的方法,其特征在于,所述物体的信息的补偿值包括所述物体的三维位置信息的补偿值,所述根据所述物体的信息的补偿值获取所述物体的信息,包括:
    获取所述物体的参考点的三维位置信息;
    根据所述物体的三维位置信息的补偿值和所述物体的参考点的三维位置信息,获取所述物体的三维位置信息。
  11. 根据权利要求5至8任一项所述的方法,其特征在于,所述物体的信息的补偿值包括所述物体的三维尺寸的补偿值,所述根据所述物体的信息的补偿值获取所述物体的信息,包括:
    获取所述物体对应的物体的三维尺寸的参考值;
    根据所述物体的三维尺寸的补偿值和所述物体对应的物体的三维尺寸的参考值,获取所述物体的三维尺寸信息。
  12. 根据权利要求5至8任一项所述的方法,其特征在于,所述物体的信息的补偿值包括所述物体的二维框的补偿值,所述根据所述物体的信息的补偿值获取所述物体的信息,包括:
    获取所述物体对应的二维框的参考值;
    根据所述物体的二维框的补偿值和所述物体对应的二维框的参考值,获取所述物体的二维框的位置信息;
    根据所述物体的二维框的位置信息获取所述物体的深度值。
  13. 根据权利要求12所述的方法,其特征在于,所述根据所述物体的二维框的位置信息获取所述物体的深度值,包括:
    将所述物体的二维框的位置信息输入第一区域分割网络模型,获取所述物体的表面上的稀疏点云数据;
    对所述物体的表面上的稀疏点云数据进行聚类分割,获取所述物体表面 上的目标点的稀疏点云数据;
    根据所述目标点的稀疏点云数据确定所述物体的深度值。
  14. 根据权利要求12所述的方法,其特征在于,所述根据所述物体的二维框的位置信息获取所述物体的深度值,包括:
    将所述物体的二维框的位置信息输入第二区域分割网络模型,获取所述物体的表面上的目标面的稀疏点云数据;
    根据所述目标面的稀疏点云数据获取所述物体的深度值。
  15. 根据权利要求1至14任一项所述的方法,其特征在于,所述物体的信息包括下列中的至少一项:三维位置信息、朝向信息、三维尺寸信息和所述物体的深度值。
  16. 根据权利要求1至14任一项所述的方法,其特征在于,所述获取待检测场景的稀疏点云数据和图像,包括:
    通过至少一个雷达传感器获取所述稀疏点云数据,通过图像传感器获取所述图像。
  17. 根据权利要求16所述的方法,其特征在于,所述雷达传感器的数量大于1;所述通过至少一个雷达传感器获取所述稀疏点云数据,包括:
    分别通过每个所述雷达传感器获取对应的第一稀疏点云数据;
    根据所述至少一个雷达传感器的外参,将每个所述雷达传感器分别对应的第一稀疏点云数据投影到目标雷达坐标系中,获取所述稀疏点云数据。
  18. 根据权利要求16所述的方法,其特征在于,所述将所述稀疏点云数据和所述图像投影到目标坐标系中,获取待处理数据,包括:
    通过所述雷达传感器与所述图像传感器的外参,将所述稀疏点云数据和所述图像投影到图像坐标系中,获取所述待处理数据。
  19. 根据权利要求1至14任一项所述的方法,其特征在于,所述待处理数据包括:所述稀疏点云数据投影到所述目标坐标系中每个点的坐标值和反射率,以及所述图像中的像素点在所述目标坐标系中的坐标值。
  20. 一种电子设备,其特征在于,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述计算机程序,具体用于:
    获取待检测场景的稀疏点云数据和图像;
    将所述稀疏点云数据和所述图像投影到目标坐标系中,获取待处理数据;
    对所述待处理数据进行三维检测,获取所述待检测场景包括的物体的检测结果。
  21. 根据权利要求20所述的电子设备,其特征在于,所述处理器具体用于:
    将所述待处理数据输入基础网络模型,获取特征图;
    将所述特征图输入候选区域网络模型,获取候选物体的二维框;
    根据所述候选物体的二维框确定所述待检测场景包括的物体,并获取所述物体的信息的补偿值;
    根据所述物体的信息的补偿值获取所述物体的信息。
  22. 根据权利要求21所述的电子设备,其特征在于,所述处理器具体用于:
    根据所述特征图获取所述图像中每个像素点属于物体的概率;
    若根据所述每个像素点属于物体的概率确定第一像素属于物体,则获取所述第一像素对应的物体的二维框;
    根据所述第一像素属于物体的概率和所述第一像素对应的物体的二维框,获取所述候选物体的二维框。
  23. 根据权利要求22所述的电子设备,其特征在于,所述处理器具体用于:
    从多个所述第一像素组成的第一集合中获取待处理的第一像素,并将所述待处理的第一像素从所述第一集合中删除,获取更新后的第一集合;所述待处理的第一像素为所述第一集合中属于物体的概率最大的第一像素;
    对于所述更新后的第一集合中的每个第一像素,获取每个第一像素分别与所述待处理的第一像素之间的关联值;所述关联值用于指示每个第一像素对应的物体的二维框与所述待处理的第一像素对应的物体的二维框的重合程度;
    将关联值大于预设值的第一像素从所述更新后的第一集合中删除,并重新执行上述获取待处理的第一像素和更新第一集合的步骤,直至第一集合不包括第一像素为止,将所有所述待处理的第一像素确定为所述候选物体的二维框。
  24. 根据权利要求21所述的电子设备,其特征在于,所述处理器具体用于:
    将所述候选物体的二维框输入第一三维检测网络模型,获取所述候选物体属于预设物体中每个物体的概率;
    根据所述候选物体属于预设物体中每个物体的概率,获取所述待检测场景包括的物体。
  25. 根据权利要求24所述的电子设备,其特征在于,所述处理器具体用于:
    通过将所述候选物体的二维框输入第一三维检测网络模型,还获取下列补偿值中的至少一项:候选物体的朝向的补偿值、候选物体的三维位置信息的补偿值、候选物体的二维框的补偿值和候选物体的三维尺寸的补偿值;
    若根据所述候选物体属于预设物体中每个物体的概率确定所述候选物体为所述待检测场景包括的物体,则将所述候选物体对应的补偿值确定为所述物体的信息的补偿值。
  26. 根据权利要求21所述的电子设备,其特征在于,所述处理器具体用于:
    将所述候选物体的二维框输入语义预测网络模型,获取所述候选物体属于预设物体中每个物体的概率;
    根据所述候选物体属于预设物体中每个物体的概率,确定所述待检测场景包括的物体。
  27. 根据权利要求26所述的电子设备,其特征在于,所述处理器具体用于:
    将所述待检测场景包括的物体的二维框输入第二三维检测网络模型,获取所述物体的信息的补偿值,所述补偿值包括下列中的至少一项:物体的朝向的补偿值、物体的三维位置信息的补偿值、物体的二维框的补偿值和物体的三维尺寸的补偿值。
  28. 根据权利要求24至27任一项所述的电子设备,其特征在于,所述物体的信息的补偿值包括所述物体的朝向的补偿值,所述处理器具体用于:
    获取所述物体所属的预设朝向区间的中心角度;
    根据所述物体的朝向的补偿值和所述物体所属的预设朝向区间的中心角 度,获取所述物体的朝向信息。
  29. 根据权利要求24至27任一项所述的电子设备,其特征在于,所述物体的信息的补偿值包括所述物体的三维位置信息的补偿值,所述处理器具体用于:
    获取所述物体的参考点的三维位置信息;
    根据所述物体的三维位置信息的补偿值和所述物体的参考点的三维位置信息,获取所述物体的三维位置信息。
  30. 根据权利要求24至27任一项所述的电子设备,其特征在于,所述物体的信息的补偿值包括所述物体的三维尺寸的补偿值,所述处理器具体用于:
    获取所述物体对应的物体的三维尺寸的参考值;
    根据所述物体的三维尺寸的补偿值和所述物体对应的物体的三维尺寸的参考值,获取所述物体的三维尺寸信息。
  31. 根据权利要求24至27任一项所述的电子设备,其特征在于,所述物体的信息的补偿值包括所述物体的二维框的补偿值,所述处理器具体用于:
    获取所述物体对应的二维框的参考值;
    根据所述物体的二维框的补偿值和所述物体对应的二维框的参考值,获取所述物体的二维框的位置信息;
    根据所述物体的二维框的位置信息获取所述物体的深度值。
  32. 根据权利要求31所述的电子设备,其特征在于,所述处理器具体用于:
    将所述物体的二维框的位置信息输入第一区域分割网络模型,获取所述物体的表面上的稀疏点云数据;
    对所述物体的表面上的稀疏点云数据进行聚类分割,获取所述物体表面上的目标点的稀疏点云数据;
    根据所述目标点的稀疏点云数据确定所述物体的深度值。
  33. 根据权利要求31所述的电子设备,其特征在于,所述处理器具体用于:
    将所述物体的二维框的位置信息输入第二区域分割网络模型,获取所述物体的表面上的目标面的稀疏点云数据;
    根据所述目标面的稀疏点云数据获取所述物体的深度值。
  34. 根据权利要求20至33任一项所述的电子设备,其特征在于,所述物体的信息包括下列中的至少一项:三维位置信息、朝向信息、三维尺寸信息和所述物体的深度值。
  35. 根据权利要求20至33任一项所述的电子设备,其特征在于,所述处理器具体用于:
    通过至少一个雷达传感器获取所述稀疏点云数据,通过图像传感器获取所述图像。
  36. 根据权利要求35所述的电子设备,其特征在于,所述雷达传感器的数量大于1;所述处理器具体用于:
    分别通过每个所述雷达传感器获取对应的第一稀疏点云数据;
    根据所述至少一个雷达传感器的外参,将每个所述雷达传感器分别对应的第一稀疏点云数据投影到目标雷达坐标系中,获取所述稀疏点云数据。
  37. 根据权利要求35所述的电子设备,其特征在于,所述处理器具体用于:
    通过所述雷达传感器与所述图像传感器之间的外参矩阵,将所述稀疏点云数据和所述图像投影到相机坐标系中,获取所述待处理数据。
  38. 根据权利要求20至23任一项所述的电子设备,其特征在于,所述待处理数据包括:所述稀疏点云数据投影到所述目标坐标系中每个点的坐标值和反射率,以及所述图像中的像素点在所述目标坐标系中的坐标值。
  39. 一种可移动平台,其特征在于,包括:如权利要求20至38任一项所述的电子设备。
  40. 根据权利要求39所述的可移动平台,其特征在于,所述可移动平台为车辆或无人机。
  41. 一种计算机存储介质,其特征在于,所述存储介质中存储计算机程序,所述计算机程序在执行时实现如权利要求1-19中任一项所述的物体检测方法。
PCT/CN2019/090393 2019-06-06 2019-06-06 物体检测方法、电子设备和可移动平台 WO2020243962A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/090393 WO2020243962A1 (zh) 2019-06-06 2019-06-06 物体检测方法、电子设备和可移动平台
CN201980012209.0A CN111712828A (zh) 2019-06-06 2019-06-06 物体检测方法、电子设备和可移动平台

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/090393 WO2020243962A1 (zh) 2019-06-06 2019-06-06 物体检测方法、电子设备和可移动平台

Publications (1)

Publication Number Publication Date
WO2020243962A1 true WO2020243962A1 (zh) 2020-12-10

Family

ID=72536815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090393 WO2020243962A1 (zh) 2019-06-06 2019-06-06 物体检测方法、电子设备和可移动平台

Country Status (2)

Country Link
CN (1) CN111712828A (zh)
WO (1) WO2020243962A1 (zh)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634439A (zh) * 2020-12-25 2021-04-09 北京奇艺世纪科技有限公司 一种3d信息展示方法及装置
CN112686167A (zh) * 2020-12-30 2021-04-20 北京迈格威科技有限公司 目标检测方法、装置、电子设备及存储介质
CN112734855A (zh) * 2020-12-31 2021-04-30 网络通信与安全紫金山实验室 一种空间光束指向方法、系统及存储介质
CN112799067A (zh) * 2020-12-30 2021-05-14 神华黄骅港务有限责任公司 装船机溜筒防撞预警方法、装置、系统和预警设备
CN112926461A (zh) * 2021-02-26 2021-06-08 商汤集团有限公司 神经网络训练、行驶控制方法及装置
CN113625288A (zh) * 2021-06-15 2021-11-09 中国科学院自动化研究所 基于点云配准的相机与激光雷达位姿标定方法和装置
CN113808096A (zh) * 2021-09-14 2021-12-17 成都主导软件技术有限公司 一种非接触式的螺栓松动检测方法及其系统
CN114723715A (zh) * 2022-04-12 2022-07-08 小米汽车科技有限公司 车辆目标检测方法、装置、设备、车辆及介质
CN116755441A (zh) * 2023-06-19 2023-09-15 国广顺能(上海)能源科技有限公司 移动机器人的避障方法、装置、设备及介质
CN116973939A (zh) * 2023-09-25 2023-10-31 中科视语(北京)科技有限公司 安全监测方法及装置
CN117611592A (zh) * 2024-01-24 2024-02-27 长沙隼眼软件科技有限公司 一种异物检测方法、装置、电子设备以及存储介质
WO2024051025A1 (zh) * 2022-09-07 2024-03-14 劢微机器人科技(深圳)有限公司 托盘定位方法、装置、设备及可读存储介质
CN118397616A (zh) * 2024-06-24 2024-07-26 安徽大学 一种基于密度感知的补全和稀疏融合的3d目标检测方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030423B (zh) * 2023-03-29 2023-06-16 浪潮通用软件有限公司 一种区域边界侵入检测方法、设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103093191A (zh) * 2012-12-28 2013-05-08 中电科信息产业有限公司 一种三维点云数据结合数字影像数据的物体识别方法
CN105783878A (zh) * 2016-03-11 2016-07-20 三峡大学 一种基于小型无人机遥感的边坡变形检测及量算方法
CN106504328A (zh) * 2016-10-27 2017-03-15 电子科技大学 一种基于稀疏点云曲面重构的复杂地质构造建模方法
CN108734728A (zh) * 2018-04-25 2018-11-02 西北工业大学 一种基于高分辨序列图像的空间目标三维重构方法
CN109191509A (zh) * 2018-07-25 2019-01-11 广东工业大学 一种基于结构光的虚拟双目三维重建方法
CN109685886A (zh) * 2018-11-19 2019-04-26 国网浙江杭州市富阳区供电有限公司 一种基于混合现实技术的配网三维场景建模方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105719284B (zh) * 2016-01-18 2018-11-06 腾讯科技(深圳)有限公司 一种数据处理方法、装置及终端
CN108509918B (zh) * 2018-04-03 2021-01-08 中国人民解放军国防科技大学 融合激光点云与图像的目标检测与跟踪方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103093191A (zh) * 2012-12-28 2013-05-08 中电科信息产业有限公司 一种三维点云数据结合数字影像数据的物体识别方法
CN105783878A (zh) * 2016-03-11 2016-07-20 三峡大学 一种基于小型无人机遥感的边坡变形检测及量算方法
CN106504328A (zh) * 2016-10-27 2017-03-15 电子科技大学 一种基于稀疏点云曲面重构的复杂地质构造建模方法
CN108734728A (zh) * 2018-04-25 2018-11-02 西北工业大学 一种基于高分辨序列图像的空间目标三维重构方法
CN109191509A (zh) * 2018-07-25 2019-01-11 广东工业大学 一种基于结构光的虚拟双目三维重建方法
CN109685886A (zh) * 2018-11-19 2019-04-26 国网浙江杭州市富阳区供电有限公司 一种基于混合现实技术的配网三维场景建模方法

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634439A (zh) * 2020-12-25 2021-04-09 北京奇艺世纪科技有限公司 一种3d信息展示方法及装置
CN112634439B (zh) * 2020-12-25 2023-10-31 北京奇艺世纪科技有限公司 一种3d信息展示方法及装置
CN112686167A (zh) * 2020-12-30 2021-04-20 北京迈格威科技有限公司 目标检测方法、装置、电子设备及存储介质
CN112799067A (zh) * 2020-12-30 2021-05-14 神华黄骅港务有限责任公司 装船机溜筒防撞预警方法、装置、系统和预警设备
CN112734855B (zh) * 2020-12-31 2024-04-16 网络通信与安全紫金山实验室 一种空间光束指向方法、系统及存储介质
CN112734855A (zh) * 2020-12-31 2021-04-30 网络通信与安全紫金山实验室 一种空间光束指向方法、系统及存储介质
CN112926461A (zh) * 2021-02-26 2021-06-08 商汤集团有限公司 神经网络训练、行驶控制方法及装置
CN112926461B (zh) * 2021-02-26 2024-04-19 商汤集团有限公司 神经网络训练、行驶控制方法及装置
CN113625288A (zh) * 2021-06-15 2021-11-09 中国科学院自动化研究所 基于点云配准的相机与激光雷达位姿标定方法和装置
CN113808096A (zh) * 2021-09-14 2021-12-17 成都主导软件技术有限公司 一种非接触式的螺栓松动检测方法及其系统
CN113808096B (zh) * 2021-09-14 2024-01-30 成都主导软件技术有限公司 一种非接触式的螺栓松动检测方法及其系统
CN114723715B (zh) * 2022-04-12 2023-09-19 小米汽车科技有限公司 车辆目标检测方法、装置、设备、车辆及介质
CN114723715A (zh) * 2022-04-12 2022-07-08 小米汽车科技有限公司 车辆目标检测方法、装置、设备、车辆及介质
WO2024051025A1 (zh) * 2022-09-07 2024-03-14 劢微机器人科技(深圳)有限公司 托盘定位方法、装置、设备及可读存储介质
CN116755441B (zh) * 2023-06-19 2024-03-12 国广顺能(上海)能源科技有限公司 移动机器人的避障方法、装置、设备及介质
CN116755441A (zh) * 2023-06-19 2023-09-15 国广顺能(上海)能源科技有限公司 移动机器人的避障方法、装置、设备及介质
CN116973939B (zh) * 2023-09-25 2024-02-06 中科视语(北京)科技有限公司 安全监测方法及装置
CN116973939A (zh) * 2023-09-25 2023-10-31 中科视语(北京)科技有限公司 安全监测方法及装置
CN117611592A (zh) * 2024-01-24 2024-02-27 长沙隼眼软件科技有限公司 一种异物检测方法、装置、电子设备以及存储介质
CN117611592B (zh) * 2024-01-24 2024-04-05 长沙隼眼软件科技有限公司 一种异物检测方法、装置、电子设备以及存储介质
CN118397616A (zh) * 2024-06-24 2024-07-26 安徽大学 一种基于密度感知的补全和稀疏融合的3d目标检测方法

Also Published As

Publication number Publication date
CN111712828A (zh) 2020-09-25

Similar Documents

Publication Publication Date Title
WO2020243962A1 (zh) 物体检测方法、电子设备和可移动平台
Liu et al. TOF lidar development in autonomous vehicle
CN114080625A (zh) 绝对位姿确定方法、电子设备及可移动平台
WO2021072710A1 (zh) 移动物体的点云融合方法、系统及计算机存储介质
US10860034B1 (en) Barrier detection
WO2022126427A1 (zh) 点云处理方法、点云处理装置、可移动平台和计算机存储介质
US20210004566A1 (en) Method and apparatus for 3d object bounding for 2d image data
CN112513679B (zh) 一种目标识别的方法和装置
RU2764708C1 (ru) Способы и системы для обработки данных лидарных датчиков
US11592820B2 (en) Obstacle detection and vehicle navigation using resolution-adaptive fusion of point clouds
GB2573635A (en) Object detection system and method
WO2022179207A1 (zh) 视窗遮挡检测方法及装置
US20210117696A1 (en) Method and device for generating training data for a recognition model for recognizing objects in sensor data of a sensor, in particular, of a vehicle, method for training and method for activating
WO2021062581A1 (zh) 路面标识识别方法及装置
US20190187253A1 (en) Systems and methods for improving lidar output
CN111819602A (zh) 增加点云采样密度的方法、点云扫描系统、可读存储介质
WO2022198637A1 (zh) 点云滤噪方法、系统和可移动平台
CN111999744A (zh) 一种无人机多方位探测、多角度智能避障方法
US20230341558A1 (en) Distance measurement system
Steinbaeck et al. Occupancy grid fusion of low-level radar and time-of-flight sensor data
WO2020215252A1 (zh) 测距装置点云滤噪的方法、测距装置和移动平台
US20210255289A1 (en) Light detection method, light detection device, and mobile platform
CN114026461A (zh) 构建点云帧的方法、目标检测方法、测距装置、可移动平台和存储介质
WO2020155142A1 (zh) 一种点云重采样的方法、装置和系统
WO2020237663A1 (zh) 一种多通道激光雷达点云插值的方法和测距装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19931933

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19931933

Country of ref document: EP

Kind code of ref document: A1