Disclosure of Invention
The embodiment of the disclosure at least provides a positioning method, a driving control method, a device, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides a positioning method, including: acquiring a target image acquired by acquiring a target scene; determining, based on the target image, a first two-dimensional geometric feature of a first target object included in the target image; performing position matching on a first target object and a second target object in the target scene based on a three-dimensional geometric feature of the second target object in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object and a corresponding relation between the first target object and the second target object; and determining target pose information of acquisition equipment for acquiring the target image based on the position matching result.
In the embodiment, the geometric features of the target object are utilized to perform position matching on the first target object and the second target object, the matching process is simple, the processed data amount is less, the processing speed is higher, and the efficiency of determining the target pose information of the target image acquisition equipment can be improved.
In an alternative embodiment, in the case that the first target object comprises a first road marker object having a rectangular contour, the first two-dimensional geometric feature of the first target object comprises: a vertex of the first target object; in case the first target object comprises a second road sign object having an irregular contour, the first two-dimensional geometrical feature of the first target object comprises: contour lines and/or corner points of the first target object; in case the first target object comprises a third road marker object of a line type, the first two-dimensional geometrical feature of the first target object comprises: and the target line segment belongs to the image area where the first target object is located and is positioned on the central line of the first target object.
In the embodiment, the corresponding first two-dimensional geometric features of different first target objects are determined in a targeted manner, so that the first two-dimensional geometric features of different first target objects can be easily identified, and the actual positions of the first target objects in the target scene can be more accurately represented.
In an optional embodiment, the determining, based on the target image, a first two-dimensional geometric feature of a first target object included in the target image includes: performing semantic segmentation processing on the target image, and determining semantic segmentation results corresponding to a plurality of pixel points in the target image respectively; and determining a first two-dimensional geometric feature of the first target object in the target image based on semantic segmentation results corresponding to a plurality of pixel points respectively and positions of the pixel points in the target image respectively.
In this embodiment, when performing semantic segmentation processing on the target image, the semantic segmentation results corresponding to the plurality of pixel points in the target image can be determined relatively accurately, so that when determining the positions of the plurality of pixel points in the target image by using the semantic segmentation results corresponding to the plurality of pixel points, the obtained first two-dimensional geometric feature is more accurate.
In an optional embodiment, in a case that the first target object includes a first road marker object having a rectangular contour, the determining, based on semantic segmentation results corresponding to a plurality of pixel points respectively and positions of the plurality of pixel points in the target image, a first two-dimensional geometric feature of the first target object in the target image includes: determining pixel points belonging to the contour of the first target object from the target image based on the semantic segmentation result; fitting to obtain a corresponding bounding box of the first target object in the target image based on pixel points belonging to the contour of the first target object; determining a first two-dimensional geometric feature of the first target object in the target image based on vertices of the bounding box.
In an optional embodiment, in a case that the first target object includes a second road sign object having an irregular contour, the determining, based on semantic segmentation results corresponding to a plurality of pixel points respectively and positions of the plurality of pixel points in the target image, a first two-dimensional geometric feature of the first target object in the target image includes: determining pixel points belonging to the outline of the first target object from the target image based on the semantic segmentation result; obtaining a contour line of the first target object based on the position of a pixel point belonging to the contour of the first target object in the target image; determining a first two-dimensional geometric feature of the first target object in the target image based on a contour line of the first target object.
In the embodiment, the contour line of the first target object can be determined by determining the pixel points belonging to the contour of the first target object from the target image, so that the vertex identification difficulty caused by irregular contour line edges determined based on the pixel points is reduced, the expression of the first two-dimensional geometric features of the first target object is simplified to a greater extent, and the matching between the same target objects is facilitated.
In an optional embodiment, in a case that the first target object includes a third road marker object of a line type, the determining, based on semantic segmentation results corresponding to a plurality of pixel points respectively and positions of the plurality of pixel points in the target image, a first two-dimensional geometric feature of the first target object in the target image includes: fitting to obtain a central line of the first target object based on the semantic segmentation result; determining a target line segment which belongs to the image area where the first target object is located and is located on the central line based on a two-dimensional coordinate value of a pixel point which is located on the central line and belongs to the image area where the first target object is located in the target image; obtaining a first two-dimensional geometric feature of the first target object based on the target line segment.
In the embodiment, the first two-dimensional geometric feature of the first target object is obtained through the determined target line segment of the first target object, so that the problem that the automatic driving vehicle cannot be accurately posture-resolved only by determining the contour lines of the stop line and the lane solid line because the stop line and the lane solid line continuously appear on the road can be solved well, and the first two-dimensional geometric feature of the first target object can be determined more accurately.
In an optional embodiment, the method further comprises: generating the corresponding relation of a first target object and a second target object based on a three-dimensional geometrical feature of the second target object in the target scene and a first two-dimensional geometrical feature of the first target object.
In this embodiment, since the three-dimensional geometric feature of the second target object in the target scene and the first two-dimensional geometric feature of the first target object are both relatively simple geometric features, when the corresponding relationship between the first target object and the second target object is generated by using the three-dimensional geometric feature and the first two-dimensional geometric feature, the calculation amount is relatively small, the processing speed is relatively high, and therefore, the processing efficiency can be improved.
In an optional embodiment, the generating the corresponding relationship between the first target object and the second target object based on a three-dimensional geometric feature of a second target object in the target scene and a first two-dimensional geometric feature of the first target object includes: based on the initial pose information of the acquisition equipment for acquiring the target image and the three-dimensional geometrical characteristics of the second target object in the target scene, projecting the second target object into an image coordinate system of the target image to obtain a first projection geometrical characteristic of the second target object in the image coordinate system; and matching the first target object and the second target object based on the first two-dimensional geometric feature of the first target object in the image coordinate system and the first projection geometric feature of the second target object in the image coordinate system to obtain the corresponding relation between the first target object and the second target object.
In the embodiment, the determined first projection geometric feature is matched with the first two-dimensional geometric feature, on one hand, because the geometric features respectively corresponding to the first projection geometric feature and the first two-dimensional geometric feature are simple, the calculation amount is small during matching, and the efficiency is high; on the other hand, the corresponding relation between the first target object and the second target object can be determined by easily performing matching in the same image coordinate system.
In an optional embodiment, the generating the corresponding relationship between the first target object and the second target object based on a three-dimensional geometric feature of a second target object in the target scene and a first two-dimensional geometric feature of the first target object includes: based on a homography matrix between target planes of the target image and the second target object, projecting a first target object in the target image to the target plane to obtain a second projection geometric feature of the first target object in the target plane; matching the first target object and the second target object based on a second projection geometric feature of the first target object in the target plane and a geometric feature of the second target object in the target plane to obtain a corresponding relation between the first target object and the second target object; wherein the geometric feature of the second target object in the target plane is determined based on the three-dimensional geometric feature of the second target object in the scene coordinate system.
In this embodiment, because the homography matrix is used to more accurately project the first target object to the transformation matrix in the target plane, the second projection geometric feature obtained by projecting the first target object to the target plane is also more accurate, so that when the first target object is matched with the second target object, the corresponding relationship between the first target object and the second target object can be more accurately obtained by using the second projection geometric feature of the first target object in the target plane and the geometric feature of the second target object in the target plane.
In an optional embodiment, the matching the first target object and the second target object based on the second projection geometric feature of the first target object in the target plane and the geometric feature of the second target object in the target plane to obtain the corresponding relationship between the first target object and the second target object includes: under the condition that the first target object comprises a first road sign object with a rectangular outline, matching is carried out by utilizing the characteristic of the vertex of the first target object in the second projection geometric characteristic and the characteristic of the vertex determined by the second target object in the geometric characteristic of the second target object in the target plane, so as to obtain the corresponding relation between the first target object and the second target object; under the condition that the first target object comprises a second road sign object with an irregular contour, matching is carried out by utilizing the features of the contour line and/or the corner points which represent the first target object in the second projection geometric features and the features of the contour line and/or the corner points which represent the second target object and are determined by the second target object in the geometric features of the second target object in the target plane, so as to obtain the corresponding relation between the first target object and the second target object; and under the condition that the first target object comprises a linear third road marker object, carrying out maximum graph matching on the second projection geometric feature and the geometric feature of the second target object in the target plane to obtain the corresponding relation between the first target object and the second target object.
In an optional embodiment, the determining, based on the result of the position matching, object pose information of an acquisition device acquiring the object image includes: determining a position matching error based on a result of the position matching; and determining the target pose information of the acquisition equipment for acquiring the target image based on the position matching error and the initial pose information of the acquisition equipment for acquiring the target image.
In the embodiment, the accuracy of the pose information of the target image acquisition equipment can be reflected by using the position matching error, and the pose information is continuously optimized based on the matching loss, so that the accuracy of the target pose information is improved.
In an optional embodiment, the determining target pose information of an acquisition device acquiring the target image based on the position matching error and initial pose information of the acquisition device acquiring the target image includes: detecting whether a preset iteration stop condition is met; determining the initial pose information obtained by the last iteration as the target pose information under the condition of meeting the iteration stop condition; and under the condition that the iteration stop condition is not met, determining new initial pose information based on the position matching error and initial pose information in the last iteration process, and returning to the step of carrying out position matching on a first target object and a second target object based on the three-dimensional geometrical characteristics of the second target object in the target scene in a scene coordinate system corresponding to the target scene, the first two-dimensional geometrical characteristics of the first target object and the corresponding relation between the first target object and the second target object.
In this embodiment, by setting a preset iteration stop condition, the obtained target pose information can reach a higher confidence level, that is, the obtained target pose information is more accurate.
In an alternative embodiment, the iteration stop condition includes any one of: the iteration times are greater than a preset iteration time threshold; the position matching error of the first target object and the second target object is smaller than a preset loss threshold.
In an optional embodiment, the performing, based on a three-dimensional geometric feature of a second target object in the target scene in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object, and a correspondence between the first target object and the second target object, position matching on the first target object and the second target object includes: under the condition that the first target object comprises a second road sign object with an irregular contour, carrying out interpolation processing on the first two-dimensional geometric characteristic of the first target object to obtain a second two-dimensional geometric characteristic of the first target object; wherein the second two-dimensional geometric feature comprises: a plurality of vertices and a plurality of interpolation points; and performing point-to-point position matching on the first target object and the second target object based on the second two-dimensional geometric feature, the three-dimensional geometric feature and the corresponding relation between the first target object and the second target object.
In this embodiment, by performing interpolation processing on the first two-dimensional geometric feature of the first target object, weights between different semantics can be balanced, and the problem of poor matching when position matching processing is performed using fewer vertices can be alleviated.
In an optional embodiment, the performing, based on a three-dimensional geometric feature of a second target object in the target scene in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object, and a correspondence between the first target object and the second target object, position matching on the first target object and the second target object includes: based on initial pose information of the acquisition equipment for acquiring the target image and three-dimensional geometric features of a second target object in the target scene in a scene coordinate system corresponding to the target scene, projecting the second target object into the image coordinate system of the target image to obtain third projection geometric features of the second target object in the image coordinate system; and performing position matching on the first target object and the second target object which have the corresponding relation based on the third projection geometric feature of the second target object in the image coordinate system and the first two-dimensional geometric feature of the first target object.
In a second aspect, an embodiment of the present disclosure provides a driving control method for an intelligent driving device, including: acquiring video frame data acquired by an intelligent driving device in the driving process; processing the video frame data by using the positioning method in the first aspect or any optional implementation manner of the first aspect, and detecting a target object in the video frame data; and controlling the intelligent driving device based on the detected target object.
In the embodiment, the positioning method provided by the embodiment of the disclosure can be used for determining the target pose information more efficiently, so that the positioning method is more favorable for being deployed in an intelligent driving device, the safety in the automatic driving control process is improved, and the requirements in the automatic driving field are better met.
In a third aspect, an embodiment of the present disclosure further provides a positioning apparatus, including: the first acquisition module is used for acquiring a target image acquired by acquiring a target scene; a first determining module, configured to determine, based on the target image, a first two-dimensional geometric feature of a first target object included in the target image; a matching module, configured to perform position matching on a first target object and a second target object in the target scene based on a three-dimensional geometric feature of the second target object in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object, and a correspondence between the first target object and the second target object; and the second determination module is used for determining the target pose information of the acquisition equipment for acquiring the target image based on the position matching result.
In a fourth aspect, an embodiment of the present disclosure further provides a driving control device of an intelligent driving device, including:
the second acquisition module is used for acquiring video frame data acquired by the intelligent driving device in the driving process;
a detection module, configured to process the video frame data by using the positioning method in the first aspect or any optional implementation manner of the first aspect, and detect a target object in the video frame data;
and the control module is used for controlling the intelligent driving device based on the detected target object.
In a fifth aspect, this disclosure provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the processor, the machine-readable instructions are executed by the processor to perform the steps in any one of the possible implementations of the first aspect or the second aspect.
In a sixth aspect, alternative implementations of the present disclosure also provide a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in any of the possible implementations of the first or second aspect.
For the description of the effects of the above apparatus, computer device, and computer-readable storage medium, reference is made to the description of the corresponding method, which is not repeated herein.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making any creative effort, shall fall within the protection scope of the disclosure.
According to research, in the vision-based positioning method, a scene image is obtained by shooting a target scene, then feature points of the scene are extracted, and the extracted feature points are matched with feature points in a three-dimensional scene model which is established in advance based on the target scene, so that the pose information of image acquisition equipment for acquiring the scene image in the target scene is obtained. For an automatic driving vehicle moving at a high speed, in order to ensure the safety of the automatic driving vehicle, the pose information of the automatic driving vehicle needs to be determined in real time, efficiently and accurately, and in the current positioning method based on vision, a process of resolving the pose information based on the matching relation of feature points needs to consume much time, so that the efficiency is low, and the requirements in the field of automatic driving cannot be met.
Based on the research, the present disclosure provides a positioning method, a driving control method, a device, a computer device, and a storage medium, where the geometric features of a target object are used to perform position matching on a first target object and a second target object, and target pose information of a target image is obtained based on the result of the position matching.
The above drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above problems and the solutions proposed by the present disclosure in the following description should be the contribution of the inventor to the present disclosure in the course of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
To facilitate understanding of the present embodiment, first, a positioning method disclosed in the embodiments of the present disclosure is described in detail, where an execution subject of the positioning method provided in the embodiments of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the location method may be implemented by a processor calling computer readable instructions stored in a memory.
The following describes a positioning method provided in the embodiments of the present disclosure.
Referring to fig. 1, a flowchart of a positioning method provided in the embodiment of the present disclosure is shown, where the method includes steps S101 to S104, where:
s101: acquiring a target image acquired by acquiring a target scene;
s102: determining a first two-dimensional geometric feature of a first target object included in the target image based on the target image;
s103: performing position matching on a first target object and a second target object based on a three-dimensional geometric feature of the second target object in a target scene in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object and a corresponding relation between the first target object and the second target object;
s104: and determining target pose information of the acquisition equipment for acquiring the target image based on the position matching result.
According to the method and the device, the target image acquired by acquiring the target scene is acquired, after the two-dimensional geometric characteristics of the first target object in the target image are determined, the first target object in the target scene and the second target object in the target scene are matched, the first target object and the second target object are matched in position by combining the three-dimensional geometric characteristics of the second target object in the target scene, and the target position and orientation information of the target image is determined based on the matching result.
The following describes the details of S101 to S104.
For the above S101, the positioning method provided by the embodiment of the present disclosure may be applied to various fields, such as the field of automatic driving and the field of smart warehousing; in the case of application to the field of automatic driving, the target scene includes, for example, an automatic driving road scene; in this case, the target objects in the target scene may comprise road sign objects, for example comprising at least one of: guideboards, traffic lights, zebra crossings, road signs, stop lines, solid lane lines, and dashed lane lines, to name a few.
In addition, in the case of applying the positioning method to the field of automatic driving, the target scene may further include a parking lot scene; target objects within the target scene include, for example: the system comprises a monitoring camera, an induction card reader, a vehicle detection line, a road pointing sign, a stop line and the like.
In the case of application in the field of smart warehousing, the target scenario includes, for example, a warehouse; the target objects in the target scene include, for example, shelves in a warehouse, indicating landmarks, and the like.
Here, the driving vehicle and the corresponding target scene may be determined according to actual conditions, and are not limited herein.
In the embodiment of the disclosure, a target scene is taken as an example of a road on which an autonomous vehicle runs, and when a target image of the target scene is acquired, for example, an image acquisition device may be installed on the autonomous vehicle; the image acquisition equipment can scan a target scene in real time to obtain a target image of the target scene.
In addition, in the example of the present disclosure, other devices such as a radar, a depth camera, and a distance sensor may also be installed on the autonomous vehicle, and the autonomous vehicle may also perform positioning based on data acquired by the other devices, and obtain a more accurate positioning result by combining a positioning result determined by using the positioning method provided by the embodiment of the present disclosure and a positioning result determined by using detection data acquired by the other devices.
For the above step S102, after the target image is obtained, the first target object included in the target image and the first two-dimensional geometric feature of the first target object in the target image may be determined by using the target image. The first target object may include at least one of a road sign, a traffic light, a zebra crossing, a road surface pointing sign, a prohibition sign, a stop line, a lane solid line, and a lane dotted line in the target image, for example; the target object is a relatively universal symbolic element in the target scene. Since different target objects appear as different graphic features when shown in the target image, the corresponding two-dimensional geometric information can be determined according to the different target objects in the target image. Wherein:
(1): the graphical feature of the first target object comprises, for example, having a rectangular outline when the first target object comprises at least one of a guideboard, a traffic light, a zebra crossing, and a dashed lane line. At this time, the first target object is a first road sign object including a rectangular contour. The first two-dimensional geometric feature of the first target object comprises the vertices of the first target object. For example, when the first target object is a zebra crossing, the outline of the first target object may be shown by using a plurality of continuous rectangles, and the corresponding first two-dimensional geometric feature may include, for example, a vertex of one of the rectangles, or vertices corresponding to each of a plurality of continuously shown matrices.
(2): the graphical feature of the first target object includes, for example, having an irregular contour when the first target object includes at least one of a road-pointing sign and a prohibited sign. At this time, the first target object is a second road-marking object including a contour having an irregularity. The first two-dimensional geometrical feature of the first target object comprises contour lines and/or corner points of the first target object. For example, when the first target object includes a right-turn direction mark in the road surface direction mark, the contour line thereof cannot be directly formed by a basic contour line such as a rectangle or a circle, and the corresponding first two-dimensional geometric feature thereof directly corresponds to an irregular contour line thereof.
(3): when the first target object includes at least one of a stop-line and a lane-solid line, the graphical feature of the first target object includes, for example, being shown in a line shape. At this time, the first target object is a third road sign including a line type. The first two-dimensional geometric feature of the first target object comprises: target line segments on the center lines of two adjacent first target objects. For example, when the first target object includes a lane solid line, since the lane solid line is long, only one end of the lane solid line may be shown in the target image, or neither end of the lane solid line may be shown, and the geometric feature corresponding to the corresponding first two-dimensional geometric feature corresponds to a target line segment that belongs to the image area where the first target object is located and is located on the center line of the first target object.
Here, the first two-dimensional geometric information of the first target object in the target image may be represented as, for example, two-dimensional coordinate values of the first target object in an image coordinate system corresponding to the target image.
For example, the vertex of the first target object may be represented as a two-dimensional coordinate value of the vertex in the image coordinate system; the contour line may be represented, for example, as a two-dimensional coordinate value of an end point of the contour line in the image coordinate system; the target line segment may be represented as two-dimensional coordinate values of the end points of the line segment in the image coordinate system.
An image coordinate system corresponding to the target image may be determined according to any pixel point in the target image, for example, an image coordinate system may be determined by using any pixel point as a coordinate origin; specifically, the determination may be based on the installation position of the image acquisition device in the autonomous vehicle; for example, if the image capturing device is installed at a higher position and the field of view is also higher, the image coordinate system may be determined by using the pixel point at the lower position in the target image as the origin. If the image acquisition equipment is installed at a lower position and the visual field is also lower, the pixel point with the higher position in the target image can be used as the origin to determine the image coordinate system. In addition, an image coordinate system can also be established by taking a projection pixel point of an optical axis of the image acquisition equipment in the target image as an origin. The specific determination may be determined according to actual situations, and details are not described herein.
Referring to fig. 2, a specific method for determining a first two-dimensional geometric feature of a first target object included in a target image in the target image based on the target image according to an embodiment of the present disclosure includes:
s201: and performing semantic segmentation processing on the target image, and determining semantic segmentation results corresponding to a plurality of pixel points in the target image respectively.
In this embodiment, when performing semantic segmentation processing on the target image, at least one of the following methods may be used, for example: full Convolutional Networks (FCN), band image Convolutional Networks (CNN-CRF), encoder-decoder models (CDN), and Feature Pyramid models (FPN).
After the semantic segmentation processing is performed on the target image, semantic segmentation results corresponding to a plurality of pixel points in the target image can be determined. Referring to fig. 3, a schematic diagram of a semantic segmentation map obtained by performing semantic segmentation on a target image according to an embodiment of the present disclosure is shown. In fig. 3, the area indicated by 31 represents a traffic light, the area indicated by 32 represents a road solid line, the area indicated by 33 represents a road surface directional sign, and the area indicated by 34 represents a road dotted line.
S202: and determining a first two-dimensional geometric feature of the first target object in the target image based on semantic segmentation results corresponding to the plurality of pixel points respectively and positions of the plurality of pixel points in the target image respectively.
Here, when the first two-dimensional geometric feature of the first target object in the target image is determined according to the semantic segmentation result corresponding to each of the different pixel points and the positions of the multiple pixel points in the target image, for different first target objects, the corresponding method for determining the two-dimensional geometric feature of the first target object in the target image is also different.
In particular, when determining a first two-dimensional geometric feature of a first target object in a target image:
a: for the case that the first target object comprises a first road marker object having a rectangular contour, the first two-dimensional geometric feature of the first target object in the target image may be determined, for example, in the following manner as shown in fig. 4:
s401: determining pixel points belonging to the outline of the first target object from the target image based on the semantic segmentation result;
s402: fitting to obtain a corresponding bounding box of the first target object in the target image based on the pixel points belonging to the contour of the first target object;
s403: a first two-dimensional geometric feature of the first target object in the target image is determined based on the vertices of the bounding box.
Illustratively, when determining the pixel points belonging to the contour of the first target object based on the semantic segmentation result, for example, based on the semantic segmentation result of each pixel point in the target image, a region formed by all the pixel points of the first target object is determined from the target image, and then the pixel points at the edge of the region are determined as the pixel points belonging to the contour of the first target object.
After the pixels belonging to the contour of the first target object are determined, the contour line formed by the pixels belonging to the contour of the first target object is usually a wavy or jagged line exhibiting small amplitude fluctuation. On one hand, because the edge of the contour line is irregular, the identification difficulty is higher when the vertex determined based on the contour line represents the geometric feature of the first target object; on the other hand, when the geometric feature of the first target object is characterized based on the vertex determined by the contour line, the amount of data is also large, which is not favorable for matching between the same target objects. And thus in the embodiments of the present disclosure. The corresponding enclosing frame of the first target object in the target image is obtained through fitting based on the pixel points belonging to the contour of the first target object, the first two-dimensional geometric feature of the first target object is formed through the vertex of the enclosing frame, the expression of the first two-dimensional geometric feature of the first target object is simplified to a large extent, matching between the same target objects is facilitated, and the identification difficulty is reduced.
When the corresponding bounding box of the first target object in the target image is obtained through fitting based on the pixel points belonging to the contour of the first target object, for example, a plurality of straight lines may be determined based on the pixel points belonging to the contour of the first target object, and the bounding box of the first target object may be obtained through fitting the plurality of straight lines.
In addition, since the actual size of the first target object in the target scene is large, and deformation caused by an excessively large pitch angle during shooting is less likely to occur, the obtained bounding box in the target image is approximately rectangular.
For example, the bounding box corresponding to the target image obtained by fitting based on the pixel points belonging to the contour of the first target object may be, for example, a portion of a contour line formed by the pixel points belonging to the contour of the first target object overlaps, or a region where the first target object is located is bounded. The bounding box can surround the pixel points corresponding to the first target object in the target image, the specific first two-dimensional geometric characteristics of the first target object in the target image can be represented more accurately through the vertex of the bounding box, and the specific position of the first target object in the target image can be represented by the coordinate value of the vertex of the bounding box in the image coordinate system corresponding to the target image.
For example, after the bounding box is determined, the position of the rectangular box in the target image can be determined according to the two-dimensional coordinate values of the opposite corners in the rectangular box, for example, the two-dimensional coordinate values of the top left corner vertex and the bottom right corner vertex of the bounding box, or the top left corner vertex and the bottom right corner vertex of the bounding box in the target image can be used as the specific position of the first target object in the target image. In this way, the geometric characteristics of the first target object can be accurately expressed, and the data amount in subsequent processing (such as position matching of the first target object and the second target object) is reduced.
Alternatively, for example, the two-dimensional coordinate values of the four corners of the rectangular frame may be directly determined as the representation of the specific position of the first target object in the target image. The specific position of the first target object in the target image obtained by the method is processed, so that the first two-dimensional geometric feature of the target object has higher readability.
Referring to fig. 5, a schematic diagram of determining a first two-dimensional geometric feature of a traffic light based on a semantic segmentation result of the traffic light according to an embodiment of the present disclosure is shown; wherein, a in fig. 5 represents a schematic diagram of a semantic segmentation result of the first target object in the target image when the first target object includes a traffic light; in fig. 5, b represents a schematic diagram of a bounding box corresponding to the first target object, and 51 and 52 represent two vertices corresponding to the bounding box.
B: for the case that the first target object comprises a second road-marking object having an irregular contour, the first two-dimensional geometrical feature of the first target object in the target image may be determined, for example, in the manner shown in fig. 6 below:
s601: determining pixel points belonging to the outline of the first target object from the target image based on the semantic segmentation result;
s602: obtaining a contour line of the first target object based on the position of a pixel point belonging to the contour of the first target object in the target image;
s603: based on the contour line of the first target object, a first two-dimensional geometric feature of the first target object in the target image is determined.
In a specific implementation, in the case that the first target object includes the road surface directional marker, since the shape of the first target object is irregular, when determining the first two-dimensional geometric feature of the first target object, the first two-dimensional geometric feature of the first target object needs to be determined based on pixel points belonging to the contour of the first target object.
Taking the road-pointing sign as an example, the vertex for determining the first two-dimensional geometric feature may for example be chosen as a turning point of a larger angular turn in the edge surrounding the first target object. Because the actual change of the position line fluctuation, the turning and the like corresponding to the turning point in the contour line of the first target object is large, the interference on the identification of the vertex is small even if the contour line of the first target object has small amplitude fluctuation, so that the contour line determined based on the semantic segmentation result is easy to identify to determine the vertex, and the specific position of the first target object in the target image can be conveniently determined.
Illustratively, referring to fig. 7, a schematic diagram for determining a first two-dimensional geometric feature of a straight-line pointing marker based on a semantic segmentation result of the straight-line pointing marker provided by the embodiment of the present disclosure is shown. Fig. 7 a shows a schematic diagram of a straight line pointing identification corresponding semantic segmentation graph, and fig. 7 b shows a schematic diagram of an outline of a first target object. Wherein, the contour line is an irregular arrow head shape. In fig. 7, b denotes a plurality of vertices 71, 72, and 73, respectively, which are obtained by recognition and which specify the contour of the first object in the object image.
C: for the case that the first target object comprises a linear third road marker object, the first two-dimensional geometric feature of the first target object in the target image may be determined, for example, in the following manner as shown in fig. 8:
s801: fitting to obtain a central line of the first target object based on the semantic segmentation result;
s802: determining a target line segment which belongs to the image area where the first target object is located and is located on the central line based on the two-dimensional coordinate value of the pixel point which is located on the central line and belongs to the image area where the first target object is located in the target image;
s803: a first two-dimensional geometric feature of the first target object is obtained based on the target line segment.
In a specific implementation, when the first target object includes a target object of which at least one end of the contour is not shown in the target image, because the stop line and the lane solid line continuously appear on the road in the normal driving process of the vehicle, only the contour lines of the stop line and the lane solid line are determined, and the automatic driving vehicle cannot be accurately solved for the position and the attitude, the target line segments of the stop line and the lane solid line are selected as the first two-dimensional geometric features of the first target object.
The target line segment may be represented as a two-dimensional coordinate value of an end point of the target line segment in the target image.
For example, when determining the target line segment of the first target object, for example, a center line of the first target object may be determined first. When extracting the center line based on the semantic segmentation result of the first target object, for example, at least one of the following methods may be employed: topology refinement based methods, distance transformation based methods, path planning based methods, and tracking based methods. For example, when the centerline is extracted by using a method based on topology refinement, a semantic segmentation map corresponding to a first target object may be determined, and then iterative processing of erosion elimination may be performed on the boundary of the semantic segmentation map by using a morphological principle until the centerline corresponding to the first target object in the semantic segmentation map is obtained. Because different methods for extracting the center line are suitable for different scenes and have different requirements on the quality of the image, the selection of the specific method for extracting the center line and the specific implementation process can be determined according to actual conditions, and are not described herein again.
After determining the centerline of the first target object, two points may also be determined on the centerline as end points of the target line segment.
Specifically, when determining the end points of the target line segment on the centerline, for example, a point-by-point search method may be adopted to determine one by one whether the point on the centerline has a corresponding pixel point in the image region corresponding to the first target object; or determining pixel points located on the central line in the image region corresponding to the first target object. At this time, all pixel points which belong to the image area where the first target object is located and are located on the central line can be determined, and the two pixel points with the farthest distance are used as the end points of the target line segment. Then, the target line segment determined by the two pixel points with the farthest distance is used as the first two-dimensional geometric feature of the first target object.
Referring to fig. 9, a schematic diagram of determining a first two-dimensional geometric feature of a lane solid line based on a semantic segmentation result of the lane solid line provided by an embodiment of the present disclosure is shown. In fig. 9, a represents a semantic segmentation diagram of a lane solid line, and b represents a target line segment corresponding to the lane solid line in fig. 9.
At this time, the first two-dimensional geometric feature of the first target object may be determined upon completion of at least one of a, B, and C. In an implementation, the first two-dimensional geometric feature of the first target object may for example be represented as p when the first target object comprises at least one of a first road sign object and a second road sign object j (ii) a Where j represents the jth first target object of the plurality of first target objects. When the first target object comprises at least one of the third road marker objects, the first two-dimensional geometric feature of the first target object may be represented by l, for example i (ii) a Wherein i represents an ith first target object among the plurality of first target objects.
For the above S103, the second target object corresponds to the first target object, and may include at least one of a guideboard, a traffic light, a zebra crossing, a solid lane line, a road surface indicator, a prohibition sign, a stop line, and a dashed lane line, for example. The scene coordinate system corresponding to the target scene where the second target object is located may include, for example, a scene coordinate system established in advance. The scene coordinate system is a three-dimensional coordinate system established for the target scene. Specifically, a world coordinate system can be directly selected as a scene coordinate system, or a scene coordinate system can be established by taking any position point in a target scene as an origin. The specific scene coordinate system may be determined according to actual conditions, and is not described herein again.
In the case that the scene coordinate system of the target scene is determined, the three-dimensional geometric features of the second target object in the target scene in the scene coordinate system corresponding to the target scene may be determined. The three-dimensional geometry of the second target object may be predetermined, for example, and may be determined by using any one of Simultaneous Localization and Mapping (SLAM) modeling and Structure-From-Motion (SFM) modeling. The method for specifically determining the three-dimensional geometric feature of the second target object may be determined according to an actual situation, and is not described herein again.
In a specific implementation, the three-dimensional geometrical feature of the second target object comprises, for example, respective vertices in the second target object, in case the second target object comprises at least one of a first road sign object and a second road sign object. In addition, for the case that the second target object comprises a second road sign object having an irregular contour, the corresponding three-dimensional geometrical feature may for example also comprise corner points of the second target object.
Wherein the three-dimensional geometric feature of the second target object can be represented by the three-dimensional coordinate values of the respective vertices in the three-dimensional coordinate system.
The second target object comprises the case of a third road marker object, the three-dimensional geometrical feature of which comprises, for example, a line segment located at the midline of the second target object and belonging to the second target object. Wherein, the three-dimensional geometric feature of the second target object can be represented by using the three-dimensional coordinate value of the endpoint of the line segment in the three-dimensional coordinate system.
In a specific implementation, when the second target object comprises at least one of a target object whose contour is shown by at least one rectangle or a target object shown by an irregular figure, the three-dimensional geometrical feature of the second target object may be represented as P, for example j (ii) a Where j represents a jth second target object of the plurality of second target objects. When the second target object comprises at least one of the target objects of which at least one end of the contour is not shown in the target image, the three-dimensional geometric feature of the second target object may be represented as L, for example i (ii) a Wherein i represents an ith second target object among the plurality of second target objects.
At this time, the corresponding relationship between the first target object and the second target object may also be generated based on the three-dimensional geometric feature of the second target object in the target scene and the first two-dimensional geometric feature of the first target object.
The embodiment of the present disclosure further provides a specific method for generating a corresponding relationship between a first target object and a second target object, including: and generating a corresponding relation between the first target object and the second target object based on the three-dimensional geometrical characteristics of the second target object in the target scene and the first two-dimensional geometrical characteristics of the first target object.
Illustratively, the second target object includes: in the case of at least one of the target objects whose outline is shown by at least one rectangle or the target objects shown by irregular figures, the correspondence between the first target object and the second target object may be generated in the following manner:
based on initial pose information of acquisition equipment for acquiring a target image and three-dimensional geometric characteristics of a second target object in a target scene, projecting the second target object into an image coordinate system of the target image to obtain first projection geometric characteristics of the second target object in the image coordinate system;
and matching the first target object and the second target object based on the first two-dimensional geometric feature of the first target object in the image coordinate system and the first projection geometric feature of the second target object in the image coordinate system to obtain the corresponding relation of the first target object and the second target object.
In a specific implementation, when acquiring initial pose information of an acquisition device for acquiring a target image, since the pose of an autonomous vehicle does not generally change to a large extent when the autonomous vehicle normally travels on a road in a target scene, for example, previously determined pose information of the autonomous vehicle on the road may be acquired in advance as the initial pose information; alternatively, since information about the road in the target scene is easily obtained, for example, a Global Positioning System (GPS) is used to determine the position information of the road, and then the initial pose information of the autonomous vehicle is estimated based on the position information of the road in the target scene. Wherein the initial pose information can be represented as T 0 . The method of specifically determining the information of the initial pose of the target image may be as followsThe actual situation is determined, and details are not described herein.
At this time, since the target image may be obtained based on, for example, an image pickup device mounted on the autonomous vehicle, the pose information of the pickup device that picks up the target image is associated with the pose information corresponding to the autonomous vehicle. The pose information of the autonomous vehicle may be determined based on the capture device pose information of the captured target image while ignoring the relative pose relationship of the image capture device and the autonomous vehicle, or may be determined in other pose resolution manners, such as by determining the pose information of the autonomous vehicle based on the relative pose relationship of the image capture device and the autonomous vehicle.
After determining the initial pose information of the acquisition equipment for acquiring the target image and the three-dimensional geometric features of the second target object in the target scene, the first projection geometric features of the second target object in the image coordinate system can be obtained by projecting the second target object into the image coordinate system of the target image.
In a specific implementation, when the second target object is projected into the image coordinate system of the target image, that is, the three-dimensional geometric feature of the second target object is projected into the image coordinate system corresponding to the target image, for example, the following method may be adopted: converting the three-dimensional geometrical characteristics of the second target object from a scene coordinate system to a world coordinate system to complete model space conversion from a scene space coordinate to a world space coordinate; and then converting the three-dimensional geometric characteristics converted to the world coordinate system from the world coordinate system to an image coordinate system based on the initial pose information of the acquisition equipment for acquiring the target image, and completing the observation space conversion from the world space coordinate to the camera space coordinate. At this time, the first projection geometric feature of the second target object in the image coordinate system can be obtained. The method for specifically determining the first projection geometric feature of the second target object may be determined according to an actual situation, and is not described herein again.
At this time, since there is a corresponding relationship between the first target object in the target image and the second target object in the target scene, there is also a corresponding relationship between the obtained first projection geometric feature of the second target object in the image coordinate system and the first two-dimensional geometric feature of the first target object in the image coordinate system, so that the first target object and the second target object can be matched based on the first two-dimensional geometric feature of the first target object in the image coordinate system and the first projection geometric feature of the second target object in the image coordinate system to obtain a corresponding relationship between the first target object and the second target object.
When the first target object and the second target object are matched, for example, a Nearest Neighbor matching (KNN) or other matching methods may be used to determine a corresponding relationship between the first target object and the second target object, and a specific matching process is not described herein again.
At this time, the corresponding relationship between the first target object and the second target object when the second target object includes at least one of a target object whose outline is shown by at least one rectangle or a target object shown by an irregular figure may be determined.
When the second target object includes at least one of a target object whose outline is shown by at least one rectangle or a target object shown by an irregular figure, a specific method for generating a corresponding relationship between the first target object and the second target object includes:
based on the homography matrix between the target image and the target plane where the second target object is located, projecting the first target object in the target image to the target plane to obtain a second projection geometric characteristic of the first target object in the target plane; and matching the first target object and the second target object based on the second projection geometric feature of the first target object in the target plane and the geometric feature of the second target object in the target plane to obtain the corresponding relation of the first target object and the second target object. And determining the geometric characteristics of the second target object in the target plane based on the three-dimensional geometric characteristics of the second target object in the scene coordinate system.
Wherein, a homography (homography) between the target images and the target planes where the second target objects are located is used for projecting the first target object in the target images into the target planes.
For example, when acquiring the homography matrix between the target image and the target plane where the second target object is located, for example, coordinate values of any pixel point of the second target object on the target plane may be acquired, where the geometric feature of the second target object in the target plane is determined based on the three-dimensional geometric feature of the second target object in the scene coordinate system, for example, represented as O 1 =(x 1 ,y 1 ,z 1 ) (ii) a And obtaining a coordinate system O of corresponding positions of at least two pixel points included by the second target object on the target image 2 =(x 2 ,y 2 ,z 2 ) The homography matrix can then be determined, for example, according to the following equation (1), which can be represented, for example, by H:
O 2 =HO 1 (1)
wherein, the coordinate value is O 1 Z in 1 And O 2 Z in 2 Is 0 in the plane and therefore can be set to z when computing the homography matrix 1 =1,z 2 = 1. The specific process of obtaining the homography matrix H is not described herein again.
After determining the homography H, the first target object may be projected into the target plane, for example, to obtain a second projection geometry of the first target object in the target plane.
After the second projection geometric feature is determined, the first target object and the second target object may be matched based on the first two-dimensional geometric feature of the first target object in the image coordinate system and the second projection geometric feature of the first target object in the target plane, so as to obtain a corresponding relationship between the first target object and the second target object.
In specific implementation, under the condition that the first target object comprises a first road sign object with a rectangular outline, matching is performed by using the feature of the vertex, which is used for characterizing the first target object, in the second projection geometric feature and the feature of the vertex, which is determined by characterizing the second target object, in the geometric feature of the second target object in the target plane, so as to obtain the corresponding relation between the first target object and the second target object.
Here, since the first target object includes the first road sign having the rectangular contour, the correspondence between the first target object and the second target object can be determined more easily by matching the feature representing the vertex of the first target object in the second projection geometric feature and the feature representing the vertex determined by the second target object in the geometric feature of the second target object in the target plane. By adopting the way of matching points in the geometric features, the accuracy is higher while the computation amount is small for the first target object including the first road landmark object with the rectangular outline.
Similarly, when the first target object includes the second road sign object having an irregular contour, the feature representing the contour line and/or the corner of the first target object in the second projection geometric feature and the feature representing the contour line and/or the corner determined by the second target object in the geometric feature of the second target object in the target plane are used for matching, so as to obtain the corresponding relationship between the first target object and the second target object.
Here, since the first target object includes the second road sign object having an irregular contour, the corresponding manner of matching the features characterizing the contour lines and/or the corners of the first target object in the second projection geometric features and the features characterizing the contour lines and/or the corners determined by the second target object in the geometric features of the second target object in the target plane may be similar to the corresponding manner of matching the first road sign object, and the matching may be performed by using the contour line vertices in the contour lines and/or directly using the determined corners. In this way, the accuracy of the matching can be correspondingly improved for a second road sign object that is more complex graphically than the first road sign object.
And under the condition that the first target object comprises a third road marker object of a linear type, performing maximum graph matching on the second projection geometric characteristic and the geometric characteristic of the second target object in the target plane to obtain the corresponding relation between the first target object and the second target object.
Specifically, for the linear third road sign, in the process of matching the first target object and the second target object by using the maximum graph matching algorithm, a plurality of candidate matching pairs may be randomly constructed in an arbitrary matching manner, each candidate matching pair includes one first target object and one second target object, each first target object is included in only one candidate matching pair, and each second target object is included in only one candidate matching pair. The linear distance between the first target object and the second target object in each candidate matching pair may be calculated, and the average value of the linear distances corresponding to each candidate matching pair may be determined. And in the matching process, removing the matching mode with the overlarge average value, thereby obtaining the matching relation between the first target object and the second target object. According to the method, the matching process can be completed under the condition that the number of the first target objects to be matched is different from that of the second target objects, and the matching results of the first target objects with a small number and the second target objects with a large number are obtained, so that the automatic identification of the missed linear third road signs in the target images is realized through maximum graph matching, and the more accurate matching results can be obtained.
Here, since the first target object comprises a third road landmark object of a line type, it cannot show information of all the points it comprises after projection, in particular all the points characterizing its actual position. The above-described way of matching the first road marking object and/or the second road marking object by means of at least one of points, contour lines, corner points is therefore not applicable. And through a maximum graph matching mode, the second projection geometric feature and the geometric feature of the second target object in the target plane can be directly matched without collecting a third road marker object again, so that the positioning efficiency is ensured.
Exemplarily, when a first target object and a second target object are matched, because initial pose information of a target image is inaccurate and the second target object in a road has a characteristic of continuous appearance, a corresponding relation obtained by using nearest neighbor matching is not accurate, so that a missing detection line is easy to appear during matching, and the determined corresponding relation between the first target object and the second target object is not accurate. Therefore, for example, an optimal matching algorithm (Kuhn-Munkras, KM) may be selected to match the first target object and the second target object, and by using the reciprocal of the distance between any two target line segments in the first target object as the weight value of the first target object corresponding to the two target line segments, the matching that is too large for the average value of the distance between the target line segments in the matching candidates may be removed, so as to obtain the correspondence between the first target object and the second target object.
Referring to fig. 10, a schematic diagram for determining a correspondence relationship between a first target object and a second target object is provided in the embodiment of the present disclosure. Wherein 11 represents a first target object, 12 represents a second target object, 13 represents a first target object which is found by matching lines through a maximum graph matching algorithm and does not have a second target object matched with the lines, that is, 13 represents a line which is missed to be detected in a target image, and 14 represents a group of corresponding first target object and second target object. However, when there is a missing detection line, for example, the missing detection line may not be processed.
After the corresponding relationship between the first target object and the second target object is determined, the first target object and the second target object may be subjected to position matching based on the three-dimensional geometric feature of the second target object in the target scene in the scene coordinate system corresponding to the target scene and the first two-dimensional geometric feature of the first target object.
When the first target object and the second target object are subjected to position matching, the corresponding relation between the first target object and the second target object is determined, and the position matching is carried out on the three-dimensional geometric feature of the second target object in the scene coordinate system and the first two-dimensional geometric feature of the first target object, so that the matching loss caused by the deviation of the initial pose information and the actual pose information can be determined, and the target pose information of the acquisition equipment for acquiring the target image is determined based on the determined matching loss.
And projecting the second target object into the image coordinate system based on the initial pose information of the acquisition equipment for acquiring the target image and the three-dimensional geometric characteristics of the second target object in the target scene in the scene coordinate system corresponding to the target scene to obtain third projection geometric characteristics of the second target object in the image coordinate system. And performing position matching on the first target object and the second target object with the corresponding relation based on the third projection geometric feature of the second target object in the image coordinate system, the first two-dimensional geometric feature of the first target object and the corresponding relation of the first target object and the second target object.
In a specific implementation, the method for determining the third projection geometric feature of the second target object in the image coordinate system is similar to the above-mentioned manner for determining the first projection geometric feature, and is not repeated herein. The determined third projection geometry, which may for example be denoted as pi (L), in case the first target object comprises at least one of the target objects of which at least one end of the contour is not shown in said target image i ,T 0 ) (ii) a Wherein pi is a projection function for transforming the three-dimensional geometrical characteristics L of the second target object i According to the initial pose information T 0 Projected into the image coordinate system. The determined third projection geometry, which may be expressed as pi (P), for example, in case the first target object comprises at least one of a target object whose contour is shown with at least one rectangle or a target object which is shown with an irregular figure j ,T 0 ) (ii) a Wherein the projection function pi is used to map the three-dimensional geometrical characteristics P of the second target object j According to the initial pose information T 0 Projected into the image coordinate system.
At this time, for example, it is also possible to determine a vertex or a target line segment corresponding to the first target object, a projection vertex or a projection target line segment in the second target object. The method for determining the projection vertex or the projection target line segment is similar to the method for determining the vertex or the projection target line segment in the first target object, and is not described herein again.
In the case that the first target object includes at least one of the target objects whose at least one end of the contour is not shown in the target image, according to the corresponding relationship between the first target object and the second target object, the first two-dimensional geometric feature of the first target object may be position-matched with the third projection geometric feature of the corresponding second target object in the image coordinate system, so as to determine the corresponding relationship between the first target object and the second target object.
At this time, since the first two-dimensional geometric feature of the first target object is determined by the endpoint coordinate value of the target line segment of the first target object, when determining the corresponding relationship, only the endpoint needs to be subjected to position matching, the amount of computation is less, and the efficiency is higher when the position is matched.
In a case where the first target object includes a second road-marking object having an irregular contour, a specific method for position matching the first target object and the second target object provided for embodiments of the present disclosure includes: performing interpolation processing on the first two-dimensional geometric characteristic of the first target object to obtain a second two-dimensional geometric characteristic of the first target object; wherein the second two-dimensional geometric feature comprises: coordinate values of the plurality of vertexes in the target image and coordinate values of the plurality of interpolation points in the target image; and performing point-to-point position matching on the first target object and the second target object based on the second two-dimensional geometric characteristic, the three-dimensional geometric characteristic and the corresponding relation between the first target object and the second target object.
In a specific implementation, in a case where the first target object includes at least one target object whose contour is shown by at least one rectangle, since the first two-dimensional geometric feature of the first target object may be obtained based on coordinate values of two or four vertices on the first target object, the vertices may be fewer, and by performing interpolation processing on each vertex, the determined multiple interpolation points may form a sparse gap contour of the first target object, so that the number of multiple vertices in two coordinate axis directions on the image coordinate system is smaller, thereby balancing weights between different semantics, and also alleviating a problem of poor matching when performing position matching processing using fewer vertices.
When performing interpolation processing on each vertex, for example, at least one of the following methods may be used: taylor Interpolation (Taylor Interpolation), lagrange Interpolation (Lagrange Interpolation), newton Interpolation (Newton Interpolation), and Hermite Interpolation (Hermite Interpolation). The specific interpolation method may be selected according to actual situations, and is not described herein again.
With respect to S104 described above, when determining the target pose information of the capturing device that captures the target image based on the position matching result, for example, a position matching error may be determined based on the result of the position matching, and the target pose information of the capturing device that captures the target image may be determined based on the position matching error and the initial pose information of the capturing device that captures the target image.
In a specific implementation, a third projection geometric feature pi (L) is determined i ,T 0 ) And pi (P) j ,T 0 ) In this case, the corresponding position matching error may be determined. Wherein the third projection geometry is pi (L) i T0) corresponding position matching error can be expressed, for example, as D l (π(L i ,T0),l i ),D l And a residual item representing the distance from the end point of the projection target line segment to the center line of the target line segment. Third projection geometry pi (P) j ,T 0 ) The corresponding position matching error may be represented, for example, as D p (π(P j ,T0),p i ),D p Representing the reprojection error between the vertex and the projected vertex.
At this time, since there may be a plurality of first target objects, the following formula (2) may be adopted, for example, when determining the position matching error:
wherein Q represents the total number of target objects whose outlines included in the first target object are shown with at least one rectangle or with an irregular figure; p represents the total number of target objects in which at least one end of the contour included in the first target object is not shown in the target image; error represents the determined match penalty.
At the moment, the smaller the matching loss error is, the more accurate the pose information of the acquisition equipment for representing and acquiring the target image is; the larger the matching loss error is, the larger the difference between the pose information of the acquisition equipment for representing the acquired target image and the actual pose information is, and the pose information of the acquisition equipment for representing the acquired target image needs to be further determined so as to reduce the matching loss error.
In the case of determining the matching loss, when determining the target pose information of the capturing device that captures the target image based on the matching loss, for example, the following method may be employed: detecting whether a preset iteration stop condition is met or not; determining initial pose information obtained by the last iteration as target pose information under the condition of meeting the iteration stop condition; and under the condition that the iteration stop condition is not met, determining new initial pose information based on the position matching error and the initial pose information in the last iteration process, and returning to the step of carrying out position matching on the first target object and the second target object based on the three-dimensional geometric characteristics of the second target object in the target scene in a scene coordinate system corresponding to the target scene, the first two-dimensional geometric characteristics of the first target object and the corresponding relation between the first target object and the second target object.
Wherein the iteration stop condition comprises at least one of: the iteration times are greater than a preset iteration time threshold; the position matching error of the first target object and the second target object is smaller than a preset loss threshold value. In the case that the iteration stop condition is selected such that the iteration number is greater than the preset iteration number threshold, the preset iteration number threshold may be determined based on experience, for example, 6 times or 8 times, so that the matching loss is small after a sufficient number of iterations. Under the condition that the selected iteration stopping condition is that the position matching error of the first target object and the second target object is smaller than a preset loss threshold, a smaller loss threshold can be set, so that the confidence coefficient of the obtained target pose information is higher. Specifically, the selection of the iteration stop condition may be determined according to an actual situation, and is not described herein again.
In the case where the iteration stop condition is not satisfied, the direction of iteration is determined to be a direction in which the position matching error is reduced, based on the position matching error determined at that time, and the latest one is determinedAnd determining the initial pose information in the primary iteration process as new initial pose information, returning to the step of performing position matching on the first target object and the second target object, and determining the matching loss again according to the new initial pose information until the position matching error meets the iteration stop condition. In the case where the iteration stop condition is satisfied, the initial pose information at this time may be determined as target pose information, which may be represented as T, for example aim 。
At this time, the target pose information T can be determined aim I.e. object pose information T of the object image aim 。
Based on the same inventive concept, the embodiment of the disclosure also provides a driving control method of the intelligent driving device.
Referring to fig. 11, a flowchart of a method for controlling a driving of an intelligent driving device according to an embodiment of the present disclosure is shown, where the method for controlling a driving of an intelligent driving device includes steps S1101 to S1103, where:
s1101: acquiring video frame data acquired by an intelligent driving device in the driving process;
s1102: the method comprises the steps of obtaining a target detection neural network by using a positioning method provided by the embodiment of the disclosure, and detecting a target object in video frame data;
s1103: and controlling the intelligent driving device based on the detected target object.
In a specific implementation, the driving device is, for example, but not limited to, any one of the following: an autonomous vehicle, a vehicle equipped with an Advanced Driving Assistance System (ADAS), a robot, or the like.
Controlling the vehicle includes, for example, controlling the vehicle to accelerate, decelerate, steer, brake, etc., or voice prompts may be played to prompt the driver to control the vehicle to accelerate, decelerate, steer, brake, etc.
The positioning method provided by the embodiment of the disclosure can determine the target pose information more efficiently, so that the positioning method is more favorable for being deployed in an intelligent driving device, improves the safety in the automatic driving control process, and better meets the requirements in the automatic driving field.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same inventive concept, a positioning apparatus corresponding to the positioning method is also provided in the embodiments of the present disclosure, and since the principle of solving the problem of the apparatus in the embodiments of the present disclosure is similar to the positioning method described above in the embodiments of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and the repeated parts are not described again.
Referring to fig. 12, a schematic diagram of a positioning apparatus provided in an embodiment of the present disclosure is shown, the apparatus includes: a first obtaining module 121, a first determining module 122, a matching module 123, and a second determining module 124; wherein,
a first obtaining module 121, configured to obtain a target image obtained by collecting a target scene; a first determining module 122, configured to determine, based on the target image, a first two-dimensional geometric feature of a first target object included in the target image; a matching module 123, configured to perform position matching on a first target object and a second target object in the target scene based on a three-dimensional geometric feature of the second target object in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object, and a corresponding relationship between the first target object and the second target object; and a second determining module 124, configured to determine, based on a result of the position matching, object pose information of an acquisition device acquiring the object image.
In an alternative embodiment, in case the first target object comprises a first road marker object having a rectangular contour, the first two-dimensional geometric feature of the first target object comprises: a vertex of the first target object; in case the first target object comprises a second road-marking object having an irregular contour, the first two-dimensional geometrical feature of the first target object comprises: contour lines and/or corner points of the first target object; in case the first target object comprises a third road marker object of a line type, the first two-dimensional geometrical feature of the first target object comprises: and the target line segment belongs to the image area where the first target object is located and is positioned on the central line of the first target object.
In an alternative embodiment, the first determining module 122, when determining the first two-dimensional geometric feature of the first target object included in the target image based on the target image, is configured to: performing semantic segmentation processing on the target image, and determining semantic segmentation results corresponding to a plurality of pixel points in the target image respectively; and determining a first two-dimensional geometric feature of the first target object in the target image based on semantic segmentation results corresponding to a plurality of pixel points respectively and positions of the pixel points in the target image respectively.
In an optional embodiment, in a case that the first target object includes a first road landmark object having a rectangular contour, the first determining module 122, when determining the first two-dimensional geometric feature of the first target object in the target image based on the semantic segmentation result corresponding to each of a plurality of pixel points and the position of each of the plurality of pixel points in the target image, is configured to: determining pixel points belonging to the contour of the first target object from the target image based on the semantic segmentation result; fitting to obtain a corresponding bounding box of the first target object in the target image based on pixel points belonging to the contour of the first target object; determining a first two-dimensional geometric feature of the first target object in the target image based on vertices of the bounding box.
In an optional embodiment, in a case that the first target object includes a second road sign object with an irregular contour, the first determining module 122, when determining the first two-dimensional geometric feature of the first target object in the target image based on the semantic segmentation result corresponding to each of a plurality of pixel points and the position of each of the plurality of pixel points in the target image, is configured to: determining pixel points belonging to the contour of the first target object from the target image based on the semantic segmentation result; obtaining the contour line of the first target object based on the position of the pixel point belonging to the contour of the first target object in the target image; determining a first two-dimensional geometric feature of the first target object in the target image based on a contour line of the first target object.
In an optional embodiment, in a case that the first target object includes a third road marker object of a line type, the first determining module 122, when determining the first two-dimensional geometric feature of the first target object in the target image based on the semantic segmentation results corresponding to a plurality of pixels, respectively, and the positions of the plurality of pixels, respectively, in the target image, is configured to: based on the semantic segmentation result, fitting to obtain a central line of the first target object; determining a target line segment which belongs to the image area where the first target object is located and is located on the central line based on a two-dimensional coordinate value of a pixel point which is located on the central line and belongs to the image area where the first target object is located in the target image; obtaining a first two-dimensional geometric feature of the first target object based on the target line segment.
In an optional embodiment, the method further includes a generating module 125, configured to: generating the corresponding relation of a first target object and a second target object based on a three-dimensional geometrical feature of the second target object in the target scene and a first two-dimensional geometrical feature of the first target object.
In an optional embodiment, the generating module 125, when generating the corresponding relationship between the first target object and the second target object based on a three-dimensional geometric feature of a second target object in the target scene and a first two-dimensional geometric feature of the first target object, is configured to: based on the initial pose information of the acquisition equipment for acquiring the target image and the three-dimensional geometrical characteristics of the second target object in the target scene, projecting the second target object into an image coordinate system of the target image to obtain a first projection geometrical characteristic of the second target object in the image coordinate system; and matching the first target object and the second target object based on the first two-dimensional geometric feature of the first target object in the image coordinate system and the first projection geometric feature of the second target object in the image coordinate system to obtain the corresponding relation between the first target object and the second target object.
In an optional embodiment, the generating module 125, when generating the corresponding relationship between the first target object and the second target object based on a three-dimensional geometric feature of a second target object in the target scene and a first two-dimensional geometric feature of the first target object, is configured to: based on a homography matrix between target planes of the target image and the second target object, projecting a first target object in the target image to the target plane to obtain a second projection geometric feature of the first target object in the target plane; matching the first target object and the second target object based on a second projection geometric feature of the first target object in the target plane and a geometric feature of the second target object in the target plane to obtain a corresponding relation between the first target object and the second target object; wherein the geometric feature of the second target object in the target plane is determined based on the three-dimensional geometric feature of the second target object in the scene coordinate system.
In an optional embodiment, when matching the first target object and the second target object based on the second projection geometric feature of the first target object in the target plane and the geometric feature of the second target object in the target plane to obtain a corresponding relationship between the first target object and the second target object, the generating module 125 is configured to: under the condition that the first target object comprises a first road sign object with a rectangular outline, matching by using the feature of the vertex, which is determined by the second target object, in the second projection geometric feature and the feature of the vertex, which is determined by the second target object, in the geometric feature of the second target object in the target plane, so as to obtain the corresponding relation between the first target object and the second target object; under the condition that the first target object comprises a second road sign object with an irregular contour, matching is carried out by utilizing the features of the contour line and/or the corner points which represent the first target object in the second projection geometric features and the features of the contour line and/or the corner points which represent the second target object and are determined by the second target object in the geometric features of the second target object in the target plane, so as to obtain the corresponding relation between the first target object and the second target object; and under the condition that the first target object comprises a third road marker object of a linear type, performing maximum graph matching on the second projection geometric characteristic and the geometric characteristic of the second target object in the target plane to obtain the corresponding relation between the first target object and the second target object.
In an alternative embodiment, the second determining module 124, when determining the target pose information of the capturing device capturing the target image based on the result of the position matching, is configured to: determining a position matching error based on a result of the position matching; and determining the target pose information of the acquisition equipment for acquiring the target image based on the position matching error and the initial pose information of the acquisition equipment for acquiring the target image.
In an optional implementation, the second determining module 124, when determining the target pose information of the capturing device capturing the target image based on the position matching error and the initial pose information of the capturing device capturing the target image, is configured to: detecting whether a preset iteration stop condition is met or not; determining the initial pose information obtained by the last iteration as the target pose information under the condition of meeting the iteration stop condition; and under the condition that the iteration stop condition is not met, determining new initial pose information based on the position matching error and initial pose information in the last iteration process, and returning to the step of carrying out position matching on the first target object and the second target object based on the three-dimensional geometric feature of the second target object in the target scene in a scene coordinate system corresponding to the target scene, the first two-dimensional geometric feature of the first target object and the corresponding relation between the first target object and the second target object.
In an alternative embodiment, the iteration stop condition includes any one of: the iteration times are greater than a preset iteration time threshold; the position matching error of the first target object and the second target object is smaller than a preset loss threshold.
In an optional embodiment, the matching module 123, when performing position matching on a first target object and a second target object in the target scene based on a three-dimensional geometric feature of the second target object in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object, and a corresponding relationship between the first target object and the second target object, is configured to: under the condition that the first target object comprises a second road sign object with an irregular contour, carrying out interpolation processing on the first two-dimensional geometrical characteristic of the first target object to obtain a second two-dimensional geometrical characteristic of the first target object; wherein the second two-dimensional geometric feature comprises: a plurality of vertices and a plurality of interpolation points; and performing point-to-point position matching on the first target object and the second target object based on the second two-dimensional geometric feature, the three-dimensional geometric feature and the corresponding relation between the first target object and the second target object.
In an optional embodiment, the matching module 123, when performing position matching on a first target object and a second target object in the target scene based on a three-dimensional geometric feature of the second target object in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object, and a corresponding relationship between the first target object and the second target object, is configured to: based on initial pose information of acquisition equipment for acquiring the target image and three-dimensional geometric features of a second target object in the target scene in a scene coordinate system corresponding to the target scene, projecting the second target object into an image coordinate system of the target image to obtain third projection geometric features of the second target object in the image coordinate system; and performing position matching on the first target object and the second target object which have the corresponding relation based on the third projection geometrical feature of the second target object in the image coordinate system and the first two-dimensional geometrical feature of the first target object.
The description of the processing flow of each module in the positioning apparatus and the interaction flow between each module may refer to the relevant description in the above embodiment of the positioning method, and is not described in detail here.
Based on the same inventive concept, the embodiment of the present disclosure further provides a driving control device of an intelligent driving device corresponding to driving control of the intelligent driving device, and as the principle of solving the problem of the device in the embodiment of the present disclosure is similar to the positioning method in the embodiment of the present disclosure, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 13, a schematic diagram of a driving control device of an intelligent driving device according to an embodiment of the present disclosure is shown, where the device includes: a second acquisition module 131, a detection module 132, and a control module 133; wherein,
the second obtaining module 131 is configured to obtain video frame data acquired by the intelligent driving device in the driving process;
a detection module 132, configured to process the video frame data by using any one of the positioning methods provided based on the embodiments of the present disclosure, and detect a target object in the video frame data;
and a control module 133 for controlling the intelligent driving apparatus based on the detected target object.
The description of the processing flow of each module in the travel control device of the intelligent travel device and the interaction flow between the modules may refer to the related description in the travel control method embodiment of the intelligent travel device described above, and will not be described in detail here.
An embodiment of the present disclosure further provides a computer device, as shown in fig. 14, which is a schematic structural diagram of the computer device provided in the embodiment of the present disclosure, and includes:
a processor 141 and a memory 142; the memory 142 stores machine-readable instructions executable by the processor 141, the processor 141 is configured to execute the machine-readable instructions stored in the memory 142, when the machine-readable instructions are executed by the processor 141, the processor 141 performs the following steps:
acquiring a target image acquired by acquiring a target scene; determining, based on the target image, a first two-dimensional geometric feature of a first target object included in the target image; performing position matching on a first target object and a second target object in the target scene based on a three-dimensional geometric feature of the second target object in a scene coordinate system corresponding to the target scene, a first two-dimensional geometric feature of the first target object and a corresponding relation between the first target object and the second target object; and determining target pose information of acquisition equipment for acquiring the target image based on the position matching result.
Alternatively, processor 141 performs the following steps:
acquiring video frame data acquired by an intelligent driving device in the driving process; processing the video frame data by using any one of the positioning methods provided based on the embodiments of the present disclosure, and detecting a target object in the video frame data; and controlling the intelligent driving device based on the detected target object.
The memory 142 includes a memory 1421 and an external memory 1422; the memory 1421 is also referred to as an internal memory, and temporarily stores operation data in the processor 141 and data exchanged with the external memory 1422 such as a hard disk, and the processor 141 exchanges data with the external memory 1422 via the memory 1421.
For the specific execution process of the instruction, reference may be made to the positioning method described in the embodiment of the present disclosure or the steps of the driving control method of the intelligent driving device, which are not described herein again.
The disclosed embodiments also provide a computer-readable storage medium having a computer program stored thereon, where the computer program is executed by a processor to perform the positioning method described in the above method embodiments or the steps of the driving control method of the intelligent driving apparatus. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The embodiments of the present disclosure further provide a computer program product, where the computer program product bears a program code, and instructions included in the program code may be used to execute the positioning method described in the foregoing method embodiments or steps of the driving control method of the intelligent driving apparatus, which may be specifically referred to the foregoing method embodiments and are not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the system and the apparatus described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used to illustrate the technical solutions of the present disclosure, but not to limit the technical solutions, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes and substitutions do not depart from the spirit and scope of the embodiments disclosed herein, and they should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.