CN110530371B - Indoor map matching method based on deep reinforcement learning - Google Patents
Indoor map matching method based on deep reinforcement learning Download PDFInfo
- Publication number
- CN110530371B CN110530371B CN201910840334.2A CN201910840334A CN110530371B CN 110530371 B CN110530371 B CN 110530371B CN 201910840334 A CN201910840334 A CN 201910840334A CN 110530371 B CN110530371 B CN 110530371B
- Authority
- CN
- China
- Prior art keywords
- coordinates
- corrected
- map
- network
- reinforcement learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/005—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 with correlation of navigation data from several sources, e.g. map or contour matching
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/10—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
- G01C21/12—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
- G01C21/16—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
- G01C21/165—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
- G01C21/206—Instruments for performing navigational calculations specially adapted for indoor navigation
Landscapes
- Engineering & Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Automation & Control Theory (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Navigation (AREA)
Abstract
The invention discloses an indoor map matching method based on deep reinforcement learning, which comprises the following steps: s1, acquiring data of the pedestrian inertial navigation module and preprocessing the data to obtain pixel coordinates related to a map; s2, constructing a local map generation module according to the pixel coordinates obtained in the step S1; s3, defining the corresponding correction coordinate when the correction code in the current state is obtained; s4, jointly representing the pixel coordinate information to be corrected and the local map as the state of the current position; s5, designing a reward mechanism according to the consistency of the corrected coordinates of the single point and the label coordinates and the similarity of the corrected track and the standard path; s6, constructing a double-network model of a target value network and a current value network, and taking MSE of the target value network output and the current value network output as loss functions; and S7, outputting the positioning coordinates corrected by the reinforcement learning model.
Description
Technical Field
The invention belongs to the technical field of indoor positioning, and particularly relates to an indoor map matching method based on deep reinforcement learning.
Background
In the age of the rapid development of internet of things technology, most applications are more or less associated with location services. For moving objects, the positioning requirement is more obvious, and therefore, the positioning technology is receiving wide attention. However, the precision and cost of the positioning technique has been a pair of spears. If the cost is too high, most of the applications of the Internet of things can only be expected to be impressive; if a low-cost scheme is adopted, the positioning precision is not satisfactory. From the market demand, the higher the accuracy of positioning is, the better, so all positioning technologies are also making continuous breakthrough in accuracy, and the cost is also gradually reduced after industrial scale-up, and the "high-accuracy, low-cost" positioning solution is undoubtedly the trend of the future market. Currently, GNSS positioning has been widely spread, but a big drawback of GNSS is that it cannot cover indoor environment, and in fact, 80% of people's daily activities occur indoors, so the importance of indoor positioning technology is self-evident.
Conventional indoor positioning methods can be divided into deployment-dependent positioning techniques and deployment-independent positioning techniques. Deployment-dependent positioning techniques are: Wi-Fi positioning technology, Bluetooth positioning technology, UWB positioning technology, RFID positioning technology, and the like; deployment-independent positioning techniques are: geomagnetic positioning technology, inertial sensor positioning technology, and the like. The inertial sensor positioning technology is different from technologies such as Wi-Fi positioning and the like, and can be suitable for more complex application scenes such as anti-terrorist rescue and the like due to the characteristic that the inertial sensor positioning technology does not need to be deployed in advance; meanwhile, the inertial sensor is low in price and beneficial to large-scale popularization and application. Then, how to rely on the inertial sensor to realize indoor positioning becomes an urgent problem to be solved.
The existing literature retrieval shows that the literature 'fusion indoor positioning based on particle filtering and map matching' (Zhouyi, Luhang, Lushuai, et al. fusion indoor positioning based on particle filtering and map matching [ J ]. academic newspaper of electronic science and technology university, 2018, v.47(03): 97-102.) provides a fusion indoor positioning method based on particle filtering and map matching. The technology tries to combine WiFi fingerprint positioning and pedestrian dead reckoning through particle filtering, and matches and corrects a positioning result by applying an indoor map. The WiFi positioning is realized by using a two-stage positioning scheme combining SVC and SVR; the PDR obtains the walking steps, step length and direction of the user through the accelerometer and the magnetometer, and the walking steps, step length and direction are used for modeling the user behavior in particle filtering; and finally, fusing the information of the two previous parts and the map information to realize final positioning. However, the technology utilizes WiFi information, a positioning area needs to be deployed in advance, a corresponding WiFi fingerprint map is constructed, and the problem of track through wall exists. The reason is that the existing map matching methods such as particle filtering focus on optimizing local tracks, and the tracks are not optimized from the global consideration.
Disclosure of Invention
The present invention is directed to provide an indoor map matching method based on deep reinforcement learning, so as to solve or improve the above-mentioned problems.
In order to achieve the purpose, the invention adopts the technical scheme that:
an indoor map matching method based on deep reinforcement learning, comprising:
s1, acquiring data of the pedestrian inertial navigation module and preprocessing the data to obtain pixel coordinates related to a map;
s2, constructing a local map generation module according to the pixel coordinates obtained in the step S1;
s3, defining the corresponding correction coordinate when the correction code in the current state is obtained;
s4, jointly representing the pixel coordinate information to be corrected and the local map as the state of the current position;
s5, designing a reward mechanism according to the consistency of the corrected coordinates of the single point and the label coordinates and the similarity of the corrected track and the standard path;
s6, constructing a double-network model of a target value network and a current value network, and taking MSE of the target value network output and the current value network output as loss functions;
and S7, outputting the positioning coordinates corrected by the reinforcement learning model.
Preferably, in step S1, the relative geodetic location coordinates during pedestrian travel are collected, and the geodetic location coordinates are subjected to coordinate conversion to generate pixel coordinates related to the map.
Preferably, the map is cut according to the pixel coordinates generated in step S1, and a local map related to the pixel coordinates is generated.
Preferably, the reward mechanism of step S5 is: and comprehensively considering the consistency of the corrected coordinates of the single point and the coordinates of the label and the similarity of the corrected track and the standard path, and returning a quantitative numerical value.
Preferably, the reward is classified according to the design of the action space and the Euclidean distance from the true value data, and if the serial number output by the model is correct, the reward is 1; decays to 0.75 of the initial value in order according to the hierarchy.
Preferably, the current value network quantizes the states, actions and reward values in the above steps to corresponding Q values through a value iteration network based on the bellman equation in step S6; the target value network and the current value network have the same network structure, except that the network parameters need to be copied at certain time steps.
The indoor map matching method based on deep reinforcement learning provided by the invention has the following beneficial effects:
according to the method, a deep reinforcement learning model is designed and built according to inertial navigation data and map data, data fusion of map information and inertial navigation track information is completed, and map matching is achieved.
Besides, the method abandons the traditional picture processing technology, and extracts the local map features by using a neural network method, so that the calculation speed is greatly improved; secondly, in view of the problem that the traditional technology cannot solve that the map matching track penetrates through the wall, the method for matching the map by fusing the map and the track information through reinforcement learning is firstly proposed to solve the problem of penetrating through the wall and finish map matching; and finally, after the deep reinforcement learning model is completely trained, the saved model can be directly used for completing the track correction task.
In conclusion, the method has the advantages of strong global optimization capability, low technical complexity, strong generalization and the like, and is particularly suitable for severe environments in which the positioning device cannot be deployed in advance.
Drawings
Fig. 1 is a schematic diagram of an overall network according to an embodiment.
Fig. 2 is a partial data presentation diagram of an embodiment.
Fig. 3 is a partial map generation result diagram of the embodiment.
Fig. 4 is a diagram illustrating an operation space control mode according to the embodiment.
Fig. 5 is a diagram of a state data structure of an embodiment.
FIG. 6 is a diagram illustrating a status feature correction process according to an embodiment.
Fig. 7 is an illustration of a reward mechanism according to an embodiment.
Fig. 8 is a flowchart of the dual network model structure according to the embodiment.
FIG. 9 is a diagram showing the comparative effect of the coarse positioning trace and the true trace according to the embodiment.
FIG. 10 is a graph showing the comparison between the rough positioning trajectory and the reinforcement learning correction trajectory according to the embodiment.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
According to one embodiment of the application, the indoor map matching method based on the deep reinforcement learning of the scheme comprises the following steps:
s1, acquiring data of the pedestrian inertial navigation module and preprocessing the data to obtain pixel coordinates related to a map;
s2, constructing a local map generation module according to the pixel coordinates obtained in the step S1;
s3, defining the corresponding correction coordinate when the correction code in the current state is obtained;
s4, jointly representing the pixel coordinate information to be corrected and the local map as the state of the current position;
s5, designing a reward mechanism according to the consistency of the corrected coordinates of the single point and the label coordinates and the similarity of the corrected track and the standard path;
s6, constructing a double-network model of a target value network and a current value network, and taking MSE of the target value network output and the current value network output as loss functions;
and S7, outputting the positioning coordinates corrected by the reinforcement learning model.
Referring to FIG. 1, an initial state generated by an environment according to one embodiment of the present applications t Feedback of output actions over a current value networkThe two continuously interact with the environment, and one of the moments is taken as an example for illustration. I.e. to generate a corresponding current states t Current motiona t Instant rewardr t And shifting to the state at the next moments t+1A quadruple. Is recorded as (s t , a t , s t+1). And saving the quadruple to a memory playback unit. After a certain time step, the current value network and the target value network are respectively extracted from the playback memory unit at random (s t , a t ) Ands t+1in combination with correspondingr t Constituting the final loss function. Network parameters are optimized for this loss.
The coordinate to be corrected is recorded as ins _ ori, the sitting mark used as a label is recorded as ins _ label, the side length of a coding map used for coding is recorded as map _ len _ big, the side length of a state map used for representing the state is recorded as map _ len _ small, and the size of a pixel group for controlling an error range is recorded as pixel _ group _ len.
According to an embodiment of the present application, the steps S1 to S7 are described in detail below.
Step S1, acquiring data and preprocessing the inertial navigation module of the pedestrian;
researchers wear inertial navigation equipment to walk in a corridor and a room, and the acquired data are transmitted to a terminal to be stored. The data used during the study are the relative geodetic coordinates of the walker during travel, as shown in FIG. 2, where ins represents inertial navigation data and label represents truth data. Since the scene currently considered is map matching on a flat map, only two-dimensional data is considered, i.e. only the x, y directions are considered. In addition, because the data is subjected to scale conversion, the data takes pixel points as units.
Step S2, constructing a local map generation module;
and the local map generation module is determined according to the inertial navigation position at the previous moment and the inertial navigation position at the current moment. Specifically, the original map is subjected to directional cutting by taking the pixel coordinate of the current moment as a center and taking the position included angle between the current moment and the previous moment as a direction, so that a final local map is obtained. Obviously, such a local map is not only related to the current position, but also to the position at the previous moment. The final local map reflects the location information together with the constraint information of the map.
As shown in fig. 3, the local map is the local cut map containing history information as shown in the right side of fig. 3, wherein the left side of fig. 3 is the original map used in the experiment, and the right side of fig. 3 is the local cut map with the pixel coordinate [552.00, 420.00] as the center, the angle with the pixel coordinate [420.00,300.00] as the direction, and the side length is 500 pixels.
Step S3, designing an action space control mode;
the regression problem of the track correction is converted into a classification problem through a self-defined coding mode, and the track can be corrected by solving a proper correction code in the current state.
Taking ins _ ori as the center and map _ len _ big as the side length, a partial map about the position can be obtained, which is called a coding map because its main function is to code the motion.
As shown in fig. 4, ins _ ori is always located at the center of the code map, i.e., No. 12 code, and the position number to be corrected is calculated from the position of ins _ label, for example, when ins _ label needs to be moved to the right by two frames relative to ins _ ori and then moved upward by one frame, the corresponding action code is 19, i.e., action _ number is 19.
The algorithm is as follows:
1) and calculating the difference value of the ins _ label and the ins _ ori in the x direction and the y direction, namely:
2) calculating the number of the center point of the coding map:
3) calculating a final action code:
when out-of-range conditions occur, e.g.Is greater thanForcing orderThe maximum boundary value.The same is true.
By this, the process of action numbering ends.
Step S4, designing a state space conversion module;
and retrieving the track coordinates to be corrected through an index rule, and correcting the track coordinates iteratively.
A data structure is designed to represent the current coordinate information, and meanwhile, the data structure can store the current action information. As shown in FIG. 5, the first two columns represent two sets of indices of the given trajectory, from which a set of original coordinates to be corrected can be retrieved, if noted asLast column shows action _ number for action _ numberAnd (4) acting. In other words, only one coordinate with correction is givenThe coordinates after correction can be calculated reversely by correcting a correct motion code. This seat is marked as ins _ reverse.
The specific correction process is shown in fig. 6: from the initial state (0, 0), the coordinate to be corrected can be found through data indexingExtracting the coordinate component at that timeThe operation code 19 corresponding to the current state is obtained from the environment, and decoding is performed according to the coding rule designed in advance in fig. 4, so that the operation code can be obtained。
The corrected coordinate components are then as follows:
and then, after state conversion, the subsequent coordinate sequence to be corrected can be corrected continuously.
Step S5, designing a reward mechanism;
the reward is classified according to the design of the action space and the Euclidean distance from the true value data, and if the serial number output by the model is correct, the reward is 1; decays to 0.75 of the initial value in order according to the hierarchy.
As shown in fig. 7, if the action number corresponding to the current label is 19, the reward value corresponding to the number farther from 19 is smaller (the color gradually deepens) with 19 as the center. Specifically, assume that the model output is numbered 19, the reward is 1; the network output is 13,14,18,23,24, the prize value is 0.75 (decay quarter), and so on. Of course, if greater discrimination is desired, the attenuation coefficient may be increased.
Step S6, building a double-network model;
as shown in fig. 8, the present model has two CNN networks in the design process. The network structures of the two CNN networks are completely identical, and the parameters of the CNN network 2 are duplicated to the CNN network 1 at a fixed time step.
Specifically, action A and State for time tIn other words, feedback through the environment can result in an immediate reward R and a next states t+1. These two sets of values need to pass through the network on both sides, respectively. Wherein due to the states t Only one coordinate, no map information has been incorporated. Both sides of the state need to go through Map clipping, i.e. Map cutting centered on the coordinates. There is a certain difference between the output parts of network 1 and network 2. Respectively through process1 and process 2.
Wherein:
process 1: and converting the action number into an one-Hot code, recording the one-Hot code as A _ Hot, and multiplying the A _ Hot by the last layer of the network to obtain the Q _ value corresponding to the number at the moment. In other words, the Process1 functions to select the Q value corresponding to action a.
Process 2: the purpose of this operation is to get Q _ target.
Process2 then needs to first take the maximum value of CNN network 2 and then add it to the instant prize R to get the final Q _ target.
The final loss function expression is obtained:
for each coordinate to be corrected, the forward path described above is performed, i.e., a different loss is obtained. Like conventional supervised learning, the network parameters can be continuously optimized based on this loss with back propagation.
Step S7, outputting the positioning coordinate corrected by the reinforcement learning model;
FIG. 9 is a comparison effect diagram of a rough positioning track and a true value track, wherein a black point track is an inertial navigation output track and has the problems of wall penetration, turning time deviation and the like; the dashed trace is the true trace.
Fig. 10 is a comparison effect diagram of the rough positioning trajectory and the reinforcement learning correction trajectory, the black dot trajectory is an inertial navigation output trajectory, and the dotted line trajectory is a correction trajectory after the reinforcement learning to complete map matching. Therefore, the inertial navigation positioning track graph which is not matched with the map has the problems of wall penetration, turning moment deviation and the like; in contrast, after map matching is realized through reinforcement learning, the track is basically overlapped with the true value, and obvious problems such as wall penetration and the like do not occur.
The invention has the following beneficial effects:
according to the method, a deep reinforcement learning model is designed and built according to inertial navigation data and map data, data fusion of map information and inertial navigation track information is completed, and map matching is achieved.
Besides, the method abandons the traditional picture processing technology, and extracts the local map features by using a neural network method, so that the calculation speed is greatly improved; secondly, in view of the problem that the traditional technology cannot solve that the map matching track penetrates through the wall, the method for matching the map by fusing the map and the track information through reinforcement learning is firstly proposed to solve the problem of penetrating through the wall and finish map matching; and finally, after the deep reinforcement learning model is completely trained, the saved model can be directly used for completing the track correction task.
In conclusion, the method has the advantages of strong global optimization capability, low technical complexity, strong generalization and the like, and is particularly suitable for severe environments in which the positioning device cannot be deployed in advance.
While the embodiments of the invention have been described in detail in connection with the accompanying drawings, it is not intended to limit the scope of the invention. Various modifications and changes may be made by those skilled in the art without inventive step within the scope of the appended claims.
Claims (5)
1. An indoor map matching method based on deep reinforcement learning is characterized by comprising the following steps:
s1, acquiring data of the pedestrian inertial navigation module and preprocessing the data to obtain pixel coordinates related to a map;
s2, constructing a local map generation module according to the pixel coordinates obtained in the step S1;
s3, defining that when the correction code in the current state is obtained, the corresponding correction coordinate can be generated, including: converting the regression problem of the track correction into a classification problem in a self-defined coding mode, and correcting the track by solving a proper correction code in the current state;
s4, jointly representing the pixel coordinate information to be corrected and the local map as the state of the current position;
s5, designing a reward mechanism according to the consistency of the corrected coordinates of the single point and the label coordinates and the similarity of the corrected track and the standard path, wherein the reward mechanism is as follows: comprehensively considering the consistency of the corrected coordinates of the single point and the label coordinates and the similarity of the corrected track and the standard path, and returning a quantitative numerical value;
s6, constructing a double-network model of a target value network and a current value network, and taking MSE of the target value network output and the current value network output as loss functions;
and S7, outputting the positioning coordinates corrected by the reinforcement learning model.
2. The deep reinforcement learning-based indoor map matching method according to claim 1, wherein: in step S1, the relative geodetic location coordinates of the pedestrian during traveling are collected, and the geodetic location coordinates are subjected to coordinate conversion, so as to generate pixel coordinates related to the map.
3. The deep reinforcement learning-based indoor map matching method according to claim 1, wherein: the map is cut according to the pixel coordinates generated in step S1, and a local map associated with the pixel coordinates is generated.
4. The deep reinforcement learning-based indoor map matching method according to claim 1, wherein the reward is classified according to the design of an action space and the euclidean distance from the true value data, and if the number of the model output is correct, the reward is 1; decays to 0.75 of the initial value in order according to the hierarchy.
5. The deep reinforcement learning-based indoor map matching method according to claim 4, wherein the current value network quantizes the state, the action and the reward value to corresponding Q values through a value iteration network based on a Bellman equation in step S6; the target value network and the current value network have the same network structure, except that the network parameters need to be copied at certain time steps.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910840334.2A CN110530371B (en) | 2019-09-06 | 2019-09-06 | Indoor map matching method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910840334.2A CN110530371B (en) | 2019-09-06 | 2019-09-06 | Indoor map matching method based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110530371A CN110530371A (en) | 2019-12-03 |
CN110530371B true CN110530371B (en) | 2021-05-18 |
Family
ID=68667273
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910840334.2A Active CN110530371B (en) | 2019-09-06 | 2019-09-06 | Indoor map matching method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110530371B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111061277B (en) * | 2019-12-31 | 2022-04-05 | 歌尔股份有限公司 | Unmanned vehicle global path planning method and device |
CN112146660B (en) * | 2020-09-25 | 2022-05-03 | 电子科技大学 | Indoor map positioning method based on dynamic word vector |
CN113008226B (en) * | 2021-02-09 | 2022-04-01 | 杭州电子科技大学 | Geomagnetic indoor positioning method based on gated cyclic neural network and particle filtering |
CN114001736A (en) * | 2021-11-09 | 2022-02-01 | Oppo广东移动通信有限公司 | Positioning method, positioning device, storage medium and electronic equipment |
CN114858158A (en) * | 2022-04-26 | 2022-08-05 | 河南省吉立达机器人有限公司 | Mobile robot repositioning method based on deep learning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105137967A (en) * | 2015-07-16 | 2015-12-09 | 北京工业大学 | Mobile robot path planning method with combination of depth automatic encoder and Q-learning algorithm |
CN106709449A (en) * | 2016-12-22 | 2017-05-24 | 深圳市深网视界科技有限公司 | Pedestrian re-recognition method and system based on deep learning and reinforcement learning |
CN108255182A (en) * | 2018-01-30 | 2018-07-06 | 上海交通大学 | A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method |
CN108680174A (en) * | 2018-05-10 | 2018-10-19 | 长安大学 | A method of map match abnormal point is improved based on machine learning algorithm |
CN109059939A (en) * | 2018-06-27 | 2018-12-21 | 湖南智慧畅行交通科技有限公司 | Map-matching algorithm based on Hidden Markov Model |
CN109407676A (en) * | 2018-12-20 | 2019-03-01 | 哈尔滨工业大学 | The moving robot obstacle avoiding method learnt based on DoubleDQN network and deeply |
CN109855616A (en) * | 2019-01-16 | 2019-06-07 | 电子科技大学 | A kind of multiple sensor robot air navigation aid based on virtual environment and intensified learning |
CN109871010A (en) * | 2018-12-25 | 2019-06-11 | 南方科技大学 | method and system based on reinforcement learning |
-
2019
- 2019-09-06 CN CN201910840334.2A patent/CN110530371B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105137967A (en) * | 2015-07-16 | 2015-12-09 | 北京工业大学 | Mobile robot path planning method with combination of depth automatic encoder and Q-learning algorithm |
CN106709449A (en) * | 2016-12-22 | 2017-05-24 | 深圳市深网视界科技有限公司 | Pedestrian re-recognition method and system based on deep learning and reinforcement learning |
CN108255182A (en) * | 2018-01-30 | 2018-07-06 | 上海交通大学 | A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method |
CN108680174A (en) * | 2018-05-10 | 2018-10-19 | 长安大学 | A method of map match abnormal point is improved based on machine learning algorithm |
CN109059939A (en) * | 2018-06-27 | 2018-12-21 | 湖南智慧畅行交通科技有限公司 | Map-matching algorithm based on Hidden Markov Model |
CN109407676A (en) * | 2018-12-20 | 2019-03-01 | 哈尔滨工业大学 | The moving robot obstacle avoiding method learnt based on DoubleDQN network and deeply |
CN109871010A (en) * | 2018-12-25 | 2019-06-11 | 南方科技大学 | method and system based on reinforcement learning |
CN109855616A (en) * | 2019-01-16 | 2019-06-07 | 电子科技大学 | A kind of multiple sensor robot air navigation aid based on virtual environment and intensified learning |
Non-Patent Citations (1)
Title |
---|
历史数据和强化学习相结合的低频轨迹数据匹配算法;孙文彬等;《测绘学报》;20161115(第11期);第1328-1344页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110530371A (en) | 2019-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110530371B (en) | Indoor map matching method based on deep reinforcement learning | |
Qingyun et al. | Cross-modality fusion transformer for multispectral object detection | |
Zhou et al. | To learn or not to learn: Visual localization from essential matrices | |
Mahjourian et al. | Geometry-based next frame prediction from monocular video | |
CN108491763B (en) | Unsupervised training method and device for three-dimensional scene recognition network and storage medium | |
US12008762B2 (en) | Systems and methods for generating a road surface semantic segmentation map from a sequence of point clouds | |
CN109272493A (en) | A kind of monocular vision odometer method based on recursive convolution neural network | |
CN112989220A (en) | Motion trajectory processing method, medium, device and equipment | |
CN115071762A (en) | Pedestrian trajectory prediction method, model and storage medium oriented to urban scene | |
Gilitschenski et al. | Deep context maps: Agent trajectory prediction using location-specific latent maps | |
CN112288776A (en) | Target tracking method based on multi-time step pyramid codec | |
Yao et al. | Goal-lbp: Goal-based local behavior guided trajectory prediction for autonomous driving | |
Radwan | Leveraging sparse and dense features for reliable state estimation in urban environments | |
Hou et al. | Fe-fusion-vpr: Attention-based multi-scale network architecture for visual place recognition by fusing frames and events | |
Sharjeel et al. | Real time drone detection by moving camera using COROLA and CNN algorithm | |
CN116486489A (en) | Three-dimensional hand object posture estimation method and system based on semantic perception graph convolution | |
CN111738092A (en) | Method for recovering shielded human body posture sequence based on deep learning | |
CN116309705A (en) | Satellite video single-target tracking method and system based on feature interaction | |
Kang et al. | ETLi: Efficiently annotated traffic LiDAR dataset using incremental and suggestive annotation | |
CN117058474B (en) | Depth estimation method and system based on multi-sensor fusion | |
Wang et al. | EFRNet-VL: An end-to-end feature refinement network for monocular visual localization in dynamic environments | |
Yao et al. | MLP-based Efficient Convolutional Neural Network for Lane Detection | |
Jeong et al. | Fast and Lite Point Cloud Semantic Segmentation for Autonomous Driving Utilizing LiDAR Synthetic Training Data | |
CN117570960A (en) | Indoor positioning navigation system and method for blind guiding robot | |
Cui et al. | Ellipse loss for scene-compliant motion prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |