CN112700049A

CN112700049A - Order distribution method and device

Info

Publication number: CN112700049A
Application number: CN202011643182.6A
Authority: CN
Inventors: 王强; 张文琦; 石东海; 袁哲明
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2021-04-23

Abstract

The embodiment of the invention provides an order dispatching method and device, wherein real-time driver and passenger information of each vehicle is acquired; wherein the driver and passenger information comprises: driver information and passenger order information; inputting real-time ride information into the trained neural network model, and outputting each ride pair value corresponding to the ride information; wherein, the trained neural network model is obtained based on the training of a sample set, and the sample set comprises: historical driver and passenger information; and according to the value of each driver-passenger pair, performing driver-passenger matching on all drivers and passengers by adopting an optimal matching KM algorithm of a bipartite graph to obtain a driver with the highest matching degree with the passenger order, and dispatching the order to the vehicle where the driver with the highest matching degree with the passenger order is located. The method solves the technical problems that in the related technology, the driver and passenger matching only takes the order price as a basis, and the driver and passenger matching is carried out on all vehicles taking in a certain area around the taking position, so that the order completion efficiency of the whole order dispatching platform is low, and the income of the whole platform is influenced.

Description

Order distribution method and device

Technical Field

The invention relates to the technical field of communication, in particular to an order dispatching method and device.

Background

With the rapid development of mobile communication systems and Global Positioning Systems (GPS), vehicle sharing platforms are rapidly developing, and provide convenient services for people going out.

In the vehicle sharing platform, in order to enable passengers to share the vehicle, the passengers generally place orders, then the vehicle sharing platform collects the orders of the passengers, the driver and the passenger matching only takes the order price as a basis, the vehicle and the passenger matching is carried out on all vehicles in a certain area around the riding position of the passengers, and the vehicle with the highest matching degree is determined so as to finish dispatching the orders of the passengers.

Therefore, in the related technology, the driver and passenger matching only takes the order price as a basis, and the driver and passenger matching is carried out on all vehicles taking in a certain area around the taking position, so that the order completion efficiency of the whole order dispatching platform is low, and the income of the whole platform is influenced.

Disclosure of Invention

The embodiment of the invention aims to provide an order dispatching method and device, which are used for solving the technical problems that in the related art, driver-ride matching only takes order price as a basis and carries out driver-ride matching on all vehicles taking a bus in a certain area around the riding position, so that the order completion efficiency of the whole dispatching platform is low, and the benefit of the whole platform is influenced. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides an order dispatching method, including:

acquiring real-time riding information of each vehicle; wherein the driver and passenger information includes: driver information and passenger order information; the driver information includes: order-receiving distance information, vehicle position information and driver service level information; the passenger order information includes: order starting position information, order ending position information and order price information;

inputting the real-time ride information into a trained neural network model, and outputting each ride pair value corresponding to the ride information; wherein the trained neural network model is trained based on a sample set, the sample set comprising: historical driver and passenger information;

and according to the value of each driver-driver pair, performing driver-driver matching on all drivers and drivers by adopting an optimal matching KM algorithm of a bipartite graph to obtain the driver with the highest matching degree with the passenger order, and dispatching the order to the vehicle where the driver with the highest matching degree with the passenger order is located.

Further, the trained neural network model is obtained by training the following steps:

acquiring historical riding information of each vehicle in a service area;

inputting historical driver and passenger information into a neural network to be trained, and outputting each driver and passenger value-to-value estimation value corresponding to the historical driver and passenger information;

constructing a loss function based on the difference value between the minimum driver-rider pair value estimation value and the driver-rider pair value target value as a training target, wherein the driver-rider pair value target value is obtained by calculating a pick-up distance value, a booking price value, an order terminal vehicle demand and a driver service grade value;

judging whether the loss function is lower than a preset threshold value or not;

if the loss function is not lower than the preset threshold value, adjusting parameters of the neural network to be trained to obtain an adjusted neural network;

updating the neural network to be trained by using the adjusted neural network, returning to the step of inputting historical driver and passenger information into the neural network to be trained and outputting each driver and passenger bivalence estimated value corresponding to the historical driver and passenger information until the loss function is lower than a preset threshold value to obtain a trained neural network model;

wherein the driver-multiplier-pair value target value is determined by adopting the following formula:

y＝R_p+λ₁R_d+λ₂R_h+λ₃R_v

wherein y is the value target value of driver-multiplier pair, R_dFor reward value, R, corresponding to the distance of order pick-up_hReward value, R, for order ending vehicle demand_vValue of reward for driver service class, R_dIs the inverse of the order distance, R_hIn proportion to the end of order demand,

λ₁for the proportion of the prize value corresponding to the pick-up distance, λ₂Ratio of reward value, lambda, for order ending vehicle demand₃A proportion of the driver service level prize value.

Further, after the historical ride information is input into the neural network to be trained and each ride-pair value estimation value corresponding to the historical ride information is output, the method further comprises:

taking the difference value between the minimum driver-rider pair value estimation value and the driver-rider pair value target value, historical driver-rider information and each driver-rider pair value estimation value as experience samples and storing the experience samples into a sample pool;

extracting a batch of experience samples from the sample pool;

if the loss function is not lower than the preset threshold value, adjusting parameters of the neural network to be trained to obtain an adjusted neural network, and then the method further comprises the following steps:

updating the neural network to be trained by using the adjusted neural network, updating historical driver and passenger information by using the batch of experience samples, returning to the step of inputting the historical driver and passenger information into the neural network to be trained and outputting each driver and passenger bivalence value estimated value corresponding to the historical driver and passenger information until the loss function is lower than a preset threshold value, and obtaining a trained neural network model.

Further, the extracting a batch of experience samples from the sample pool includes:

calculating a weight probability density function of the empirical sample according to the difference value between the value estimated value of the driver-multiplier pair and the target value of the driver-multiplier pair;

and according to the weight probability density function, batch sampling is carried out on the empirical samples in the sample pool to obtain batch samples.

Further, the inputting the real-time ride information into the trained neural network model and outputting each ride pair value includes:

inputting the driver and passenger information into the trained neural network model, wherein the trained neural network model outputs each driver and passenger pair value corresponding to the driver and passenger information through an output layer neuron of the trained neural network model according to the order receiving distance information, the order price information, the order end point position information and the driver service level information in the driver and passenger information, wherein the number of neurons in the output layer of the neural network model is one, and the number of neurons in the input layer of the neural network model is determined by the dimension of the driver and passenger information.

Further, the step of performing driving and riding matching on all drivers and riders by adopting an optimal matching KM algorithm of a bipartite graph according to the value of each driving and riding pair to obtain the driver with the highest matching degree with the passenger so as to distribute an order to the vehicle where the driver with the highest matching degree with the passenger is located includes:

the step of inputting the real-time driver and passenger information into a trained neural network model and outputting each driver and passenger pair value corresponding to the driver and passenger information comprises the following steps:

searching all orders in a preset range of each driver in all drivers based on the real-time driving and taking information to serve as effective orders;

calculating the driving and taking pair values of all drivers and all effective orders of the drivers through the trained neural network model;

the step of performing driving and riding matching on all drivers and riders by adopting an optimal matching KM algorithm of a bipartite graph according to values of the driver and the rider pairs to obtain a driver with the highest matching degree with the passengers so as to distribute an order to a vehicle where the driver with the highest matching degree with the passengers comprises the following steps:

taking the value of the driver pairs of all the effective orders of the drivers as the input of a KM algorithm, adopting the KM algorithm to carry out driver-and-passenger matching on all the drivers, determining the value of all the driver-and-passenger pairs to be maximized, and distributing the orders to the unique drivers which maximize the value of all the driver-and-passenger pairs.

In a second aspect, an embodiment of the present invention provides an order dispatching device, including:

the acquisition module is used for acquiring the real-time riding information of each vehicle; wherein the driver and passenger information includes: driver information and passenger order information; the driver information includes: order-receiving distance information, vehicle position information and driver service level information; the passenger order information includes: order starting position information, order ending position information and order price information;

the first processing module is used for inputting the real-time ride information into a trained neural network model and outputting each ride pair value corresponding to the ride information; wherein the trained neural network model is trained based on a sample set, the sample set comprising: historical driver and passenger information;

and the matching module is used for performing driving and riding matching on all drivers and riders by adopting the optimal matching KM algorithm of the bipartite graph according to the values of the driving and riding pairs to obtain the driver with the highest matching degree with the passenger order, so as to dispatch the order to the vehicle where the driver with the highest matching degree with the passenger order is located.

Further, the apparatus further comprises: the second processing module is used for training to obtain a trained neural network model by adopting the following steps:

acquiring historical riding information of each vehicle in a service area;

y＝R_p+λ₁R_d+λ₂R_h+λ₃R_v

Further, the apparatus further comprises:

the storage module is used for inputting the historical ride information into the neural network to be trained, outputting each ride pair value estimation value corresponding to the historical ride information, and then taking the difference value between the minimized ride pair value estimation value and the ride pair value target value, the historical ride information and each ride pair value estimation value as experience samples and storing the experience samples into a sample pool;

the sampling module is used for extracting batch experience samples from the sample pool;

the device further comprises:

and a third processing module, configured to, after the parameter of the neural network to be trained is adjusted if the loss function is not lower than the preset threshold, obtain an adjusted neural network, update the neural network to be trained using the adjusted neural network, update historical ride information using the batch of experience samples, return the historical ride information to the neural network to be trained, and output each ride bivariate value estimation value corresponding to the historical ride information until the loss function is lower than the preset threshold, so as to obtain a trained neural network model.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;

a memory for storing a computer program;

a processor for implementing the steps of the method of any one of the first aspect when executing a program stored in the memory.

In a fourth aspect, the present invention provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to perform the method of any one of the above first aspects.

The embodiment of the invention has the following beneficial effects:

the order dispatching method and the order dispatching device provided by the embodiment of the invention are used for independently dispatching orders for each vehicle. In the process of individually dispatching orders, the dispatching of orders is related to driver information and passenger order information, so that the embodiment of the invention considers order starting position information, order ending position information and driver information of each vehicle, such as vehicle position information and driver service level information, in addition to order price information, and carries out driving and taking matching on all vehicles to obtain the driver with the highest matching degree with the passenger. Therefore, the dispatched orders are more matched to dispatch the orders to the vehicle where the driver with the highest matching degree with the passenger is located, the time wasted when a single vehicle is not matched with the dispatched orders is reduced, the efficiency of dispatching the orders is improved, and the completion efficiency of the orders is further improved; meanwhile, when each vehicle is matched with the dispatched order, the completion efficiency of each order is improved, so that the order completion efficiency of the whole dispatching platform is improved, and the income of the whole dispatching platform is improved.

Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart illustrating an order dispatching method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a first process for obtaining a trained neural network model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a second process for obtaining a trained neural network model according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an internal structure of a neural network model according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an order dispatch device according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Therefore, in order to solve the technical problems that the driver and passenger matching only takes the order price as a basis, and the driver and passenger matching is performed on all vehicles taking a bus in a certain area around the riding position, so that the order completion efficiency of the whole order dispatching platform is low, and the income of the whole platform is influenced, the inventor considers two contents: first, the driver and passenger matching is based on the order price only, which results in the loss of part of order taking rate and platform benefit due to the fact that the driver and passenger matching only considers the current benefit, but not the long-term benefit, so the inventor thinks that the vehicle demand of the order terminal area (namely, within the predetermined range of the order terminal position information) should be considered in addition to the order price in the vehicle and passenger matching process. Secondly, the inventor considers the driver-to-driver matching to achieve the purpose of improving the driver-to-driver experience and platform income, and basically considers the problem of mismatching of traffic supply and demand, namely lengthening the online time of the driver, ensuring that more orders are served, and promoting a good driver income mechanism for improving the online time and the service quality of the driver.

In view of the above, embodiments of the present invention provide an order dispatching method and apparatus, in order dispatching for vehicles, all vehicles in an area are not dispatched to other places with higher vehicle demand, but each vehicle dispatches an order individually. In the process of individually dispatching orders, the dispatching of orders is related to driver information and passenger order information, so that the embodiment of the invention considers order starting position information, order ending position information and driver information of each vehicle, such as vehicle position information and driver service level information, in addition to order price information, and carries out driving and taking matching on all vehicles to obtain the driver with the highest matching degree with the passenger. Therefore, the dispatched orders are more matched to dispatch the orders to the vehicle where the driver with the highest matching degree with the passenger is located, the time wasted when a single vehicle is not matched with the dispatched orders is reduced, the efficiency of dispatching the orders is improved, and the completion efficiency of the orders is further improved; meanwhile, when each vehicle is matched with the dispatched order, the completion efficiency of each order is improved, so that the order completion efficiency of the whole dispatching platform is improved, and the income of the whole dispatching platform is improved.

First, an order dispatching method provided by the embodiment of the invention is described below.

The order dispatching method provided by the embodiment of the invention is applied to an order dispatching platform, wherein the order dispatching platform can be used for distributing orders for vehicles, so that the vehicles can provide traffic service. Such a dispatch platform, which may also be referred to as a vehicle reservation or vehicle dispatch platform, is accessible through an electronic device such as a mobile terminal installed with the platform application.

First, through the platform application, the passenger can transmit a request for matching vehicles to the order platform with passenger order information. Wherein the passenger order information includes: order starting position information, order ending position information and order price information. For clarity of layout, the passenger order information is described in detail later.

Secondly, the order platform can obtain driver information of each vehicle, and then generates driver information and passenger order information of each vehicle one by one, wherein the driver information comprises: order taking distance information, vehicle position information, and driver service level information. 200 pieces of riding information can be generated by using driver of a vehicle and order information of a passenger as one piece of riding information, such as 100 vehicles and 2 passengers. For the sake of clarity of the layout, the driver information is described in detail later.

And finally, the order platform performs subsequent driving and taking matching based on the driving and taking information to obtain the driver with the highest order matching degree with the passenger so as to dispatch the order to the vehicle where the driver with the highest order matching degree with the passenger is located.

With reference to the above description, an order dispatching method according to an embodiment of the present invention is described in detail below.

As shown in fig. 1, an order dispatching method provided by the embodiment of the present invention may include the following steps:

step 110, acquiring real-time riding information of each vehicle; wherein the driver and passenger information includes: driver information and passenger order information; the driver information includes: order-receiving distance information, vehicle position information and driver service level information; the passenger order information includes: order starting position information, order ending position information and order price information. Wherein the order taking distance information includes: the order taking distance value, the order price information comprising: and (4) ordering the price value. The order destination location information includes: an order end location. The order dispatching platform can determine the position of each order terminal vehicle through each order terminal position, and further determine the order terminal vehicle demand (also called order terminal heat value). The driver service level information includes: a driver service rating value.

In order to accomplish real-time driver-to-passenger matching, it is necessary to obtain driver's vehicle and passenger order information in real time. In order to subsequently perform the driving and riding matching on the driver's vehicle and the passenger order information item by item in the real-time driving and riding matching process so as to obtain a more accurate driving and riding matching result, in the embodiment of the invention, the obtaining of the real-time driving and riding information of each vehicle further comprises: and generating real-time driving information of each vehicle one by one according to the vehicle of all drivers and passenger order information acquired in real time, namely aiming at the vehicle of all drivers and all passenger order information, wherein the vehicle of each driver corresponds to one passenger order information.

Of course, in order to obtain the limited riding matching in real time, the obtaining of the real-time riding information of each vehicle further includes: the method comprises the steps of acquiring vehicles of all drivers in a service area in real time and passenger order information, and generating real-time driving and taking information of each vehicle in the service area one by one, namely aiming at the vehicles of all the drivers in the service area and all the passenger order information, wherein the vehicle of each driver corresponds to one passenger order information.

The order pickup distance information is a distance between the order start position information and the vehicle position information. Order starting location information, order ending location information, order price information, vehicle location information, and driver service level information are all available from the order and the driver's vehicle.

Step 120, inputting the real-time ride information into a trained neural network model, and outputting each ride pair value corresponding to the ride information; wherein the trained neural network model is trained based on a sample set, the sample set comprising: historical driver and passenger information.

In the step 120, each riding value corresponding to the riding information is a pair of a specific driver and an order, that is, the input of the trained neural network model is the feature of the driver and the order, so the output of the trained neural network model is the value corresponding to the driver and the order.

In order to achieve the purpose of maximizing the value of all driver and order pairs subsequently, in the step 120, the real-time driving information is input to the trained neural network model one by one, so that the output of the trained neural network model, that is, the value of each driver and passenger pair corresponding to the driving information, which is the value of each driver and passenger pair corresponding to the driving information, can be referred to as the value of each driving pair corresponding to the driving information. For the sake of clear layout, the values of the driver pairs corresponding to the driver information are described in detail later.

The historical driver and passenger information includes: historical driver information and historical passenger order information;

the historical driver information includes: historical vehicle location information and historical driver service level information; the historical passenger order information includes: historical order starting location information, historical order ending location information, and historical order price information.

The historical vehicle position information and the historical order initial position information are used for estimating the order taking distance by the neural network model, so that the order taking distance is reduced, and the purpose of improving the passenger embodiment sense is achieved. The historical order ending position information is used for estimating the order ending vehicle demand by the neural network, so that the aim of dispatching the vehicle to the area with higher demand is fulfilled, and the order response rate and the platform yield of the vehicle sharing platform are improved. The historical order price information is used for estimating the order price by the neural network, and the aim of improving the income of the vehicle sharing platform by preferentially serving orders with higher prices is fulfilled. The driver service level information is used for distinguishing the service level of the driver by the neural network model, and the driver with higher service level is ensured to obtain better income, namely good driver income, wherein the service level of the driver is comprehensively evaluated by the online time length of the driver and the service quality of the driver. In detail, in order to substantially solve the problem of unmatched traffic supply and demand, a good driver income mechanism is provided to promote a driver to improve the online time and the service quality and further ensure that more orders are served, so that the aim of well developing a vehicle sharing platform is fulfilled. The better the driver service quality and the longer the online time, the higher the driver service level, i.e. the better the driver service level information.

And step 130, performing driving and riding matching on all drivers and riders by adopting an optimal matching KM algorithm of a bipartite graph according to the value of each driving and riding pair to obtain a driver with the highest matching degree with the passenger order, and dispatching the order to the vehicle where the driver with the highest matching degree with the passenger order is located.

In the step 130, driver-driver matching is performed according to the driver-passenger pair value output by the neural network model through a best matching (Kuhn-Munkras, KM for short) algorithm of bipartite graph, so as to maximize the value of all driver-passenger pairs.

Specifically, based on the step 120, the method further includes: searching all orders in a preset range of each driver in all drivers based on the real-time driving and taking information to serve as effective orders; and calculating the driving and taking pair value of all drivers and all drivers with valid orders through the trained neural network model. The step 130 further includes: taking the value of the driver pairs of all the effective orders of the drivers as the input of a KM algorithm, adopting the KM algorithm to carry out driver-and-passenger matching on all the drivers, determining the value of all the driver-and-passenger pairs to be maximized, and distributing the orders to the unique drivers which maximize the value of all the driver-and-passenger pairs.

As a specific example, assume that each driver's predetermined range may be, but is not limited to, a range within 3 km. The order starting position information, the order ending position information, the order taking distance information and other information in each order can be obtained through the real-time driving and taking information, so that all orders within 3km of each driver in all the drivers can be searched from all the orders, and the orders are called as effective orders;

taking the value of the driver pairs of all the effective orders of the drivers as the input of a KM algorithm, adopting the KM algorithm to carry out driver-and-passenger matching on all the drivers, determining the value of all the driver-and-passenger pairs to be maximized, and distributing the orders to the unique drivers which maximize the value of all the driver-and-passenger pairs. This uniquely matches the order to the driver by maximizing the value of all driver-ride pairs.

In vehicle dispatch ordering, rather than dispatching all vehicles in an area to other places where the demand for vehicles is high, each vehicle makes an individual dispatch order. In the process of individually dispatching orders, the dispatching of orders is related to driver information and passenger order information, so that the embodiment of the invention considers order starting position information, order ending position information and driver information of each vehicle, such as vehicle position information and driver service level information, in addition to order price information, and carries out driving and taking matching on all vehicles to obtain the driver with the highest matching degree with the passenger. Therefore, the dispatched orders are more matched to dispatch the orders to the vehicle where the driver with the highest matching degree with the passenger is located, the time wasted when a single vehicle is not matched with the dispatched orders is reduced, the efficiency of dispatching the orders is improved, and the completion efficiency of the orders is further improved; meanwhile, when each vehicle is matched with the dispatched order, the completion efficiency of each order is improved, so that the order completion efficiency of the whole dispatching platform is improved, and the income of the whole dispatching platform is improved.

In order to maximize the value of all drivers and orders when the value of each driver and passenger pair is matched, the accuracy of the value of each driver and passenger pair needs to be determined, in one possible implementation, a neural network to be trained is trained firstly, so as to obtain a trained neural network model which can be more accurate, as shown in fig. 2, the specific implementation process is as follows:

and step 121, obtaining historical riding information of each vehicle in the service area. The historical ride information refers to ride information in a time period before the current time (i.e., real time), and the time period before the current time is referred to as a historical time period compared with the real-time ride information.

In order to obtain a trained neural network model more accurately, and then in the historical driving and taking matching process, the driver's vehicle and the passenger order information can be subjected to driving and taking matching one by one so as to obtain a more accurate driving and taking matching result, in the embodiment of the invention, the step of obtaining the historical driving and taking information of each vehicle in the service area further comprises the following steps: and generating historical driving and taking information of each vehicle one by using the vehicles of all drivers in the historical time period in the service area and the passenger order information, namely, aiming at the vehicles of all drivers and all passenger order information in the historical time period in the service area, wherein the vehicle of each driver corresponds to one passenger order information.

And step 122, inputting the historical driver and passenger information into the neural network to be trained, and outputting each driver and passenger value estimation value corresponding to the historical driver and passenger information.

In the step 122, historical riding history information is input into the neural network to be trained one by one, and the neural network to be trained estimates the value of each pair of driver and passenger according to the order taking distance information, the order price information, the order end position information and the driver service level information which are obtained from the historical riding information. Subsequently, in order to achieve the purposes of accelerating the convergence of the neural network model to be trained and improving the utilization rate of the experience sample, optimized sample sampling is used for carrying out neural network training, so that the driver and passenger information and the driver and passenger estimation value are firstly used for generating a piece of sample information, namely < driver and passenger information and driver and passenger estimation value >, and the sample information is stored into a sample pool as the experience sample; and then extracting a batch of empirical samples from the sample pool, and training the neural network model until the neural network model converges. For clarity of layout, the detailed description is provided hereinafter.

The order taking distance information, the order price information, the order end point position information and the driver service level information determine that the proportion of the output value of the neural network model to be trained (namely the proportion of the reward value corresponding to the order taking distance, the proportion of the reward value of the order end point vehicle demand and the proportion of the driver service level reward value) is automatically updated according to Batch Gradient Descent (BGD), and the specific automatic updating process is shown as the following formula:

where | B | represents the number of batch samples extracted and Σ is the summation symbol. Notably, the order taking distance information ensures that the order taking distance is reduced to enhance the sense of appearance of passengers; the order price information ensures that orders with higher priority service prices are served so as to improve the income of the shared vehicle platform; the order end position information ensures that the vehicle runs to an area with larger demand after being matched by drivers and conductors; the driver service level information ensures "good driver profits" to promote ecologically good development of the vehicle sharing platform.

And step 123, constructing a loss function based on the difference value between the minimum driver-multiplier pair value estimation value and the driver-multiplier pair value target value as a training target. The driving-versus-value target value in this step is calculated from the pickup distance value, the order price value, the order destination vehicle demand (also referred to as the order destination heat value), and the driver service rank value, such as the rank value of the service for the excellent drivers being 1 and the rank value of the service for the non-excellent drivers being 0. The specific determination of the driver-multiplier-value target value will be described in detail below, and will not be described herein.

Wherein, in order to train the neural network, a loss function is constructed. The following steps are taken to construct a loss function as:

a first step of calculating a formula R_p+λ₁R_d+λ₂R_h+λ₃R_vCalculating a target value of the driver-multiplier pair as a target value of the driver-multiplier pair value;

secondly, inputting the value target value of the driver-rider pair into the trained neural network model to obtain an estimated value Q (s, a; theta) of the driver-rider pair as the value estimated value of the driver-rider pair, wherein Q (s, a; theta) is the value estimated value of the driver-rider pair, s represents driver information, a represents passenger information, and theta is a neural network parameter; is input s, a, variable θ; for distinguishing between inputs and variables;

and step three, calculating according to the value estimation value of the driver-multiplier pair and the value target value of the driver-multiplier pair, and obtaining a loss function by adopting the following formula:

J(θ)＝(R_p+λ₁R_d+λ₂R_h+λ₃R_v-Q(s,a；θ))²。

step 124, determining whether the loss function is lower than a preset threshold, where the preset threshold can be set according to the requirement, for example: 0.01, wherein the value range of the preset threshold is [0, 1 ]; if yes, that is, the loss function is lower than the preset threshold, which indicates that the training is finished, step 127 is executed; if not, that is, the loss function is not lower than the preset threshold, it indicates that the next training is required to adjust the neural network model parameters, then step 125 is executed.

And step 125, adjusting parameters of the neural network to be trained to obtain the adjusted neural network.

And step 126, updating the neural network to be trained by using the adjusted neural network, returning to step 122 until the loss function is lower than the preset threshold, and executing step 127.

And step 127, obtaining the trained neural network model.

In order to achieve the purpose of accelerating the convergence of the neural network model to be trained and improving the utilization rate of the experience sample, the neural network training is performed by using the optimized sample sampling, and the further training process is as shown in fig. 3, and is specifically realized as follows:

Through the

above steps

121 and 122, the driver and passenger information of each vehicle history in the service area is collected and acquired. Secondly, the driver and product historical data are input into the neural network model one by one. And thirdly, storing the experience sample < driver information, driver estimation value > into a sample pool for training a neural network model to be trained subsequently, and providing batch experience samples. Furthermore, in order to accelerate the convergence speed of the neural network model and the utilization rate of the experience samples, a curie driving mechanism is provided, wherein the curie driving mechanism means that when the samples are extracted from the sample pool, the probability that the samples with less occurrence times are extracted is higher, so that the convergence speed of the neural network is accelerated, and the utilization rate of the samples is increased. And performing weight probability density function calculation on all the empirical samples in the sample pool. The details are described below.

And 1221, storing the difference value between the minimum driver-multiplier pair value estimation value and the driver-multiplier pair value target value, historical driver-multiplier information and each driver-multiplier pair value estimation value into a sample pool as experience samples.

Step 1222, extract a batch experience sample from the sample pool.

The step 1222 further includes: firstly, calculating a weight probability density function of the empirical sample according to a difference value between a driver-multiplier-pair value estimation value and a driver-multiplier-pair value target value; the specific weight probability density function formula is:

wherein, w_iTo extract the probability of the ith empirical sample, l_iAnd the difference value between the value estimation value of the driver-multiplier pair in the ith experimental sample and the target value of the driver-multiplier pair value is obtained, and N is the total number of the experimental samples in the sample pool. And thirdly, sampling the experience samples in the sample pool in batches according to the weight probability density function to obtain batch samples. Meanwhile, a loss function is constructed for the training target based on the difference value between the minimum driver-to-driver value estimation value and the driver-to-driver value target value.

y＝R_p+λ₁R_d+λ₂R_h+λ₃R_v

λ₁for reward values, λ, corresponding to the order-receiving distance₂Reward value, lambda, for order ending vehicle demand₃A proportion of the driver service level prize value. Wherein the high quality driver is the highest or higher driver service level, i.e. the first predetermined number of bits before the maximum service level in the driver service levels. The high quality driver is the first preset number of bits before the lowest or lower driver service level, i.e., the minimum service level in the driver service levels. The first preset number can be set according to user requirements. The higher the driver service level, the higher the prize value. The closer the pick-up distance, the higher the prize value. The greater the order end vehicle demand, the higher the reward value.

And step 123, constructing a loss function based on the difference value between the minimum driver-multiplier pair value estimation value and the driver-multiplier pair value target value as a training target.

Of course, the execution sequence of

steps

1221, 1222 to 123 is not limited.

Step 124, judging whether the loss function is lower than a preset threshold value; if yes, that is, the loss function is lower than the preset threshold, which indicates that the training is finished, step 127 is executed; if not, that is, the loss function is not lower than the preset threshold, it indicates that the next training is required to adjust the neural network model parameters, then step 125 is executed.

And 1261, updating the neural network to be trained by using the adjusted neural network, updating historical driver and crew information by using the batch experience samples, returning to the step 122 until the loss function is lower than the preset threshold value, and executing the step 127.

And step 127, obtaining the trained neural network model.

In order to determine the accuracy of the trained neural network model, the trained neural network model may be tested, and the testing process may be implemented as follows.

Firstly, acquiring riding information in a service area;

secondly, inputting the driver and passenger information into a trained neural network model, and calculating each driver and passenger pair value corresponding to the driver and passenger information;

and thirdly, carrying out ride matching on the value based on each ride pair so as to determine maximization of all ride pairs.

In order to achieve the purpose of maximizing the value of all driver and order pairs subsequently, and obtain the value of each driver and passenger pair corresponding to the driver and passenger information, the step 120 further includes:

In the above steps, the neural network model to be trained outputs the value information of the driver and passenger pair corresponding to the driver and passenger information through the neuron of the output layer of the neural network model to be trained according to the order receiving distance information, the order price information, the order end point position information and the driver service level information which are acquired from the driver and passenger information.

Further, in order to show the working principle of the neural network model more clearly, the internal structure of the neural network model is shown in fig. 4, and the specific neural network model includes: estimating the Q network and the sample pool. In each training step, a piece of historical ride information is input into an estimation Q network, the Q network is estimated, corresponding ride value is estimated to obtain a ride estimation value, and the experience sample, namely the < ride information, the ride estimation value > is stored in a sample pool; and extracting batch samples from a sample pool through a curiosity driving mechanism to perform parameter updating on the estimated Q network until the estimated Q network converges.

The Q network is a three-layer fully-connected neural network, the input is (s, a), namely the information of a driver and passengers, and the output is Q (s, a; theta), namely the value of the driver and the passenger.

The curious driving mechanism is mainly a formula (i.e., a weighted probability density function of the empirical samples), and the formula is used for calculating the weight of each sample, i.e., the frequency of occurrence of each sample in the sample pool.

In the embodiment of the invention, in the aspect of driver and passenger matching, not only the current benefit is considered, but also the future benefit is considered. Meanwhile, good driver profits are considered, so that the online duration and the service quality of the driver are promoted, and the good development of a vehicle sharing platform is further promoted. In addition, in order to accelerate the convergence speed of the neural network model and the utilization rate of the experience samples, a curious driving mechanism is provided for batch sampling of the experience samples. Therefore, the order response rate and the platform profit of the vehicle sharing platform can be effectively improved.

The following provides a description of an order dispatching device according to an embodiment of the present invention.

Referring to fig. 5, fig. 5 is a first structural schematic diagram of an order dispatching device according to an embodiment of the invention. The order distribution device provided by the embodiment of the invention can comprise the following modules:

the acquisition module 21 is used for acquiring real-time riding information of each vehicle; wherein the driver and passenger information includes: driver information and passenger order information; the driver information includes: order-receiving distance information, vehicle position information and driver service level information; the passenger order information includes: order starting position information, order ending position information and order price information;

the first processing module 22 is configured to input the real-time ride information to the trained neural network model, and output each ride pair value corresponding to the ride information; wherein the trained neural network model is trained based on a sample set, the sample set comprising: historical driver and passenger information;

and the matching module 23 is configured to perform driving and riding matching on all drivers and riders by using the best matching KM algorithm of the bipartite graph according to the values of the driving and riding pairs to obtain a driver with the highest matching degree with the passenger order, so as to dispatch an order to the vehicle where the driver with the highest matching degree with the passenger order is located.

In one possible implementation, the apparatus further includes: the second processing module is used for training to obtain a trained neural network model by adopting the following steps:

acquiring historical riding information of each vehicle in a service area;

y＝R_p+λ₁R_d+λ₂R_h+λ₃R_v

In one possible implementation, the apparatus further includes:

the device further comprises:

In one possible implementation, the sampling module is configured to:

In a possible implementation manner, the first processing module is configured to:

In a possible implementation manner, the first processing module is configured to search all orders within a predetermined range of each driver among all drivers based on the real-time driving and taking information, and use the orders as valid orders;

the matching module is configured to:

The following continues to describe the electronic device provided by the embodiment of the present invention.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The embodiment of the present invention further provides an electronic device, which includes a processor 31, a communication interface 32, a memory 33 and a communication bus 34, wherein the processor 31, the communication interface 32 and the memory 33 complete mutual communication through the communication bus 34,

a memory 33 for storing a computer program;

the processor 31 is configured to implement the steps of the order dispatching method when executing the program stored in the memory 33, and in a possible implementation manner of the present invention, the following steps may be implemented:

The communication bus mentioned in the electronic device may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a RAM (Random Access Memory) or an NVM (Non-Volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

The method provided by the embodiment of the invention can be applied to electronic equipment. Specifically, the electronic device may be: desktop computers, laptop computers, intelligent mobile terminals, servers, and the like. Without limitation, any electronic device that can implement the embodiments of the present invention is within the scope of the present invention.

The embodiment of the invention provides a computer-readable storage medium, wherein a computer program is stored in the storage medium, and when being executed by a processor, the computer program realizes the steps of the order dispatching method.

Embodiments of the present invention provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the order distribution method described above.

Embodiments of the present invention provide a computer program, which when run on a computer, causes the computer to perform the steps of the order distribution method described above.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus/server/electronic device/storage medium/computer program product/computer program embodiment comprising instructions, the description is relatively simple as it is substantially similar to the method embodiment, and reference may be made to some descriptions of the method embodiment for relevant points.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. An order dispatching method, comprising:

2. The method of claim 1, wherein the trained neural network model is trained using the steps of:

acquiring historical riding information of each vehicle in a service area;

y＝R_p+λ₁R_d+λ₂R_h+λ₃R_v

3. The method of claim 1, wherein after inputting historical ride information into the neural network to be trained and outputting respective ride-to-value estimates corresponding to the historical ride information, the method further comprises:

extracting a batch of experience samples from the sample pool;

4. The method of claim 3, wherein said extracting a batch of empirical samples from said sample pool comprises:

5. The method of any one of claims 1 to 4, wherein inputting the real-time ride information into a trained neural network model, outputting respective ride-pair values, comprises:

6. The method according to any one of claims 1 to 4, wherein the inputting the real-time ride information into a trained neural network model and outputting respective ride-to-value values corresponding to the ride information comprises:

7. An order dispatch device, comprising:

8. The apparatus of claim 7, wherein the apparatus further comprises: the second processing module is used for training to obtain a trained neural network model by adopting the following steps:

acquiring historical riding information of each vehicle in a service area;

y＝R_p+λ₁R_d+λ₂R_h+λ₃R_v

9. The apparatus of claim 7, wherein the apparatus further comprises:

the device further comprises:

10. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus;

the memory is used for storing a computer program;

the processor, when executing the program stored in the memory, implementing the method of any of claims 1-6.