Mobility trajectory generation: a survey

Xiangjie Kong¹,
Qiao Chen²,
Mingliang Hou²,
Hui Wang² &
…
Feng Xia³

4844 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

Mobility trajectory data is of great significance for mobility pattern study, urban computing, and city science. Self-driving, traffic prediction, environment estimation, and many other applications require large-scale mobility trajectory datasets. However, mobility trajectory data acquisition is challenging due to privacy concerns, commercial considerations, missing values, and expensive deployment costs. Nowadays, mobility trajectory data generation has become an emerging trend in reducing the difficulty of mobility trajectory data acquisition by generating principled data. Despite the popularity of mobility trajectory data generation, literature surveys on this topic are rare. In this paper, we present a survey for mobility trajectory generation by artificial intelligence from knowledge-driven and data-driven views. Specifically, we will give a taxonomy of the literature of mobility trajectory data generation, examine mainstream theories and techniques as well as application scenarios for generating mobility trajectory data, and discuss some critical challenges facing this area.

Find Your Way Back: Mobility Profile Mining with Constraints

From Trajectory Modeling to Social Habits and Behaviors Analysis

Semantic Trajectories: A Survey from Modeling to Application

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The mobility trajectory dataset includes a wide range of information generated by diverse moving objects, consisting of a sequence of ordered points (Kong et al. 2018a). This data holds significant importance as it provides valuable insights into movement patterns and behaviors. In the field of urban computing, trajectory data enables the development of intelligent transportation systems, optimization of traffic flow, and prediction of congestion (Yan et al. 2014; Wang et al. 2019; Kong et al. 2022). In the area of city science, mobility trajectory data aids researchers in understanding urban dynamics, identifying activity hotspots, and improving resource allocation and public services (Halim et al. 2022; Bao et al. 2020; Han et al. 2020; Zhao et al. 2021a). Additionally, in the context of self-driving cars and intelligent transportation systems, mobility trajectory data is essential. It assists in training algorithms, allowing autonomous vehicles to navigate complex urban environments and make informed decisions (Kong et al. 2017; Waqas et al. 2020; Benko Loknar et al. 2023).

Currently, large-scale mobility trajectory data has been extensively utilized in practical applications. For instance, the study conducted by Hu et al. (2023) demonstrates the utilization of historical trajectory datasets and road networks for traffic predictions, thereby mitigating potential threats stemming from abrupt surges in traffic volume and ensuring the safety of public transportation. The analysis of mobile phone data conducted by Fan et al. (2021) and Li and Mostafavi (2022) improves the general public’s capacity to respond effectively to natural disasters. Furthermore, Wang et al. (2017) investigate taxi trajectory recognition to discern trip purposes and offer insights for smart city planning.

Although a large amount of mobility trajectory data is collected through sensors and various applications, there are challenges in direct utilization of this data in practice due to privacy concerns, commercial considerations, missing values, and expensive deployment costs. Firstly, there are privacy issues associated with mobility trajectory data, as it involves sensitive information about individuals’ activities and behaviors (Kong et al. 2018; Gursoy et al. 2019; Romero-Tris and Megías 2018). Secondly, there are commercial considerations as mobility trajectory data holds commercial value, but data sharing can be challenging due to conflicts of interest (Pan et al. 2019; Wang et al. 2020). Thirdly, data may contain missing values. In real-world mobility trajectory datasets, it is common to encounter corrupted or missing values due to sensor failures, communication loss, and data transmission issues (Ren et al. 2021; Hou et al. 2023). Finally, obtaining high-quality mobility trajectory data can be costly in terms of deployment. Setting up and maintaining sensors, data collection infrastructure, and computational resources require substantial investment (Halim et al. 2016; Zhang et al. 2020b; Kanaya et al. 2012). These factors mentioned above limit the accessibility and availability of mobility trajectory data. Therefore, trajectory data generation addresses the challenges of privacy protection, commercial considerations, missing values, and high investment costs faced in data collection. It helps professionals such as traffic managers, urban planners, and decision-makers optimize traffic systems, predict congestion, evaluate urban policies, and improve resource allocation.

The research topic of mobility trajectory data generation attracts sustained attention in recent years, and many impressive models or methods have been proposed. Some of these transform the generation problem as predicting the origin–destination matrix with spatial interaction theory (Roy and Thill 2003; Yan et al. 2017; Yan and Zhou 2019). These works model the mobility patterns based on gravity theory (Odlyzko 2015), Weber–Fecher Law (Slovic et al. 1977), intervening opportunities (Stouffer 1940), and game theory (Su et al. 2007) to estimate coarse-grained mobility preferences between two regions in urban. The generation or simulation process is carried out by the microscopic traffic simulation engines such as VISSIM (Fellendorf and Vortisch 2010) and SUMO (Simulation of Urban Mobility) (Brockfeld et al. 2001). With the development of artificial intelligence, various technologies related to it are used in different fields. Among all available mobility generation methods, the deep neural network is the stand out (Liu et al. 2020; Park et al. 2018; Zang et al. 2021; Zhang et al. 2020b; Bao et al. 2022). The idea behind this work is to learn the nonlinear spatio-temporal correlations preserved in traffic datasets by leveraging the strong approximation ability of deep neural networks.

The increasing popularity of mobility trajectory data generation has led to numerous publications in interdisciplinary fields. For example, in transportation and operational research areas, traffic patterns are simulated or modeled by related knowledge or theories of human mobility. However, most existing research generated data by estimating the possible distributions from the already existed dataset incapable of generating trajectories across different types. For example, the patterns learned from taxi trajectories cannot be directly applied in generating trajectories of private cars. Therefore, theory-guided or knowledge-based models also play an important role in mobility trajectory data generation.

In this paper, we attempt to solve this issue by presenting a comprehensive survey of mobility trajectory data generation. The main audience and readers of this survey are practitioners interested in studying the mobility trajectory of data generation from different research perspectives. We will first outline the problem of mobility trajectory data generation and introduce some related fundamentals. Then, the framework of this survey is given, and the categorization is discussed. Afterward, based on our categorization, we will elaborate on 55 mobility trajectory data generation papers. These papers mainly cover work in the field of transportation, but we also cover several publications from the data science and deep learning fields. Finally, we will discuss the current and future challenges of mobility trajectory data generation. The insights readers can extract from this survey are:

Comprehensive definitions of mobility trajectory data generation in different application scenarios.
The strengths and weaknesses of different categories methods and models in mobility trajectory data generation.
Commonly used open datasets in mobility trajectory data generation and the associated open code.
Future challenges facing mobility trajectory data generation and possible opportunities to deal with these challenges.

Comparison to other survey papers. There are some previously published works focusing on the topic of mobility trajectory data generation. One of the early surveys on this topic is Harri et al. (2009). This work mainly presents a framework to introduce vehicle mobility models, which can be used to generate realistic vehicular motion patterns based on Vehicular Ad Hoc Networks (VANETs). This survey mainly focuses on introducing the knowledge-based models, resulting in neglecting the deep learning-related work. Our survey aims to provide a more comprehensive view in reviewing the work in mobility trajectory data generation.

Recently, Shin et al. (2020) provides a survey about mobility trace generation. This survey focuses on synthesizing user mobility traces by Generative Adversarial Network (GAN) categorizes the review papers according to different types of GAN. However, this survey pays attention to GAN techniques without much focus on the domain knowledge in trace generation. Furthermore, this survey offers limited insights into future challenges, which may not be sufficient to inspire readers who are dedicated to the generation of mobility trajectory data. Our work provides several deep discussions about the challenges and future directions in Sect. 8.

The work of Gao et al. (2020) provides another survey of spatio-temporal data mining. This survey presents a detailed categorization based on different application scenarios of GAN in spatio-temporal modeling. However, this survey focuses on spatio-temporal data mining without consideration of the mobility trajectory generation work. Our work provides a deep and comprehensive survey of mobility trajectory data generation.

To the best of our knowledge, we are the first survey to organize and introduce the mobility trajectory data generation from the perspectives of different paradigms: knowledge-driven and data-driven. In this survey, we first provide a deep insight into these two paradigms and introduce the categorization and framework of our survey. Then, we give a detailed definition of mobility trajectory generation according to different scenarios. Moreover, we elaborate on the fundamentals (theories and techniques) commonly used in knowledge-driven and data-driven methods. We review each specific work based on the scenarios we presented and the fundamentals we discussed. Finally, we provide future challenges and possible trends in mobility trajectory data generation.

The rest of this paper is organized as follows. In Sect. 2, we introduce the detailed methodology that explains how we conducted the literature survey and identified the articles to be included in the study. In Sect. 3, we discuss the taxonomy of this survey. In Sect. 4, fundamentals and comprehensive definitions of mobility trajectory data generation are given. Our work is focused on Sect. 5. We split this section into two subsections: Sect. 5.1 discusses the knowledge-driven methods, while Sect. 5.2 elaborates on the data-driven methods. In Sect. 6, we introduce the evaluation metrics commonly used in mobility trajectory data generation. Then, in Sect. 7, we conclude the existing sources of mobility trajectory data generation including datasets, simulation tools and related open codes. Section 8 describes the challenges and future opportunities in mobility trajectory data generation research. Finally, we summery our work in Sect. 9.

2 Methodology

In the initial stage of the study, in accordance with the recommendations by Wohlin (2014), we utilized Google Scholar to conduct a literature search by employing diverse keywords, thereby mitigating potential publisher bias. The search was carried out in March 2020 without specifying a particular time frame. Duplicate papers and non-English articles were excluded, while all relevant journal articles, conference papers, and book sections pertaining to mobile trajectory data were included. Subsequently, a snowballing approach was employed on the identified papers. Firstly, the reference lists of each paper were scrutinized to identify potentially relevant new publications pertaining to the research topic. Subsequently, papers were selected or excluded based on the aforementioned criteria, and the process was concluded when no further relevant papers were discovered. Overall, 55 papers were utilized in this study.

3 Taxonomy

From the model’s perspective, we categorize the mobility trajectory generation works into knowledge-driven and data-driven. From application scenarios, we divide mobility trajectory generation into three scenarios. Figure 1 shows the categories of mobility trajectory generation. We will make a detailed discussion about our categorization.

In the early stage, hypotheses or theories are proposed by researchers. Then the collected various datasets are used to confirm or refute these hypotheses or theories, e.g., gravity model in traffic flow estimation. However, we have to agree that the data mining techniques or deep learning techniques have become a mainstream paradigm of the current mobility trajectory generation topic. Some researchers even propose that the rise of data science is the end of theory (Karpatne et al. 2017). The underlying idea is to leverage abundant data to construct models by optimizing a loss function, without relying on scientific theories.

Nevertheless, black-box deep learning methods have many limitations in applications. Firstly, deep learning methods rely largely on high-quality training samples. However, it is not easy to collect the representative labeled data involving complex and many physical variables. Generalization has become a major problem that plagues deep learning methods. The second limitation is the interpretability of deep learning methods. Although an ‘end-to-end’ or a ‘task-specific method’ achieves impressive performance on real-world datasets or application tasks, the process of knowledge discovery in the scientific domain does not end at that. Interpretable models or methods are based on explainable theories, which helps prevent the acquisition of erroneous patterns from noisy data. This ensures the model’s capacity for generalization.

Methods of mobility trajectory generation can be categorized into two classes from the macroscopic view. Some works designed their models based on theories or hypotheses, while others learned the mobility patterns from a large number of datasets. In this survey, we aim to introduce the mobility trajectory generation methods from these two paradigms. We hope that readers can get more in-depth insights or inspirations from the advantages and disadvantages of these two classes of methods we reviewed.

We divide the reviewed literature into categories knowledge-driven and data-driven. Moreover, we class the data-driven methods based on the specific techniques into Recurrent Neural Network (RNN-based) approaches and GAN-based approaches.

Table 1 The description of notations

Full size table

4 Definitions and fundamentals

In this section, we first give definitions of mobility trajectory and mobility trajectory generation as shown in Table 1. We introduce three common application scenarios of mobility trajectory generation. Then, we give a detailed discussion about the fundamentals used in existing mobility trajectory generation work.

4.1 Definitions

Mobility trajectory mobility trajectory is defined as a set contained sequential spatio-temporal moving records ${\mathcal {S}}=\{x_{1}, x_{2},\ldots , x_{n} \}\in {\mathbb {R}}^{N\times 2}$, where ith element is a record defined as a tuple $(l_{i}, t_{i})$. $l_{i}$ denotes the spatial information, which can be GPS coordinates (longitude, latitude) or a region ID. $t_{i}$ represents the temporal information such as the timestamp of ith record.

Figure 2 shows an example of mobility trajectories of two objects. The top mobility trajectory is recorded by the GPS location identification, which is the most common manner of mobility trajectory data. The bottom mobility trajectory is obtained by transforming the GPS coordinates into other representations such as region ID to help model the latent semantic information from trajectories.

Domain knowledge domain knowledge is a set ${\mathcal {K}}$ that contains the information related to the trajectories or mobility patterns.

In this paper, we will mainly introduce four types of domain knowledge information that are commonly used in existing mobility trajectory generation work.

Report information the government will publish various information about the transportation, urbanization, and mobility analysis report per year. Information contained in these reports can reflect mobility or transportation situations in a macroscopic view, assisting in generating trajectories. For example, Kong et al. (2018) generated trajectories of social cars by estimating the parameters by the 2015 Beijing Transport Annual Report.^{Footnote 1}
Demographic information the size of the population directly affects the formulation and improvement of policies for employment, elderly care, medical care, and social security. It also affects the distribution of education and medical institutions in the area where citizens are located, the construction of service facilities for humans, the distribution of commercial service outlets, the supply of urban housing, and the construction of urban roads. Demographic information is related to the travel demand and decides the mobility patterns in a city. Researchers use it to compute the demand and then provide a schema to solve some urban problems such as traffic congestion (Kong et al. 2018).
Spatial information spatial information can be categorized into two classes. The first class is the road network information. The road network is composed of points, lines, and planes. Besides, road network shows the basic spatial structure of a city and contain large amounts of information, such as different road network representing various road, hierarchy, and path structures. The second class is the Point of Interest (POI). POI contains text descriptions of spatial entities and can be utilized to extract the latent semantic information preserved in trajectories. The mobility trajectory can be transformed into the mobility activities between POIs and the mobility patterns can be extracted by learning the relationships among POIs (Yao et al. 2018). Common ways to obtain spatial information are Google Maps,^{Footnote 2} AMAP,^{Footnote 3} and Open Street Map (OSM)^{Footnote 4}
Demand information demand information can be seen as a hybrid fine-grained information affected by various factors such as demographic information, economic information, environment information, etc. To simplify the discussion and help readers build a clear understanding, we list demand information as one of the information to be reviewed in the following discussions. Demand information decides the flow from the origin and destination. It is structured by an Originated-Destination (OD) matrix, which can convert into the individual trips of vehicles. Thus, the OD matrix describes each vehicle’s departure and arrival place in a specific region during the simulation.

It should be noted that domain-specific knowledge is varied, and within this survey, we have selected four frequently employed sources of information in works on generating mobility trajectories for inclusion.

Mobility trajectory generation given a predefined information set ${\mathcal {M}}\subseteq {\mathcal {S}}\cup {\mathcal {K}}$, the mobility trajectory generation aims to learn model or function ${\mathcal {F}}: {\mathcal {M}} \rightarrow \hat{{\mathcal {S}}}=\{{\hat{x}}_{1},{\hat{x}}_{2}, \ldots, {\hat{x}}_{n}\}\in {\mathbb {R}}^{N \times 2}$. The information set ${\mathcal {M}}$ consists of two components: ${\mathcal {S}}$, which is a set of sequential spatial–temporal movement records, and ${\mathcal {K}}$, which is a set of domain knowledge containing various types of information. The set $\hat{{\mathcal {S}}}$ represents the collection of trajectories generated using a model or function.

The generated mobile trajectory data has similar statistical characteristics to real data and can be used for analysis and verification. The requirements for generating mobile trajectory data vary in different scenarios. In the context of smart cities, generated trajectory data is used to assess traffic congestion and accidents, thereby improving urban transportation. Therefore, generated mobile trajectories mostly consider factors other than just location, such as weather, peak hours, and holidays (Fan et al. 2021). For autonomous driving, generated trajectory data is used for training to enhance the vehicle’s understanding and response capabilities to the surrounding environment. Therefore, there is no need to generate long-term trajectories for autonomous driving; instead, the focus is on considering the interactions among different objects in the same space (Alahi et al. 2016). In terms of optimizing basic transportation infrastructure, generated trajectory data is used to evaluate the deployment of new infrastructure in cities and provide recommendations for urban planners and managers. Therefore, generated data is often generated for a specific area based on given historical conditions (Zhang et al. 2020b). In this paper, we divide the mobility trajectory generation into three application scenarios.

Scenario 1 the first scenario of mobility trajectory generation is about validation in VANETs and traffic simulation. For validation, some research (Codeca et al. 2015) used real information to build traffic scenarios for evaluating and comparing new communication protocols. For traffic simulation, the urban traffic state is estimated by generating trajectories (Dian Khumara et al. 2018).
Scenario 2 the second scenario of mobility trajectory generation is missing value imputation for urban. The complete mobility trajectory dataset is hard to obtain due to the limitations of privacy and security, power outage malfunctioning, and transfer errors. To solve this problem, Xia et al. (2017) and Kong et al. (2018) introduce relevant domain knowledge to generate trajectories. Besides, this work can also be used to fill in missing data.
Scenario 3 the third scenario of mobility trajectory generation is autonomous driving. To enhance autonomous driving safety, researchers (Alahi et al. 2016; Gupta et al. 2018) start to focus on making the algorithm understand the surrounding environment and the behavior of pedestrians and vehicles through generating possible trajectories.

4.2 Fundamentals

In this subsection, we first introduce the theories and tools used in the knowledge-driven methods, including spatial interaction models, traffic models, and two simulation tools. Then, we introduce the widely used techniques in data-driven methods, including Convolutional Neural Network (CNN), RNN, and GAN.

4.2.1 Spatial interaction models

Researchers have successively presented many models for predicting the flow of people, goods, and information between origins and destinations for more than 100 years. These models have different names in different disciplines and they are called travel distribution prediction models (Yan 2017) in transportation science. Prediction of flows can reduce the cost of spatial interaction while maintaining the diversity of choices in transportation.

The gravity model is successfully applied in mobility pattern analysis. There is a law similar to Newton’s law of universal gravitation in the flow distribution phenomenon between multiple places. In 2008, Jung et al. (2008) found that the traffic flow in the Seoul subway network in South Korea can be calculated using the following model:

$$T_{ij}=\alpha \frac{m_{i} m_{j}}{d_{ij}^{\beta }},$$

(1)

where $T_{ij}$ is the passenger flow from station $i$ to station $j$, $m_{i}, m_{j}$ are the populations of stations $i$ and $j$, $d_{ij}$ is the distance between two stations $i$ and $j$, $\alpha$ and $\beta$ are two parameters.

In addition to the law of gravity in the railway network, this law also exists in the commuting travel (Viboud et al. 2006), population migration (Tobler 1995), international trade (Fagiolo 2010). However, the gravity model parameters have different values in different regions and may also have different values for the same region in different periods; that is, its applicability is limited. Stouffer (1940) provided another spatial interaction model called the intervening opportunities (IO) model. This model does not use the actual distance but sorts the destinations from near to far. The decision-maker will select the destination with a certain probability according to the ranking. In actual application, the IO model does not need to enter the actual distance; only the population and the number of trips in each location can complete the travel distribution forecast for the entire region. But its theoretical basis is not easy to understand and contains many parameters to be estimated; it is rarely adopted in practical applications.

4.2.2 Traffic models

Traffic models have a history of more than a hundred years. They are generally divided into the macro model at the strategic planning level and the micro model at the operational planning level. Establishing a traffic model is the basic method for traffic analysis. The four-step model (FSM) is currently the most commonly used macroscopic traffic model (McNally 2007).

The FSM is one of the first trip demand models that attempt to link the use and behavior of land for transportation planning. It includes the generation and distribution of trips, the choice of mode, and the assignment of traffic. Trip generation is determined by the population size, social economy, land use, travel frequency, and other factors. Trip distribution is used to predict the inter-regional trip flow related to the regional trip volume growth trend, trip resistance, and other factors. Due to the difference in time and other factors of various modes of transportation and the different preferences of travelers for different modes of transportation, the choice of trip mode is different. Traffic assignment will load OD traffic to each intersection section through route selection.

4.2.3 Simulation tool

Traffic simulation is the utilization of simulation technology to assist in the study of traffic. It contains random characteristics, which can be microscopic or macroscopic. It involves a mathematical model that describes the real-time movement of the transportation system within a certain period of time. In this part, there are two mainly simulator tools, Simulation of Urban Mobility (SUMO) and VISSIM, which are widely used.

SUMO SUMO was provided in 2001 and first released in 2002 (Brockfeld et al. 2001; Krajzewicz et al. 2002). SUMO is an open-source tool with a simulation package that can process and simulate traffic-related data. Behrisch et al. (2011) introduced the developments and prospects of SUMO in different research topics. SUMO is an effective simulation tool with characteristics of highly portable, microscopic, and continuous. SUMO contains multiple application packages. The common ones are dfrouter which can build the path of the vehicle, duarouter which use the Gawron model (1998) to compute the shortest path and dynamic user balance, netconvert which use to translate the road network, od2trips which import the OD matrix and translate the travel path, and TraCITestClient which can explore the possibility of communication with external applications such as network simulator version 2/3 (NS2/3).^{Footnote 5} As shown in Fig. 3, the important two modules of simulating the mobility for vehicles are road network import and demand modeling components. With the help of SUMO, urban traffic conditions change easily to study for researches. For instance, the combination of SUMO with NS2/3 makes it possible to achieve vehicle-to-vehicle (V2V) data transmission and generate vehicle trajectories.
VISSIM VISSIM is a discrete and stochastic microscopic traffic simulation system software based on the PTV Corporation’s time step and driving behavior in Germany. The traffic simulator relies on the “Wiedemann 74’’ car following model or the “Wiedemann 99’’ car-following model, which is classified as a psycho-physical car following model (Aycin and Benekohal 1999). The lateral lane change uses a rule-based algorithm. The VISSIM software is internally composed of a traffic simulator and a signal state generator. The simulator includes a car-following model and a lane change model. The signal generator is a signal control software that implements traffic flow control through programs. They exchange data and signal status information through the interface. VISSIM can perform functions such as road network evaluation and optimization, traffic impact evaluation. And, it can realistically simulate the behavior of cars, trucks, buses, subways, light rails, bicycles, and pedestrians. For example, VISSIM supports the location layout of light rail and public transportation systems, supports the evaluation of public transport priority schemes (such as bus lanes), supports indoor and outdoor pedestrian flow analysis, and public short-distance traffic simulation. Similar to SUMO, VISSIM can also simulate trajectories by importing the OD matrix.

4.2.4 Convolutional neural network (CNN)

CNN usually plays an important role in hybrid deep network design, whose main purpose is to gradually learn inherent features, beginning with low-level features and then building more complex concepts by a series of layers. Similar to the traditional neural network, the architecture of a typical CNN (Fig. 4) includes an input layer, an output layer, and hidden layers in general. Convolution layers, pooling layers, fully connected layers, and Rectified Linear Unit (ReLU) activation are the most commonly used in hidden layers. The purpose of convolution is to extract features from input layers. In contrast, pooling aims to gradually reduce the spatial size of the data volume but preserve vital information. Convolutional layers can handle temporal dependencies (Nikhil and Morris 2019). Moreover, pooling layers, which commonly include max-pooling and average pooling, perform downsampling or upsampling between successive convolutional layers on the spatial dimensions. ReLU layer will perform activation function operations by the element; the data size of this layer has not changed. Fully connected layers are similar to the traditional multilayer perceptron (MLP), in which every single neuron connect all neurons in the previous layer.

CNN is widely utilized not only for image data and natural language processing tasks (Krizhevsky et al. 2017; Nagarhalli et al. 2021), but also for addressing spatio-temporal data mining challenges. In transportation, CNN serves as a prevalent technique for extracting features that capture the spatial characteristics of traffic. For instance, Chen et al. (2020) proposed a methodology to extract spatio-temporal features across multiple layers, where CNN is employed to transform road representations into image format. This approach enables the extraction of pertinent information by considering the spatial structure of the roads. Similarly, Lv et al. (2018) treats trajectory data as two-dimensional images and utilizes multi-layer CNNs to integrate trajectory patterns at different scales, facilitating accurate prediction tasks. More about the use of CNN in mobility trajectory generation tasks can be found in Sect. 5.2.

4.2.5 Recurrent neural network (RNN)

RNN (Mikolov et al. 2010) is a type of neural network that attaches great importance to capture temporal information in sequential data. RNN can take diverse sizes between inputs and outputs compared to another neural network such as CNN.

A classical RNN cell also consists of three layers (input, hidden, and output). It can be seen as a chain of nodes depicted in Fig. 5. Where X represents the input data, Y represents the output data, H refers to the hidden state, W and b refer to the parameters. Specifically, the state of node ${H}_{t}$ not only process the input data ${x}_{t}$ at time t but also process the information stored in ${H}_{t-1}$ and memorize the important sequence parts. Then, the state of node ${H}_{t}$ conveys the processed information to the next node state ${H}_{t+1}$. To calculate the loss, the result of output ${Y}_{t}$ can compare with the ground truth. In mobility trajectory generation, the input of RNN is composed of the historical trajectories. A continuous time period is divided into multiple time steps, and the historical trajectory is read from each time step and sent to the RNN (Ma et al. 2019).

However, RNN suffers from vanishing gradients with auto-regressive learning manner for long sequences input. To address this problem, Long Short-Term Memory (LSTM) has been proposed (Hochreiter and Schmidhuber 1997) and further improved in Gers et al. (2000).

LSTM also consists of multiple layers and possesses memorization capability to compare with simple RNN. LSTM add a element of memory state C, current $C_{t}$ include previous time ${\textbf{C}}_{t-1}$ and current new part. In addition, LSTM has three more gates, which control the propagation of information in the network. The first is the input gate, which determines how much current information to reserve, such as remembering some new information. The second is forget gate, which determines how much current or previous information to reserve or forget. The third is the output gate, which determines the output of the information or controls how relevant and current information deliver for the next step. As shown in Fig. 6, LSTM maintains the recurrent structure of the RNN, but the difference is that LSTM has three gates to control the transmission of information.

The RNN-based method possesses the main advantage in memorization capability. Knowing when to memorize or forget the information led RNN-based to be the popular method for sequence data. However, the time of training is remarkably longer than other deep neural network models because of its recurrent structure.

In transportation, RNNs are primarily utilized to capture the temporal and spatial movement patterns of individuals. These models often incorporate various types of data, such as weather conditions and holiday schedules, for modeling purposes. For instance, Feng et al. (2018) introduces a mobility prediction model based on a recurrent neural network with an attention mechanism. This attention mechanism captures multi-level periodic characteristics, thereby improving the prediction performance of the recurrent neural network. Additionally, Kong and Wu (2018) proposed the Hierarchical Spatio-temporal LSTM (HSTLSTM) model to address data sparsity and capture periodic variations for predicting short-term correlations among individuals. In Sect. 5.2.1, we will provide an overview of the common usage of RNN-based models.

4.2.6 Generative adversarial network (GAN)

GAN is proposed by Goodfellow et al. (2014). As shown in Fig. 7, the basic architecture of GAN comprises two fundamental components: the generator $G\left( {\varvec{z}} ; \theta _{g}\right)$ and discriminator $D\left( {\varvec{x}} ; \theta _{d}\right)$, which compete against each other. On the one hand, the generator can capture the data ${\varvec{x}}$ distribution $p_{g}$ from noise variables $p_{z}(z)$ learning to generate fake data that look real, which can fool the discriminator. On the other hand, the discriminator can distinguish between different classes fake or not as a classifier to model the probability of each class. In an ideal state, the generator G can generate the fake data with the real data G(z), and the discriminator difficult to distinguish whether the data generated by G is real or not. Finally, the two components reached a dynamic equilibrium, so D(G(z)) equals 0.5.

However, the original GAN also exists some inadequate. For instance, GAN is not suitable for processing discrete forms of data, such as text. In addition, GAN has problems with unstable training, disappearing gradients, and mode collapse/dropping. To cope with this problem, many variants of the vanilla GAN are presented. Mirza and Osindero (2014) proposed conditional generative adversarial net (CGAN), which adds some prior conditions on the original basis, making GAN more controllable. Arjovsky et al. (2017) proposed Wasserstein generative adversarial network (WGAN), which used Wasserstein distance instead of JS divergence to solve that the two distributions do not overlap, the Wasserstein distance can still reflect their distance. WGAN not only solves training instability but also provides a reliable training progress indicator.

In the field of transportation, GANs have become a significant paradigm in data-driven generation methods, supplanting conventional stochastic models. GANs capture spatial dimension features that are beyond the reach of traditional methods and encompass additional information, including temporal dimension, social dimension, and complex nonlinear relationships in the data (Gupta et al. 2018; Ouyang et al. 2018). Recent studies have employed GANs as a stochastic generator for synthesizing realistic mobile trajectory data. For detailed information on GANs-based methods, refer to Sects. 5.2.2 and 5.2.3.

5 Mobility trajectory generation techniques

Table 2 Summary of advantages and disadvantages of mobility trajectory generation

Full size table

In this section, we will elaborate on the representative methods of mobility trajectory generation based on the categorization we presented. For each work, we will introduce the scenario in which it is applied and discuss the theories or techniques it has developed.

5.1 Knowledge-driven approaches

Early generation of mobility trajectories was mainly used for simulating human daily dynamics in regional planning or observing and dealing with traffic congestion. Raney et al. (2003) designed a multi-agent traffic system that simulated 24-h micro-traffic in Zurich, Switzerland. They generated vehicle trajectories covering metropolitan areas with a population of 10 million, which were used for regional planning. They utilized demographic information and spatial information as knowledge, using micro-queue simulation and the Dijkstra algorithm for generating routes. Likewise, Cetin et al. (2003) also conducted dynamic micro-simulation of car traffic throughout Switzerland using traffic flow queue models based on Scenario 2. The generated dataset has a long duration but mainly focuses on the morning peak period and does not consider daily traffic conditions. However, both of these studies solely focus on car traffic, overlooking other modes of transportation.

Kanaya et al. (2012) combined spatial information and SUMO to propose a human sensing system simulator that synthesizes realistic human movements. Under Scenario 2, it can assist in locating individuals for navigation purposes. In the simulation part, they utilized map data, sensor information, and network data as prior knowledge for simulation. However, it is challenging to set up sensors in different cities to validate the system’s cost. Moreover, this method only simulates human behavior in urban areas.

Considering the previously discussed constraints of privacy and security protection, the absence of authentic, publicly accessible mobile trajectory datasets capable of capturing regional traffic dynamics poses a challenge for evaluating and validating vehicular networking protocols outlined in Scenario 1. To mitigate this concern, Ferreira et al. (2009) provided an alternative method to get urban mobility of vehicles and the respective drive speeds based on traffic image. They extracted the trajectory-related knowledge, e.g., distribution of buildings, from the Spatial information contained in the stereoscopic aerial photos. This work generates the fine-grained through SUMO and the spatial knowledge is utilized to estimate an accurate O/D matrix of two regions. The spatial knowledge is mainly learned by feature selection.

$$\begin{aligned} P(Z)&= \sum _{i\in I} \sum _{j\in J}P(Z\cap Y_j \cap X_{i})\\&= \sum _{i\in I} \sum _{j\in J}P(Z\vert Y_j\cap X_{i})P(Y_j\vert X_{i})P(X_{i}),\\ \end{aligned}$$

(2)

where X, Y, Z represents the three events and I, J denotes two partitions of events space. Given two events X and Y (already occurred), the probability that C happens can be represented as a conditional probability as (2). This work transforms P(Z) as destination choice event and utilizes the demand information and spatial information to estimate the corresponding probability in (2). The estimated choice probability can be represented as an O/D matrix to be input into simulation tools to generate the trajectories. However, the short duration of connectivity in the aircraft and the cost of aerial photography make the data collection hard.

Subsequently, Thakurzx et al. (2012) acquired traffic flow data from roadside surveillance cameras in cities including London, Sydney, and Toronto, in order to calibrate the mobility of micro-vehicles. However, like previous research, this approach is also burdened by high filming costs and the need for advanced image processing techniques. Moreover, aerial photography has a limited time interval, rendering it unsuitable for generating large-scale datasets.

For knowledge-driven methods to generate mobility trajectories, traffic simulation tools are indispensable. Typically, they combine prior knowledge to generate macroscopic traffic flow, which refers to traffic volume between regions, for the purpose of trajectory generation tasks. Uppoor et al. (2014) synthesized real vehicle trajectory datasets for the City of Cologne based on Scenario 1 using SUMO. This work combines Spatial information, Demographic information, and Report information to generate possible mobility distributions in urban areas. Firstly, they obtained road topology information from the OpenStreetMap database. Secondly, they utilized population, Points of Interest (POI), and time usage patterns (i.e., residents’ time planning) as knowledge to calculate traffic demand. Then, the authors chose to utilize the Gawron algorithm for traffic assignment to achieve dynamic user equilibrium. Compared to the Dijkstra algorithm, the Gawron algorithm maximizes the road network capacity more effectively. The authors provided solutions to the issues encountered during the simulation process.

Codeca et al. (2015) described the process of creating realistic scenarios based on SUMO in a medium-sized European city, Luxembourg, using Scenario 1. The authors extracted the road topology structure using OpenStreetMap (OSM). With the help of the simulator, they needed to verify the accuracy of the manually corrected topology structure. They generated realistic traffic patterns based on activity-based demand using data easily obtained from government websites, such as population data. Additionally, they considered the reasonableness of traffic patterns. Bedogni et al. (2015) provided an openly available real trajectory dataset. Knowledge was extracted from Spatial information, particularly road network information. They implemented the SUMO road network conversion tool NETCONVERT, which allows automated and clean importing of OSM data, generated original circular movement trajectory datasets for the Bologna region in Italy. This work considered fine-grained road features such as connectivity and traffic lights when simulating trajectories. All three works mentioned above have long trajectory durations and wide coverage areas. However, these methods cannot be used for trajectory generation without relevant government research reports.

Gramaglia et al. (2016) generated a trajectory dataset based on the Scenario 1 to characterize the vehicular network connectivity. Intelligent Driver Model (IDM; Liebner et al. 2012) is utilized to estimate the statistical driving status to simulate the traces. IDM estimates the driver behavior of a vehicle i through the instantaneous acceleration $dv_{i}(t)/dt$ as:

$$\begin{aligned} \frac{dv_{i}(t)}{dt}&= a \left[ 1-\left( \frac{v_{i}(4)}{v_{i}^{max}} \right) ^{4}- \left( \frac{\Delta x_{i}^{des}(t)}{\Delta x_{i}(t)} \right) ^{2}\right] ,\\ \Delta x_{i}^{des}(t)&= \Delta x^{safe}+ \left[ v_{i}(t)\Delta t_{i}^{safe}-\frac{v_{i}(t)\Delta v_{i}(t)}{2\sqrt{ab}}\right] ,\\ \end{aligned}$$

(3)

where $v_{i}(t)$ is the current speed of i, $v_{i}^{max}$ denotes the maximum speed, and $\Delta x_{i}^{des}(t)$ represents the desired dynamical distance (leading distance driver would keep from). This work analyzed the data collected by sensors deployed on highway loops and incorporated the Demand information into traffic models to generate trajectory data. This work generated trajectories with a duration of 24 h and a coverage range of 10 km. However, it focuses primarily on the study of vehicle networks in highway environments.

SUMO is a highly regarded simulation tool primarily designed for right-hand traffic. However, countries that follow left-hand traffic need to make specific modifications to the SUMO files. Lim et al. (2017) proposed a method that enables the simulation of left-hand traffic in Malaysia using SUMO, building upon the foundation of Scenario 1. The research focused on making primary modifications to the road connections and traffic signal files. Nevertheless, due to the challenges of modifying extensive maps, this method is not suitable for large-scale areas.

The acquisition of inter-regional traffic flow is vital for simulating traffic using the SUMO platform, and numerous studies have relied on publicly available government data for estimation purposes. Kong et al. (2018) utilized floating car data in Beijing to generate a dataset of social vehicle trajectories within SUMO. It is important to highlight that the objective of their work was to produce trajectory datasets specifically for private cars, based on the floating car dataset, which is applicable to Scenario 1. The study integrated Report information, Demographic information, and Spatial information into a spatial interaction model to estimate the macroscopic travel distribution across different areas in Beijing. In a subsequent work, Kong et al. (2022) introduced an alternative method for generating mobility trajectories in the same application scenario. They proposed a three-layer framework, wherein the first layer focused on developing a regional partition scheme. The second layer presented a novel spatiotemporal interaction model to estimate traffic flow between two regions and conducted simulations using SUMO. Lastly, the third layer analyzed the validation results from both macroscopic and microscopic perspectives. However, it is important to acknowledge that this method exhibits certain limitations, performing better in high-density scenarios compared to low-density scenarios. Moreover, it lacks a comprehensive consideration of factors that influence travel behavior and requires specific urban road segmentation in the regional partition. The two aforementioned studies encompass an analysis of macroscopic traffic flow and microscopic driving behavior, resulting in extended duration and coverage of the entire Beijing Fifth Ring Road. Nevertheless, it is essential to note that utilizing simulation tools for route selection may contribute to traffic congestion.

In summary, knowledge-driven methods are predominantly used in Scenario 1 due to their application for validating works related to VANETs protocols or simulating traffic, which requires larger volumes of data, wider coverage, and longer duration. While realistic data, such as traffic flow or traffic average speed, can be collected (Gramaglia et al. 2016), these data are solely utilized for estimating statistical characteristics rather than learning features. When simulation tools are employed, knowledge-driven methods demonstrate a more effective capability in generating large-scale and long-term datasets. However, this approach heavily relies on supplementary information in addition to specific datasets. Furthermore, this generation paradigm is primarily based on spatial theories and traffic models. Nonetheless, theories or models tend to oversimplify real-world variables, leading to suboptimal performance in capturing intricate correlations or dependencies at a fine-grained microscopic level.

5.2 Data-driven approaches

Compared to the method of knowledge-driven, data-driven methods make use of large-scale real datasets of sensors by incorporating deep learning techniques into mobility trajectory generation. This paradigm aims to learn the spatio-temporal dependencies preserved in the realistic data and then generates the trajectories by the learned spatio-temporal correlations.

5.2.1 RNN-based models

RNNs and their variations have achieved certain accomplishments in generating pedestrian trajectories. Alahi et al. (2016) proposed Social LSTM for generating pedestrian trajectories. The model designed an aggregation strategy to connect neighboring LSTM units and learn the interactive behaviors among individuals in a larger spatial context. Social pooling aggregates the hidden states of adjacent pedestrians within a certain spatial distance, as shown in Fig.8. However, this method neglects the influence of other factors, such as scene layout. Additionally, in crowded scenes, the strategy becomes more complex due to the use of LSTM for each individual. Inspired by the aforementioned work, Fernando et al. (2018) presented an attention-based LSTM model that considers the past interactions between pedestrians and their neighbors in the contextual scene to generate future trajectories. The introduced attention model can handle highly congested scenarios.

Xue et al. (2017) further extended the previous work and presented a framework named Bi-Prediction for predicting pedestrian trajectories in a scene. Bi-Prediction designed a two-stage architecture based on bidirectional LSTM to learn fine-grained entry and exit trajectories in a given scene. Unlike the previous work that clusters trajectories, Bi-Prediction divides the scene into multiple regions and utilizes bidirectional LSTM classification to predict the destination selection probability of pedestrians.

Unlike previous studies that disregard the present intention of nearby pedestrians while concentrating solely on their adjacent hidden states, Zhang et al. (2019) introduced a states refinement module based on LSTM network. Acting as a feature extractor, this module employs an information passing mechanism to engage neighboring pedestrians’ intentions and jointly handles the current states of all pedestrians in congested scenarios. Furthermore, an information selection mechanism is introduced to selectively extract valuable features from individual neighbors.

In contrast to Social LSTM and Bi-Prediction, Lisotto et al. (2019) proposed three tensors to enhance the performance of the basic LSTM model. The first tensor is the Social Tensor, which aggregates neighboring interactions using a pooling mechanism. The Social Tensor follows a similar pooling strategy as in Social LSTM. The second tensor is the Navigation Tensor, which incorporates environmental content information for path selection. Specifically, a Navigation Map ${\mathcal {N}}$ was developed to quantify the frequency of crossings during navigation. Average pooling is employed to mitigate abrupt frequency transitions. The third tensor is the Semantic Tensor, which captures the semantic characteristics of spatial areas. The study defined a semantic class ${\mathcal {C}}={grass, building, obstacle, bench, car, road, sidewalk}$ and encoded it using one-hot representations. However, this approach also models each pedestrian as an LSTM network, making it equally unsuitable for crowded scenarios.

In real-life scenarios, pedestrians influence each other’s movements and are also affected by the presence of obstacles in their surroundings. Therefore, it is essential to consider various factors when generating future trajectory predictions. The application of attention mechanisms has proven to be effective in generating more plausible trajectories, and its effectiveness has been demonstrated in many tasks.

Haddad et al. (2019) introduced a graph-based LSTM framework for generating pedestrian trajectories. In contrast to previous approaches, this framework represents spatial and temporal interactions using a spatio-temporal graph, as shown in Fig.9. The graph components are decomposed into three LSTM-based modules: temporal edge LSTM, spatial edge LSTM, and node LSTM. Vanilla LSTM is employed to incorporate spatial and temporal relationships into deep representations.

Al-Molegi et al. (2018) proposed a neural network model that combines RNN and attention mechanisms. This model employs representation learning techniques to extract essential information from sequential trajectories. It tends to generate pedestrian trajectories that correspond to specific locations. However, the model lacks the capability to handle unseen locations. Similarly, Vemula et al. (2018) incorporates attention mechanisms to capture the relative importance of each individual in the crowd, irrespective of their proximity. However, the computational complexity increases due to the larger number of model parameters. The attention mechanism was also incorporated by Jiang et al. (2019) to distinguish the importance of different neighbors and tackle the issue of generating pedestrian trajectories. However, their initial extraction of destination information from past trajectory data led to the model neglecting the influence of pedestrians on one another. Consequently, this led to a deviation in the intended destination, resulting in trajectories that deviated from their actual paths.

The utilization of soft attention and hard attention (Fernando et al. 2018), implemented with the LSTM model, addresses pedestrian interactions in densely populated scenarios by incorporating the trajectory information of nearby neighbors into future trajectory generation. In a similar vein, Bhujel et al. (2019) propose two attention mechanisms within the LSTM framework. The first one is physical attention, which leverages input images to identify locations and generate contextual information. The second one is social attention, which computes social context vectors based on the encoder’s hidden states. Furthermore, the authors employ CNN as an extractor to acquire scene information. Notably, this study employs a single LSTM, effectively reducing the complexity of the training process. In the study conducted byXue et al. (2020), the generation of pedestrian future trajectories relies exclusively on the observed partial trajectories. The model adopts the LSTM architecture and incorporates temporal attention mechanisms into the location and velocity LSTM layers. However, the emphasis of this research is not placed on the integration of comprehensive background information, including static obstacles and scene details.

The main objective of the aforementioned trajectory generation tasks is to generate pedestrian trajectories. However, there have also been several studies that focus on generating trajectories from the perspective of vehicles. Park et al. (2018) proposed a framework specifically designed for vehicle trajectory generation. In this framework, an LSTM encoder is utilized to capture the trajectory samples and state information of the ego vehicle. Subsequently, the LSTM decoder leverages a beam search algorithm to generate future trajectories. The architecture of the LSTM encoder–decoder framework is visually shown in Fig.10.

The following works are built upon the LSTM Encoder–decoder framework. Deo and Trivedi (2018) introduced a unique approach by enhancing the social pooling layer with convolution, enabling robust learning of interdependencies in the data. Messaoud et al. (2019) tackled the challenge of long-term trajectory prediction (5 s) on highways by integrating attention mechanism and LSTM to capture spatio-temporal dependencies. Khakzar et al. (2020) aimed to overcome the limitations of existing methods, including computational complexity and dataset dependence, by employing ConvLSTM. This replaces the inner product of LSTM with convolution, ensuring the preservation of spatio-temporal motion patterns.

Existing LSTM models inadequately capture the spatial interactions and temporal relations among distinct vehicles. Furthermore, basic LSTM models encounter challenges with the vanishing gradient problem, impeding their training on long time series. Choi et al. (2019) proposed an attention mechanism to enhance the basic RNN and elucidate the impact of network-level traffic state information on generating trajectories for urban vehicles. Ma et al. (2019) devised an algorithm comprising two primary levels: an instance level for capturing agent mobility and interactions, and a category level for learning from agents of the same type. Nevertheless, its practical application is limited by the algorithm’s high computational cost and overreliance on traffic conditions and historical trajectories. Dai et al. (2019) integrated spatial interactions and temporal relations into the LSTM model to quantify the interactions among diverse vehicles. Additionally, they mitigated the vanishing gradient problem by introducing two consecutive LSTM layers between the input and output.

It is crucial to emphasize that the previously mentioned trajectory generation works utilizing RNN are employed in Scenario 3 with the goal of comprehending pedestrian and vehicle behaviors, and preventing collisions with obstacles in the surrounding environment. These works play a crucial role in the future advancement of socially compliant agents and autonomous vehicles.

5.2.2 GAN-based models

GAN has proven effective in generating pedestrian mobility trajectories. For instance, Gupta et al. (2018) designed an early GAN model named SocialGAN, which utilized a purely data-driven approach to model interactions among individuals. L2 loss was employed in this work to measure the distance between generated samples and real samples, as illustrated in Eq.4. In contrast to the conventional GAN discussed in Sect. 4.2.6, SocialGAN integrates a new pooling mechanism within the Encoder–decoder framework to capture information about individuals and generate trajectories in a scene.

$${\mathcal {L}}2=\min _{m}\left\| Y_{t}-{\hat{Y}}_{t}^{(m)}\right\| _{2},$$

(4)

where m is a hyperparameter.

Ouyang et al. (2018) designed a non-parametric trajectory generator that combines WGAN-GP (Gulrajani et al. 2017) to capture high-order geographic and semantic features. Non-parametric means that the generator does not assume any explicit parameters for the movement trajectories. They evaluated the synthetic trajectories by comparing the geographic and semantic features with real trajectories. In the model proposed by Amirian et al. (2019), L2 loss was excluded during the training of the generator to avoid mode collapse issues. This work not only integrated the Info-GAN structure into their network but also defined an attention aggregation mechanism to capture interactions between humans.

Song et al. (2019) analyzed data from macro and micro perspectives within the GAN framework. The former applied the k-means clustering method, while the latter focused on understanding the correlations between different points. They used a four-layer CNN to generate trajectories represented as matrices. However, due to the limitations of specific locations, the model’s capability is bound by high randomness and has some drawbacks. Additionally, this approach lacked quantitative evaluation of the model’s realism. Subsequently, Liu et al. (2020) applied a generator called CoL-GAN with an attention mechanism in a generative adversarial network, using a convolutional neural network as the discriminator. The model includes a social attention module to capture pedestrian’s historical patterns.

In the task of generating vehicle movement trajectories, GAN have been utilized. For example, a GAN-based framework for predicting vehicle trajectories was proposed by Roy et al. (2019) to model the interactions between vehicles with diverse types and driving styles. The crucial aspect involves integrating the social environment into the GAN model, which incorporates the LSTM encoder–decoder architecture and has demonstrated superior performance compared to certain purely RNN- or LSTM-based approaches. To account for the interactions among multiple vehicles, Wang et al. (2020c) proposed a collaborative learning approach based on GANs to generate multi-modal distributions of vehicle trajectories. This approach comprises two modules: the autoencoder social convolution module and the recursive social module, enabling the modeling of spatiotemporal information for distinct vehicles. Zhao et al. (2021) introduced a GAN model for trajectory generation and a vehicle turning model to adapt the prediction process in urban scenarios. During the dataset preparation, the complex spatial dependencies of road topology were addressed through vehicle coordinate transformation.

The above-mentioned GAN-related models, similar to the RNN-based methods for generating pedestrian and vehicle movement trajectories, are applied in Scenario 3.

5.2.3 Hybrid methods

The majority of the models presented in Sects. 5.2.1 and 5.2.2 generate trajectories depicting the movement of pedestrians or vehicles within a shared scene. Subsequent approaches integrate multiple neural network models within their frameworks and capture intricate scenarios. Unless otherwise indicated, these methods are also employed for Scenario 3. Zhao et al. (2019b) presented the Multi-Agent Tensor Fusion (MATF) network, which generates trajectories considering both vehicles and pedestrians. Specifically, this method utilizes an LSTM encoder–decoder architecture and employs Conditional Generative Adversarial Networks (CGAN; Mirza and Osindero 2014) to learn a stochastic generative model that captures uncertainties across multiple modes. The future trajectories are subsequently obtained through iterative decoding processes.

Vishnu et al. (2023) further expanded upon the previously mentioned approach and introduced three prediction models with distinct architectures: TS-Transformer, Generative Adversarial Network-based (TS-GAN), and Conditional Variational Autoencoder-based (TS-CVAE). These models are designed to generate trajectories for multiple agents in interactive driving scenarios. Sadeghian et al. (2019) provided Sophie, a GAN-based model, to predict future social constraints among multiple interacting agents in a scene. This method, similar to SocialGAN, employs LSTM to estimate temporal states. However, it distinguishes itself by integrating two attention mechanisms (physical attention and social attention) to enable interpretable generation. Furthermore, CNN is utilized as a feature extractor to capture scene features.

On the contrary, in comparison to most scene generation models that require extensive condition settings and parameters, Wu et al. (2020) introduced a fully data-driven model called LSTM-GAN, which solely relies on historical data. Moreover, the data generated by this method can concurrently encompass continuous time periods and locations.

5.2.4 Others

The Graph Neural Network (GNN) is a machine learning model that operates on graph structures. Graph attention networks (GAT), which combine GNN with attention mechanisms, have been employed for trajectory generation tasks. Kosaraju et al. (2019) utilized a GAN based on the GAT to generate multimodal pedestrian trajectories in interactive scenes, known as Social-BiGAT. Huang et al. (2019) introduces a spatial-temporal graph attention mechanism, using LSTM to capture the temporal correlations of pedestrian movements, and GAT to model spatial interactions.

In contrast to the aforementioned research, Gao et al. (2020) introduced a hierarchical graph neural network known as VectorNet. Rather than employing CNN for encoding, they utilize vector representations to handle high-definition maps and agent movement trajectories. Additionally, they stack multiple GNN layers to capture higher-order interactions among all components. Lv et al. (2023) designed a model that combines Graph Convolutional Networks (GCN) with attention mechanisms to capture interactions among pedestrians and between pedestrians and the environment in complex scenes. However, since the functions are specifically designed based on inherent graph structures, they are not compatible with non-GCN methods.

Lv and Yuan (2023) integrates social knowledge (such as the distance, speed, and visual range between pedestrians) as a matrix and combines it with the GAT to generate pedestrian movement trajectories. However, this approach primarily emphasizes the interaction between pedestrians, neglecting the interaction between pedestrians and the environment.

Kang et al. (2021) proposed a method called TraG for urban crowd mobility, which automatically captures contextual and statistical mobility features, ranging from simple empirical data to synthetic trajectories, using real-world datasets. This study primarily focuses on Scenario 1 and Scenario 2 for evaluating network simulation and planning decisions.

To address the probabilistic generation task for multiple interacting entities, Li et al. (2019) employed a variational recurrent neural network (VRNN) to improve coordination classification accuracy and used a Coordination-Bayesian Conditional Generative Adversarial Network to generate future vehicle trajectories based on historical information and coordination outcomes of multiple vehicles.

Si et al. (2019) designed an Adaptive Generation (AGen) method for generating vehicle trajectories. This method combines online adaptation and offline learning models to account for individual variances and temporal behaviors. It also incorporates an RNN model.

To simulate human spatio-temporal mobility patterns, Luca Pappalardo (2018) designed a data-driven algorithm called DIary-based TRAjectory Simulator (DITRAS), which achieves realistic simulation of human mobility. The basic idea is to separate the temporal characteristics and spatial characteristics of human mobility. Specifically, it constructs a mobility diary from real data and transforms it into a mobility trajectory.

To address the issue of data scarcity in emerging cities, recent works have combined prior knowledge with data-driven methods based on Scenario 2. He et al. (2020) proposed a framework that integrates transfer learning and multiple-source data from the target city to generate mobility data for new cities. Rong et al. (2023) was inspired by the previous work and combined GNN and GAN to generate OD flow data in emerging cities using data from source cities.

For improved estimation of traffic conditions and patterns in urban development planning and management (Scenario 2), TrafficGAN, a deep generative model proposed by Zhang et al. (2020b), captures the underlying patterns of how traffic evolves with changing travel demands and the evolving structure of the underlying road network. Within their framework, they developed a generative adversarial network (GAN) architecture featuring a generator and discriminator equipped with dynamic convolutional layers. Additionally, Zhang et al. (2020a) proposed conditional GAN (cGAN) to address traffic planning problems by considering traffic demands as conditions to generate traffic estimates. They used dynamic convolutional layers to extract spatial correlation within localized networks. Finally, they utilized self-attention mechanisms to capture temporal relationships.

For the task of generating future trajectories of moving objects and forecasting traffic flow in urban areas (Scenario 2 and Scenario 3), Karimzadeh et al. (2021) employ both reinforcement learning and transfer learning techniques to design the architecture of LSTM models. Additionally, they leverage high-order convolution operations and adaptive distance adjacency matrices to effectively capture the spatiotemporal dependencies within urban environments.

In summary, data-driven approaches have the capability to uncover complex and latent factors or correlations from the data itself. In Sect. 3, we have discussed two limitations of the data-driven paradigm. From this subsection, it can be concluded that the performance of data-driven approaches is contingent upon the quality of the training data. Furthermore, in the majority of existing data-driven approaches, there is a lack of clear understanding regarding the training process. Taking GAN as an example, the adversarial learning process within GAN remains largely unknown. Consequently, the training of GAN still poses significant challenges in GAN-related research. Nevertheless, data-driven methods offer distinct advantages when compared to knowledge-driven methods. Our current understanding and theories may not fully grasp the inherent complexity of the mobility trajectory process. For instance, accurately modeling the subtle psychological state of individual drivers, which significantly influences trajectory generation, remains elusive.

Data-driven methods are more commonly utilized in Scenario 2 and Scenario 3. Among them, Scenario 2 is primarily employed to generate and evaluate data for emerging cities, leveraging historical traffic data, and to assess the impact of new buildings on future traffic. These tasks are frequently complemented by knowledge-driven approaches. These works primarily concentrate on generating traffic speed and traffic flow data. Data-driven methods in Scenario 3 has the capability to generate both vehicle and pedestrian movement trajectories. The generated trajectory data usually exhibits shorter duration and covers smaller spatial areas, enabling a more detailed exploration of spatio-temporal dependencies.

In the trajectory generation process, knowledge-driven methods, in addition to the relevant information mentioned in Sect. 4, have taken into account the temporal patterns of residents’ travel in Scenario 1 (Uppoor et al. 2014; Bedogni et al. 2015). Conversely, data-driven methods incorporate a wider range of features for training, going beyond the sole reliance on location-based information. Within Scenario 2, data-driven methods primarily focus on the average speeds within specific regions and the traffic flow within each respective region (Zhang et al. 2020a). Moreover, certain studies also encompass demographic data, income-related data, epidemiological conditions, and policies (Bao et al. 2022), alongside the generated time periods (peak hours, holidays) and weather conditions (Wu et al. 2020). In Scenario 3, there is a notable emphasis on ensuring safe distances between vehicles (Dai et al. 2019) and considering environmental factors (Gao et al. 2020).

The complexity of knowledge-driven methods in handling large-scale and long-duration simulations depends on various factors, including the number of vehicles, road segment complexity, vehicle interactions, and traffic signal controls. The processing time increases as more factors are taken into consideration. In data-driven methods, the CNN module exhibits relatively low complexity, whereas RNN and GAN training involves higher complexity, demanding more computational resources (Huang et al. 2019). The complexity of hybrid methods is contingent upon network design, parameter size, and training process iterations. A summary is shown in Table 2.

Table 3 Evaluation methods for generating mobility trajectories

Full size table

6 Evaluation metrics

A major problem with generating mobility trajectory data is that they are generated by model simulations and thus require validation. However, the validation of the accuracy and effectiveness of knowledge-driven and data-driven data generation methods are different.

For knowledge-driven methods, most evaluation metrics are based on prior knowledge (e.g., real traffic conditions and navigation services data) or visualize the generated data to analyze whether it is following common sense. For instance, Dian Khumara et al. (2018), Kong et al. (2018), Pigné et al. (2011), Bedogni et al. (2015), Zhao et al. (2019a) and Raney et al. (2003) show the efficiency for the generated dataset by comparing with the real traffic condition. In addition, some works (Kong et al. 2018; Uppoor et al. 2014) visualizes the generated data and analyzes its rationality. In some cases (Codeca et al. 2015), directly uses the generated data in actual scenarios, such as evaluating and testing network protocols. Kanaya et al. (2012) simulate a monitoring-based flow estimation system to validate the usefulness of the model.

For data-driven methods, the evaluation is usually included qualitative evaluation and quantitative evaluation. The qualitative evaluation is mainly to show the generated results in a way of visual comparative analysis. There are metrics used to quantitatively evaluate the mobility trajectory generation models as illustrated in Table 3. First, for the data type of pedestrian trajectories, the common error metrics used to quantitatively evaluate the accuracy of the generated model are average displacement error (ADE), final displacement error (FDE).

(1)
ADE (Gupta et al. 2018; Roy et al. 2019; Sadeghian et al. 2019): this metric is the average Euclidean distance difference between each generated position and ground truth position during the generation time. Apart from this, the average non-linear displacement error (NL-ADE) calculates the distance between each generated position in the nonlinear region formed by the turning point generated by the pedestrian walking process and the ground truth position (Jiang et al. 2019). The calculation formula of this metric is as follows:
$$A D E=\frac{\sum _{j=1}^{N} \frac{\sum _{i=1}^{n} \sqrt{\left( {\hat{x}}_{t}^{j}-x_{t}^{j}\right) ^{2}+\left( {\hat{y}}_{t}^{j}-y_{t}^{j}\right) ^{2}}}{n}}{N},$$
(5)
where N is the set of pedestrians, $({\hat{x}}_{t}^{j}, {\hat{y}}_{t}^{j})$ are the generated coordinates at time t and $(x_{t}^{j}, y_{t}^{j})$ are the real position coordinates of time t.
(2)
FDE (Gupta et al. 2018; Roy et al. 2019; Sadeghian et al. 2019): this metric is the average Euclidean distance difference between the final generation positions and the corresponding truth locations. The calculation formula of this metric is as follows:
$$F D E=\frac{\sum _{j=1}^{N} \sqrt{\left( {\hat{x}}_{n}^{j}-x_{n}^{j}\right) ^{2}+\left( {\hat{y}}_{n}^{j}-y_{n}^{j}\right) ^{2}}}{N},$$
(6)
where N is the set of pedestrians, $({\hat{x}}_{n}^{j}, {\hat{y}}_{n}^{j})$ are the generated coordinates at time n and $(x_{n}^{j}, y_{n}^{j})$ are the real position coordinates at time n.
(3)
Jensen–Shannon Divergence (JSD) (Ouyang et al. 2018; Feng et al. 2020): JSD is the symmetric measure of the distance of two probability (P and Q) distributions. The smaller the JSD between the generated data and the real-world data distribution, the better. The calculation formula of this metric is as follows:
$$\begin{array}{l} J S \text{ divergence } (P \Vert Q)=\frac{1}{2} {\mathbb {E}}_{P}\left[ \log \frac{P}{X}\right] +\frac{1}{2} {\mathbb {E}}_{Q}\left[ \log \frac{Q}{X}\right] , \end{array}$$
(7)
where $X=\frac{1}{2}(P+Q)$.

In addition, for the data type of vehicle trajectory, the more common metrics are average accuracy (AA), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root-Mean-Squared Error (RMSE).
(4)
AA (Zhao et al. 2021): it represents the average generation accuracy of the generated vehicle trajectory. The calculation formula of this metric is as follows:
$${\text{AA}}=\frac{1}{n} \sum _{i=1}^{n}\left( 1-\frac{|{\hat{y}}_{i}-y_{i}|}{K}\right) ,$$
(8)
where ${y}_{i}$ is the information of real traffic; ${\hat{y}}_{i}$ represents the predicted by $y_{i}$; n represents the number of vehicles, K represents a constant.
(5)
MAE (Zhao et al. 2021; Li et al. 2018; Park et al. 2018): this metric represents the average value of absolute error, which can reflect the real value of the error of generated value. The calculation formula of this metric is as follows:
$${\text{MAE}}=\frac{1}{n}\sum _{i=1}^{n}|{\hat{y}}_{i}-y_{i}|,$$
(9)
where ${y}_{i}$ is the information of real data; ${\hat{y}}_{i}$ is the predicted by $y_{i}$; n represents the number of vehicles. The evaluation metric MAPE is equivalent to the weighted version of MAE.
(6)
RMSE (Deo and Trivedi 2018; Khakzar et al. 2020; Zhang et al. 2020a; Wang et al. 2020c): this metric is the square root of the ratio of the square sum of the error of the generation result to times n of generation. RMSE is sensitive than other metric in abnormal value. The calculation formula of this metric is as follows:
$${\text{RMSE}}=\left[ \frac{1}{n} \sum _{i=1}^{n}\left( {\hat{y}}_{i}-y_{i}\right) ^{2}\right] ^{\frac{1}{2}},$$
(10)
where ${y}_{i}$ is the information of real data; ${\hat{y}}_{i}$ represents the predicted value of $y_{i}$; n represents the number of vehicles.
Table 4 Open mobility trajectory datasets
Full size table

7 Open mobility trajectory datasets and source code

In this section, we summarize the open datasets and code from the existing researches. We hope this section will help the successor to spawn more valuable work in this domain.

7.1 Open datasets

We categorize the datasets into three types. The first type is road network data. As previously mentioned, the road network data consist of point, line, and plane. These data show the basic structures of the region. They can be easily obtained from the Internet, such as OpenStreetMap.^{Footnote 6} The second type is the trajectory of pedestrians and vehicles data. On the one hand, these data main include longitude and latitude information. On the other hand, they can be matched with road network data. The third type is data generated by the simulation tools. The key to generating these data is the calculation of region demand traffic using domain knowledge.

The relevant open datasets are shown in Table 4.

(1) Geolife: this dataset includes timestamped points with latitude and longitude information collected from 182 users from April 2007 to August 2012. Al-Molegi et al. (2018) uses it as test datasets.

(2) ETH/UCY: these datasets contain thousands of trajectories from 1536 pedestrians, including abundant pedestrians interacting in the real world. ETH contains two subsets with different scenes (ETH-univ and ETH-hotel). UCY contains three subsets with different scenes (UCY-univ, UCY-zara01, and UCY-zara02).

(3) NGSIM: this dataset contains four trajectory subsets, namely: US-101, I-80, Lankershim Boulevard, and Peachtree Street. The first two are commonly used which record the trajectory of the vehicle on the highway. The I-80 dataset represents data collected on Interstate 80 in Emeryville, California on April 13, 2005. The US-101 dataset was collected on US Highway 101 in Los Angeles, California on June 15, 2005.

(4) PeMS: this dataset is provided by the California Transportation Department from 2001 to 2019. It contains various traffic-relevant data, such as congestion.

5) METR-LA: this dataset contains highway traffic information from Los Angeles County Road, collecting by loop detectors. Li et al. (2018) referenced the time period from March 1st to June 30th, 2012.

(6) Cologne trace: this vehicle mobility dataset is provided by the Institute of Transportation Systems at the German Aerospace Center (ITS-DLR) based on the project of TAPASCologne. This dataset covers 400 km$^{2}$ during 24 h in a region.

Table 5 Open source code

Full size table

7.2 Open source code

Open source code is not only helpful for researchers to compare the result with other methods but also inspires successors to think and deepen understanding during operation. Therefore, we provide the existing hyperlinks of open source code in this paper (as shown in Table 5).

All the open source code is built on the PyTorch^{Footnote 7} framework. For the SocialLSTM model, the core part is the LSTM sequence network and it can train on a single GPU. The SocialGAN model consists of three components: generator, max pooling and discriminator. The code is developed on Ubuntu 16.04 with Python 3.5 and PyTorch 0.4. Theoretically, the SocialWay model is an improvement on the basis of the Social-GAN model, such as the SocialWay model implemented attention pooling to replace max pooling. The code of the CurbGAN model and TrafficGAN model can develop on Ubuntu 16.04 with Python 3.6.7 and PyTorch 0.4.1.

8 Challenges and future opportunities

Mobility trajectory generation is very challenging because of the complicated relationship of spatio-temporal in mobility trajectory data. In addition, evaluating the generated results is also an important aspect of mobility trajectory generation. In this section, we introduce four common challenges and their corresponding solutions, make a comparison after a comprehensive survey on the knowledge-driven and data-driven approaches in mobility trajectory generation.

8.1 Long-term mobility trajectory generation

As above-mentioned, most existing work generates mobility trajectory data in a short-term range ($\le 30$ min). Though knowledge-driven methods reviewed in Sect. 5.1 can generate large-scale and long-term mobility trajectory data, the fine-grained quality of these generated data is still worse than that generated by data-driven methods. Moreover, prior or external knowledge is hard to obtain in some scenarios.

As stated in Sect. 5.2, lots of data-driven methods learn the temporal correlations in data by RNN-based approaches. However, the time-consuming and gradient vanishing/explosion problems limit its capabilities of generating long-term sequences. Therefore, long-term temporal dependency learning is one of the most important challenges for mobility trajectory generation.

In future research, it is crucial to focus on the development of models capable of capturing global temporal dependencies. Currently, Attention-based methods (Vaswani et al. 2017; Kitaev et al. 2020; Zhou et al. 2021) have proven effective in learning long and global temporal dependencies. Moreover, integrating finer-grained knowledge into data-driven methods can guide the model in learning long-term dependencies within mobility trajectory data (Karpatne et al. 2017).

8.2 Spatio-temporal interactions

In mobility trajectory data generation, the basic factor is to make a model to learn spatio-temporal dependencies or correlations sufficiently. Knowledge-driven methods achieve impressive performance on macroscopic scenarios, which is inferior to data-driven methods on microscopic data generation.

Although data-driven methods can learn more fine-grained spatio-temporal correlations, the sequential learning manner of these methods still limits their capabilities to learn spatio-temporal interactions. Guo et al. (2019) argued that different spatial correlated locations at different time slots are considered to formulate different impacts on a given region in the future.

Most existing methods model the spatio-temporal correlations separately in generating mobility trajectory data. To overcome this challenge, future work should investigate representing mobility-related data in a more structured mode, such as using graph representations (Ye et al. 2020; Sheng et al. 2022) and knowledge graph triplets (Wang et al. 2020a). These representations can explicitly enhance the model’s learning of spatial and temporal interactions.

8.3 Model limitations

We reviewed the mobility trajectory data generation work based on their different modeling driving forces. Knowledge-driven methods rely very little on data and perform well on the macroscopic mobility trajectory data generation, e.g., area trajectory status generation (Kong et al. 2018). However, the accuracy of these methods can not support the fine-grained downstream application tasks. Data-driven methods depend largely on the data and can obtain accurate mobility trajectory data generation results. However, missing data, privacy protection or difficulty in data acquisition may limit the application of data-driven methods.

To tackle this challenge, future research should explore the integration of prior knowledge into data-driven approaches. Knowledge-assisted learning has garnered considerable interest in recent years due to its potential in reducing the complexity of learning and mitigating overfitting problems when dealing with limited data (Karpatne et al. 2017). An exemplary application of this approach is seen in COVID-GAN (Bao et al. 2020, 2022), which incorporates various factors such as population demographics, median income, epidemic conditions, and policy parameters into a generative adversarial network (GAN). This integration allows for the generation of accurate mobility data specifically tailored to the COVID-19 period.

8.4 Fixed representation

In mobility trajectory data generation, the popular way to represent data is the image-based representation. The map is divided into regular grids and the mobility trajectory data is transformed into each regular grids. Then, CNN is utilized to extract the features of these data. However, the spatial structure in mobility trajectory has been demonstrated more complex than the Euclidean space (Ye et al. 2020). This fixed representation of mobility trajectory data is the challenge in generating more accurate results.

In future works, the exploration of graph-structured data learning continues to hold significant promise (Wu et al. 2021; Lv et al. 2023). Given the prevalence of graph structures in traffic data, the integration of GNNs into deep learning frameworks, such as RNNs and GANs, offers a means to capture non-Euclidean spatial dependencies and obtain more precise generation outcomes.

9 Conclusion

In this survey, we present a novel taxonomy for the literature of mobility trajectory generation. We categorize the methods into two different paradigms: knowledge-driven and data-driven. Subsequently, we provide clear definitions of trajectory generation and analyze three common application scenarios of mobility trajectory generation. Detailed introduction of fundamentals, including theories, tools, and techniques commonly used in mobility trajectory generation, is given. We elaborate the knowledge-driven and data-driven methods according to the scenarios and fundamentals we introduced. Evaluation metrics, public datasets, and open-source code are introduced in this survey. Four open challenges future research directions are introduced based on the work we surveyed. We anticipate that this survey paper will facilitate readers in comprehending the fundamental concepts, application scenarios, relevant theories, and techniques in the area of mobility trajectory generation, thereby providing valuable insights.

Notes

References

Alahi A, Goel K, Ramanathan V et al (2016) Social LSTM: human trajectory prediction in crowded spaces. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), 2016, pp 961–971. https://doi.org/10.1109/CVPR.2016.110
Al-Molegi A, Jabreel M, Martínez-Ballesté A (2018) Move, attend and predict: an attention-based neural model for people’s movement prediction. Pattern Recognit Lett 112:34–40. https://doi.org/10.1016/j.patrec.2018.05.015
Article Google Scholar
Amirian J, Hayet JB, Pettré J (2019) Social ways: learning multi-modal distributions of pedestrian trajectories with GANs. In: 2019 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), 2019, pp 2964–2972. https://doi.org/10.1109/CVPRW.2019.00359
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, 2017, pp 214–223
Aycin M, Benekohal R (1999) Comparison of car-following models for simulation. Transp Res Rec 1678(1):116–127
Article Google Scholar
Bao H, Zhou X, Zhang Y et al (2020) COVID-GAN: estimating human mobility responses to COVID-19 pandemic through spatio-temporal conditional generative adversarial networks. In: Proceedings of the 28th international conference on advances in geographic information systems, 2020, pp 273–282. https://doi.org/10.1145/3397536.3422261
Bao H, Zhou X, Xie Y et al (2022) COVID-GAN+: estimating human mobility responses to COVID-19 through spatio-temporal generative adversarial networks with enhanced features. ACM Trans Intell Syst Technol. https://doi.org/10.1145/3481617
Article Google Scholar
Bedogni L, Gramaglia M, Vesco A et al (2015) The Bologna Ringway Dataset: improving road network conversion in SUMO and validating urban mobility via navigation services. IEEE Trans Veh Technol 64(12):5464–5476. https://doi.org/10.1109/TVT.2015.2475608
Article Google Scholar
Behrisch M, Bieker L, Erdmann J et al (2011) SUMO—simulation of urban mobility: an overview. In: Proceedings of SIMUL 2011, the third international conference on advances in system simulation, 2011
Benko Loknar M, Klančar G, Blažič S (2023) Minimum-time trajectory generation for wheeled mobile systems using Bézier curves with constraints on velocity, acceleration and jerk. Sensors. https://doi.org/10.3390/s23041982
Article MATH Google Scholar
Bhujel N, Teoh EK, Yau WY (2019) Pedestrian trajectory prediction using RNN encoder–decoder with spatio-temporal attentions. In: 2019 IEEE 5th international conference on mechatronics system and robots (ICMSR), 2019, pp 110–114. https://doi.org/10.1109/ICMSR.2019.8835478
Brockfeld E, Barlovic R, Schadschneider A et al (2001) Optimizing traffic lights in a cellular automaton model for city traffic. Phys Rev E 64(056):132. https://doi.org/10.1103/PhysRevE.64.056132
Article Google Scholar
Cetin N, Burri A, Nagel K (2003) A large-scale agent-based traffic microsimulation based on queue model. In: In Proceedings of Swiss transport research conference (STRC), Monte Verita, CH, 2003, p 3-4272
Chandra R, Bhattacharya U, Bera A et al (2019) TRAPHIC: trajectory prediction in dense and heterogeneous traffic using weighted interactions. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2019, pp 8475–8484. https://doi.org/10.1109/CVPR.2019.00868
Chen C, Li K, Teo SG et al (2020) Citywide traffic flow prediction based on multiple gated spatio-temporal convolutional neural networks. ACM Trans Knowl Discov Data. https://doi.org/10.1145/3385414
Article Google Scholar
Choi S, Kim J, Yeo H (2019) Attention-based recurrent neural network for urban vehicle trajectory prediction. Procedia Comput Sci 151:327–334. https://doi.org/10.1016/j.procs.2019.04.046
Article Google Scholar
Codeca L, Frank R, Engel T (2015) Luxembourg SUMO traffic (LUST) scenario: 24 hours of mobility for vehicular networking research. In: 2015 IEEE vehicular networking conference (VNC), 2015, pp 1–8. https://doi.org/10.1109/VNC.2015.7385539
Dai S, Li L, Li Z (2019) Modeling vehicle interactions via modified LSTM models for trajectory prediction. IEEE Access 7:38287–38296. https://doi.org/10.1109/ACCESS.2019.2907000
Article Google Scholar
Deo N, Trivedi MM (2018) Convolutional social pooling for vehicle trajectory prediction. In: 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), 2018, pp 1468–1476. https://doi.org/10.1109/CVPRW.2018.00196
Dian Khumara MA, Fauziyyah L, Kristalina P (2018) Estimation of urban traffic state using simulation of urban mobility (SUMO) to optimize intelligent transport system in smart city. In: 2018 International electronics symposium on engineering technology and applications (IES-ETA), 2018, pp 163–169. https://doi.org/10.1109/ELECSYM.2018.8615508
Fagiolo G (2010) The international-trade network: gravity equations and topological properties. J Econ Interact Coord 5(1):1–25. https://doi.org/10.1007/s11403-010-0061-y
Article Google Scholar
Fan C, Jiang X, Mostafavi A (2021) Evaluating crisis perturbations on urban mobility using adaptive reinforcement learning. Sustain Cities Soc 75(103):367. https://doi.org/10.1016/j.scs.2021.103367
Article Google Scholar
Fellendorf M, Vortisch P (2010) Microscopic traffic flow simulator VISSIM, pp 63–93. https://doi.org/10.1007/978-1-4419-6142-6_2
Feng J, Li Y, Zhang C et al (2018) Deepmove: predicting human mobility with attentional recurrent networks. In: Proceedings of the 2018 World Wide Web conference. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, WWW ’18, 2018, pp 1459–1468. https://doi.org/10.1145/3178876.3186058
Feng J, Yang Z, Xu F et al (2020) Learning to simulate human mobility. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining, 2020, pp 3426–3433. https://doi.org/10.1145/3394486.3412862
Fernando T, Denman S, Sridharan S et al (2018) Soft + hardwired attention: an LSTM framework for human trajectory prediction and abnormal event detection. Neural Netw 108:466–478. https://doi.org/10.1016/j.neunet.2018.09.002
Ferreira M, Conceição H, Fernandes R et al (2009) Stereoscopic aerial photography: an alternative to model-based urban mobility approaches. In: Proceedings of the sixth ACM international workshop on VehiculAr InterNETworking, 2009, pp 53–62. https://doi.org/10.1145/1614269.1614279
Gao J, Sun C, Zhao H et al (2020a) VectorNet: encoding HD maps and agent dynamics from vectorized representation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020, pp 11522–11530. https://doi.org/10.1109/CVPR42600.2020.01154
Gao N, Xue H, Shao W et al (2020b) Generative adversarial networks for spatio-temporal data: a survey. arXiv e-prints. arXiv:2008.08903. https://arxiv.org/abs/arXiv:2008.08903
Gawron C (1998) An iterative algorithm to determine the dynamic user equilibrium in a traffic simulation model. Int J Mod Phys C 09(03):393–407. https://doi.org/10.1142/S0129183198000303
Article Google Scholar
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471. https://doi.org/10.1162/089976600300015015
Article Google Scholar
Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial nets. In: Proceedings of the 27th international conference on neural information processing systems, 2014, pp 2672–2680
Gramaglia M, Trullols-Cruces O, Naboulsi D et al (2016) Mobility and connectivity in highway vehicular networks. Comput Commun 78(C):28–44. https://doi.org/10.1016/j.comcom.2015.10.014
Article Google Scholar
Gulrajani I, Ahmed F, Arjovsky M et al (2017) Improved training of Wasserstein GANs. In: NIPS’17, 2017. Curran Associates Inc., Red Hook, pp 5769–5779
Guo S, Lin Y, Feng N et al (2019) Attention based spatial–temporal graph convolutional networks for traffic flow forecasting. In: Proceedings of the AAAI conference on artificial intelligence, 2019, vol 33(01), pp 922–929. https://doi.org/10.1609/aaai.v33i01.3301922
Gupta A, Johnson J, Fei-Fei L et al (2018) Social GAN: socially acceptable trajectories with generative adversarial networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2018, pp 2255–2264. https://doi.org/10.1109/CVPR.2018.00240
Gursoy ME, Liu L, Truex S et al (2019) Differentially private and utility preserving publication of trajectory data. IEEE Trans Mob Comput 18(10):2315–2329. https://doi.org/10.1109/TMC.2018.2874008
Article Google Scholar
Haddad S, Wu M, Wei H et al (2019) Situation-aware pedestrian trajectory prediction with spatio-temporal attention model. CoRR abs/1902.05437. https://arxiv.org/abs/arXiv:1902.05437
Halim Z, Kalsoom R, Bashir S et al (2016) Artificial intelligence techniques for driving safety and vehicle crash prediction. Artif Intell Rev 46:351–387. https://doi.org/10.1007/s10462-016-9467-9
Article Google Scholar
Halim Z, Khan A, Sulaiman M et al (2022) On finding optimum commuting path in a road network: a computational approach for smart city traveling. Trans Emerg Telecommun Technol. https://doi.org/10.1002/ett.3786
Article Google Scholar
Han X, Shen G, Yang X et al (2020) Congestion recognition for hybrid urban road systems via digraph convolutional network. Transp Res C 121(102):877. https://doi.org/10.1016/j.trc.2020.102877
Article Google Scholar
Harri J, Filali F, Bonnet C (2009) Mobility models for vehicular ad hoc networks: a survey and taxonomy. IEEE Commun Surv Tutor 11(4):19–41. https://doi.org/10.1109/SURV.2009.090403
Article Google Scholar
He T, Bao J, Li R et al (2020) What is the human mobility in a new city: transfer mobility knowledge across cities. In: Proceedings of the web conference, 2020, pp 1355–1365. https://doi.org/10.1145/3366423.3380210
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Hou M, Tang T, Xia F et al (2023) MISSII: missing information imputation for traffic data. IEEE Trans Emerg Top Comput. https://doi.org/10.1109/TETC.2023.3280481
Article Google Scholar
Hu H, Lin Z, Hu Q et al (2023) Multi-source information fusion based DLAAS for traffic flow prediction. IEEE Trans Comput. https://doi.org/10.1109/TC.2023.3236902
Article Google Scholar
Huang Y, Bi H, Li Z et al (2019) STGAT: modeling spatial–temporal interactions for human trajectory prediction. In: 2019 IEEE/CVF international conference on computer vision (ICCV), 2019, pp 6271–6280. https://doi.org/10.1109/ICCV.2019.00637
Jiang X, Lin W, Liu J (2019) A method of pedestrian trajectory prediction based on LSTM. In: Proceedings of the 2019 2nd international conference on computational intelligence and intelligent systems, 2019, pp 79–84. https://doi.org/10.1145/3372422.3372428
Jung WS, Wang F, Stanley HE (2008) Gravity model in the Korean highway. Europhys Lett 81(4):48005. https://doi.org/10.1209/0295-5075/81/48005
Article Google Scholar
Kanaya T, Hiromori A, Yamaguchi H et al (2012) Humans: a human mobility sensing simulator. In: 2012 5th International conference on new technologies, mobility and security (NTMS), 2012, pp 1–4. https://doi.org/10.1109/NTMS.2012.6208740
Kang X, Liu L, Zhao D et al (2021) TRAG: a trajectory generation technique for simulating urban crowd mobility. IEEE Trans Ind Inform 17(2):820–829. https://doi.org/10.1109/TII.2020.2976777
Article Google Scholar
Karimzadeh M, Aebi R, de Souza AM et al (2021) Reinforcement learning-designed LSTM for trajectory and traffic flow prediction. In: 2021 IEEE wireless communications and networking conference (WCNC), 2021, pp 1–6. https://doi.org/10.1109/WCNC49053.2021.9417511
Karpatne A, Atluri G, Faghmous JH et al (2017) Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans Knowl Data Eng 29(10):2318–2331. https://doi.org/10.1109/TKDE.2017.2720168
Article Google Scholar
Khakzar M, Rakotonirainy A, Bond A et al (2020) A dual learning model for vehicle trajectory prediction. IEEE Access 8:21897–21908. https://doi.org/10.1109/ACCESS.2020.2968618
Article Google Scholar
Kitaev N, Kaiser L, Levskaya A (2020) Reformer: the efficient transformer. In: International conference on learning representations, 2020
Kong D, Wu F (2018) HST-LSTM: a hierarchical spatial-temporal long-short term memory network for location prediction. In: Proceedings of the 27th international joint conference on artificial intelligence. IJCAI’18, 2018. AAAI Press, pp 2341–2347
Kong X, Xia F, Wang J et al (2017) Time–location-relationship combined service recommendation based on taxi trajectory data. IEEE Trans Ind Inform 13(3):1202–1212. https://doi.org/10.1109/TII.2017.2684163
Article Google Scholar
Kong X, Li M, Ma K et al (2018a) Big trajectory data: a survey of applications and services. IEEE Access 6:58295–58306
Article Google Scholar
Kong X, Xia F, Ning Z et al (2018b) Mobility dataset generation for vehicular social networks based on floating car data. IEEE Trans Veh Technol 67(5):3874–3886. https://doi.org/10.1109/TVT.2017.2788441
Article Google Scholar
Kong X, Chen Q, Hou M et al (2022a) RMGEN: a tri-layer vehicular trajectory data generation model exploring urban region division and mobility pattern. IEEE Trans Veh Technol. https://doi.org/10.1109/TVT.2022.3176243
Article Google Scholar
Kong X, Zhu B, Shen G et al (2022b) Spatial–temporal-cost combination based taxi driving fraud detection for collaborative Internet of Vehicles. IEEE Trans Ind Inform 18(5):3426–3436
Article Google Scholar
Kosaraju V, Sadeghian A, Martín-Martín R et al (2019) Social-BiGAT: multimodal trajectory forecasting using bicycle-GAN and graph attention networks. In: Advances in neural information processing systems, 2019
Krajzewicz D, Hertkorn G, Rössel C et al (2002) SUMO (simulation of urban mobility)—an open-source traffic simulation. In: 4th Middle East symposium on simulation and modelling, 2002, pp 183–187
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Li B, Mostafavi A (2022) Location intelligence reveals the extent, timing, and spatial variation of hurricane preparedness. Sci Rep 12(1):16121
Article Google Scholar
Li Y, Yu R, Shahabi C et al (2018) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: International conference on learning representations, 2018
Li J, Ma H, Zhan W et al (2019) Coordination and trajectory prediction for vehicle interactions via Bayesian generative modeling. In: 2019 IEEE intelligent vehicles symposium (IV), 2019, pp 2496–2503. https://doi.org/10.1109/IVS.2019.8813821
Liebner M, Baumann M, Klanner F et al (2012) Driver intent inference at urban intersections using the intelligent driver model. In: 2012 IEEE intelligent vehicles symposium, 2012, pp 1162–1167. https://doi.org/10.1109/IVS.2012.6232131
Lim KG, Lee CH, Chin RKY et al (2017) SUMO enhancement for vehicular ad hoc network (VANET) simulation. In: 2017 IEEE 2nd international conference on automatic control and intelligent systems (I2CACIS), 2017, pp 86–91. https://doi.org/10.1109/I2CACIS.2017.8239038
Lisotto M, Coscia P, Ballan L (2019) Social and scene-aware trajectory prediction in crowded spaces. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), 2019, pp 2567–2574. https://doi.org/10.1109/ICCVW.2019.00314
Liu S, Liu H, Bi H et al (2020) Col-GAN: plausible and collision-less trajectory prediction by attention-based GAN. IEEE Access 8:101662–101671. https://doi.org/10.1109/ACCESS.2020.2987072
Article Google Scholar
Luca Pappalardo FS (2018) Data-driven generation of spatio-temporal routines in human mobility. Data Min Knowl Discov 32(3):787–829. https://doi.org/10.1007/s10618-017-0548-4
Article MathSciNet Google Scholar
Lv K, Yuan L (2023) SKGACN: social knowledge-guided graph attention convolutional network for human trajectory prediction. IEEE Trans Instrum Meas 72:1–11. https://doi.org/10.1109/TIM.2023.3283544
Article Google Scholar
Lv J, Li Q, Sun Q et al (2018) T-Conv: a convolutional neural network for multi-scale taxi trajectory prediction. In: 2018 IEEE international conference on big data and smart computing (BigComp), 2018, pp 82–89. https://doi.org/10.1109/BigComp.2018.00021
Lv P, Wang W, Wang Y et al (2023) SSAGCN: social soft attention graph convolution network for pedestrian trajectory prediction. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2023.3250485
Article Google Scholar
Ma Y, Zhu X, Zhang S et al (2019) TrafficPredict: trajectory prediction for heterogeneous traffic-agents. In: Proceedings of the AAAI conference on artificial intelligence, 2019, vol 33(01), pp 6120–6127. https://doi.org/10.1609/aaai.v33i01.33016120
McNally MG (2007) The four-step model. In: Handbook of transport modelling, pp 35–53. https://doi.org/10.1108/9780857245670-003
Messaoud K, Yahiaoui I, Verroust-Blondet A et al (2019) Relational recurrent neural networks for vehicle trajectory prediction. In: 2019 IEEE intelligent transportation systems conference (ITSC), 2019, pp 1813–1818. https://doi.org/10.1109/ITSC.2019.8916887
Mikolov T, Karafiát M, Burget L et al (2010) Recurrent neural network based language model. In: Eleventh annual conference of the International Speech Communication Association, 2010, pp 1045–1048
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint. arXiv:1411.1784
Nagarhalli TP, Vaze V, Rana NK (2021) Impact of machine learning in natural language processing: a review. In: 2021 Third international conference on intelligent communication technologies and virtual mobile networks (ICICV), 2021, pp 1529–1534. https://doi.org/10.1109/ICICV50876.2021.9388380
Nikhil N, Morris BT (2019) Convolutional neural network for trajectory prediction. In: Computer vision—ECCV 2018 workshops, 2019, pp 186–196
Odlyzko A (2015) The forgotten discovery of gravity models and the inefficiency of early railway networks. Œcon Hist Method Philos 5–2:157–192. https://doi.org/10.2139/ssrn.2490241
Article Google Scholar
Ouyang K, Shokri R, Rosenblum DS et al (2018) A non-parametric generative model for human trajectories. In: Proceedings of the 27th international joint conference on artificial intelligence, 2018, pp 3812–3817
Pan Z, Bao J, Zhang W et al (2019) TrajGuard: a comprehensive trajectory copyright protection scheme. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’19, 2019. Association for Computing Machinery, New York, pp 3060–3070. https://doi.org/10.1145/3292500.3330685
Park SH, Kim B, Kang CM et al (2018) Sequence-to-sequence prediction of vehicle trajectory via LSTM encoder–decoder architecture. In: 2018 IEEE intelligent vehicles symposium (IV), 2018, pp 1672–1678. https://doi.org/10.1109/IVS.2018.8500658
Pfeiffer M, Paolo G, Sommer H et al (2018) A data-driven model for interaction-aware pedestrian motion prediction in object cluttered environments. In: 2018 IEEE international conference on robotics and automation (ICRA), 2018, pp 5921–5928. https://doi.org/10.1109/ICRA.2018.8461157
Pigné Y, Danoy G, Bouvry P (2011) A vehicular mobility model based on real traffic counting data. In: International workshop on communication technologies for vehicles, 2011. Springer, pp 131–142
Raney B, Cetin N, Völlmy A et al (2003) An agent-based microsimulation model of Swiss travel: first results. Netw Spat Econ 3(1):23–41. https://doi.org/10.1023/A:1022096916806
Article Google Scholar
Ren H, Ruan S, Li Y et al (2021) MtrajRec: map-constrained trajectory recovery via seq2seq multi-task learning. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery and data mining, KDD ’21, 2021. Association for Computing Machinery, New York, pp 1410–1419. https://doi.org/10.1145/3447548.3467238
Romero-Tris C, Megías D (2018) Protecting privacy in trajectories with a user-centric approach. ACM Trans Knowl Discov Data. https://doi.org/10.1145/3233185
Article Google Scholar
Rong C, Feng J, Ding J (2023) GODDAG: generating origin–destination flow for new cities via domain adversarial training. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2023.3268409
Roy JR, Thill JC (2003) Spatial interaction modelling. Pap Reg Sci 83(1):339–361
Article Google Scholar
Roy D, Ishizaka T, Mohan CK et al (2019) Vehicle trajectory prediction at intersections using interaction based generative adversarial networks. In: 2019 IEEE intelligent transportation systems conference (ITSC), 2019, pp 2318–2323. https://doi.org/10.1109/ITSC.2019.8916927
Sadeghian A, Kosaraju V, Sadeghian A et al (2019) Sophie: an attentive GAN for predicting paths compliant to social and physical constraints. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2019, pp 1349–1358. https://doi.org/10.1109/CVPR.2019.00144
Sheng Z, Xu Y, Xue S et al (2022) Graph-based spatial-temporal convolutional network for vehicle trajectory prediction in autonomous driving. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2022.3155749
Article Google Scholar
Shin S, Jeon H, Cho C et al (2020) User mobility synthesis based on generative adversarial networks: a survey. In: 2020 22nd International conference on advanced communication technology (ICACT), 2020, pp 94–103. https://doi.org/10.23919/ICACT48636.2020.9061335
Si W, Wei T, Liu C (2019) AGEN: adaptable generative prediction networks for autonomous driving. In: 2019 IEEE intelligent vehicles symposium (IV), 2019, pp 281–286. https://doi.org/10.1109/IVS.2019.8814238
Slovic P, Fischhoff B, Lichtenstein S (1977) Behavioral decision theory. Annu Rev Psychol 28(1):1–39
Article Google Scholar
Song HY, Baek MS, Sung M (2019) Generating human mobility route based on generative adversarial network. In: 2019 Federated conference on computer science and information systems (FedCSIS), 2019, pp 91–99. https://doi.org/10.15439/2019F320
Stouffer SA (1940) Intervening opportunities: a theory relating mobility and distance. Am Sociol Rev 5(6):845–867
Article Google Scholar
Su B, Chang H, Chen YZ et al (2007) A game theory model of urban public traffic networks. Physica A 379(1):291–297. https://doi.org/10.1016/j.physa.2006.12.049
Article Google Scholar
Thakurzx GS, Huiz P, Helmyx A (2012) Modeling and characterization of urban vehicular mobility using web cameras. In: 2012 Proceedings IEEE INFOCOM workshops, 2012, pp 262–267. https://doi.org/10.1109/INFCOMW.2012.6193503
Tobler W (1995) Migration: Ravenstein, Thornthwaite, and beyond. Urban Geogr 16(4):327–343. https://doi.org/10.2747/0272-3638.16.4.327
Article Google Scholar
Uppoor S, Trullols-Cruces O, Fiore M et al (2014) Generation and analysis of a large-scale urban vehicular mobility dataset. IEEE Trans Mob Comput 13(5):1061–1075. https://doi.org/10.1109/TMC.2013.27
Article Google Scholar
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems, 2017, pp 6000–6010
Vemula A, Muelling K, Oh J (2018) Social attention: modeling attention in human crowds. In: 2018 IEEE international conference on robotics and automation (ICRA), 2018, pp 4601–4607. https://doi.org/10.1109/ICRA.2018.8460504
Viboud C, Bjørnstad ON, Smith DL et al (2006) Synchrony, waves, and spatial hierarchies in the spread of influenza. Science 312(5772):447–451. https://doi.org/10.1126/science.1125237
Article Google Scholar
Vishnu C, Abhinav V, Roy D et al (2023) Improving multi-agent trajectory prediction using traffic states on interactive driving scenarios. IEEE Robot Autom Lett 8(5):2708–2715. https://doi.org/10.1109/LRA.2023.3258685
Article Google Scholar
Wang P, Fu Y, Liu G et al (2017) Human mobility synchronization and trip purpose detection with mixture of Hawkes processes. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’17, 2017. Association for Computing Machinery, New York, pp 495–503. https://doi.org/10.1145/3097983.3098067
Wang J, Kong X, Xia F et al (2019) Urban human mobility: data-driven modeling and prediction. ACM SIGKDD Explor Newslett 21(1):1–19
Article Google Scholar
Wang P, Liu K, Jiang L et al (2020a) Incremental mobile user profiling: reinforcement learning with spatial knowledge graph for modeling event streams. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining, 2020, pp 853–861
Wang W, Xia F, Nie H et al (2020b) Vehicle trajectory clustering based on dynamic representation learning of Internet of Vehicles. IEEE Trans Intell Transp Syst 22(6):3567–3576
Article Google Scholar
Wang Y, Zhao S, Zhang R et al (2020c) Multi-vehicle collaborative learning for trajectory prediction with spatio-temporal tensor fusion. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.3009762
Article Google Scholar
Waqas M, Tu S, Halim Z et al (2020) Authentication of vehicles and road side units in intelligent transportation system. Comput Mater Contin 64(1):359–371. https://doi.org/10.32604/cmc.2020.09821
Wohlin C (2014) Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering, EASE ’14, 2014. Association for Computing Machinery, New York. https://doi.org/10.1145/2601248.2601268
Wu C, Chen L, Wang G et al (2020) Spatiotemporal scenario generation of traffic flow based on LSTM-GAN. IEEE Access 8:186191–186198. https://doi.org/10.1109/ACCESS.2020.3029230
Article Google Scholar
Wu Z, Pan S, Chen F et al (2021) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24. https://doi.org/10.1109/TNNLS.2020.2978386
Article MathSciNet Google Scholar
Xia F, Rahim A, Kong X et al (2017) Modeling and analysis of large-scale urban mobility for green transportation. IEEE Trans Ind Inform 14(4):1469–1481
Article Google Scholar
Xue H, Huynh DQ, Reynolds M (2017) Bi-Prediction: pedestrian trajectory prediction based on bidirectional LSTM classification. In: 2017 International conference on digital image computing: techniques and applications (DICTA), 2017, pp 1–8. https://doi.org/10.1109/DICTA.2017.8227412
Xue H, Huynh DQ, Reynolds M (2018) SS-LSTM: a hierarchical LSTM model for pedestrian trajectory prediction. In: 2018 IEEE winter conference on applications of computer vision (WACV), 2018, pp 1186–1194. https://doi.org/10.1109/WACV.2018.00135
Xue H, Huynh DQ, Reynolds M (2020) A location–velocity-temporal attention LSTM model for pedestrian trajectory prediction. IEEE Access 8:44576–44589. https://doi.org/10.1109/ACCESS.2020.2977747
Article Google Scholar
Yan X (2017) Advances in modeling spatial interaction network. Sci Technol Rev 35(14):15–22
Google Scholar
Yan XY, Zhou T (2019) Destination choice game: a spatial interaction theory on human mobility. Sci Rep 9(1):9466. https://doi.org/10.1038/s41598-019-46026-w
Article Google Scholar
Yan XY, Zhao C, Fan Y et al (2014) Universal predictability of mobility patterns in cities. J R Soc Interface 11(100):20140834. https://doi.org/10.1098/rsif.2014.0834
Article Google Scholar
Yan XY, Wang WX, Gao ZY et al (2017) Universal model of individual and population mobility on diverse spatial scales. Nat Commun 8(1):1–9
Article Google Scholar
Yao Z, Fu Y, Liu B et al (2018) Representing urban functions through zone embedding with human mobility patterns. In: Proceedings of the 27th international joint conference on artificial intelligence, 2018, pp 3919–3925
Ye J, Zhao J, Ye K et al (2020) How to build a graph-based deep learning architecture in traffic domain: a survey. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.3043250
Article Google Scholar
Zang T, Zhu Y, Xu Y et al (2021) Jointly modeling spatio-temporal dependencies and daily flow correlations for crowd flow prediction. ACM Trans Knowl Discov Data. https://doi.org/10.1145/3439346
Article Google Scholar
Zhang P, Ouyang W, Zhang P et al (2019) SR-LSTM: state refinement for LSTM towards pedestrian trajectory prediction. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2019, pp 12077–12086. https://doi.org/10.1109/CVPR.2019.01236
Zhang Y, Li Y, Zhou X et al (2020a) Curb-GAN: conditional urban traffic estimation through spatio-temporal generative adversarial networks. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining, 2020, pp 842–852. https://doi.org/10.1145/3394486.3403127
Zhang Y, Li Y, Zhou X et al (2020b) Off-deployment traffic estimation—a traffic generative adversarial networks approach. IEEE Trans Big Data. https://doi.org/10.1109/TBDATA.2020.3014511
Article Google Scholar
Zhao L, Liu Y, Al-Dubai A et al (2019a) A learning-based vehicle-trajectory generation method for vehicular networking. In: 2019 IEEE 21st international conference on high performance computing and communications; IEEE 17th international conference on smart city; IEEE 5th international conference on data science and systems (HPCC/SmartCity/DSS), 2019, pp 519–526. https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00082
Zhao T, Xu Y, Monfort M et al (2019b) Multi-agent tensor fusion for contextual trajectory prediction. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2019, pp 12118–12126. https://doi.org/10.1109/CVPR.2019.01240
Zhao L, Gao Y, Ye J et al (2021a) Spatio-temporal event forecasting using incremental multi-source feature learning. ACM Trans Knowl Discov Data. https://doi.org/10.1145/3464976
Article Google Scholar
Zhao L, Liu Y, Al-Dubai AY et al (2021b) A novel generation-adversarial-network-based vehicle trajectory prediction method for intelligent vehicular networks. IEEE Internet Things J 8(3):2066–2077. https://doi.org/10.1109/JIOT.2020.3021141
Article Google Scholar
Zhou H, Zhang S, Peng J et al (2021) Informer: beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI conference on artificial intelligence, 2021, vol 35(12), pp 11106–11115

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 62072409, and in part by the Zhejiang Provincial Natural Science Foundation under Grant LR21F020003.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and Affiliations

School of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
Xiangjie Kong
School of Software, Dalian University of Technology, Dalian, 116620, China
Qiao Chen, Mingliang Hou & Hui Wang
School of Computing Technologies, RMIT University, Melbourne, 3000, Australia
Feng Xia

Authors

Xiangjie Kong
View author publications
You can also search for this author in PubMed Google Scholar
Qiao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Mingliang Hou
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Xia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Xia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kong, X., Chen, Q., Hou, M. et al. Mobility trajectory generation: a survey. Artif Intell Rev 56 (Suppl 3), 3057–3098 (2023). https://doi.org/10.1007/s10462-023-10598-x

Download citation

Published: 24 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10462-023-10598-x

Mobility trajectory generation: a survey

Abstract

Similar content being viewed by others

Find Your Way Back: Mobility Profile Mining with Constraints

From Trajectory Modeling to Social Habits and Behaviors Analysis

Semantic Trajectories: A Survey from Modeling to Application

Explore related subjects

1 Introduction

2 Methodology

3 Taxonomy

4 Definitions and fundamentals

4.1 Definitions

4.2 Fundamentals

4.2.1 Spatial interaction models

4.2.2 Traffic models

4.2.3 Simulation tool

4.2.4 Convolutional neural network (CNN)

4.2.5 Recurrent neural network (RNN)

4.2.6 Generative adversarial network (GAN)

5 Mobility trajectory generation techniques

5.1 Knowledge-driven approaches

5.2 Data-driven approaches

5.2.1 RNN-based models

5.2.2 GAN-based models

5.2.3 Hybrid methods

5.2.4 Others

6 Evaluation metrics

7 Open mobility trajectory datasets and source code

7.1 Open datasets

7.2 Open source code

8 Challenges and future opportunities

8.1 Long-term mobility trajectory generation

8.2 Spatio-temporal interactions

8.3 Model limitations

8.4 Fixed representation

9 Conclusion

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation