Understanding how congestion at one location can cause ripples throughout large-scale transportation network is vital for transportation researchers and practitioners to pinpoint traffic bottlenecks for congestion mitigation. Traditional studies rely on either mathematical equations or simulation techniques to model traffic congestion dynamics. However, most of the approaches have limitations, largely due to unrealistic assumptions and cumbersome parameter calibration process. With the development of Intelligent Transportation Systems (ITS) and Internet of Things (IoT), transportation data become more and more ubiquitous. This triggers a series of data-driven research to investigate transportation phenomena. Among them, deep learning theory is considered one of the most promising techniques to tackle tremendous high-dimensional data. This study attempts to extend deep learning theory into large-scale transportation network analysis. A deep Restricted Boltzmann Machine and Recurrent Neural Network architecture is utilized to model and predict traffic congestion evolution based on Global Positioning System (GPS) data from taxi. A numerical study in Ningbo, China is conducted to validate the effectiveness and efficiency of the proposed method. Results show that the prediction accuracy can achieve as high as 88% within less than 6 minutes when the model is implemented in a Graphic Processing Unit (GPU)-based parallel computing environment. The predicted congestion evolution patterns can be visualized temporally and spatially through a map-based platform to identify the vulnerable links for proactive congestion mitigation.