default search action
Shigang Li 0002
Person information
- affiliation: Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
- affiliation (PhD 2014): University of Science and Technology Beijing, China
Other persons with the same name
- Shigang Li — disambiguation page
- Shigang Li 0001 — Hiroshima City University, Graduate School of Information Sciences, Japan (and 4 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j17]Jinfan Chen, Shigang Li, Ran Guo, Jinhui Yuan, Torsten Hoefler:
AutoDDL: Automatic Distributed Deep Learning With Near-Optimal Bandwidth Cost. IEEE Trans. Parallel Distributed Syst. 35(8): 1331-1344 (2024) - [c33]Nils Blach, Maciej Besta, Daniele De Sensi, Jens Domke, Hussein Harake, Shigang Li, Patrick Iff, Marek Konieczny, Kartik Lakhotia, Ales Kubicek, Marcel Ferrari, Fabrizio Petrini, Torsten Hoefler:
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network. NSDI 2024 - [c32]Shunde Li, Junyu Gu, Jue Wang, Tiechui Yao, Zhiqiang Liang, Yumeng Shi, Shigang Li, Weiting Xi, Shushen Li, Chunbao Zhou, Yangang Wang, Xuebin Chi:
POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters. PPoPP 2024: 469-471 - 2023
- [j16]Hang Cao, Liang Yuan, He Zhang, Yunquan Zhang, Baodong Wu, Kun Li, Shigang Li, Minghua Zhang, Pengqi Lu, Junmin Xiao:
AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format. IEEE Trans. Parallel Distributed Syst. 34(3): 766-780 (2023) - [c31]Daning Cheng, Shigang Li, Yunquan Zhang:
Asynch-SGBDT: Train Stochastic Gradient Boosting Decision Trees in an Asynchronous Parallel Manner. IPDPS 2023: 256-267 - [c30]Kazuki Osawa, Shigang Li, Torsten Hoefler:
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices. MLSys 2023 - [c29]Kehao Lin, Chunbao Zhou, Yan Zeng, Ningming Nie, Jue Wang, Shigang Li, Yangde Feng, Yangang Wang, Kehan Yao, Tiechui Yao, Jilin Zhang, Jian Wan:
A Scalable Hybrid Total FETI Method for Massively Parallel FEM Simulations. PPoPP 2023: 135-147 - [c28]Yumeng Shi, Ningming Nie, Jue Wang, Kehao Lin, Chunbao Zhou, Shigang Li, Kehan Yao, Shunde Li, Yangde Feng, Yan Zeng, Fang Liu, Yangang Wang, Yue Gao:
Large-Scale Simulation of Structural Dynamics Computing on GPU Clusters. SC 2023: 11:1-11:14 - [c27]Shunde Li, Zongguo Wang, Lingkun Bu, Jue Wang, Zhikuang Xin, Shigang Li, Yangang Wang, Yangde Feng, Peng Shi, Yun Hu, Xuebin Chi:
ANT-MOC: Scalable Neutral Particle Transport Using 3D Method of Characteristics on Multi-GPU Systems. SC 2023: 12:1-12:13 - [c26]Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso:
Co-design Hardware and Algorithm for Vector Search. SC 2023: 87:1-87:15 - [i21]Jinfan Chen, Shigang Li, Ran Guo, Jinhui Yuan, Torsten Hoefler:
AutoDDL: Automatic Distributed Deep Learning with Asymptotically Optimal Communication. CoRR abs/2301.06813 (2023) - [i20]Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler:
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch. CoRR abs/2305.04684 (2023) - [i19]Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso:
Co-design Hardware and Algorithm for Vector Search. CoRR abs/2306.11182 (2023) - [i18]Nils Blach, Maciej Besta, Daniele De Sensi, Jens Domke, Hussein Harake, Shigang Li, Patrick Iff, Marek Konieczny, Kartik Lakhotia, Ales Kubicek, Marcel Ferrari, Fabrizio Petrini, Torsten Hoefler:
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network. CoRR abs/2310.03742 (2023) - 2022
- [j15]Tiechui Yao, Jue Wang, Meng Wan, Zhikuang Xin, Yangang Wang, Rongqiang Cao, Shigang Li, Xuebin Chi:
VenusAI: An artificial intelligence platform for scientific discovery on supercomputers. J. Syst. Archit. 128: 102550 (2022) - [c25]Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler:
A data-centric optimization framework for machine learning. ICS 2022: 36:1-36:13 - [c24]Shigang Li, Torsten Hoefler:
Near-optimal sparse allreduce for distributed deep learning. PPoPP 2022: 135-149 - [c23]Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, Steve Scott:
HammingMesh: A Network Topology for Large-Scale Deep Learning. SC 2022: 11:1-11:18 - [c22]Shigang Li, Kazuki Osawa, Torsten Hoefler:
Efficient Quantized Sparse Matrix Operations on Tensor Cores. SC 2022: 37:1-37:15 - [i17]Shigang Li, Torsten Hoefler:
Near-Optimal Sparse Allreduce for Distributed Deep Learning. CoRR abs/2201.07598 (2022) - [i16]Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, Steve Scott:
HammingMesh: A Network Topology for Large-Scale Deep Learning. CoRR abs/2209.01346 (2022) - [i15]Shigang Li, Kazuki Osawa, Torsten Hoefler:
Efficient Quantized Sparse Matrix Operations on Tensor Cores. CoRR abs/2209.06979 (2022) - [i14]Kazuki Osawa, Shigang Li, Torsten Hoefler:
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices. CoRR abs/2211.14133 (2022) - 2021
- [j14]Daning Cheng, Shigang Li, Hanping Zhang, Fen Xia, Yunquan Zhang:
Why Dataset Properties Bound the Scalability of Parallel Machine Learning Training Algorithms. IEEE Trans. Parallel Distributed Syst. 32(7): 1702-1712 (2021) - [j13]Shigang Li, Tal Ben-Nun, Giorgi Nadiradze, Salvatore Di Girolamo, Nikoli Dryden, Dan Alistarh, Torsten Hoefler:
Breaking (Global) Barriers in Parallel Stochastic Optimization With Wait-Avoiding Group Averaging. IEEE Trans. Parallel Distributed Syst. 32(7): 1725-1739 (2021) - [c21]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
Data Movement Is All You Need: A Case Study on Optimizing Transformers. MLSys 2021 - [c20]Giorgi Nadiradze, Amirmojtaba Sabour, Peter Davies, Shigang Li, Dan Alistarh:
Asynchronous Decentralized SGD with Quantized and Local Updates. NeurIPS 2021: 6829-6842 - [c19]Shigang Li, Torsten Hoefler:
Chimera: efficiently training large-scale neural networks with bidirectional pipelines. SC 2021: 27 - [c18]Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, Torsten Hoefler:
Flare: flexible in-network allreduce. SC 2021: 35 - [i13]Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, Torsten Hoefler:
Flare: Flexible In-Network Allreduce. CoRR abs/2106.15565 (2021) - [i12]Shigang Li, Torsten Höfler:
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines. CoRR abs/2107.06925 (2021) - [i11]Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler:
A Data-Centric Optimization Framework for Machine Learning. CoRR abs/2110.10802 (2021) - 2020
- [j12]Xinming Qin, Honghui Shang, Lei Xu, Wei Hu, Jinlong Yang, Shigang Li, Yunquan Zhang:
The static parallel distribution algorithms for hybrid density-functional calculations in HONPAS package. Int. J. High Perform. Comput. Appl. 34(2) (2020) - [j11]Daning Cheng, Shigang Li, Yunquan Zhang:
WP-SGD: Weighted parallel SGD for distributed unbalanced-workload training system. J. Parallel Distributed Comput. 145: 202-216 (2020) - [j10]Kun Li, Shigang Li, Shan Huang, Yifeng Chen, Yunquan Zhang:
FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations. J. Supercomput. 76(7): 5501-5520 (2020) - [c17]Hang Cao, Liang Yuan, He Zhang, Baodong Wu, Shigang Li, Pengqi Lu, Yunquan Zhang, Yongjun Xu, Minghua Zhang:
A Highly Efficient Dynamical Core of Atmospheric General Circulation Model based on Leap-Format. IPDPS 2020: 95-104 - [c16]Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:
Taming unbalanced training workloads in deep learning with partial collective operations. PPoPP 2020: 45-61 - [i10]Shigang Li, Tal Ben-Nun, Dan Alistarh, Salvatore Di Girolamo, Nikoli Dryden, Torsten Hoefler:
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging. CoRR abs/2005.00124 (2020) - [i9]Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler:
Deep Learning for Post-Processing Ensemble Weather Forecasts. CoRR abs/2005.08748 (2020) - [i8]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
Data Movement Is All You Need: A Case Study on Optimizing Transformers. CoRR abs/2007.00072 (2020)
2010 – 2019
- 2019
- [j9]Zhihao Li, Haipeng Jia, Yunquan Zhang, Shice Liu, Shigang Li, Xiao Wang, Hao Zhang:
Efficient parallel optimizations of a high-performance SIFT on GPUs. J. Parallel Distributed Comput. 124: 78-91 (2019) - [j8]Kun Li, Shigang Li, Shan Huang, Yifeng Chen, Yunquan Zhang:
Correction to: FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations. J. Supercomput. 75(12): 8339-8340 (2019) - [c15]Daning Cheng, Hanping Zhang, Fen Xia, Shigang Li, Yunquan Zhang:
Using Gradient Based Multikernel Gaussian Process and Meta-Acquisition Function to Accelerate SMBO. ICTAI 2019: 440-447 - [c14]Kun Li, Shigang Li, Bei Wang, Yifeng Chen, Yunquan Zhang:
swMD: Performance Optimizations for Molecular Dynamics Simulation on Sunway Taihulight. ISPA/BDCloud/SocialCom/SustainCom 2019: 511-518 - [c13]Kun Li, Honghui Shang, Yunquan Zhang, Shigang Li, Baodong Wu, Dong Wang, Libo Zhang, Fang Li, Dexun Chen, Zhiqiang Wei:
OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight. SC 2019: 68:1-68:16 - [i7]Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:
Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations. CoRR abs/1908.04207 (2019) - [i6]Daning Cheng, Hanping Zhang, Fen Xia, Shigang Li, Yunquan Zhang:
The Scalability for Parallel Machine Learning Training Algorithm: Dataset Matters. CoRR abs/1910.11510 (2019) - [i5]Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, Torsten Hoefler:
Predicting Weather Uncertainty with Deep Convnets. CoRR abs/1911.00630 (2019) - 2018
- [j7]Shigang Li, Yunquan Zhang, Torsten Hoefler:
Cache-Oblivious MPI All-to-All Communications Based on Morton Order. IEEE Trans. Parallel Distributed Syst. 29(3): 542-555 (2018) - [c12]Baodong Wu, Shigang Li, Hang Cao, Yunquan Zhang, He Zhang, Junmin Xiao, Minghua Zhang:
AGCM3D: A Highly Scalable Finite-Difference Dynamical Core of Atmospheric General Circulation Model Based on 3D Decomposition. ICPADS 2018: 355-364 - [c11]Junmin Xiao, Shigang Li, Baodong Wu, He Zhang, Kun Li, Erlin Yao, Yunquan Zhang, Guangming Tan:
Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model. ICPP 2018: 12:1-12:10 - [c10]Shigang Li, Baodong Wu, Yunquan Zhang, Xianmeng Wang, Jianjiang Li, Changjun Hu, Jue Wang, Yangde Feng, Ningming Nie:
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer. ICPP 2018: 47:1-47:11 - [i4]Daning Cheng, Fen Xia, Shigang Li, Yunquan Zhang:
Asynchronous Parallel Sampling Gradient Boosting Decision Tree. CoRR abs/1804.04659 (2018) - [i3]Daning Cheng, Hanping Zhang, Fen Xia, Shigang Li, Yunquan Zhang:
Using Known Information to Accelerate HyperParameters Optimization Based on SMBO. CoRR abs/1811.03322 (2018) - 2017
- [j6]Changjun Hu, Xianmeng Wang, Jianjiang Li, Xinfu He, Shigang Li, Yangde Feng, Shaofeng Yang, He Bai:
Kernel optimization for short-range molecular dynamics. Comput. Phys. Commun. 211: 31-40 (2017) - [j5]Baodong Wu, Shigang Li, Yunquan Zhang, Ningming Nie:
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation. Comput. Phys. Commun. 211: 113-123 (2017) - [c9]Shigang Li, Yunquan Zhang, Torsten Hoefler:
POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures. PPoPP 2017: 445-446 - [i2]Daning Cheng, Shigang Li, Yunquan Zhang:
Weighted parallel SGD for distributed unbalanced-workload training system. CoRR abs/1708.04801 (2017) - [i1]Daning Cheng, Shigang Li, Yunquan Zhang:
Asynchronous COMID: the theoretic basis for transmitted data sparsification tricks on Parameter Server. CoRR abs/1709.02091 (2017) - 2016
- [j4]Yunquan Zhang, Ting Cao, Shigang Li, Xinhui Tian, Liang Yuan, Haipeng Jia, Athanasios V. Vasilakos:
Parallel Processing Systems for Big Data: A Survey. Proc. IEEE 104(11): 2114-2136 (2016) - [j3]Yunquan Zhang, Shigang Li, Shengen Yan, Huiyang Zhou:
A Cross-Platform SpMV Framework on Many-Core Architectures. ACM Trans. Archit. Code Optim. 13(4): 33:1-33:25 (2016) - 2015
- [j2]Shigang Li, Changjun Hu, Junchao Zhang, Yunquan Zhang:
Automatic tuning of sparse matrix-vector multiplication on multicore clusters. Sci. China Inf. Sci. 58(9): 1-14 (2015) - [c8]Xiaomin Zhu, Junchao Zhang, Kazutomo Yoshii, Shigang Li, Yunquan Zhang, Pavan Balaji:
Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations. CCGRID 2015: 1099-1106 - [c7]Shigang Li, Yunquan Zhang, Chunyang Xiang, Lei Shi:
Fast Convolution Operations on Many-Core Architectures. HPCC/CSS/ICESS 2015: 316-323 - 2014
- [j1]Shigang Li, Torsten Hoefler, Chungjin Hu, Marc Snir:
Improved MPI collectives for MPI processes in shared address spaces. Clust. Comput. 17(4): 1139-1155 (2014) - 2013
- [c6]Shigang Li, Torsten Hoefler, Marc Snir:
NUMA-aware shared-memory collective communication for MPI. HPDC 2013: 85-96 - [c5]Shigang Li, Jingyuan Hu, Xin Cheng, Chongchong Zhao:
Asynchronous Work Stealing on Distributed Memory Systems. PDP 2013: 198-202 - 2011
- [c4]Yunfeng Peng, Chongchong Zhao, Shucai Yao, Shigang Li, Yi Chen:
Scheduling Multi-paradigm and Multi-grain Parallel Components on Heterogeneous Platforms. ChinaGrid 2011: 15-21 - [c3]Shigang Li, Shucai Yao, Haohu He, Lili Sun, Yi Chen, Yunfeng Peng:
Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core. ICA3PP (2) 2011: 54-63 - [c2]Yunfeng Peng, Changjun Hu, Chongchong Zhao, Shigang Li, Shucai Yao:
Management of Non-functional Attributes of Parallel Components. ICCS 2011: 461-470 - 2010
- [c1]Qian Cao, Changjun Hu, Haohu He, Xiang Huang, Shigang Li:
Support for OpenMP Tasks on Cell Architecture. ICA3PP (2) 2010: 308-317
Coauthor Index
aka: Torsten Höfler
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-11 17:29 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint