default search action
Jim M. Brandt
Person information
- affiliation: Sandia National Laboratories
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j5]Burak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Manuel Egele, Ayse K. Coskun:
Runtime Performance Anomaly Diagnosis in Production HPC Systems Using Active Learning. IEEE Trans. Parallel Distributed Syst. 35(4): 693-706 (2024) - [c44]Alexander V. Goponenko, Kenneth Lamar, Benjamin A. Allan, James M. Brandt, Damian Dechev:
Job Scheduling for HPC Clusters: Constraint Programming vs. Backfilling Approaches. DEBS 2024: 135-146 - [i5]Francieli Boito, Jim M. Brandt, Valeria Cardellini, Philip H. Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann C. Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin A. Huck, Thomas Ilsche, Thomas Jakobsche, Terry R. Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Ivy Peng, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael T. Showerman, Devesh Tiwari, Torsten Wilde, Keiji Yamamoto:
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations. CoRR abs/2401.16971 (2024) - 2023
- [c43]Francieli Boito, Jim M. Brandt, Valeria Cardellini, Philip H. Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann C. Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin A. Huck, Thomas Ilsche, Thomas Jakobsche, Terry R. Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Ivy Peng, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael T. Showerman, Devesh Tiwari, Torsten Wilde, Keiji Yamamoto:
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations. CLUSTER Workshops 2023: 37-43 - [c42]Kenneth Lamar, Alexander V. Goponenko, Omar Aaziz, Benjamin A. Allan, James M. Brandt, Damian Dechev:
Evaluating HPC Job Run Time Predictions Using Application Input Parameters. DEBS 2023: 127-138 - [c41]Burak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Manuel Egele, Ayse K. Coskun:
Prodigy: Towards Unsupervised Anomaly Detection in Production HPC Systems. SC 2023: 26:1-26:14 - [i4]Jim M. Brandt, Florina M. Ciorba, Ann C. Gentile, Michael Ott, Torsten Wilde:
Driving HPC Operations With Holistic Monitoring and Operational Data Analytics (Dagstuhl Seminar 23171). Dagstuhl Reports 13(4): 98-120 (2023) - 2022
- [c40]Burak Aksar, Efe Sencan, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Brian Kulis, Ayse K. Coskun:
ALBADross: Active Learning Based Anomaly Diagnosis for Production HPC Systems. CLUSTER 2022: 369-380 - [c39]Alexander V. Goponenko, Kenneth Lamar, Christina L. Peterson, Benjamin A. Allan, Jim M. Brandt, Damian Dechev:
Metrics for Packing Efficiency and Fairness of HPC Cluster Batch Job Scheduling. SBAC-PAD 2022: 241-252 - 2021
- [c38]Kenneth Lamar, Alexander V. Goponenko, Christina L. Peterson, Benjamin A. Allan, Jim M. Brandt, Damian Dechev:
Backfilling HPC Jobs with a Multimodal-Aware Predictor. CLUSTER 2021: 618-622 - [c37]Burak Aksar, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Manuel Egele, Ayse K. Coskun:
E2EWatch: An End-to-End Anomaly Diagnosis Framework for Production HPC Systems. Euro-Par 2021: 70-85 - [c36]Yijia Zhang, Burak Aksar, Omar Aaziz, Benjamin Schwaller, Jim M. Brandt, Vitus J. Leung, Manuel Egele, Ayse K. Coskun:
Using Monitoring Data to Improve HPC Performance via Network-Data-Driven Allocation. HPEC 2021: 1-7 - [c35]Archit Patke, Saurabh Jha, Haoran Qiu, Jim M. Brandt, Ann C. Gentile, Joe Greenseid, Zbigniew Kalbarczyk, Ravishankar K. Iyer:
Delay sensitivity-driven congestion mitigation for HPC systems. ICS 2021: 342-353 - [c34]Emily Costa, Tirthak Patel, Benjamin Schwaller, Jim M. Brandt, Devesh Tiwari:
Systematically inferring I/O performance variability by examining repetitive job behavior. SC 2021: 33 - [c33]Burak Aksar, Yijia Zhang, Emre Ates, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Manuel Egele, Ayse K. Coskun:
Proctor: A Semi-Supervised Performance Anomaly Diagnosis Framework for Production HPC Systems. ISC 2021: 195-214 - 2020
- [c32]Benjamin Schwaller, Nick Tucker, Tom Tucker, Benjamin A. Allan, Jim M. Brandt:
HPC System Data Pipeline to Enable Meaningful Insights through Analysis-Driven Visualizations. CLUSTER 2020: 433-441 - [c31]Alexander V. Goponenko, Ramin Izadpanah, Jim M. Brandt, Damian Dechev:
Towards workload-adaptive scheduling for HPC clusters. CLUSTER 2020: 449-453 - [c30]Saurabh Jha, Archit Patke, Jim M. Brandt, Ann C. Gentile, Benjamin Lim, Mike Showerman, Greg Bauer, Larry Kaplan, Zbigniew Kalbarczyk, William Kramer, Ravi K. Iyer:
Measuring Congestion in High-Performance Datacenter Interconnects. NSDI 2020: 37-57 - [c29]Ron Brightwell, Kurt B. Ferreira, Ryan E. Grant, Scott Levy, Jay F. Lofstead, Stephen L. Olivier, Kevin T. Pedretti, Andrew J. Younge, Ann C. Gentile, Jim M. Brandt:
ALAMO: Autonomous Lightweight Allocation, Management, and Optimization. SMC 2020: 408-422 - [i3]Archit Patke, Saurabh Jha, Haoran Qiu, Jim M. Brandt, Ann C. Gentile, Joe Greenseid, Zbigniew Kalbarczyk, Ravishankar K. Iyer:
Application-aware Congestion Mitigation forHigh-Performance Computing Systems. CoRR abs/2012.07755 (2020)
2010 – 2019
- 2019
- [j4]Ramin Izadpanah, Benjamin A. Allan, Damian Dechev, Jim M. Brandt:
Production Application Performance Data Streaming for System Monitoring. ACM Trans. Model. Perform. Evaluation Comput. Syst. 4(2): 8:1-8:25 (2019) - [j3]Ozan Tuncer, Emre Ates, Yijia Zhang, Ata Turk, Jim M. Brandt, Vitus J. Leung, Manuel Egele, Ayse K. Coskun:
Online Diagnosis of Performance Variation in HPC Systems Using Machine Learning. IEEE Trans. Parallel Distributed Syst. 30(4): 883-896 (2019) - [c28]Saurabh Jha, Archit Patke, Jim M. Brandt, Ann C. Gentile, Mike Showerman, Eric Roman, Zbigniew T. Kalbarczyk, Bill Kramer, Ravishankar K. Iyer:
A Study of Network Congestion in Two Supercomputing High-Speed Interconnects. Hot Interconnects 2019: 45-48 - [c27]Emre Ates, Yijia Zhang, Burak Aksar, Jim M. Brandt, Vitus J. Leung, Manuel Egele, Ayse K. Coskun:
HPAS: An HPC Performance Anomaly Suite for Reproducing Performance Variations. ICPP 2019: 40:1-40:10 - [i2]Valerio Formicola, Saurabh Jha, Daniel Chen, Fei Deng, Amanda Bonnie, Mike Mason, Jim M. Brandt, Ann C. Gentile, Larry Kaplan, Jason Repik, Jeremy Enos, Mike Showerman, Annette Greiner, Zbigniew Kalbarczyk, Ravishankar K. Iyer, Bill Kramer:
Understanding Fault Scenarios and Impacts through Fault Injection Experiments in Cielo. CoRR abs/1907.01019 (2019) - [i1]Saurabh Jha, Archit Patke, Jim M. Brandt, Ann C. Gentile, Mike Showerman, Eric Roman, Zbigniew T. Kalbarczyk, William T. Kramer, Ravishankar K. Iyer:
A Study of Network Congestion in Two Supercomputing High-Speed Interconnects. CoRR abs/1907.05312 (2019) - 2018
- [c26]Ville Ahlgren, Stefan Andersson, Jim M. Brandt, Nicholas Cardo, Sudheer Chunduri, Jeremy Enos, Parks Fields, Ann C. Gentile, Richard Gerber, Michael Gienger, Joe Greenseid, Annette Greiner, Bilel Hadri, Yun He, Dennis Hoppe, Urpo Kaila, Kaki Kelly, Mark Klein, Alex Kristiansen, Steve Leak, Mike Mason, Kevin T. Pedretti, Jean-Guillaume Piccinali, Jason Repik, Jim Rogers, Susanna Salminen, Mike Showerman, Cary Whitney, Jim Williams:
Large-Scale System Monitoring Experiences and Recommendations. CLUSTER 2018: 532-542 - [c25]Saurabh Jha, Jim M. Brandt, Ann C. Gentile, Zbigniew Kalbarczyk, Ravishankar K. Iyer:
Characterizing Supercomputer Traffic Networks Through Link-Level Analysis. CLUSTER 2018: 562-570 - [c24]Emre Ates, Ozan Tuncer, Ata Turk, Vitus J. Leung, Jim M. Brandt, Manuel Egele, Ayse K. Coskun:
Taxonomist: Application Detection Through Rich Monitoring Data. Euro-Par 2018: 92-105 - [c23]Ramin Izadpanah, Nichamon Naksinehaboon, Jim M. Brandt, Ann C. Gentile, Damian Dechev:
Integrating Low-latency Analysis into HPC System Monitoring. ICPP 2018: 5:1-5:10 - [c22]Kenneth Lamar, Ramin Izadpanah, Jim M. Brandt, Damian Dechev:
An Efficient Latch-free Database Index Based on Multi-dimensional Lists. IPCCC 2018: 1-2 - 2017
- [c21]Saurabh Jha, Jim M. Brandt, Ann C. Gentile, Zbigniew Kalbarczyk, Gregory H. Bauer, Jeremy Enos, Michael T. Showerman, Larry Kaplan, Brett M. Bode, Annette Greiner, Amanda Bonnie, Mike Mason, Ravishankar K. Iyer, William Kramer:
Holistic Measurement-Driven System Assessment. CLUSTER 2017: 797-800 - [c20]Ozan Tuncer, Emre Ates, Yijia Zhang, Ata Turk, Jim M. Brandt, Vitus J. Leung, Manuel Egele, Ayse K. Coskun:
Diagnosing Performance Variations in HPC Applications Using Machine Learning. ISC 2017: 355-373 - 2016
- [j2]Anthony M. Agelastos, Benjamin A. Allan, Jim M. Brandt, Ann C. Gentile, Sophia Lefantzi, Steve Monk, Jeff Ogden, Mahesh Rajan, Joel Stevenson:
Continuous whole-system monitoring toward rapid understanding of production HPC applications and systems. Parallel Comput. 58: 90-106 (2016) - [c19]Benjamin A. Allan, Jim M. Brandt, Ann C. Gentile, Cory Lueninghoener, Nichamon Naksinehaboon, Boyana Norris, Narate Taerat:
HPCMASPA Introduction and Committees. IPDPS Workshops 2016: 1665-1666 - [c18]Jim M. Brandt, Ann C. Gentile, Michael T. Showerman, Jeremy Enos, Joshi Fullop, Gregory H. Bauer:
Large-Scale Persistent Numerical Data Source Monitoring System Experiences. IPDPS Workshops 2016: 1711-1720 - [c17]Sam Sanchez, Amanda Bonnie, Graham van Heule, Conor Robinson, Adam DeConinck, Kathleen Kelly, Quellyn Snead, Jim M. Brandt:
Design and Implementation of a Scalable HPC Monitoring System. IPDPS Workshops 2016: 1721-1725 - 2015
- [c16]Anthony M. Agelastos, Benjamin A. Allan, Jim M. Brandt, Ann C. Gentile, Sophia Lefantzi, Steve Monk, Jeff Ogden, Mahesh Rajan, Joel Stevenson:
Toward Rapid Understanding of Production HPC Applications and Systems. CLUSTER 2015: 464-473 - [c15]Jim M. Brandt, Ann C. Gentile, Cindy Martin, Jason Repik, Narate Taerat:
New Systems, New Behaviors, New Patterns: Monitoring Insights from System Standup. CLUSTER 2015: 658-665 - [c14]Steven D. Feldman, Deli Zhang, Damian Dechev, James Brandt:
Extending LDMS to Enable Performance Monitoring in Multi-core Applications. CLUSTER 2015: 717-720 - [c13]Jim M. Brandt, Karen D. Devine, Ann C. Gentile:
Infrastructure for In Situ System Monitoring and Application Data Analysis. ISAV@SC 2015: 36-40 - 2014
- [c12]Jim M. Brandt, Karen D. Devine, Ann C. Gentile, Kevin T. Pedretti:
Demonstrating improved application performance using dynamic monitoring and task mapping. CLUSTER 2014: 408-415 - [c11]Anthony M. Agelastos, Benjamin A. Allan, Jim M. Brandt, Paul Cassella, Jeremy Enos, Joshi Fullop, Ann C. Gentile, Steve Monk, Nichamon Naksinehaboon, Jeff Ogden, Mahesh Rajan, Michael T. Showerman, Joel Stevenson, Narate Taerat, Thomas W. Tucker:
The Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and Applications. SC 2014: 154-165 - 2012
- [c10]Li Yu, Ziming Zheng, Zhiling Lan, Terry R. Jones, Jim M. Brandt, Ann C. Gentile:
Filtering log data: Finding the needles in the Haystack. DSN 2012: 1-12 - 2011
- [j1]Narate Taerat, Jim M. Brandt, Ann C. Gentile, Matthew Wong, Chokchai Leangsuksun:
Baler: deterministic, lossless log message clustering tool. Comput. Sci. Res. Dev. 26(3-4): 285-295 (2011) - [c9]Jim M. Brandt, Frank Chen, Ann C. Gentile, Chokchai Leangsuksun, Jackson R. Mayo, Philippe P. Pébay, Diana C. Roe, Narate Taerat, David C. Thompson, M. H. Wong:
Framework for Enabling System Understanding. Euro-Par Workshops (2) 2011: 231-240 - 2010
- [c8]James Brandt, Frank Chen, Vincent De Sapio, Ann C. Gentile, Jackson R. Mayo, Philippe P. Pébay, Diana C. Roe, David C. Thompson, Matthew Wong:
Using Cloud Constructs and Predictive Analysis to Enable Pre-Failure Process Migration in HPC Systems. CCGRID 2010: 703-708 - [c7]James Brandt, Frank Chen, Vincent De Sapio, Ann C. Gentile, Jackson R. Mayo, Philippe P. Pébay, Diana C. Roe, David C. Thompson, Matthew Wong:
Quantifying effectiveness of failure prediction and response in HPC systems: Methodology and example. DSN Workshops 2010: 2-7 - [c6]James Brandt, Frank Chen, Vincent De Sapio, Ann C. Gentile, Jackson R. Mayo, Philippe P. Pébay, Diana C. Roe, David C. Thompson, Matthew Wong:
Combining Virtualization, resource characterization, and Resource management to enable efficient high performance compute platforms through intelligent dynamic resource allocation. IPDPS Workshops 2010: 1-8
2000 – 2009
- 2009
- [c5]Jim M. Brandt, Ann C. Gentile, Jackson R. Mayo, Philippe P. Pébay, Diana C. Roe, David C. Thompson, Matthew Wong:
Resource monitoring and management with OVIS to enable HPC in cloud computing environments. IPDPS 2009: 1-8 - 2008
- [c4]Jim M. Brandt, Bert J. Debusschere, Ann C. Gentile, Jackson R. Mayo, Philippe P. Pébay, David C. Thompson, Matthew Wong:
Using Probabilistic Characterization to Reduce Runtime Faults in HPC Systems. CCGRID 2008: 759-764 - [c3]Jim M. Brandt, Bert J. Debusschere, Ann C. Gentile, Jackson R. Mayo, Philippe P. Pébay, David C. Thompson, M. H. Wong:
Ovis-2: A robust distributed architecture for scalable RAS. IPDPS 2008: 1-8 - 2006
- [c2]Jim M. Brandt, Ann C. Gentile, D. J. Hale, Philippe P. Pébay:
OVIS: a tool for intelligent, real-time monitoring of computational clusters. IPDPS 2006 - 2005
- [c1]Jim M. Brandt, Ann C. Gentile, Youssef M. Marzouk, Philippe P. Pébay:
Meaningful Automated Statistical Analysis of Large Computational Clusters. CLUSTER 2005: 1-2
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:19 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint