• Gerogiannis G and Torrellas J. (2024). Practical Online Reinforcement Learning for Microprocessors With Micro-Armed Bandit. IEEE Micro. 44:4. (80-87). Online publication date: 1-Jul-2024.

    https://doi.org/10.1109/MM.2024.3408719

  • Gerogiannis G and Torrellas J. Micro-Armed Bandit: Lightweight & Reusable Reinforcement Learning for Microarchitecture Decision-Making. 56th Annual IEEE/ACM International Symposium on Microarchitecture. (698-713).

    https://doi.org/10.1145/3613424.3623780

  • Donyanavard B, Mück T, Rahmani A, Dutt N, Sadighi A, Maurer F and Herkersdorf A. SOSA. Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. (685-698).

    https://doi.org/10.1145/3352460.3358312

  • Ding Y, Mishra N and Hoffmann H. Generative and multi-phase learning for computer systems optimization. Proceedings of the 46th International Symposium on Computer Architecture. (39-52).

    https://doi.org/10.1145/3307650.3326633

  • Rahmani A, Donyanavard B, Mück T, Moazzemi K, Jantsch A, Mutlu O and Dutt N. (2018). SPECTR. ACM SIGPLAN Notices. 53:2. (169-183). Online publication date: 30-Nov-2018.

    https://doi.org/10.1145/3296957.3173199

  • Rahmani A, Donyanavard B, Mück T, Moazzemi K, Jantsch A, Mutlu O and Dutt N. SPECTR. Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. (169-183).

    https://doi.org/10.1145/3173162.3173199

  • Xu Q, Jeon H, Kim K, Ro W and Annavaram M. (2016). Warped-slicer. ACM SIGARCH Computer Architecture News. 44:3. (230-242). Online publication date: 12-Oct-2016.

    https://doi.org/10.1145/3007787.3001161

  • Zhang Y and Lin W. (2016). Efficient resource sharing algorithm for physical register file in simultaneous multi-threading processors. Microprocessors & Microsystems. 45:PB. (270-282). Online publication date: 1-Sep-2016.

    https://doi.org/10.1016/j.micpro.2016.06.002

  • Xu Q, Jeon H, Kim K, Ro W and Annavaram M. Warped-slicer. Proceedings of the 43rd International Symposium on Computer Architecture. (230-242).

    https://doi.org/10.1109/ISCA.2016.29

  • Zhou M, Du Y, Childers B, Mosse D and Melhem R. (2016). Symmetry-Agnostic Coordinated Management of the Memory Hierarchy in Multicore Systems. ACM Transactions on Architecture and Code Optimization. 12:4. (1-26). Online publication date: 7-Jan-2016.

    https://doi.org/10.1145/2847254

  • Porter L, Laurenzano M, Tiwari A, Jundt A, Ward, Jr. W, Campbell R and Carrington L. (2015). Making the Most of SMT in HPC. ACM Transactions on Architecture and Code Optimization. 11:4. (1-26). Online publication date: 9-Jan-2015.

    https://doi.org/10.1145/2687651

  • Jiménez V, Cazorla F, Gioiosa R, Buyuktosunoglu A, Bose P, O'Connell F and Mealey B. (2014). Adaptive Prefetching on POWER7. ACM Transactions on Parallel Computing. 1:1. (1-25). Online publication date: 3-Oct-2014.

    https://doi.org/10.1145/2588889

  • Kucuk G, Uslu G and Yesil C. History-Based Predictive Instruction Window Weighting for SMT Processors. Proceedings of the 29th International Conference on Supercomputing - Volume 8488. (187-198).

    https://doi.org/10.1007/978-3-319-07518-1_12

  • Zhang Y, Hays M, Lin W and John E. Autonomous control of issue queue utilization for simultaneous multi-threading processors. Proceedings of the High Performance Computing Symposium. (1-8).

    /doi/10.5555/2663510.2663529

  • Dubach C, Jones T and Bonilla E. (2013). Dynamic microarchitectural adaptation using machine learning. ACM Transactions on Architecture and Code Optimization. 10:4. (1-28). Online publication date: 1-Dec-2013.

    https://doi.org/10.1145/2541228.2541238

  • Feliu J, Sahuquillo J, Petit S and Duato J. L1-bandwidth aware thread allocation in multicore SMT processors. Proceedings of the 22nd international conference on Parallel architectures and compilation techniques. (123-132).

    /doi/10.5555/2523721.2523741

  • Jiménez V, Gioiosa R, Cazorla F, Buyuktosunoglu A, Bose P and O'Connell F. Making data prefetch smarter. Proceedings of the 21st international conference on Parallel architectures and compilation techniques. (137-146).

    https://doi.org/10.1145/2370816.2370837

  • Hoffmann H, Holt J, Kurian G, Lau E, Maggio M, Miller J, Neuman S, Sinangil M, Sinangil Y, Agarwal A, Chandrakasan A and Devadas S. Self-aware computing in the Angstrom processor. Proceedings of the 49th Annual Design Automation Conference. (259-264).

    https://doi.org/10.1145/2228360.2228409

  • Eyerman S and Eeckhout L. (2012). Probabilistic modeling for job symbiosis scheduling on SMT processors. ACM Transactions on Architecture and Code Optimization. 9:2. (1-27). Online publication date: 1-Jun-2012.

    https://doi.org/10.1145/2207222.2207223

  • Vandierendonck H and Seznec A. (2011). Managing SMT resource usage through speculative instruction window weighting. ACM Transactions on Architecture and Code Optimization. 8:3. (1-20). Online publication date: 1-Oct-2011.

    https://doi.org/10.1145/2019608.2019611

  • Chen J and John L. Predictive coordination of multiple on-chip resources for chip multiprocessors. Proceedings of the international conference on Supercomputing. (192-201).

    https://doi.org/10.1145/1995896.1995927

  • Dubach C, Jones T, Bonilla E and O'Boyle M. A Predictive Model for Dynamic Microarchitectural Adaptivity Control. Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture. (485-496).

    https://doi.org/10.1109/MICRO.2010.14

  • Watkins M and Albonesi D. Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors. Proceedings of the 19th international conference on Parallel architectures and compilation techniques. (41-52).

    https://doi.org/10.1145/1854273.1854284

  • Meng J, Tarjan D and Skadron K. (2010). Dynamic warp subdivision for integrated branch and memory divergence tolerance. ACM SIGARCH Computer Architecture News. 38:3. (235-246). Online publication date: 19-Jun-2010.

    https://doi.org/10.1145/1816038.1815992

  • Meng J, Tarjan D and Skadron K. Dynamic warp subdivision for integrated branch and memory divergence tolerance. Proceedings of the 37th annual international symposium on Computer architecture. (235-246).

    https://doi.org/10.1145/1815961.1815992

  • Eyerman S and Eeckhout L. Probabilistic job symbiosis modeling for SMT processor scheduling. Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems. (91-102).

    https://doi.org/10.1145/1736020.1736033

  • Eyerman S and Eeckhout L. (2010). Probabilistic job symbiosis modeling for SMT processor scheduling. ACM SIGPLAN Notices. 45:3. (91-102). Online publication date: 5-Mar-2010.

    https://doi.org/10.1145/1735971.1736033

  • Eyerman S and Eeckhout L. (2010). Probabilistic job symbiosis modeling for SMT processor scheduling. ACM SIGARCH Computer Architecture News. 38:1. (91-102). Online publication date: 5-Mar-2010.

    https://doi.org/10.1145/1735970.1736033

  • Ubal R, Sahuquillo J, Petit S and López P. Paired ROBs. Proceedings of the 15th International Euro-Par Conference on Parallel Processing. (309-320).

    https://doi.org/10.1007/978-3-642-03869-3_31

  • Liu C and Gaudiot J. The Impact of Resource Sharing Control on the Design of Multicore Processors. Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing. (315-326).

    https://doi.org/10.1007/978-3-642-03095-6_31

  • Eyerman S and Eeckhout L. (2009). Memory-level parallelism aware fetch policies for simultaneous multithreading processors. ACM Transactions on Architecture and Code Optimization. 6:1. (1-33). Online publication date: 30-Mar-2009.

    https://doi.org/10.1145/1509864.1509867

  • Eyerman S and Eeckhout L. Per-thread cycle accounting in SMT processors. Proceedings of the 14th international conference on Architectural support for programming languages and operating systems. (133-144).

    https://doi.org/10.1145/1508244.1508260

  • Eyerman S and Eeckhout L. (2009). Per-thread cycle accounting in SMT processors. ACM SIGARCH Computer Architecture News. 37:1. (133-144). Online publication date: 1-Mar-2009.

    https://doi.org/10.1145/2528521.1508260

  • Eyerman S and Eeckhout L. (2009). Per-thread cycle accounting in SMT processors. ACM SIGPLAN Notices. 44:3. (133-144). Online publication date: 28-Feb-2009.

    https://doi.org/10.1145/1508284.1508260

  • Choi S and Yeung D. (2009). Hill-climbing SMT processor resource distribution. ACM Transactions on Computer Systems. 27:1. (1-47). Online publication date: 1-Feb-2009.

    https://doi.org/10.1145/1482619.1482620

  • Bitirgen R, Ipek E and Martinez J. Coordinated management of multiple interacting resources in chip multiprocessors. Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture. (318-329).

    https://doi.org/10.1109/MICRO.2008.4771801

  • Sharkey J, Loew J and Ponomarev D. (2008). Reducing register pressure in SMT processors through L2-miss-driven early register release. ACM Transactions on Architecture and Code Optimization. 5:3. (1-28). Online publication date: 1-Nov-2008.

    https://doi.org/10.1145/1455650.1455652

  • Wang H, Koren I and Krishna C. An adaptive resource partitioning algorithm for SMT processors. Proceedings of the 17th international conference on Parallel architectures and compilation techniques. (230-239).

    https://doi.org/10.1145/1454115.1454148

  • Sharkey J and Ponomarev D. An L2-miss-driven early register deallocation for SMT processors. Proceedings of the 21st annual international conference on Supercomputing. (138-147).

    https://doi.org/10.1145/1274971.1274992

  • Gerogiannis G and Torrellas J. Micro-Armed Bandit: Lightweight & Reusable Reinforcement Learning for Microarchitecture Decision-Making. Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture. (698-713).

    https://doi.org/10.1145/3613424.3623780

  • Weston K, Janfaza V, Taur A, Mungra D, Kansal A, Zahran M and Muzahid A. (2023). Post-Silicon Customization Using Deep Neural Networks. Architecture of Computing Systems. 10.1007/978-3-031-42785-5_9. (120-136).

    https://link.springer.com/10.1007/978-3-031-42785-5_9

  • Zhan H, Sheng V and Lin W. (2021). Reinforcement learning-based register renaming policy for simultaneous multithreading CPUs. Expert Systems with Applications. 10.1016/j.eswa.2021.115717. (115717). Online publication date: 1-Aug-2021.

    https://linkinghub.elsevier.com/retrieve/pii/S095741742101099X

  • JIN X, YU N, ZHOU Y, HUANG B, YU Z, ZHAN X, WANG H, WANG S and BAO Y. (2020). Supporting Predictable Performance Guarantees for SMT Processors. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. 10.1587/transfun.2019EAP1146. E103.A:6. (806-820). Online publication date: 1-Jun-2020.

    https://www.jstage.jst.go.jp/article/transfun/E103.A/6/E103.A_2019EAP1146/_article

  • Carroll S and Lin W. (2019). Applied On-Chip Machine Learning for Dynamic Resource Control in Multithreaded Processors. Parallel Processing Letters. 10.1142/S0129626419500130. 29:03. (1950013). Online publication date: 1-Sep-2019.

    https://www.worldscientific.com/doi/abs/10.1142/S0129626419500130

  • Jin X, Zhou Y, Huang B, Yu Z, Zhan X, Wang H, Wang S, Yu N, Sun N and Bao Y. QoSMT. Proceedings of the ACM International Conference on Supercomputing. (206-216).

    https://doi.org/10.1145/3330345.3330364

  • Margaritov A, Gupta S, Gonzalez-Alberquilla R and Grot B. (2019). Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). 10.1109/HPCA.2019.00024. 978-1-7281-1444-6. (15-27).

    https://ieeexplore.ieee.org/document/8675191/

  • Rahmani A, Donyanavard B, Mück T, Moazzemi K, Jantsch A, Mutlu O and Dutt N. (2018). SPECTR. ACM SIGPLAN Notices. 53:2. (169-183). Online publication date: 30-Nov-2018.

    https://doi.org/10.1145/3296957.3173199

  • Inal G and Kucuk G. (2018). Application of Machine Learning Techniques on Prediction of Future Processor Performance 2018 Sixth International Symposium on Computing and Networking Workshops (CANDARW). 10.1109/CANDARW.2018.00044. 978-1-5386-9184-7. (190-195).

    https://ieeexplore.ieee.org/document/8590898/

  • Shahhosseini S, Moazzemi K, Rahmani A and Dutt N. (2018). On the feasibility of SISO control-theoretic DVFS for power capping in CMPs. Microprocessors and Microsystems. 10.1016/j.micpro.2018.09.012. 63. (249-258). Online publication date: 1-Nov-2018.

    https://linkinghub.elsevier.com/retrieve/pii/S0141933118300851

  • Chen J, Cai H and Wang W. (2018). A new metaheuristic algorithm. Soft Computing - A Fusion of Foundations, Methodologies and Applications. 22:12. (3857-3878). Online publication date: 1-Jun-2018.

    https://doi.org/10.1007/s00500-017-2845-7

  • Rahmani A, Donyanavard B, Mück T, Moazzemi K, Jantsch A, Mutlu O and Dutt N. SPECTR. Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. (169-183).

    https://doi.org/10.1145/3173162.3173199

  • Güngörer H and Küçük G. (2018). Dynamic Capping of Physical Register Files in Simultaneous Multi-threading Processors for Performance. Computer and Information Sciences. 10.1007/978-3-030-00840-6_5. (41-48).

    http://link.springer.com/10.1007/978-3-030-00840-6_5

  • Bao Y and Wang S. (2017). Labeled von Neumann Architecture for Software-Defined Cloud. Journal of Computer Science and Technology. 10.1007/s11390-017-1716-0. 32:2. (219-223). Online publication date: 1-Mar-2017.

    http://link.springer.com/10.1007/s11390-017-1716-0

  • Xu Q, Jeon H, Kim K, Ro W and Annavaram M. (2016). Warped-slicer. ACM SIGARCH Computer Architecture News. 44:3. (230-242). Online publication date: 12-Oct-2016.

    https://doi.org/10.1145/3007787.3001161

  • Wang X and Martínez J. (2016). ReBudget. ACM SIGARCH Computer Architecture News. 44:2. (19-32). Online publication date: 29-Jul-2016.

    https://doi.org/10.1145/2980024.2872382

  • Wang X and Martínez J. (2016). ReBudget. ACM SIGPLAN Notices. 51:4. (19-32). Online publication date: 9-Jun-2016.

    https://doi.org/10.1145/2954679.2872382

  • Wang X and Martínez J. ReBudget. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. (19-32).

    https://doi.org/10.1145/2872362.2872382

  • Zhou M, Du Y, Childers B, Mosse D and Melhem R. (2016). Symmetry-Agnostic Coordinated Management of the Memory Hierarchy in Multicore Systems. ACM Transactions on Architecture and Code Optimization. 12:4. (1-26). Online publication date: 7-Jan-2016.

    https://doi.org/10.1145/2847254

  • (2015). A resource utilization based instruction fetch policy for SMT processors. Microprocessors & Microsystems. 39:1. (1-10). Online publication date: 1-Feb-2015.

    https://doi.org/10.1016/j.micpro.2014.10.001

  • Porter L, Laurenzano M, Tiwari A, Jundt A, Ward, Jr. W, Campbell R and Carrington L. (2015). Making the Most of SMT in HPC. ACM Transactions on Architecture and Code Optimization. 11:4. (1-26). Online publication date: 9-Jan-2015.

    https://doi.org/10.1145/2687651

  • Guney I, Yildiz A, Bayindir I, Serdaroglu K, Bayik U and Kucuk G. (2015). A Machine Learning Approach for a Scalable, Energy-Efficient Utility-Based Cache Partitioning. High Performance Computing. 10.1007/978-3-319-20119-1_29. (409-421).

    https://link.springer.com/10.1007/978-3-319-20119-1_29

  • Fang J, Pan Z, Yu L and Liu S. A Case of Chip Multithreading Architecture with Resource Unit Manager. Proceedings of the 2013 International Conference on Information Science and Cloud Computing Companion. (495-501).

    https://doi.org/10.1109/ISCC-C.2013.8

  • Zhang Y, Douglas C and Lin W. (2013). Recalling instructions from idling threads to maximize resource utilization for simultaneous multi-threading processors. Computers and Electrical Engineering. 39:7. (2031-2044). Online publication date: 1-Oct-2013.

    https://doi.org/10.1016/j.compeleceng.2013.05.013

  • Zhang Y and Lin W. (2013). Capping Speculative Traces to Improve Performance in Simultaneous Multi-threading CPUs 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW). 10.1109/IPDPSW.2013.27. 978-0-7695-4979-8. (1555-1564).

    http://ieeexplore.ieee.org/document/6651052/

  • Nemirovsky M and Tullsen D. (2013). Multithreading Architecture. Synthesis Lectures on Computer Architecture. 10.2200/S00458ED1V01Y201212CAC021. 8:1. (1-109). Online publication date: 15-Jan-2013.

    http://www.morganclaypool.com/doi/abs/10.2200/S00458ED1V01Y201212CAC021

  • Güney I, Küçük G and Özcan E. (2013). Hyper-Heuristics for Performance Optimization of Simultaneous Multithreaded Processors. Information Sciences and Systems 2013. 10.1007/978-3-319-01604-7_10. (97-106).

    https://link.springer.com/10.1007/978-3-319-01604-7_10

  • Yang H, Zheng C, Shi X and Pan Z. (2012). Resource efficiency and equity in chip Multithreading architecture 2012 International Conference on Systems and Informatics (ICSAI). 10.1109/ICSAI.2012.6223162. 978-1-4673-0199-2. (932-937).

    http://ieeexplore.ieee.org/document/6223162/

  • Papadopoulos A, Maggio M, Negro S and Leva A. (2012). General control-theoretical framework for online resource allocation in computing systems. IET Control Theory & Applications. 10.1049/iet-cta.2011.0632. 6:11. (1594).

    http://digital-library.theiet.org/content/journals/10.1049/iet-cta.2011.0632

  • Yang Hua , Zheng Cai Ping , Zhou Zhen Hui , Zhuang Wei and Pan Zhuo Jin . (2011). Understanding performance-resource dependency by thread slicing and curve fitting 2011 International Conference on Computer Science and Network Technology (ICCSNT). 10.1109/ICCSNT.2011.6181900. 978-1-4577-1587-7. (17-22).

    http://ieeexplore.ieee.org/document/6181900/

  • Wang H, Koren I and Krishna C. (2011). Utilization-Based Resource Partitioning for Power-Performance Efficiency in SMT Processors. IEEE Transactions on Parallel and Distributed Systems. 22:7. (1150-1163). Online publication date: 1-Jul-2011.

    https://doi.org/10.1109/TPDS.2010.199

  • Eyerman S and Eeckhout L. (2010). Per-Thread Cycle Accounting. IEEE Micro. 30:1. (71-80). Online publication date: 1-Jan-2010.

    https://doi.org/10.1109/MM.2010.23

  • Martinez J and Ipek E. (2009). Dynamic Multicore Resource Management. IEEE Micro. 29:5. (8-17). Online publication date: 1-Sep-2009.

    https://doi.org/10.1109/MM.2009.77

  • Ubal R, Sahuquillo J, Petit S and López P. Paired ROBs. Proceedings of the 15th International Euro-Par Conference on Parallel Processing. (309-320).

    https://doi.org/10.1007/978-3-642-03869-3_31

  • Chen H, Ping L, Lu K and Jiang X. A Dynamic Resource Allocation Optimization for SMT Processors. Proceedings of the 2009 International Conference on Future Computer and Communication. (353-357).

    https://doi.org/10.1109/ICFCC.2009.47

  • Chen H, Ping L, Chen X and Lu K. Design of Non-Critical Path Resource Distributor for SMT Processors. Proceedings of the 2009 International Conference on Computer Engineering and Technology - Volume 02. (48-52).

    https://doi.org/10.1109/ICCET.2009.83

  • Chen H, Pan X, Ping L, Lu K and Chen X. (2008). A spatially triggered dissipative resource distribution policy for SMT processors. Journal of Zhejiang University-SCIENCE A. 10.1631/jzus.A0720083. 9:8. (1070-1082). Online publication date: 1-Aug-2008.

    http://link.springer.com/10.1631/jzus.A0720083

  • Chen Liu and Gaudiot J. (2008). Resource sharing control in Simultaneous MultiThreading microarchitectures 2008 13th Asia-Pacific Computer Systems Architecture Conference (ACSAC). 10.1109/APCSAC.2008.4625432. 978-1-4244-2682-9. (1-8).

    http://ieeexplore.ieee.org/document/4625432/

  • Boneti C, Cazorla F, Gioiosa R, Buyuktosunoglu A, Cher C and Valero M. Software-Controlled Priority Characterization of POWER5 Processor. Proceedings of the 35th Annual International Symposium on Computer Architecture. (415-426).

    https://doi.org/10.1109/ISCA.2008.8

  • Boneti C, Cazorla F, Gioiosa R, Buyuktosunoglu A, Cher C and Valero M. (2008). Software-Controlled Priority Characterization of POWER5 Processor. ACM SIGARCH Computer Architecture News. 36:3. (415-426). Online publication date: 1-Jun-2008.

    https://doi.org/10.1145/1394608.1382157

  • Latorre F, Gonzalez J and Gonzalez A. (2008). Efficient resources assignment schemes for clustered multithreaded processors Distributed Processing Symposium (IPDPS). 10.1109/IPDPS.2008.4536226. 978-1-4244-1693-6. (1-12).

    http://ieeexplore.ieee.org/document/4536226/

  • Kang D, Liu C and Gaudiot J. (2008). The impact of speculative execution on SMT processors. International Journal of Parallel Programming. 36:4. (361-385). Online publication date: 1-Apr-2008.

    https://doi.org/10.1007/s10766-007-0052-3