default search action
Muning Wen
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j5]Dongzi Wang, Fangwei Zhong, Minglong Li, Muning Wen, Yuanxi Peng, Teng Li, Adam Yang:
RoMAT: Role-based multi-agent transformer for generalizable heterogeneous cooperation. Neural Networks 174: 106129 (2024) - [j4]Shangding Gu, Dianye Huang, Muning Wen, Guang Chen, Alois Knoll:
Safe Multiagent Learning With Soft Constrained Policy Optimization in Real Robot Control. IEEE Trans. Ind. Informatics 20(9): 10706-10716 (2024) - [c5]Ziyu Wan, Xidong Feng, Muning Wen, Stephen Marcus McAleer, Ying Wen, Weinan Zhang, Jun Wang:
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training. ICML 2024 - [c4]Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang:
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision. SIGIR 2024: 3-13 - [i14]Muning Wen, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen:
Entropy-Regularized Token-Level Policy Optimization for Large Language Models. CoRR abs/2402.06700 (2024) - [i13]Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang:
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision. CoRR abs/2403.06221 (2024) - [i12]Muning Wen, Ziyu Wan, Weinan Zhang, Jun Wang, Ying Wen:
Reinforcing Language Agents via Policy Optimization with Action Decomposition. CoRR abs/2405.15821 (2024) - [i11]Yingxuan Yang, Huayi Wang, Muning Wen, Weinan Zhang:
P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training. CoRR abs/2408.05541 (2024) - [i10]Yiwei Shi, Muning Wen, Qi Zhang, Weinan Zhang, Cunjia Liu, Weiru Liu:
Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation. CoRR abs/2409.09541 (2024) - [i9]Qiqiang Lin, Muning Wen, Qiuying Peng, Guanyu Nie, Junwei Liao, Jun Wang, Xiaoyun Mo, Jiamu Zhou, Cheng Cheng, Yin Zhao, Jun Wang, Weinan Zhang:
Hammer: Robust Function-Calling for On-Device Language Models via Function Masking. CoRR abs/2410.04587 (2024) - 2023
- [j3]Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Hai-Feng Zhang, Weinan Zhang:
Large sequence models for sequential decision-making: a survey. Frontiers Comput. Sci. 17(6): 176349 (2023) - [j2]Linghui Meng, Muning Wen, Chenyang Le, Xiyun Li, Dengpeng Xing, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Yaodong Yang, Bo Xu:
Offline Pre-trained Multi-agent Decision Transformer. Mach. Intell. Res. 20(2): 233-248 (2023) - [j1]Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Yong Yu, Jun Wang, Weinan Zhang:
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning. J. Mach. Learn. Res. 24: 150:1-150:12 (2023) - [i8]Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang:
Large Sequence Models for Sequential Decision-Making: A Survey. CoRR abs/2306.13945 (2023) - [i7]Xidong Feng, Ziyu Wan, Muning Wen, Ying Wen, Weinan Zhang, Jun Wang:
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training. CoRR abs/2309.17179 (2023) - 2022
- [c3]Jakub Grudzien Kuba, Ruiqing Chen, Muning Wen, Ying Wen, Fanglei Sun, Jun Wang, Yaodong Yang:
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning. ICLR 2022 - [c2]Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang:
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem. NeurIPS 2022 - [i6]Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang:
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem. CoRR abs/2205.14953 (2022) - 2021
- [c1]Jakub Grudzien Kuba, Muning Wen, Linghui Meng, Shangding Gu, Haifeng Zhang, David Mguni, Jun Wang, Yaodong Yang:
Settling the Variance of Multi-Agent Policy Gradients. NeurIPS 2021: 13458-13470 - [i5]Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang:
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning. CoRR abs/2106.07551 (2021) - [i4]Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang:
Settling the Variance of Multi-Agent Policy Gradients. CoRR abs/2108.08612 (2021) - [i3]Jakub Grudzien Kuba, Ruiqing Chen, Muning Wen, Ying Wen, Fanglei Sun, Jun Wang, Yaodong Yang:
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning. CoRR abs/2109.11251 (2021) - [i2]Shangding Gu, Jakub Grudzien Kuba, Muning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois C. Knoll, Yaodong Yang:
Multi-Agent Constrained Policy Optimisation. CoRR abs/2110.02793 (2021) - [i1]Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu:
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks. CoRR abs/2112.02845 (2021)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-22 19:47 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint