Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2739480.2754744acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Solving Interleaved and Blended Sequential Decision-Making Problems through Modular Neuroevolution

Published: 11 July 2015 Publication History

Abstract

Many challenging sequential decision-making problems require agents to master multiple tasks, such as defense and offense in many games. Learning algorithms thus benefit from having separate policies for these tasks, and from knowing when each one is appropriate. How well the methods work depends on the nature of the tasks: Interleaved tasks are disjoint and have different semantics, whereas blended tasks have regions where semantics from different tasks overlap. While many methods work well in interleaved tasks, blended tasks are difficult for methods with strict, human-specified task divisions, such as Multitask Learning. In such problems, task divisions should be discovered automatically. To demonstrate the power of this approach, the MM-NEAT neuroevolution framework is applied in this paper to two variants of the challenging video game of Ms. Pac-Man. In the simplified interleaved version of the game, the results demonstrate when and why such machine-discovered task divisions are useful. In the standard blended version of the game, a surprising, highly effective machine-discovered task division surpasses human-specified divisions, achieving the best scores to date in this game. Modular neuroevolution is thus a promising technique for discovering multimodal behavior for challenging real-world tasks.

References

[1]
A. M. Alhejali and S. M. Lucas. Evolving Diverse Ms. Pac-Man Playing Agents Using Genetic Programming. In UKCI, 2010.
[2]
A. M. Alhejali and S. M. Lucas. Using Genetic Programming to Evolve Heuristics for a Monte Carlo Tree Search Ms Pac-Man Agent. In CIG, pages 65--72. IEEE, 2013.
[3]
A. G. Barto and S. Mahadevan. Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems, pages 41--77, 2003.
[4]
M. F. Brandstetter and S. Ahmadi. Reactive Control of Ms. Pac Man Using Information Retrieval Based on Genetic Programming. In CIG, pages 250--256. IEEE, 2012.
[5]
R. A. Brooks. A Robust Layered Control System for a Mobile Robot. Robotics and Automation, 2(10), 1986.
[6]
R. Calabretta, S. Nolfi, D. Parisi, and G. Wagner. Duplication of Modules Facilitates the Evolution of Functional Specialization. ALife, 6(1):69--84, 2000.
[7]
R. A. Caruana. Multitask Learning: A Knowledge-based Source of Inductive Bias. In ICML, pages 41--48, 1993.
[8]
J. Clune, J.-B. Mouret, and H. Lipson. The Evolutionary Origins of Modularity. Royal Society B, pages 20122863--20122863, 2013.
[9]
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. Evolutionary Computation, 6:182--197, 2002.
[10]
T. G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning. In ICML, 1998.
[11]
B. Hengst. Discovering Hierarchy in Reinforcement Learning with HEXQ. In ICML, pages 243--250, 2002.
[12]
J. Huizinga, J.-B. Mouret, and J. Clune. Evolving Neural Networks That Are Both Modular and Regular: HyperNeat Plus the Connection Cost Technique. In GECCO, pages 697--704. ACM, 2014.
[13]
N. Kashtan and U. Alon. Spontaneous Evolution of Modularity and Network Motifs. National Academy of Sciences, 102(39):13773--13778, 2005.
[14]
J. D. Knowles, R. A. Watson, and D. Corne. Reducing Local Optima in Single-Objective Problems by Multi-objectivization. In EMO, pages 269--283. Springer, 2001.
[15]
N. Kohl and R. Miikkulainen. Evolving Neural Networks for Strategic Decision-Making Problems. Neural Networks, Special issue on Goal-Directed Neural Systems, 2009.
[16]
G. Konidaris and A. Barto. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining. In NIPS, pages 1015--1023, 2009.
[17]
J. R. Koza. Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, 1994.
[18]
D. Lessin, D. Fussell, and R. Miikkulainen. Open-Ended Behavioral Complexity for Evolved Virtual Creatures. In GECCO, pages 335--342. ACM, 2013.
[19]
J.-B. Mouret and S. Doncieux. MENNAG: A Modular, Regular and Hierarchical Encoding for Neural-networks Based on Attribute Grammars. Evolutionary Intelligence, 2008.
[20]
G. Recio, E. Martín, C. Estébanez, and Y. Sáez. AntBot: Ant Colonies for Video Games. TCIAIG, 4(4):295--308, 2012.
[21]
P. Rohlfshagen and S. M. Lucas. Ms Pac-Man versus Ghost Team CEC 2011 Competition. In CEC, pages 70--77. IEEE, 2011.
[22]
J. P. Rosca. Generality Versus Size in Genetic Programming. In GP, pages 381--387. MIT Press, 1996.
[23]
J. Schrum and R. Miikkulainen. Evolving Multimodal Networks for Multitask Games. TCIAIG, 4(2):94--111, 2012.
[24]
J. Schrum and R. Miikkulainen. Evolving Multimodal Behavior With Modular Neural Networks in Ms. Pac-Man. In GECCO, pages 325--332. ACM, 2014.
[25]
K. O. Stanley, B. D. Bryant, I. Karpov, and R. Miikkulainen. Real-Time Evolution of Neural Networks in the NERO Video Game. In National Conference on Artificial Intelligence, 2006.
[26]
K. O. Stanley and R. Miikkulainen. Evolving Neural Networks Through Augmenting Topologies. Evolutionary Computation, pages 99--127, 2002.
[27]
P. Stone and M. Veloso. Layered Learning. In ECML, pages 369--381. Springer Verlag, 2000.
[28]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
[29]
R. S. Sutton, D. Precup, and S. P. Singh. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artificial Intelligence, 112(1--2):181--211, 1999.
[30]
T. Thompson, F. Milne, A. Andrew, and J. Levine. Improving Control Through Subsumption in the EvoTanks Domain. In CIG, pages 363--370. IEEE, 2009.
[31]
J. Togelius. Evolution of a Subsumption Architecture Neurocontroller. Intelligent and Fuzzy Systems, pages 15--20, 2004.
[32]
N. van Hoorn, J. Togelius, and J. Schmidhuber. Hierarchical Controller Learning in a First-Person Shooter. In CIG, pages 294--301. IEEE, 2009.
[33]
P. Verbancsics and K. O. Stanley. Constraining Connectivity to Encourage Modularity in HyperNEAT. In GECCO, pages 1483--1490. ACM, 2011.

Cited By

View all
  • (2016)Solving multiple isolated, interleaved, and blended tasks through modular neuroevolutionEvolutionary Computation10.1162/EVCO_a_0018124:3(459-490)Online publication date: 1-Sep-2016
  • (2015)Evolving Neural NetworksProceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation10.1145/2739482.2756577(137-161)Online publication date: 11-Jul-2015

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '15: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation
July 2015
1496 pages
ISBN:9781450334723
DOI:10.1145/2739480
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. games
  2. multi-objective optimization
  3. neural networks

Qualifiers

  • Research-article

Funding Sources

Conference

GECCO '15
Sponsor:

Acceptance Rates

GECCO '15 Paper Acceptance Rate 182 of 505 submissions, 36%;
Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Solving multiple isolated, interleaved, and blended tasks through modular neuroevolutionEvolutionary Computation10.1162/EVCO_a_0018124:3(459-490)Online publication date: 1-Sep-2016
  • (2015)Evolving Neural NetworksProceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation10.1145/2739482.2756577(137-161)Online publication date: 11-Jul-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media