Welcome to the upgraded MacSphere! We're putting the finishing touches on it; if you notice anything amiss, email macsphere@mcmaster.ca

Neuroplasticity in Genetic Programming Agents for Adaptive and Continual Decision Making

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Dynamically decomposing complex tasks into reusable sub-policies remains a core challenge in Reinforcement Learning. Tangled Program Graphs, a genetic-programming framework for general-purpose machine learning (applied here to reinforcement learning), addresses this by evolving connections between different agents in order to break down complex problems into manageable sub-problems. Inspired by memetic algorithms, which accelerate evolutionary search through agentic refinement, we introduce Neuro-Tangled Program Graphs. This biologically grounded extension utilizes hierarchical plasticity within the structure of an agent, applying a homeostatic rule at the initial decision edges and a competitive Oja-style update in each subsequent decision edge. Evaluated on both a static and dynamic variant of the MuJoCo Ant environment, this approach yields higher peak returns and evolves with 59-88% fewer mean effective instructions used per step, demonstrating stronger performance and a more compact search. Next, we add an TD-style online value baseline and eligibility traces to stabilize and distribute dense step-wise rewards over time, sharpening temporal updates within each agent. We then examine how trace length and a per-team plasticity decay factor shape learning dynamics. To set these, we compare end-to-end evolutionary tuning with MAP-Elites using a multi-archive that explores (trace length x decay). The benefits of reward modulation are then tested for with TPG and NeuroTPG variants on a customized static and dynamic maze environment. This addition show a consistently better performance across all seeds and also a more interpretable final structure. Overall, our findings highlight the vital role of a local search within population search algorithms. Our studies hope to open a new avenue to gradient-free memetic algorithms which offer many benefits and opportunities from various already developed field of studies.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By