Multi-agent Reinforcement Learning for predator-prey ecosystem
| dc.contributor.author | Palak, Michal | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för matematiska vetenskaper | sv |
| dc.contributor.examiner | Mostad, Petter | |
| dc.contributor.supervisor | Strannegård, Claes | |
| dc.date.accessioned | 2026-06-08T12:55:09Z | |
| dc.date.issued | 2026 | |
| dc.date.submitted | ||
| dc.description.abstract | Agent-based ecosystem models provide a flexible way to study population dynamics by simulating the behaviour of individual organisms. However, manually specifying realistic animal behaviour can be difficult, especially when agents must balance multiple needs and interact with other species. This thesis investigates the use of multi-agent reinforcement learning for simulating a simplified predator-prey ecosystem in a two-dimensional grid-based environment. A custom ecosystem environment was developed in which prey and predator agents interact with spatially distributed resources, including multiple grass types and water reservoirs. The model includes survival constraints based on energy, thirst, age, movement, reproduction, predation, and resource consumption. In contrast to simpler predator-prey simulations, the environment requires prey agents to balance grazing and drinking, creating a simple form of migration between food and water resources. Age-dependent movement speed was also introduced as a way to model increased vulnerability among young and old individuals. Several learning configurations were evaluated. In particular, the thesis compares a standard survival reward with a homeostatic reward based on internal energy and water levels, as well as two training setups for handling agent death. The results indicate that the homeostatic reward might improve early learning however the results were not statistically significant. The respawn based training environment significantly improves training efficiency compared with the standard setup. Comparisons between PPO, TRPO, TQC and hand coded agents showed broadly similar performance after extended training. Trained agents were also evaluated on satellitederived terrain maps, where no significant reduction in performance measure was observed compared with Perlin noise-generated maps. The results suggest that multi-agent reinforcement learning can be used to generate stable predator-prey dynamics in a spatially structured ecosystem while producing useful simulation statistics such as population trends, survival distributions, reproduction patterns, and spatial movement heatmaps. | |
| dc.identifier.coursecode | MVEX03 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12380/311134 | |
| dc.language.iso | eng | |
| dc.setspec.uppsok | PhysicsChemistryMaths | |
| dc.subject | multi-agent reinforcement learning, predator-prey, lotka-volterra | |
| dc.title | Multi-agent Reinforcement Learning for predator-prey ecosystem | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Engineering mathematics and computational science (MPENM), MSc |
