Optimizing latency in multi-agent systems

Prum, Sophearoth

Optimizing latency in multi-agent systems

dc.contributor.author	Prum, Sophearoth
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.date.accessioned	2026-06-30T11:45:14Z
dc.date.issued	2026
dc.date.submitted
dc.description.abstract	Large Language Model (LLM)-based multi-agent systems are increasingly used to solve complex tasks through collaboration between specialized agents. However, the use of multiple agents, tool invocations, and inter-agent communication can introduce significant latency and cost, limiting practical deployment. This thesis investigates how architectural optimizations affect the performance of an LLM-based multi-agent system. A financial analysis pipeline was implemented using the Agent-to-Agent (A2A) protocol for inter-agent communication and the Model Context Protocol (MCP) for tool use. Four cumulative optimization techniques were evaluated: agent parallelization, tool batching, schema pruning, and model assignment. Performance was assessed using end-to-end latency, inference cost, and output quality. The results show that agent parallelization provides negligible latency improvement under the evaluated deployment conditions due to shared model endpoint contention. In contrast, tool batching reduces median latency up to 27.4% and inference cost by 54.6% while improving output quality from 4.18 to 5.00. Schema pruning and model assignment techniques further reduces inference cost up to 77.1% compared to the baseline without degrading quality. Overall, the results suggest that reducing tool invocation overhead and unnecessary context transfer provides greater benefits than agent-level parallelization in the evaluated multi-agent architecture.
dc.identifier.uri	https://hdl.handle.net/20.500.12380/311685
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Agent-to-Agent, Multi-agent system, Optimization, Latency, MCP
dc.title	Optimizing latency in multi-agent systems
dc.type.degree	Examensarbete på kandidatnivå	sv
dc.type.degree	Bachelor Thesis	en
dc.type.uppsok	M2

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 26-18 SP.pdf
Size:: 1.31 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för kandidatexamen