MORE, AMIT, Anwar, Tarique orcid.org/0000-0001-7157-0236 and YADAV, POONAM orcid.org/0000-0003-0169-0704 (Accepted: 2026) CEDAR: Carbon Efficient Dynamic Allocation and Routing for Agentic LLM Inference. In: GreenSys ’26. . , pp. 1-6. (In Press)
Abstract
LLM inference now dominates operational AI compute, yet pro- duction serving stacks typically optimise for performance alone, leaving cost and carbon unmanaged. We present CEDAR, a queue level multi objective control framework for agentic LLM inference that jointly optimises tail latency, cloud cost, and marginal carbon emissions. CEDAR observes backlog depth, waiting time percentiles, and service level objective (SLO) slack to route mixed criticality re- quests across heterogeneous, geo-distributed fleets. In trace-driven evaluation, CEDAR reduces cost by up to 26% and carbon by up to 27% relative to a Performance-Only baseline, while maintaining competitive p95 latency (0.88 s) and low SLO violation (4.3%). These results indicate queue level control as a practical path to sustainable agentic inference without unacceptable QoS degradation.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2026 Copyright held by the owner/author(s) |
| Dates: |
|
| Institution: | The University of York |
| Academic Units: | The University of York > Faculty of Sciences (York) > Computer Science (York) |
| Funding Information: | Funder Grant number EPSRC EP/X040518/1 EPSRC EP/Y019229/1 |
| Date Deposited: | 02 Apr 2026 15:00 |
| Last Modified: | 06 May 2026 05:05 |
| Status: | In Press |
| Sustainable Development Goals: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:239602 |
Download
Filename: Eurosys-GreenSys-Energy_Simulated_data.pdf
Description: Eurosys-GreenSys-Energy_Simulated_data
Licence: CC-BY-NC-ND 2.5



CORE (COnnecting REpositories)
CORE (COnnecting REpositories)