Items where Funder is Foreign Commonwealth and Development Office.
The University of Leeds
Cohn, A.G. orcid.org/0000-0002-7652-8907 and Blackwell, R.E. (2024) Evaluating the Ability of Large Language Models to Reason about Cardinal Directions. [Preprint]
Li, F. orcid.org/0000-0002-1109-6285, Hogg, D.C. orcid.org/0000-0002-6125-9564 and Cohn, A.G. orcid.org/0000-0002-7652-8907 (2024) Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark. In: Wooldridge, M.J., Dy, J.G. and Natarajan, S., (eds.) Proceedings of the AAAI Conference on Artificial Intelligence. Thirty-Eighth AAAI Conference on Artificial Intelligence, 20-27 Feb 2024, Vancouver, Canada. AAAI , pp. 18500-18507. ISBN 978-1-57735-887-9
Malfa, E.L., Petrov, A., Frieder, S. et al. (5 more authors) (2023) The ARRT of Language-Models-as-a-Service: Overview of a New Paradigm and its Challenges. [Preprint]
Huang, X.A., Malfa, E.L., Marro, S. et al. (3 more authors) (2024) A Notion of Complexity for Theory of Mind via Discrete World Models. [Preprint]
Cohn, A.G. orcid.org/0000-0002-7652-8907 and Blackwell, R.E. (2024) Can Large Language Models Reason about the Region Connection Calculus? [Preprint]
Lin, F., La Malfa, E., Hofmann, V. et al. (3 more authors) (2024) Graph-enhanced large language models in asynchronous plan reasoning. In: Proceedings of Machine Learning Research. ICML'24: Proceedings of the 41st International Conference on Machine Learning, 21-27 Jul 2024, Vienna, Austria. ACM , pp. 30108-30134.
Li, F., Hogg, D.C. and Cohn, A.G. orcid.org/0000-0002-7652-8907 (2024) Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning. [Preprint]