Predoaia, Ionut orcid.org/0000-0002-2009-4054 and García-López, Pedro (2025) A Cloud-Agnostic Serverless Architecture for Distributed Machine Learning. In: Proceedings - 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024. 11th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024, 16-19 Dec 2024 IEEE , ARE , pp. 131-140.
Abstract
Serverless computing has shown vast potential for big data analytics applications, especially involving machine learning algorithms. Nevertheless, little consideration has been given in the literature to cloud-agnostic serverless architectures that leverage existing parallel implementations of machine learning algorithms. This work bridges this gap by proposing a multi-cloud serverless architecture for distributed machine learning, that enables machine learning engineers without cloud computing expertise to effortlessly port already implemented parallel machine learning algorithms to serverless, whilst overcoming vendor lock-in. In this work, two stateful machine learning algorithms have been ported to serverless, k-means clustering and logistic regression. The serverless implementation of k-means provided superior performance and scalability compared to a serverful implementation when using a number of workers that is equal to or slightly lower than the total number of vCPUs available on the VM running the serverful implementation. Additionally, it achieved an 87-fold speedup compared to a sequential implementation. Moreover, two storage designs of the shared state will be proposed for the serverless implementations, one that requires locks for updating the shared state, and another that is lock-free. Our experimental evaluation demonstrates that the performance of the lock-free serverless implementation of k-means declines with the increase in the number of clusters.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | This is an author-produced version of the published paper. Uploaded in accordance with the University’s Research Publications and Open Access policy. |
Keywords: | Distributed Machine Learning,Big Data,Serverless Architectures,Cloud Agnostic,Multicloud,Lithops |
Dates: |
|
Institution: | The University of York |
Academic Units: | The University of York > Faculty of Sciences (York) > Computer Science (York) |
Depositing User: | Pure (York) |
Date Deposited: | 20 May 2025 11:20 |
Last Modified: | 29 May 2025 23:14 |
Published Version: | https://doi.org/10.1109/BDCAT63179.2024.00032 |
Status: | Published |
Publisher: | IEEE |
Identification Number: | 10.1109/BDCAT63179.2024.00032 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:226782 |
Download
Filename: A_Cloud_Agnostic_Serverless_Architecture_for_Distributed_Machine_Learning.pdf
Description: A_Cloud_Agnostic_Serverless_Architecture_for_Distributed_Machine_Learning
Licence: CC-BY 2.5