Bigger Isn't better : the ethical and scientific vices of extra-large datasets in language models

Goetze, T.S. orcid.org/0000-0002-3435-3264 and Abramson, D. (2021) Bigger Isn't better : the ethical and scientific vices of extra-large datasets in language models. In: WebSci '21: 13th ACM Web Science Conference 2021 Proceedings. WebSci '21: 13th ACM Web Science Conference, 21-25 Jun 2021, Virtual conference. Association for Computing Machinery , pp. 69-75. ISBN 9781450385251

Abstract

Metadata

Authors/Creators:
Copyright, Publisher and Additional Information: © 2021 The Authors. This is an author-produced version of a paper subsequently published in WebSci '21: 13th ACM Web Science Conference Proceedings. Uploaded in accordance with the publisher's self-archiving policy.
Keywords: computer ethics; Natural Language Processing; Computing profession; Free and open source software; philosophy of computer science
Dates:
  • Accepted: 20 May 2021
  • Published (online): 21 June 2021
  • Published: 21 June 2021
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Arts and Humanities (Sheffield) > Department of Philosophy (Sheffield)
Funding Information:
FunderGrant number
Social Sciences and Humanities Research CouncilBPF-162695
Depositing User: Symplectic Sheffield
Date Deposited: 30 Jun 2021 10:05
Last Modified: 30 Jun 2021 10:42
Status: Published
Publisher: Association for Computing Machinery
Refereed: Yes
Identification Number: https://doi.org/10.1145/3462741.3466809

Export

Statistics