Peng, P. orcid.org/0000-0003-2700-5699 and Yoshida, Y. (2020) Average sensitivity of spectral clustering. In: KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. The 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2020), 23-27 Aug 2020, Virtual Event, CA, USA. Association for Computing Machinery (ACM) , pp. 1132-1140. ISBN 9781450379984
Abstract
Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in measurement, withholding for a privacy reason, or arbitrariness in data conversion. To make reliable and efficient decisions based on spectral clustering, we assess the stability of spectral clustering against edge perturbations in the input graph using the notion of average sensitivity, which is the expected size of the symmetric difference of the output clusters before and after we randomly remove edges. We first prove that the average sensitivity of spectral clustering is proportional to $łambda_2/łambda_3^2$, where $łambda_i$ is the i-th smallest eigenvalue of the (normalized) Laplacian. We also prove an analogous bound for k-way spectral clustering, which partitions the graph into k clusters. Then, we empirically confirm our theoretical bounds by conducting experiments on synthetic and real networks. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 The Authors. This is an author-produced version of a paper subsequently published in KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Spectral clustering; Laplacian; average sensitivity |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 09 Jun 2020 09:43 |
Last Modified: | 16 Sep 2020 14:02 |
Status: | Published |
Publisher: | Association for Computing Machinery (ACM) |
Refereed: | Yes |
Identification Number: | 10.1145/3394486.3403166 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:161662 |