Wu, Z., Dong, Y.-N., Wei, H.-L. orcid.org/0000-0002-4704-7346 et al. (1 more author) (2020) Consistency measure based simultaneous feature selection and instance purification for multimedia traffic classification. Computer Networks, 173. 107190. ISSN 1389-1286
Abstract
With the increase of multimedia traffic, the implementation of fast and accurate classification has become an important issue. Besides, a manually captured dataset contains certain noise and mislabeled instances, which influences the accuracy of classifier to some extent. Motivated by these observations, a novel feature selection and instance purification (FS&IP) method based on consistency measure is proposed. It utilizes a linear consistency-constrained algorithm for feature selection. In each round of iteration, it removes the instance with the minor labels in every pattern subset. Our method has three desirable properties: 1) It can simultaneously achieve feature selection and data purification. 2) when purifying instance, it doesn’t need to annotate the noisy instance with learned labels; that is because it is an unsupervised method in terms of data purification. 3) through data purification, it is able to obtain a minimal feature subset on condition of maintaining accuracy. In addition, the proposed method can be used to discover a new discriminative feature based on linking behaviors called the flow fragment (F-Frag), which can reflect important information among the complex and multitudinous packet communication behaviors. The experimental results over six different datasets demonstrate the advantages of the proposed technique compared to six existing methods, and the discriminative power of the new flow fragment feature.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 Elsevier. This is an author produced version of a paper subsequently published in Computer Networks. Uploaded in accordance with the publisher's self-archiving policy. Article available under the terms of the CC-BY-NC-ND licence (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
Keywords: | Traffic classification; Feature selection; Instance purification; Flow fragment |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Automatic Control and Systems Engineering (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 08 Apr 2020 09:23 |
Last Modified: | 10 Mar 2021 01:38 |
Status: | Published |
Publisher: | Elsevier |
Refereed: | Yes |
Identification Number: | 10.1016/j.comnet.2020.107190 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:159329 |
Download
Filename: Computer Networks - Fina Accepted Manuscript (1).pdf
Licence: CC-BY-NC-ND 4.0