The University of Southampton
University of Southampton Institutional Repository

Improving k-nearest neighbor pattern recognition models for privacy-preserving data analysis

Improving k-nearest neighbor pattern recognition models for privacy-preserving data analysis
Improving k-nearest neighbor pattern recognition models for privacy-preserving data analysis

Supervised learning classification models use labeled data to train models on a discrete form for generating predictions. A major challenge addressed in this paper is training a machine learning model to the recognition of a pattern data perspective of the original datasets and privacy-preserving datasets to improve predictive models. The model training process, the training datasets, and validation datasets are mixed with data and privacy-preserving data cause overfitting from high variance in the machine learning algorithm. This paper addresses a k-Nearest Neighbor algorithm to build models, apply an automated hyperparameter tuning method to determine the optimal parameters based on the characteristics before the training process of a large volume datasets. Evaluating the model to achieve goals based on a high score of accuracy results on quality prediction and performance models. The experiments from our real datasets and the UCI machine learning repository show the best method for all of the training data and conduct difference experiments for improving accuracy, feasibility, correctness and reliability of the scheme.

automated hyperparameter tuning, k-Nearest Neighbor (k-NN), privacy-preserving data, supervised learning classification models
5804-5813
IEEE
Romsaiyud, Walisa
24c63d03-1894-48d8-b1cb-7352310e1405
Schnoor, Henning
ce6f449e-dd65-4e8b-884e-687589a9f17b
Hasselbring, Wilhelm
ee89c5c9-a900-40b1-82c1-552268cd01bd
Baru, Chaitanya
Huan, Jun
Khan, Latifur
Hu, Xiaohua Tony
Ak, Ronay
Tian, Yuanyuan
Barga, Roger
Zaniolo, Carlo
Lee, Kisung
Ye, Yanfang Fanny
Romsaiyud, Walisa
24c63d03-1894-48d8-b1cb-7352310e1405
Schnoor, Henning
ce6f449e-dd65-4e8b-884e-687589a9f17b
Hasselbring, Wilhelm
ee89c5c9-a900-40b1-82c1-552268cd01bd
Baru, Chaitanya
Huan, Jun
Khan, Latifur
Hu, Xiaohua Tony
Ak, Ronay
Tian, Yuanyuan
Barga, Roger
Zaniolo, Carlo
Lee, Kisung
Ye, Yanfang Fanny

Romsaiyud, Walisa, Schnoor, Henning and Hasselbring, Wilhelm (2020) Improving k-nearest neighbor pattern recognition models for privacy-preserving data analysis. Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung and Ye, Yanfang Fanny (eds.) In 2019 IEEE International Conference on Big Data (Big Data). IEEE. pp. 5804-5813 . (doi:10.1109/BigData47090.2019.9006281).

Record type: Conference or Workshop Item (Paper)

Abstract

Supervised learning classification models use labeled data to train models on a discrete form for generating predictions. A major challenge addressed in this paper is training a machine learning model to the recognition of a pattern data perspective of the original datasets and privacy-preserving datasets to improve predictive models. The model training process, the training datasets, and validation datasets are mixed with data and privacy-preserving data cause overfitting from high variance in the machine learning algorithm. This paper addresses a k-Nearest Neighbor algorithm to build models, apply an automated hyperparameter tuning method to determine the optimal parameters based on the characteristics before the training process of a large volume datasets. Evaluating the model to achieve goals based on a high score of accuracy results on quality prediction and performance models. The experiments from our real datasets and the UCI machine learning repository show the best method for all of the training data and conduct difference experiments for improving accuracy, feasibility, correctness and reliability of the scheme.

This record has no associated files available for download.

More information

e-pub ahead of print date: 24 February 2020
Venue - Dates: 2019 IEEE International Conference on Big Data, Big Data 2019, , Los Angeles, United States, 2019-12-09 - 2019-12-12
Keywords: automated hyperparameter tuning, k-Nearest Neighbor (k-NN), privacy-preserving data, supervised learning classification models

Identifiers

Local EPrints ID: 488760
URI: http://eprints.soton.ac.uk/id/eprint/488760
PURE UUID: 70cb2e83-392b-478f-aff7-772a9310c8ef
ORCID for Wilhelm Hasselbring: ORCID iD orcid.org/0000-0001-6625-4335

Catalogue record

Date deposited: 05 Apr 2024 16:36
Last modified: 10 Apr 2024 02:15

Export record

Altmetrics

Contributors

Author: Walisa Romsaiyud
Author: Henning Schnoor
Author: Wilhelm Hasselbring ORCID iD
Editor: Chaitanya Baru
Editor: Jun Huan
Editor: Latifur Khan
Editor: Xiaohua Tony Hu
Editor: Ronay Ak
Editor: Yuanyuan Tian
Editor: Roger Barga
Editor: Carlo Zaniolo
Editor: Kisung Lee
Editor: Yanfang Fanny Ye

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×