The University of Southampton
University of Southampton Institutional Repository

Correspondence analysis: handling cell-wise outliers via the reconstitution algorithm

Correspondence analysis: handling cell-wise outliers via the reconstitution algorithm
Correspondence analysis: handling cell-wise outliers via the reconstitution algorithm
Correspondence analysis (CA) is a popular technique to visualize the relationship between two categorical variables. CA uses the data from a two-way contingency table and is affected by the presence of outliers. The supplementary points method is a popular method to handle outliers. Its disadvantage is that the information from entire rows or columns is removed. However, outliers can be caused by cells only. In this paper, a reconstitution algorithm is introduced to cope with such cells. This algorithm can reduce the contribution of cells in CA instead of deleting entire rows or columns. Thus the remaining information in the row and column involved can be used in the analysis. The reconstitution algorithm is compared with two alternative methods for handling outliers, the supplementary points method and MacroPCA. It is shown that the proposed strategy works well.
2331-8422
Qi, Qianqian
47673ec0-7ef7-413d-8102-10789990f40c
Hessen, David J.
5e4ddabd-0df6-48e4-8c6e-478e2f1940ec
Vonk, Aike N.
2e3cc931-9fb5-4dc3-a5c8-7237c743f9dd
Van Der Heijden, Peter
85157917-3b33-4683-81be-713f987fd612
Qi, Qianqian
47673ec0-7ef7-413d-8102-10789990f40c
Hessen, David J.
5e4ddabd-0df6-48e4-8c6e-478e2f1940ec
Vonk, Aike N.
2e3cc931-9fb5-4dc3-a5c8-7237c743f9dd
Van Der Heijden, Peter
85157917-3b33-4683-81be-713f987fd612

[Unknown type: UNSPECIFIED]

Record type: UNSPECIFIED

Abstract

Correspondence analysis (CA) is a popular technique to visualize the relationship between two categorical variables. CA uses the data from a two-way contingency table and is affected by the presence of outliers. The supplementary points method is a popular method to handle outliers. Its disadvantage is that the information from entire rows or columns is removed. However, outliers can be caused by cells only. In this paper, a reconstitution algorithm is introduced to cope with such cells. This algorithm can reduce the contribution of cells in CA instead of deleting entire rows or columns. Thus the remaining information in the row and column involved can be used in the analysis. The reconstitution algorithm is compared with two alternative methods for handling outliers, the supplementary points method and MacroPCA. It is shown that the proposed strategy works well.

Text
Correspondence analysis- handling cell-wise outliers via the reconstitution algorithm - Author's Original
Download (380kB)

More information

Accepted/In Press date: 26 April 2024

Identifiers

Local EPrints ID: 490023
URI: http://eprints.soton.ac.uk/id/eprint/490023
ISSN: 2331-8422
PURE UUID: 8b8fbccf-b5db-405d-abe3-4354dda147ec
ORCID for Peter Van Der Heijden: ORCID iD orcid.org/0000-0002-3345-096X

Catalogue record

Date deposited: 13 May 2024 17:03
Last modified: 14 May 2024 01:44

Export record

Contributors

Author: Qianqian Qi
Author: David J. Hessen
Author: Aike N. Vonk

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×