The University of Southampton
University of Southampton Institutional Repository

A semi-supervised regression model for mixed numerical and categorical variables

Ng, Michael K., Chan, Elaine Y., So, M.C. and Ching, Wai-Ki (2007) A semi-supervised regression model for mixed numerical and categorical variables Pattern Recognition, 40, (16), pp. 1745-1752. (doi:10.1016/j.patcog.2006.06.018).

Record type: Article


In this paper, we develop a semi-supervised regression algorithm to analyze data sets which contain both categorical and numerical attributes. This algorithm partitions the data sets into several clusters and at the same time fits a multivariate regression model to each cluster. This framework allows one to incorporate both multivariate regression models for numerical variables (supervised learning methods) and k-mode clustering algorithms for categorical variables (unsupervised learning methods). The estimates of regression models and k-mode parameters can be obtained simultaneously by minimizing a function which is the weighted sum of the least-square errors in the multivariate regression models and the dissimilarity measures among the categorical variables. Both synthetic and real data sets are presented to demonstrate the effectiveness of the proposed method.

Full text not available from this repository.

More information

Published date: 1 June 2007
Keywords: clustering, regression, data mining, numerical variables, categorical variables


Local EPrints ID: 180719
ISSN: 0031-3203
PURE UUID: 47e8a94e-764f-4643-a82b-ae48832aeb21

Catalogue record

Date deposited: 13 Apr 2011 14:53
Last modified: 18 Jul 2017 12:00

Export record



Author: Michael K. Ng
Author: Elaine Y. Chan
Author: M.C. So
Author: Wai-Ki Ching

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton:

ePrints Soton supports OAI 2.0 with a base URL of

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.