The University of Southampton
University of Southampton Institutional Repository

Automatic Arabic Text Classification

Automatic Arabic Text Classification
Automatic Arabic Text Classification
Automated document classification is an important text mining task especially with the rapid growth of the number of online documents present in Arabic language. Text classification aims to automatically assign the text to a predefined category based on linguistic features. Such a process has different useful applications including, but not restricted to, e-mail spam detection, web page content filtering, and automatic message routing. This paper presents the results of experiments on document classification achieved on seven different Arabic corpora using statistical methodology. The performance of two popular classification algorithms in classifying the aforementioned corpora has been evaluated.
Al-Harbi, S
8cea09f6-3898-49bc-8185-5ff893d2a05c
Almuhareb, A
7e902bc1-bc0c-4413-9dae-74140745d1b8
Al-Thubaity, A
c3969617-310a-40f0-9ad5-824fdbfc3a3c
Khorsheed, M. S.
380be73d-eb54-4298-98b2-066d70f7f487
Al-Rajeh, A
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Al-Harbi, S
8cea09f6-3898-49bc-8185-5ff893d2a05c
Almuhareb, A
7e902bc1-bc0c-4413-9dae-74140745d1b8
Al-Thubaity, A
c3969617-310a-40f0-9ad5-824fdbfc3a3c
Khorsheed, M. S.
380be73d-eb54-4298-98b2-066d70f7f487
Al-Rajeh, A
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd

Al-Harbi, S, Almuhareb, A, Al-Thubaity, A, Khorsheed, M. S. and Al-Rajeh, A (2008) Automatic Arabic Text Classification. Proceedings of The 9th International Conference on the Statistical Analysis of Textual Data, France.

Record type: Conference or Workshop Item (Paper)

Abstract

Automated document classification is an important text mining task especially with the rapid growth of the number of online documents present in Arabic language. Text classification aims to automatically assign the text to a predefined category based on linguistic features. Such a process has different useful applications including, but not restricted to, e-mail spam detection, web page content filtering, and automatic message routing. This paper presents the results of experiments on document classification achieved on seven different Arabic corpora using statistical methodology. The performance of two popular classification algorithms in classifying the aforementioned corpora has been evaluated.

Text
Arabic-Classification.pdf - Other
Download (290kB)

More information

Published date: March 2008
Venue - Dates: Proceedings of The 9th International Conference on the Statistical Analysis of Textual Data, France, 2008-03-01
Organisations: Electronics & Computer Science, Southampton Wireless Group

Identifiers

Local EPrints ID: 272254
URI: https://eprints.soton.ac.uk/id/eprint/272254
PURE UUID: d9af8b3d-f7a5-402c-9db0-10a438968fb2

Catalogue record

Date deposited: 05 May 2011 18:21
Last modified: 01 Dec 2017 17:33

Export record

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of https://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×