The University of Southampton
University of Southampton Institutional Repository

Düzenli Ifadeler ile Ingilizce Dil Gruplarinin Analiz Edilmesi

Düzenli Ifadeler ile Ingilizce Dil Gruplarinin Analiz Edilmesi
Düzenli Ifadeler ile Ingilizce Dil Gruplarinin Analiz Edilmesi

In most of widely used distance education platforms which are named as MOOC (Massive Open Online Courses) language of lectures are English, but even so, they have participants from a lot of different countries. This situation causes differences in learners usage behaviors and performances. In our previous studies we tried to divide the users into language groups according to their English language proficiency. In this study, with natural language processing techniques we aimed to improve the division of language groups of students and automatically generate datasets which belong to language groups from a distance education platform named as FutureLearn. In FutureLearn platform (like other distance education platforms), learners do not have to provide their country information while registering. Also for some of the learners, provided country information belongs to where they currently live which is different from their home country. In such situations, it is not possible to determine whether English is their first, official or secondary language. Our study focused on using regex patterns to update learners language groups' labels with aim of using them in future studies like predicting the learners' language groups. As data source the datasets of «Understanding Language: Learning and Teaching-4» course on the FutureLearn platform is used. To update the language groups with natural language processing we mostly used features like learners' comments, ids, and country information. As a result of this study, with the analysis of the comments of the users, we identified 63.06% of all commented users' language groups which consist of English as official and primary language, English is official but not primary language and English is not official language. It is observed that 78.19% of these learners belong to the same language group as their provided country information in registration progress and 21.81% of users groups' home country is different from their language group which is identified from their comments. When we just use their country information (the information provided in registration step) number of English language group identified learners were lower and identified learners' language groups could be wrong.

FutureLearn, identification of English language groups, MOOC, natural language processing, Regex
IEEE
Duru, Ismail
deabd39c-9f2f-4d56-ac2e-bb205a89a55d
Diri, Banu
bb699481-69e7-47fd-9bef-5923de449d8d
Özçevik, M. Emir
d595ab34-7deb-4cd3-a56c-2e576ca853d2
Ataseven, Kerim
822ada7f-a8d9-4e8b-ab0d-3db97a4ea13b
Doǧan, Gülüstan
30ad4cd6-1955-4ced-882c-d76a0ec25741
White, Su
5f9a277b-df62-4079-ae97-b9c35264c146
Duru, Ismail
deabd39c-9f2f-4d56-ac2e-bb205a89a55d
Diri, Banu
bb699481-69e7-47fd-9bef-5923de449d8d
Özçevik, M. Emir
d595ab34-7deb-4cd3-a56c-2e576ca853d2
Ataseven, Kerim
822ada7f-a8d9-4e8b-ab0d-3db97a4ea13b
Doǧan, Gülüstan
30ad4cd6-1955-4ced-882c-d76a0ec25741
White, Su
5f9a277b-df62-4079-ae97-b9c35264c146

Duru, Ismail, Diri, Banu, Özçevik, M. Emir, Ataseven, Kerim, Doǧan, Gülüstan and White, Su (2018) Düzenli Ifadeler ile Ingilizce Dil Gruplarinin Analiz Edilmesi. In Proceedings - 2018 Innovations in Intelligent Systems and Applications Conference, ASYU 2018. IEEE.. (doi:10.1109/ASYU.2018.8554018).

Record type: Conference or Workshop Item (Paper)

Abstract

In most of widely used distance education platforms which are named as MOOC (Massive Open Online Courses) language of lectures are English, but even so, they have participants from a lot of different countries. This situation causes differences in learners usage behaviors and performances. In our previous studies we tried to divide the users into language groups according to their English language proficiency. In this study, with natural language processing techniques we aimed to improve the division of language groups of students and automatically generate datasets which belong to language groups from a distance education platform named as FutureLearn. In FutureLearn platform (like other distance education platforms), learners do not have to provide their country information while registering. Also for some of the learners, provided country information belongs to where they currently live which is different from their home country. In such situations, it is not possible to determine whether English is their first, official or secondary language. Our study focused on using regex patterns to update learners language groups' labels with aim of using them in future studies like predicting the learners' language groups. As data source the datasets of «Understanding Language: Learning and Teaching-4» course on the FutureLearn platform is used. To update the language groups with natural language processing we mostly used features like learners' comments, ids, and country information. As a result of this study, with the analysis of the comments of the users, we identified 63.06% of all commented users' language groups which consist of English as official and primary language, English is official but not primary language and English is not official language. It is observed that 78.19% of these learners belong to the same language group as their provided country information in registration progress and 21.81% of users groups' home country is different from their language group which is identified from their comments. When we just use their country information (the information provided in registration step) number of English language group identified learners were lower and identified learners' language groups could be wrong.

Full text not available from this repository.

More information

Published date: 29 November 2018
Venue - Dates: 2018 Innovations in Intelligent Systems and Applications Conference, ASYU 2018, Adana, Turkey, 2018-10-04 - 2018-10-06
Alternative titles: Analysis of English language groups with regular expressions
Keywords: FutureLearn, identification of English language groups, MOOC, natural language processing, Regex

Identifiers

Local EPrints ID: 427610
URI: http://eprints.soton.ac.uk/id/eprint/427610
PURE UUID: 70c3633a-c183-481a-897b-851a6d11343f
ORCID for Su White: ORCID iD orcid.org/0000-0001-9588-5275

Catalogue record

Date deposited: 24 Jan 2019 17:30
Last modified: 20 Jul 2019 01:09

Export record

Altmetrics

Contributors

Author: Ismail Duru
Author: Banu Diri
Author: M. Emir Özçevik
Author: Kerim Ataseven
Author: Gülüstan Doǧan
Author: Su White ORCID iD

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×