The University of Southampton

×

Is language modeling enough? Evaluating effective embedding combinations

Is language modeling enough? Evaluating effective embedding combinations

Is language modeling enough? Evaluating effective embedding combinations

Universal embeddings, such as BERT or ELMo, are useful for a broad set of natural language processing tasks like text classification or sentiment analysis. Moreover, specialized embeddings also exist for tasks like topic modeling or named entity disambiguation. We study if we can complement these universal embeddings with specialized embeddings. We conduct an in-depth evaluation of nine well known natural language understanding tasks with SentEval. Also, we extend SentEval with two additional tasks to the medical domain. We present PubMedSection, a novel topic classification dataset focussed on the biomedical domain. Our comprehensive analysis covers 11 tasks and combinations of six embeddings. We report that combined embeddings outperform state of the art universal embeddings without any embedding fine-tuning. We observe that adding topic model based embeddings helps for most tasks and that differing pre-training tasks encode complementary features. Moreover, we present new state of the art results on the MPQA and SUBJ tasks in SentEval.

Schneider, Rudolf

de17d245-1142-433b-a4db-704531922037

Oberhauser, Tom

1a354535-f0d6-4337-bdf0-f85b48920f46

Grundmann, Paul

89c6b557-3123-49a3-a110-cfcd42b65861

Gers, Felix Alexander

3558b668-ca2c-4ef0-9fd8-b2f17440e126

Löser, Alexander

d4833e3d-0f0d-40c9-86a1-a557594e60f9

Staab, Steffen

bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Schneider, Rudolf

de17d245-1142-433b-a4db-704531922037

Oberhauser, Tom

1a354535-f0d6-4337-bdf0-f85b48920f46

Grundmann, Paul

89c6b557-3123-49a3-a110-cfcd42b65861

Gers, Felix Alexander

3558b668-ca2c-4ef0-9fd8-b2f17440e126

Löser, Alexander

d4833e3d-0f0d-40c9-86a1-a557594e60f9

Staab, Steffen

bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Schneider, Rudolf, Oberhauser, Tom, Grundmann, Paul, Gers, Felix Alexander, Löser, Alexander and Staab, Steffen (2020) Is language modeling enough? Evaluating effective embedding combinations. Proceedings of the 12th International Conference on Language Resources and Evaluation, , Marseille, France. 11 - 16 May 2020. (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

Universal embeddings, such as BERT or ELMo, are useful for a broad set of natural language processing tasks like text classification or sentiment analysis. Moreover, specialized embeddings also exist for tasks like topic modeling or named entity disambiguation. We study if we can complement these universal embeddings with specialized embeddings. We conduct an in-depth evaluation of nine well known natural language understanding tasks with SentEval. Also, we extend SentEval with two additional tasks to the medical domain. We present PubMedSection, a novel topic classification dataset focussed on the biomedical domain. Our comprehensive analysis covers 11 tasks and combinations of six embeddings. We report that combined embeddings outperform state of the art universal embeddings without any embedding fine-tuning. We observe that adding topic model based embeddings helps for most tasks and that differing pre-training tasks encode complementary features. Moreover, we present new state of the art results on the MPQA and SUBJ tasks in SentEval.

Text

LREC20_LM_TM(27)(1) - Author's Original

Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (204kB)

More information

Accepted/In Press date: 11 February 2020

Venue - Dates: Proceedings of the 12th International Conference on Language Resources and Evaluation, , Marseille, France, 2020-05-11 - 2020-05-16

Learn more about Web and Internet Science research

Identifiers

Local EPrints ID: 438613

URI: http://eprints.soton.ac.uk/id/eprint/438613

PURE UUID: df30f6d4-8937-433b-a872-ad41fa4c68c3

ORCID for Steffen Staab:

orcid.org/0000-0002-0780-4154

Catalogue record

Date deposited: 18 Mar 2020 17:33

Last modified: 17 Mar 2024 03:38

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Rudolf Schneider

Author: Tom Oberhauser

Author: Paul Grundmann

Author: Felix Alexander Gers

Author: Alexander Löser

Author: Steffen Staab

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

Loading...

View more statistics

Library staff additional information

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

⇧ Back to top

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×