Livedoc: showing contextual information using topic modeling techniques
Livedoc: showing contextual information using topic modeling techniques
We present a solution named LiveDoc, which augments natural language text documents with relevant contextual background information. This background information helps readers to understand the context of the discourse better by fetching relevant information from other sources such as Wikipedia. Often the readers do not possess all background and supplementary information required for comprehending the purport of a narrative such as a news op-ed article. At the same time, it is not possible for authors to provide all contextual information while addressing a particular topic. LiveDoc processes the information in a document; uses extracted entities to fetch relevant background information in the context of the document from various sources (as defined by user) using semantic matching and topic modeling techniques like Latent Dirichlet Allocation and Hierarchical Dirichlet Process; and presents the background information to the user by augmenting the original document with the fetched information. Reader is then equipped better to understand the document with this additional background information. We present the effectiveness of our solution through extensive experimentation and associated results.
Data contextualization, Hierarchical Dirichlet Process, Information retrieval, Latent Dirichlet Allocation, Natural language processing, Topic modeling
468-482
Deshmukh, Jayati
5903b0c1-b4d1-4fbf-b687-610d4fde3990
Annervaz, K. M.
60ecdbb0-0673-49ca-92d4-29e48a46a0bb
Sengupta, Shubhashis
b7c8401f-33ff-4edc-89cf-228aa902a6cc
Pathak, Neetu
538ec4b0-7082-422b-bd22-3f9bcaeeb9a2
2016
Deshmukh, Jayati
5903b0c1-b4d1-4fbf-b687-610d4fde3990
Annervaz, K. M.
60ecdbb0-0673-49ca-92d4-29e48a46a0bb
Sengupta, Shubhashis
b7c8401f-33ff-4edc-89cf-228aa902a6cc
Pathak, Neetu
538ec4b0-7082-422b-bd22-3f9bcaeeb9a2
Deshmukh, Jayati, Annervaz, K. M., Sengupta, Shubhashis and Pathak, Neetu
(2016)
Livedoc: showing contextual information using topic modeling techniques.
Perner, Petra
(ed.)
In Machine Learning and Data Mining in Pattern Recognition - 12th International Conference, MLDM 2016, Proceedings.
vol. 9729,
Springer.
.
(doi:10.1007/978-3-319-41920-6_37).
Record type:
Conference or Workshop Item
(Paper)
Abstract
We present a solution named LiveDoc, which augments natural language text documents with relevant contextual background information. This background information helps readers to understand the context of the discourse better by fetching relevant information from other sources such as Wikipedia. Often the readers do not possess all background and supplementary information required for comprehending the purport of a narrative such as a news op-ed article. At the same time, it is not possible for authors to provide all contextual information while addressing a particular topic. LiveDoc processes the information in a document; uses extracted entities to fetch relevant background information in the context of the document from various sources (as defined by user) using semantic matching and topic modeling techniques like Latent Dirichlet Allocation and Hierarchical Dirichlet Process; and presents the background information to the user by augmenting the original document with the fetched information. Reader is then equipped better to understand the document with this additional background information. We present the effectiveness of our solution through extensive experimentation and associated results.
This record has no associated files available for download.
More information
Published date: 2016
Additional Information:
Publisher Copyright:
© Springer International Publishing Switzerland 2016.
Venue - Dates:
12th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2016, , New York, United States, 2016-07-16 - 2016-07-21
Keywords:
Data contextualization, Hierarchical Dirichlet Process, Information retrieval, Latent Dirichlet Allocation, Natural language processing, Topic modeling
Identifiers
Local EPrints ID: 493373
URI: http://eprints.soton.ac.uk/id/eprint/493373
ISSN: 0302-9743
PURE UUID: 4514879d-9245-4516-94c7-002195c88c33
Catalogue record
Date deposited: 30 Aug 2024 17:09
Last modified: 31 Aug 2024 02:12
Export record
Altmetrics
Contributors
Author:
Jayati Deshmukh
Author:
K. M. Annervaz
Author:
Shubhashis Sengupta
Author:
Neetu Pathak
Editor:
Petra Perner
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics