SemanticNews: Enriching publishing of news stories

Hare, Jonathon, Newman, David, Peters, Wim, Greenwood, Mark and Eggink, Jana (2014) SemanticNews: Enriching publishing of news stories University of Southampton


[img] PDF SemanticNewsFinalReport.pdf - Version of Record
Download (2MB)


A central goal for the EPSRC funded Semantic Media Network project is to support interesting collaboration opportunities between researchers in order to foster relationships and encourage working together (EPSRC priority 'Working Together'). SemanticNews was one of the four projects funded in the first round of Semantic Media Network mini-projects, and was collaboration between the Universities of Southampton and Sheffield, together with the BBC.
The SemanticNews project aimed to promote people's comprehension and assimilation of news by augmenting broadcast news discussion and debate with information from the semantic web in the form of linked open data (LOD). The project has laid the foundations for a toolkit for (semi- ) automatic provision of semantic analysis and contextualization of the discussion of current events, encompassing state of the art semantic web technologies including text mining, consolidation against Linked Open Data, and advanced visualisation.
SemanticNews was bootstrapped using episodes of the BBC Question Time programme that already had transcripts and manually curated metadata, which included a list of the topical questions being debated. This information was used to create a workflow that a) extracts relevant entities using established named entity recognition techniques to identify the types of information to contextualise for a news article; b) provides associations with concepts from LOD resources; and, c) visualises the context using information derived from the LOD cloud.
This document forms the final report of the SemanticNews project, and describes in detail the processes and techniques explored for the enrichment of Question Time episodes. The final section of the report discusses how this work could be expanded in the future, and also makes a few recommendations for additional data that could be could be captured during the production process that would make the automatic generation of the contextualisation easier.

Item Type: Monograph (Project Report)
Organisations: Web & Internet Science
ePrint ID: 366832
Date :
Date Event
9 January 2014Published
Date Deposited: 11 Jul 2014 13:53
Last Modified: 17 Apr 2017 13:30
Further Information:Google Scholar

Actions (login required)

View Item View Item