Boosting performance in data science competition using topic-driven analytics: evidence from recommendation system design on Kaggle
Boosting performance in data science competition using topic-driven analytics: evidence from recommendation system design on Kaggle
Research developments in the recommendation system and electronic commerce literature present more accurate and comprehensive recommendation system solutions. However, while these developments add new features to the recommendation systems, the question of whether a novel solution would excel in practice remains. Open innovation and crowdsourcing platforms are becoming an arena for designers to test their solutions in business competitions. We show how structural topical modeling identifies topical themes that improve contestant performance using forum message data during the competition period. Our topic modeling analysis identifies technological and business issues that emerge in recommendation system development. An econometric framework further investigates the link between topic distribution and performance. The multiperiod difference-in-differences estimator reports no significant statistical relation when linking all message communications to the performance. However, topic-dominant and topic-dispersed messages are both found to positively and significantly impact performance. Our result shows that structural topical modeling has an essential role to critically examine the most valuable message links to boost performance. Stakeholders may prioritize the messages with specific topics and/or a mixture of topics. We provide research and practical implications for researchers, business analysts, developers, and managers to improve their experiences when engaging in recommendation system design on platforms.
Bibliographies, Business, Companies, Data science, Decision support, Dispersion, Recommender systems, Technological innovation, difference-in-differences (DID), knowledge sharing, recommendation systems, structural topic modeling (STM)
1-12
Li, Libo
838dda30-da62-41ad-b57c-bee6ad59acd3
2 September 2022
Li, Libo
838dda30-da62-41ad-b57c-bee6ad59acd3
Li, Libo
(2022)
Boosting performance in data science competition using topic-driven analytics: evidence from recommendation system design on Kaggle.
IEEE Transactions on Engineering Management, .
(doi:10.1109/TEM.2022.3199688).
Abstract
Research developments in the recommendation system and electronic commerce literature present more accurate and comprehensive recommendation system solutions. However, while these developments add new features to the recommendation systems, the question of whether a novel solution would excel in practice remains. Open innovation and crowdsourcing platforms are becoming an arena for designers to test their solutions in business competitions. We show how structural topical modeling identifies topical themes that improve contestant performance using forum message data during the competition period. Our topic modeling analysis identifies technological and business issues that emerge in recommendation system development. An econometric framework further investigates the link between topic distribution and performance. The multiperiod difference-in-differences estimator reports no significant statistical relation when linking all message communications to the performance. However, topic-dominant and topic-dispersed messages are both found to positively and significantly impact performance. Our result shows that structural topical modeling has an essential role to critically examine the most valuable message links to boost performance. Stakeholders may prioritize the messages with specific topics and/or a mixture of topics. We provide research and practical implications for researchers, business analysts, developers, and managers to improve their experiences when engaging in recommendation system design on platforms.
Text
tem_body_textV11
- Accepted Manuscript
Text
Boosting_Performance_in_Data_Science_Competition_Using_Topic-Driven_Analytics_Evidence_From_Recommendation_System_Design_on_Kaggle
Restricted to Repository staff only
Request a copy
More information
Accepted/In Press date: 8 August 2022
e-pub ahead of print date: 2 September 2022
Published date: 2 September 2022
Additional Information:
Publisher Copyright:
IEEE
Keywords:
Bibliographies, Business, Companies, Data science, Decision support, Dispersion, Recommender systems, Technological innovation, difference-in-differences (DID), knowledge sharing, recommendation systems, structural topic modeling (STM)
Identifiers
Local EPrints ID: 469888
URI: http://eprints.soton.ac.uk/id/eprint/469888
ISSN: 0018-9391
PURE UUID: b8e29305-5c02-4931-af5d-b2abdabad270
Catalogue record
Date deposited: 28 Sep 2022 16:32
Last modified: 17 Mar 2024 04:00
Export record
Altmetrics
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics