Extending journal-based research
impact
assessment to book-based disciplines
(Research Proposal)
L. Carr (ECS, Southampton)
S. Hitchcock (ECS, Southamtpon)
C. Oppenheim (Information Science, Loughborough),
J.W. McDonald (Social Statistics, Southampton),
T. Champion (Archeology, Southampton),
S. Harnad (ECS, Southampton)
Summary:
The ‘impact’ of academic research is typically measured by how much it
is read, used and cited, and by how much new work it influences.
Services that measure impact work well for journal-based disciplines.
Book-based disciplines can now benefit from online tools and methods of
impact analysis too. These analyses also predict fruitful directions
for future research, and so can inform research assessment and funding.
This project will extend tools for online bibliometric data collection
of publications and their citations with the aim of testing and
evaluating new Web metrics to assist research assessment in book-based
disciplines.
Objectives
A method has been proposed that could give online research assessment
far richer, more sensitive and more predictive measures of research
productivity and impact, for far less cost and effort (Harnad et al.
2003). This method, builds on and extends well-established,
citation-based analysis of research impact applied typically to journal
articles; we will now test whether this can be generalized to
disciplines that depend on other types of publications such as books.
Many Arts and Humanities subjects are primarily book-based rather than
article-based. The aim of this proposal is to:
· engage book
authors in these subject areas in self-archiving their book's metadata
(author, title, date, publishers, keywords, abstract) plus its
reference list (bibliography).
· apply the
scientometric resources that are currently available only for journal
articles to book-based disciplines.
· compute
book-to-book citation-counts for books as well as for book-authors from
the resulting database of self-archived book metadata+references.
All of this will help extend and integrate the scientometric analysis
of citations and usage across disciplines, and between the domains of
books and journals. It should improve predictive capability for
assessment as well as extending our understanding of the embryology and
evolutionary trajectory of knowledge.
The objectives of this work are to:
· Produce a
Web-based citation database with records for as many cited books across
arts and humanities disciplines as can be gathered (using archaeology
as the initial focus case)
· Develop and
test a series of Web metrics for measuring research impact based on
these publications and their citations
· Evaluate user
acceptance of these metrics and offer selected metrics for ongoing use
Calls for researchers to self-archive their published research papers
in openly accessible institutional archives, thereby increasing the
visibility and impact of their work, would have little effect on arts
and humanities subjects, it is widely claimed, because
(1) their research impact is book-based
and not journal-based
(2) authors are more concerned about finding a prestigious publisher
and obtaining a few good book reviews
(3) their journal-articles are considered “waystations” on the road to
the book
If this is indeed an accurate description of the status quo in much of
Arts and Humanities research insofar as practises and beliefs are
concerned, it is anything but optimum. The online medium is poised to
change things radically!
The project will gather bibliographic data from arts and humanities
authors' books and articles, and will citation-link and
citation-analyse them to create a book citation-impact factor. This has
not been done systematically before, and it will not only be
informative about parallels between the impact of books and the impact
of articles, but it will have policy implications. The new measures
should help do two things:
(1) provide new
ways of assessing the impact of book-based research
(2) enhance impact by making the research more
visible online.
It cannot do the latter for books on the scale that is possible for
articles, because book-texts usually cannot be made openly accessible
to all would-be users online, but it can still increase book visibility
and might encourage some authors to make their books openly accessible
too.
The following are examples of the kinds of comparisons that could help
answer the research questions posed here:
(1) Compare the book-based impact
magnitudes and rankings with their journal-based counterparts for every
author where both measures are available.
(2) Compare the book-based impact magnitudes and rankings with the
traditional criterial rankings (publisher prestige, number and
favourability of review articles) and with rankings by peer or expert
judgment.
(3) Compare with ISI's Web-of-Science's "gray" book impact, which it
does not calculate or use, but could be calculated using the UK ISI
database. (ISI has indicated already that it will approve our software
agents as part of the UK national site license.)
Research context
Scientometric methods have long proved remarkably powerful for
measuring impact of published works, and while research assessment
appears to use impact data only indirectly, it is inevitable there will
be some correlation. Recent studies have quantified the extent of the
correlation, thus demonstrating the predictive power of scientometric
methods For example, statistical correlational analyses on the
numerical outcome of the RAE using average citation frequencies have
been shown to predict departmental outcome ratings remarkably closely.
Smith and Eysenck (2002) found a correlation of as high as .91 in
Psychology. Oppenheim (1998) and Holmes and Oppenheim (2001) found
correlations of .80 and higher in other disciplines.
These scientometric methods have traditionally been applied only to
select parts of the research literature, notably journal articles.
Availability of online data makes it possible to enhance the quality
and scope of scientometric tools, in particular to extend traditional
citation-based methods to books, and supplement this with, for example,
usage measures, to produce new metrics that can be used for continuous
online assessment of research productivity and to assist the prediction
of fruitful areas of new research for funding.
Being the only country with a national research assessment exercise,
the UK is in a unique position to exploit these methods, which would
increase the uptake and impact of UK research output, and set an
example for the rest of the world that will almost certainly be
emulated.
Methods
The project will be developed by a team that has built tools to manage
online publication data for scientometric analysis, backed by
researchers with experience and expertise in scientometric analysis and
Webmetric techniques and leaders in some of the key fields to be
analysed.
The primary method of the project will be to increase the number of
analytic techniques we are already developing to measure the uptake and
visibility of cited books. This will begin with collecting and
supplementing book data using tools developed in the successful
JISC-NSF funded Open Citation Project (opcit.eprints.org) for citation
analysis for Open Archives (Hitchcock et al. 2002). The project
proposed here would modify and extend these tools:
· Eprints.org
software (software.eprints.org), for managing institutional archives
· Citebase
(citebase.eprints.org), a discovery service with usage and
citation-bases ranking
· Paracite
(paracite.eprints.org), an online reference finder
The user interface to Eprints software will be optimised for author
input of metadata and reference data for arts and humanities books.
Reference data will be automatically processed by Citebase and added to
its growing citation database of books and papers. Online records of
referenced items, where available, will be located by Paracite to
enable reference linking. Based on the collected data, we will be able
to build a number of analytic techniques to measure the uptake and
visibility of cited works. The following are examples of the analyses
that would become feasible as a result of this project:
· Books and
authors can be credited with their citation counts from both their
book-to-book citations and their article-to-book citations (and added
to the citation counts of article-to-article citations).
· Early-stage
predictors, such as usage-counts ("hits") for the self-archived book
metadata (partly analogous to the article preprint hits that have
proved predictive of later citation counts in the case of
articles).
· The
correlation between book citation-counts and the publisher imprimatur
(the book publisher's "impact factor") will be computable.
· The
correlation between book citation-counts and (1) the number of
book-reviews, (2) the impact factor of the book-review journal or
magazine, and (3) the citation-count for the reviews -- all will be
computable.
· The
time-course, predictivity, and pattern of inter-correlations will no
doubt reveal a good deal more about the true impact of books, and it
may even change publication and evaluation practises.
· "Hubs and
authorities" can be derived from this data, again for predictiveness
and evaluation through recursive adjustment of citation weights.
Intermediate and final-stage evaluations will be performed on all user
interfaces and computed outputs to monitor user reactions and ensure
usability and fitness-for-purpose. Evaluations will involve observing
small user groups, supplemented with Web forms-based feedback, as
applied in the Open Citation Project (Hitchcock et al. 2003)
To direct and guide the project, monthly meetings (electronically
moderated as needed) will be held with project co-applicants, and
bi-annually with a project advisory board that is to include UK experts
on bibliometrics and experts in the key subject areas to be covered.
The first requirement is to establish methods of data collection. A
workable service that can continue beyond the project will need to
cultivate authors to input data on their book publications, and this
will be a primary aim of project advocacy to authors, to be directed
through national and international research agencies. The initial aim
is for the database to include records for at least as many books as
submitted in the 2001 RAE. A critical mass of data will need to be
built quickly in at least one focussed subject area to demonstrate the
types of services that will encourage others. We would initially focus
on archaeology, as outlined in the work plan below, where we will
attempt to solicit bibliographies from the entire discipline, UK and
international, supplemented by scanning in the bibliographies of books
in the past 10 years. The rest of the sample will be based on
solicitations in other disciplines. In the specific test case, the UK
focus is for RAE implications; the Archaeology focus is for a complete
test-case.
· Findings on
webmetric tools to analyse citation data will be disseminated through
strategic presentations to universities and research-funders, and with
Web demonstrators to support continuous use. These will also be linked
to corresponding journal-based databases in the same subject matter.
· A Web citation
database for book-based publications consisting of their metadata and
their cited references will be created, and will be accessible online
to all (c.f. Citebase, citebase.eprints.org). The database will be used
by both researchers and research assessors and funders to search,
navigate and rank publications on the basis of existing and new
measures of impact and usage.
· The results
will be publicised and promoted in journals such as Journal of
Information Science, Journal of the American Society for Information
Science and Technology, Scientometrics, Research Evaluation, Research
Policy, D-Lib Magazine, First Monday, and at conferences and symposia,
as well as through talks at UK universities and worldwide. A major
conference in this area is the biannual International Conference on
Scientometrics and Informetrics, due in 2005.
· Presentations
and reports for organisations involved in funding and research
assessment decisions, including the Higher Education Funding Councils,
funding agencies such as ESRC, EPSRC and AHRB, and major charities that
fund research
References
· Harnad, S. (2006)
Future UK Research Assessment Exercise (RAE) to be Metrics-Based. http://openaccess.eprints.org/index.php?/archives/75-guid.html
· Harnad, S. et
al. (2003) “Mandated online RAE CVs linked to university eprint
archives: Enhancing UK research impact and assessment”. Ariadne, issue
35, April 30 http://www.ariadne.ac.uk/issue35/harnad/
· Hitchcock, S.,
et al. (2003) “Evaluating Citebase, an Open Access Web-based
Citation-Ranked Search and Impact Discovery Service”.
http://opcit.eprints.org/evaluation/Citebase-evaluation/evaluation-report.html
· Hitchcock, S.,
et al. (2002) “Open Citation Linking: the Way Forward”. D-Lib Magazine,
Vol. 8, No. 10, October 2002
http://www.dlib.org/dlib/october02/hitchcock/10hitchcock.html
· Holmes, Alison
and Oppenheim, Charles (2001) “Use of citation analysis to predict the
outcome of the 2001 Research Assessment Exercise for Unit of Assessment
(UoA) 61: Library and Information Management”. Information Research,
Vol. 6, No. 2, January
http://www.shef.ac.uk/~is/publications/infres/paper103.html
· Oppenheim,
Charles (1998) “The correlation between citation counts and the 1992
research assessment exercise ratings for British research in genetics,
anatomy and archaeology”. Journal of Documentation, 53:477-87
http://dois.mimas.ac.uk/DoIS/data/Articles/julkokltny:1998:v:54:i:5:p:477-487.html
· Smith, Andrew
and Eysenck, Michael (2002) "The correlation between RAE ratings and
citation counts in psychology", June
http://psyserver.pc.rhbnc.ac.uk/citations.pdf
Work plan
· Technical
requirements analysis; estimate extent of database
· Design manual
and automated data collection methods
· Adapt and
develop software for data collection, data finding, citation analysis
· Data collection
o Design Eprints interface
for author deposit of book metadata
o Focus on Archaeology to
obtain a comprehensive test dataset
§ Approach authors
and publishers
§ Scan and key-in
data from library holdings, as necessary
o Populate database to agreed
targets and assure data quality
· Data processing
o Adapt software to reference
styles to automatically process input data for citation database
o Design user interface to
citation database
· Data finding
o Tailor Paracite to seek
online sources/records of referenced works
· Monitor and
adapt methods of data collection
· Promote data
deposit to all relevant book-based disciplines and extend data
collection in critical areas
· Investigate
new Web metrics
· Design initial
metrics: demonstrate Web-based results
· Develop tools
to analyse and display Web metrics
· Evaluate
outputs of metrics
· Optimise
selected metrics for wider use
· Report results
Principal RA (Dr Steve Hitchcock) responsibilities
· Lead researcher
· Project
management
· Liaise and
work with technical RA
· Advocacy for
author deposit of books metadata throughout arts and humanities
subjects, and uptake of Web metrics,
· Maintain
information for project team
· Build and
coordinate advisory board
· Maintain
project Web site
· Manage
experimental services produced by project
· Dissemination:
promotion, communication, writing papers and presentations
· Evaluation and
user testing
· Writing
project reports