[BOAI] Re: How to compare research impact of toll- vs. open-access research

From: Stevan Harnad <harnad AT>
Date: Sat, 6 Sep 2003 13:34:14 +0100 (BST)

The following data posted by Peter Suber in 
indicate that open-access articles (from BioMedCentral) average at least
89 times as many downloads as toll-access articles (from Elsevier). (The
89 is probably an undercount, because it does not include PubMedCentral

    "Elsevier has put some PowerPoint slides on the web summarizing
    its interim results for 2003. Slide #16 shows that there were 4.5
    million full-text articles in ScienceDirect on June 30, 2003, and
    slide #15 shows that there were 124 million article downloads in
    the 12 months preceding that date. This means that its articles
    were downloaded an average of 28 times each during the past year.

    "For comparison I asked Jan Velterop of BioMed Central what the
    download figure was for BMC articles during the same time period. He
    reports that the average is about 2500 per year, which doesn't
    count downloads of the same articles from PubMed Central. This is
    89 times the Elsevier number. "

Combine these download data with the citebase data on the correlation
between downloads and citations
and you will be able to estimate the dramatic way in which open access
enhances research citation impact, confirming what Steve Lawrence reported
in 2001 for computer science research:
and what Kurtz et al. reported for astrophysical research:

(In an ongoing collaboration with Charles Oppenheim we are currently
making controlled pairwise comparisons of citation impact between
open-access and toll-access articles that appear in the same journal and
year, comparing self-archived and non-self-archived articles, across time,
and across disciplines. We hope to extend these comparisons with the
help of ISI's citation database.)

Those individuals, institutions, research-funders, tax-payers and nations
who are interested in increasing the visibility, usage and impact of
their research output should take special note of these data! Apply the
estimates in reverse if you wish to estimate the amount of research impact
(and its rewards) being that is currently being *lost* daily, monthly,
and yearly by researchers, their institutions, and by research itself as
long as we delay providing immediate open access to all research output --
as we could already do today, by self-archiving it.

Stevan Harnad

[Posted with permission from Michael Kurtz, Astrophysics, Harvard]

On Fri, 14 Nov 2003, Michael Kurtz wrote:

> You may be interested in:
> which is a report by the librarian liason of the AAS Pub board meeting. 
> The relevant paragraph (at the end) is:
> Finally, there was a very interesting brief report from Greg Schwarz 
> (from the ApJ editorial office) on some work he's doing tracking 
> citation rates of papers published in the ApJ based on whether they were 
> posted on astro-ph or not. He studied samples from July-Dec. 1999 and 
> July-Dec. 2002. The first interesting point is that 72% of the papers 
> published in the latter period had appeared on astro-ph, although the 
> submission rate to the server seems to be leveling off. He also noted 
> that the number of authors per paper has been increasing along with the 
> total length and that most astro-ph submissions are after acceptance by 
> the journal. The really fascinating conclusion he's drawn, at least from 
> my perspective, is that ApJ papers that were also on astro-ph have a 
> citation rate that is _twice_ that of papers not on the preprint server. 
> Moreover, this higher citation rate appears to continue once the time 
> gap disappears (that is, papers on astro-ph are viewed about nine months 
> ahead of the journal paper, but after several years of availability, the 
> astro-ph papers are still being cited at a significantly higher rate).
> You have shown some similar work already, but this seems nicely done. 
> With the majority of ApJ papers going to astro-ph those which are not 
> preprinted (and which are less referenced) seem the oddballs.
> I have been assuming that the higher citation rates for papers which are 
> preprinted was due to the preprinting; perhaps the effect is that lower 
> quality/interest papers are not preprinted.  

Can I ask for a clarification (because the word "preprinted," unlike
"self-archived," is somewhat ambiguous): Are you specifically 
here to the prepublication part of an article's timeline, your point being
that in astrophysics, where the publishers' versions are all effectively
"open access" by the time they appear (in that they are all available 
the entire worldwide astrophysical research community via site-licenses
to the relatively small and closed group of journals involved), there are
*still* twice as many citations of those papers that were self-archived
before publication (as either pre-refereeing preprints or post-refereeing
postprints or both) than to those that only became openly accessible
when they became available as from the publisher?

That would be very useful news both for the value of open access to
eprints (preprints and postprints) in general and the value of prepublication
self-archiving in particular, suggesting that (if we take Steve Lawrence's
figures for the overall citation advantage of free online access to eprints
over the its alternatives -- online or on-paper -- which is a citation
advantage of 4.5) we see that a two-fold advantage already comes from
free access to the prepublication phase alone.

The causality, of course, is uncertain here, as you note: Is it that
earlier open-access enhances the citation counts, or that the better
articles are the ones that are being self-archived earlier?

In any case, it is certainly a vote both for open access and for early

Cheers, Stevan

Date: Fri, 14 Nov 2003 15:19:23 -0500
From: Michael Kurtz 

Hi Stevan,

First I should note that I personally have nothing to do with this 
study, I have only read Sarah's report.  The author, Greg Schwartz 
(gschwarz # AT # would certainly be a better source as to what 
he is saying.

Certainly you may post my message about it.  

To your question:  Greg is referring to papers which have been deposited 
in the ArXiv, normally astro-ph, thus they are self-archived in advance 
of publication (preprinted).  There are other avenues for astronomy 
articles to be preprinted; he seems from the description not to be 
taking them into account.  In your terminology he notes that most of the 
articles were submitted to the ArXiv after they were accepted (thus are 
post-refereeing postprints); there is no requirement for this by any 
astronomy journal, but it has long been the common practice, since 
before preprints became electronic.

So the answer to your first paragraph question is YES!

Greg may know if there is a difference in citation rate for papers which 
were deposited in the ArXiv before they were accepted (pre-refereeing 
preprints) vs after they were accepted (postprints); this would help to 
clear up the causality issue, as the preprints were self-archived earlier.

In any event this is a huge vote for the importance of self-archiving.

Best wishes


