Budapest Open Access Initiative      

Budapest Open Access Initiative: BOAI Forum Archive

[BOAI] [Forum Home] [index] [options] [help]

boaiforum messages

[BOAI] Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Sat, 15 Mar 2003 15:30:10 +0000 (GMT)


Threading:      • This Message
             [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk

Quote/Comments on:

    Clifford A. Lynch: "Institutional Repositories: 
    Essential Infrastructure for Scholarship in the Digital Age"
    http://www.arl.org/newsltr/226/ir.html

Cliff Lynch makes many very good points. I disagree with him only on one
point, but it is a fundamental one, with important practical and
strategic implications for the immediate future: What is the most pressing
reason for creating and filling institutional repositories at this
time? Cliff thinks it is to promote new forms of scholarship whereas
I think it is to promote refereed research. The new scholarship
is coming too, and will certainly grow in importance, but the immediate
rationale for creating and filling institutional repositories is for the
self-archiving of institutional research input, in order to maximize
its research impact, by maximizing user access to it, through open access:
http://www.soros.org/openaccess/

> faculty have been exploring ways in which works of authorship in the new
> digital medium can enhance teaching and learning and the communication
> of scholarship

This is the familiar and valid complaint that the university has not
been sufficiently supportive of online innovations by faculty, neither
in terms of resourcing it nor in terms of rewarding it. This is true,
and it is indeed a problem, and no doubt slowing innovation. But it is
also being remedied, by increasing recognition and support, and the
persistence of innovative faculty. It is *not* the reason universities
need digital repositories urgently at this time, and this is *not* the
(main) content that will fill them.

> faculty have exploited the Net as a vehicle for sharing their ideas
> worldwide, whether these ideas are expressed in relatively familiar
> forms such as digital versions of traditional journal articles or (less
> commonly) in entirely new forms...

This is a combination of the two kinds of content that are at issue
here. I am putting the primary emphasis on the "familiar forms" 
rather
than the new ones (important and valuable though they too are). The
progress, productivity and funding of scholarly and scientific research
depend directly on its visibility and accessibility: the degree to which
it is found, seen, read, used, cited, applied, built-upon by other
researchers. In a word, it all depends on *research impact.* And research
impact depends on research access. Whatever blocks access blocks impact.

There are 20,000 peer-reviewed research journals, across all disciplines
worldwide, publishing 2,000,000 articles annually. Almost all of these
articles are accessible to researchers (i.e., to their potential users)
only if their institution can afford the toll-access (subscription,
license) to the journal in which they were published. And most
universities cannot afford toll-access to most journals -- even the
richest can only afford a minority of the 20,000. This means that *all*
research on the planet is inaccessible to *most* of its potential
users. And every single case of access-denial is a case of potential
impact loss. The overwhelming, pressing rationale for institutional
repositories is accordingly: to put an end of this daily impact loss --
a legacy of the paper era when the true costs of paper access made it
unavoidable, but no longer necessary in the online era, when open access
can be provided by institutions for their own refereed research output.

It is quite natural for researchers to self-archive their own refereed
research output in their own institutional archives, giving it away to
all of its would-be users worldwide for free, in order to maximize its
research impact, for they have been giving it away free to their
publishers for the very same reason throughout the paper era: Unlike all
other authors, researchers have always given away their work, written
only for impact, not for royalty revenue from toll-income. Hence it is
only natural that now that it has become possible to do so, they should
self-archive it in their own institutional archives so as to put an end
to the needless daily impact loss that is a legacy of the paper era.

This -- and not new forms of scholarship -- is the immediate, pressing
rationale for creating and filling institutional repositories at this
time. And this (refereed research output) is the content with which they
need to be filled, as soon as possible. With it -- and their newfound
role as *outgoing* collections of a university's own research output
instead of *incoming* collections of the output of other universities --
the institutional archives will also become the repositories for new
forms of scholarship. But the first and most urgent step is to put an
end to the needless daily impact loss for peer-reviewed research.

What about the peer-reviewed journals? Their toll-access mechanism of
cost-recovery may continue to co-exist with the open-access versions in
the institutional repositories, with those researchers whose institutions
can afford it using the former and those who cannot using the latter
-- or the journals may eventually have to cut costs and downsize to
the essentials in the online era, which may well prove to be just
peer-review service-provision alone, with the access, storage and
distribution offloaded onto the institutional repositories. 

Peer-review only costs about $500 per outgoing paper, whereas
those institutions who can afford it are paying an average of $2000
(collectively) per incoming paper in access-tolls -- in exchange for
the very limited access this provides, restricted to the minority who
can afford it.
http://www.nature.com/nature/debates/e-access/Articles/harnad.html#B1

> faculty are well motivated to rise above the institutional failures to
> help them disseminate their works

Indeed they are, in the service of maximizing their research impact and
putting an end to its needless loss. But maximizing research impact is
in the interest of their institutions too, as the benefits of research
impact (research funding, prizes, prestige) are shared by faculty and
their institutions.

Let me count the three most obvious ways that the self-archiving of
institutional research output benefits researchers' institutions:

(1) Open access to an institution's research output maximizes its
impact and its rewards, as noted.

(2) Open access, being reciprocal if practised by other institutions too,
maximizes faculty access to the research output of *other* institutions,
generating better-informed and more current research (using the research
output of others, as you would have them use yours!).

(3) If/when there is ever an eventual downsizing of peer-reviewed
journals to the remaining online-age essentials (probably only peer
review itself), then there is also the prospect of eventual institutional
windfall savings of up to 75% on serials budgets.

> a faculty member seeking... broader dissemination and availability of
> his or her traditional journal articles...faces several time-consuming
> problems...  [F]aculty time is being wasted, and expended ineffectively,
> on system administration activities and content curation.

Cliff here means the time-consuming problem of maintaining a website for
self-archiving one's own research output. An institutional archive
is certainly a more sensible solution than having each researcher
maintain his own archive.

> Institutional repositories can maintain data in addition to authored
> scholarly works. In this sense, the institutional repository is a
> complement and a supplement, rather than a substitute, for traditional
> scholarly publication venues.

Not only is the institutional archive a supplement rather than a
substitute when it self-archives data that could not be included with
the published article, but it is a supplement even when it self-archives
the article: The self-archived open-access version is a supplement to the
journal's toll-access version, to maximize its research impact. It is not
a substitute for journal publication -- and certainly not a substitute
for peer review -- though it might one day become a substitute for
toll-access (for those who can afford it: for those who cannot, it
is already a substitute today!).

> where the disciplinary practice is ready, institutional repositories can
> feed disciplinary repositories directly. In cases where the disciplinary
> culture is more conservative, where scholarly societies or key journals
> choose to hold back change, institutional repositories can help
> individual faculty take the lead in initiating shifts in disciplinary
> practice.

There is no need -- in the age of OAI-interoperability -- for
institutional archives to "feed" central disciplinary archives: They
need only feed OAI metadata harvesters. The institution is the natural
locus for self-archiving its own research output, for each of
its disciplines. And it is individual researchers, not disciplines,
who will overcome the old habits, with the incentive to self-archive
coming from the discipline-universal benefits of maximizing research
impact. These benefits are shared by researchers and their institutions,
not by researchers and their disciplines (which are more of a locus
for *competing* for impact than for *sharing* it!). And journals are not
holding back change (and cannot): They are themselves changing with the
new possibilities the online medium has provided to allow researchers to
maximize their research impact:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm

But it is certainly true that university archives can help faculty take
the lead by providing the resources and policy that facilitates
self-archiving:
http://www.eprints.org/self-faq/#institution-facilitate-filling

> Institutional repositories can encourage the exploration and adoption of
> new forms of scholarly communication... This, to me, is perhaps the most
> important and exciting payoff

Here is where Cliff and I disagree. Exciting as they are, the new forms
are not the immediate priority: Open access to the "old forms" is. 
Then
the new forms will come too. But first the full research impact of the
old forms, at last. They will pave the way for the rest.

> The first potential danger is that institutional repositories are cast
> as tools of institutional (administrative) strategies to exercise
> control over what has typically been faculty controlled intellectual
> work. I believe that any institutional repository approach that requires
> deposit of faculty or student works and/or uses the institutional
> repository as a means of asserting control or ownership over these works
> will likely fail, and probably deserves to fail... This is not to say
> that policies mandating the deposit of materials that are broadly
> recognized as part of the institutional record ... are inappropriate.

I agree completely. The purpose of institutional archives and
archive-filling policies is not to assert control or ownership over
faculty research output! It is to maximize its research impact by
maximizing user access to it.
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
http://paracite.eprints.org/cgi-bin/rae_front.cgi

Mixing up the open-access agenda with other university dreams about
generating new revenue streams from faculty intellectual output (software,
patents, courseware, distance education, electronic publishing) is not
only wrong-headed, but it risks delaying the real and sizeable benefits
of open access to refereed research output, turning the institutional
repository movement into aimless gridlock for some time to come.

> My second concern is... [that] administrators, librarians, and faculty
> members wishing to challenge existing systems of scholarly publishing
> (specifically their economic models and their creation of barriers to
> access through intellectual property control and licensing arrangements)
> may try to link their efforts too directly to institutional repositories
> by imposing inappropriate policy constraints 

I agree. See above. And here is a model for an appropriate policy:
http://www.ecs.soton.ac.uk/~lac/archpol.html

> it dramatically underestimates the importance of institutional
> repositories to characterize them as instruments for restructuring the
> current economics of scholarly publishing

I agree again. It is not the business of universities to restructure the
economics of scholarly publishing. It is the business of universities to
do research, publish their findings, and make sure that those findings are
put to full use. Maximizing all would-be users' access to them is the
way to ensure the latter. And that might (but just might) eventually
have some effects on the economics of refereed journal publication. But
that would only be a side-effect, not the direct motivation or
justification at all: That direct motivation and justification is
to maximize the impact of institutional research output by making it
open-access -- by self-archiving it in the institutional repository.

> the institutional repository isn't a journal, or a collection of
> journals, and should not be managed like one. That's not the point or
> the purpose of an institutional repository.

Correct. It is an open-access supplement to toll-access via the journals.

> Institutional repositories are not a challenge or alternative to
> disciplinary repositories; rather, they complement them, just as they
> can complement existing venues of scholarly publication.

In the era of OAI, institutional and disciplinary archives are equivalent,
because completely interoperable. However, the shared interest of
researchers and their institutions in maximizing the impact of their
research output makes institutional archives a better bet for hastening
open access, especially as they are in a position to modify their
existing publish/perish policies so as to mandate self-archiving in
order to maximize research impact.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html

> It is desirable to make this as simple as possible... with a simple and
> stable submission interface to the institutional repository. 

The simple solution is available already: See the 60+ Eprints.org
institutional archives http://software.eprints.org/#ep2
in use for over 2 years and growing:
http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt

The challenging part is not creating the free self-archiving software,
nor in making it simple, nor in getting it adopted, but in getting
the archives filled, which requires a clear, coherent institutional
self-archiving policy -- with a clear sense of *what* needs to be
self-archived, *how* and *why*:
http://www.ecs.soton.ac.uk/~lac/archpol.html

> It's vital that institutions recognize institutional repositories as a
> serious and long-lasting commitment to the campus community (and to the
> scholarly world, and the public at large) 

Yes, but *far* more important than this advance long-lasting commitment
to an empty archive is a coherent policy for getting it filled!

> An institutional repository can fail over time for many reasons: policy
> (for example, the institution chooses to stop funding it), management
> failure or incompetence, or technical problems. Any of these failures
> can result in the disruption of access...I worry a great deal about what
> the various impacts and implications of the first few major failures of
> institutional repositories

And I worry a great deal about worries about the permanence of empty
or even non-existent archives, instead of directing all energies and
resourcefulness to filling the archives! Get the precious intellectual
eggs into the basket, and their very presence there will be the best
guarantor that they will be maintained in perpetuum. Worry instead
about permanence now and all you do is add another item to the
long list of needless worries that are holding back self-archiving:
http://www.eprints.org/self-faq/#1.Preservation

And this is also the point to remind ourselves, again, that
self-archiving is a *supplement* to, not a *substitute* for journal
publication. Until and unless there is a transition and downsizing 
from toll-access journal publication to open-access journal publication,
the primary preservation burden is not on the institutional archives!
Their burden is merely to provide open-access to it, now, as a supplement
for those who cannot afford toll-access.

So stop worrying about archives failing and work instead on archives
filling!

> Not every higher education institution will need or want to run an
> institutional repository, though I think ultimately almost every such
> institution will want to offer some institutional repository services to
> its community. We will see various forms of consortial or cluster
> institutional repositories. 

Maybe. But it seems to me that this is only a substantive question if we
are talking about the industrial strength archive software such as
DSpace. For the "light" softwares such as Eprints, there is so little
start-up time and maintenance required that I would think any
institution that generated research output could and would run its own.
(Again, there is not enough *content* yet to talk about fancy consortial
schemes! Let's get the culture of self-archiving rolling before we worry
about the load being to great for an institution to manage on its own!)

> Federation of institutional repositories may also subsume the
> development of arrangements that recognize and facilitate faculty
> mobility and cross-institutional collaborations.

This can be managed at the metadata level without any special need to
"federate" (over and above OAI-interoperability). A metadata tag
indicating current institutions, and tags indicating prior institutions
and dates will allow all research to be triangulated upon (for where it
was done, and when).

> The MIT [free repository] software is not the only option available,
> although I believe it is the most general-purpose; for example, there
> is [free repository] software from the University of Southampton in
> the U.K. <http:// www.eprints.org/> designed more specifically for
> institutional or disciplinary repositories of papers, as opposed to
> arbitrary digital materials.

And I have here tried to give the reasons why the pressing challenge now
is not general-purpose archiving of arbitrary digital materials, but
the self-archiving of institutional refereed research output, to
maximize its research impact by maximizing its visibility and
accessibility, through open access.
http://www.ecs.soton.ac.uk/~harnad/Temp/unto-others.html

Stevan Harnad

-------------------------------------------------------------------
NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 
02):

    http://amsci-forum.amsci.org/archives/september98-forum.html
                            or
    http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: september98-forum AT amsci-forum.amsci.org 

See also the Budapest Open Access Initiative:
    http://www.soros.org/openaccess

the BOAI Forum:
    http://www.eprints.org/boaiforum.php/

the Free Online Scholarship Movement:
    http://www.earlham.edu/~peters/fos/timeline.htm

the SPARC position paper on institutional repositories:
    http://www.unites.uqam.ca/src/sante.htm

the OAI site:
    http://www.openarchives.org

and the free OAI institutional archiving software site:
    http://www.eprints.org/



[BOAI] Re: Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Sun, 16 Mar 2003 14:15:56 +0000 (GMT)


Threading: [BOAI] Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from freemamh AT lavc.edu

On Sat, 15 Mar 2003, Thomas Krichel wrote:

>   Stevan Harnad writes:
> 
>sh> There is no need -- in the age of OAI-interoperability -- for
>sh> institutional archives to "feed" central disciplinary 
archives:
> 
>   I do not share what I see as a  blind faith in interoperability
>   through a technical protocol. 

I am quite happy to defer to the technical OAI experts on this one, but let
us put the question precisely: 

Thomas Krichel suggests that institutional (OAI) data-archives
(full-texts) should "feed" disciplinary (OAI) data-archives,
because OAI-interoperability is somehow not enough. I suggest that
OAI-interoperability (if I understand it correctly) should be enough. No
harm in redundant archiving, of course, for backup and security, but not
necessary for the usage and functionality itself. In fact, if I understand
correctly the intent of the OAI distinction between OAI data-providers -- 
http://www.openarchives.org/Register/BrowseSites.pl 
-- and OAI service-providers --
http://www.openarchives.org/service/listproviders.html 
-- it is not the full-texts of data-archives that need to be "fed" to
(i.e., harvested by) the OAI service providers, but only their metadata.

Hence my conclusion that distributed, interoperable OAI institutional
archives are enough (and the fastest route to open-access). No need
to harvest their contents into central OAI discipline-based archives
(except perhaps for redundancy, as backup). Their OAI interoperability
should be enough so that the OAI service-providers can (among other things)
do the "virtual aggregation" by discipline (or any other computable
criterion) by harvesting the metadata alone, without the need to harvest
full-text data-contents too.

It should be noted, though, that Thomas Krichel's excellent RePec
archive and service in Economics -- http://repec.org/ -- goes
well beyond the confines of OAI-harvesting! RePec harvests non-OAI
content too, along lines similar to the way ResearchIndex/citeseer --
http://citeseer.nj.nec.com/cs -- harvests non-OAI content in computer
science. What I said about there being no need to "feed" 
institutional OAI
archive content into disciplinary OAI archives certainly does not apply
to *non-OAI* content, which would otherwise be scattered willy-nilly
all over the net and not integrated in any way. Here RePec's and
ResearchIndex's harvesting is invaluable, especially as RePec already
does (and ResearchIndex has announced that it plans to) make all its
harvested content OAI-compliant!

To summarize: The goal is to get all research papers, pre- and
post-peer-review, openly accessible (and OAI-interoperable) as soon as
possible. (These are BOAI Strategies 1 [self-archiving] and 2
[open-access journals]: http://www.soros.org/openaccess/read.shtml
). In principle this can be done by (1) self-archiving them in central
OAI disciplinary archives like the Physics arXiv (the biggest and
first of its kind) -- http://arxiv.org/show_monthly_submissions
-- by (2) self-archiving them in distributed institutional OAI
Archives -- http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt -- by (3)
self-archiving them on arbitrary Web and FTP sites (and hoping they
will be found or harvested by services like Repec or ResearchIndex)
or by (4) publishing them in open-access journals (BOAI Strategy 2:
http://www.soros.org/openaccess/journals.shtml ).

My point was only that because researchers and their institutions
(*not* their disciplines) have shared interests vested in maximizing
their joint research impact and its rewards, institution-based
self-archiving (2) is a more promising way to go -- in the age of
OAI-interoperability -- than discipline-based self-archiving (1), even
though the latter began earlier. It is also obvious that both (1) and
(2) are preferable to arbitrary Web and FTP self-archiving (3), which
began even earlier (although harvesting arbitrary Website and FTP contents
into OAI-compliant Archives is still a welcome makeshift strategy
until the practise of OAI self-archiving is up to speed). Creating new
open-access journals and converting the established (20,000) toll-access
journals to open-access is desirable too, but it is obviously a much
slower and more complicated path to open access than self-archiving,
so should be pursued in parallel.

My conclusion in favor of institutional self-archiving is based on the
evidence and on logic, and it represents a change of thinking,
for I had originally advocated (3) Web/FTP self-archiving --
http://www.arl.org/scomm/subversive/toc.html -- then switched allegiance
to central self-archiving (1), even creating a discipline-based archive:
http://cogprints.ecs.soton.ac.uk/ But with the advent of OAI in 1999,
plus a little reflection, it became apparent that
institutional self-archiving (2) was the fastest, most direct, and most
natural road to open access: http://www.eprints.org/ 
And since then its accumulating momentum seems to be confirming that this
is indeed so: http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2212.html
http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt

>   The primary sense of belonging
>   of a scholar in her research activities is with the disciplinary
>   community of which she thinks herself a part... It certainly
>   is not with the institution. 

That may or may not be the case, but in any case it is irrelevant to
the question of which is the more promising route to open-access. Our
primary sense of belonging may be with our family, our community,
our creed, our tribe, or even our species. But our rewards (research
grant funding and overheads, salaries, postdocs and students attracted
to our research, prizes and honors) are intertwined and shared with our
institutions (our employers) and not our disciplines (which are often
in fact the locus of competition for those same rewards!)

>   Therefore, if you want to fill
>   institutional archives---which I agree is the best long-run way
>   to enhance access and preservation to scholarly research--- [the]
>   institutional archive has to be accompanied by a discipline-based
>   aggregation process. 

But the question is whether this "aggregation" needs to be the 
"feeding"
of institutional OAI archive contents into disciplinary OAI archives, or
merely the "feeding" of OAI metadata into OAI services.

>    The RePEc project has produced such an aggregator
>   for economics for a while now. I am sure that other, similar
>   projects will follow the same aims, but, with the benefit of
>   hindsight, offer superior service. The lack of such services
>   in many disciplines,  or the lack of interoperability between
>   disciplinary and  institutional archives, are major obstacle to
>   the filling  the institutional archives.  There are no
>   inherent contradictions between institution-based archives
>   and disciplinary aggregators,

There is no contradiction. In fact, I suspect this will prove to be a
non-issue, once we confirm that (a) we agree on the need for
OAI-compliance and (b) "aggregation" amounts to metadata-harvesting 
and
OAI service-provision when the full-texts are in the institutional
archive are OAI-compliant (and calls for full-text harvesting only
if/when they are not). Content "aggregation," in other words, is a
paper-based notion. In the online era, it merely means digital sorting
of the pointers to the content.

>   In the paper that Stevan refers to, Cliff Lynch writes,
>   at http://www.arl.org/newsltr/226/ir.html
> 
>cl> But consider the plight of a faculty member seeking only broader
>cl> dissemination and availability of his or her traditional journal
>cl> articles, book chapters, or perhaps even monographs through use of
>cl> the network, working in parallel with the traditional scholarly
>cl> publishing system.
> 
>   I am afraid, there more and more such faculty members. Much
>   of the research papers found over the Internet are deposited
>   in the way. This trend is growing not declining.

You mean self-archiving in arbitrary non-OAI author websites? There is
another reason why institutional OAI archives and official institutional
self-archiving policies (and assistance) are so important. In reality,
it is far easier to deposit and maintain one's papers in institutional
OAI archives like Eprints than to set up and maintain one's own website.
All that is needed is a clear official institutional policy, plus
some startup help in launching it. (No such thing is possible at a
"discipline" level.)
http://www.ecs.soton.ac.uk/~lac/archpol.html 
http://www.eprints.org/self-faq/#institution-facilitate-filling 
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
http://paracite.eprints.org/cgi-bin/rae_front.cgi

>cl> Such a faculty member faces several time-consuming problems. He or
>cl> she must exercise stewardship over the actual content and its
>cl> metadata: migrating the content to new formats as they evolve over
>cl> time, creating metadata describing the content, and ensuring the
>cl> metadata is available in the appropriate schemas and formats and
>cl> through appropriate protocol interfaces such as open archives
>cl> metadata harvesting.
> 
>   Sure, but academics do not like their work-, and certainly
>   not their publishing-habits, [to] be interfered with by external
>   forces. Organizing academics is like herding cats!

I am sure academics didn't like to be herded into publishing with the
threat of perishing either. Nor did they like switching from paper to
word-processors. Their early counterparts probably clung to the oral
tradition, resisting writing too; and monks did not like be herded from
their peaceful manuscript-illumination chambers to the clamour of
printing presses. But where there is a causal contingency -- as there is
between (a) the research impact and its rewards, which academics like as
much as anyone else, and (b) the accessibility of their research -- academics
are surely no less responsive than Prof. Skinner's pigeons and rats to
those causal contingencies, and which buttons they will have to press 
in order to maximize their rewards!
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm

Besides, it is not *publishing* habits that need to be changed, but
*archiving* habits, which are an online supplement, not a substitute,
for existing (and unchanged) publishing habits.

>cl> Faculty are typically best at creating new
>cl> knowledge, not maintaining the record of this process of
>cl> creation. Worse still, this faculty member must not only manage
>cl> content but must manage a dissemination system such as a personal 
Web
>cl> site, playing the role of system administrator (or the manager of
>cl> someone serving as a system administrator).
> 
>   There are lot of ways in which to maintain a web site or to get
>   access to a maintained one. It is a customary activity these days and
>   no longer requires much technical expertise. A primitive integration
>   of the contents can be done by Google, it requires  no metadata.
>   Academics don't care  about long-run preservation, so that problem
>   remains unsolved. In the meantime, the academic who uploads papers to a 
web
>   site takes steps to resolve the most pressing problem, access.

Agreed. And uploading it into a departmental OAI Eprints Archive is 
by far the simplest way and most effective way to do all of that. All it
needs is a policy to mandate it:
http://www.ecs.soton.ac.uk/~lac/archpol.html

>cl> Over the past few years, this has ceased to be a reasonable activity
>cl> for most amateurs; software complexity, security risks, backup
>cl> requirements, and other problems have generally relegated effective
>cl> operation of Web sites to professionals who can exploit economies of
>cl> scale, and who can begin each day with a review of recently issued
>cl> security patches.
> 
>   These are technical concerns. When you operate a linux box
>   on the web you simply fire up a script that will download
>   the latest version. That is easy enough. Most departments
>   have separate web operations. Arguing for one institutional
>   archive for digital contents is akin to calling for a single web
>   site for an institution. The diseconomies of scale of central
>   administration impose other types of costs that the ones that it was to
>   reduce. The secret is to find a middle way.

I couldn't quite follow all of this. The bottom line is this: The free
Eprints.org software (for example) can be installed within a few days. It
can then be replicated to handle all the departmental or research group
archives a university wants, with minimal maintenance time or costs. The
rest is just down to self-archiving, which takes a few minutes for the
first paper, and even less time for subsequent papers (as the repeating
metadata -- author, institution, etc., can be "cloned" into each new
deposit template). An institution may wish to impose an institutional
"look" on all of its separate eprints archives; but apart from that,
they can be as autonomous and as distributed and as many as desired:
OAI-interoperability works locally just as well as it does globally.

>cl> Today, our faculty time is being wasted, and expended ineffectively,
>cl> on system administration activities and content curation. And,
>cl> because system administration is ineffective, it places our
>cl> institutions at risk: because faculty are generally not capable of
>cl> responding to the endless series of security exposures and patches,
>cl> our university networks are riddled with vulnerable faculty machines
>cl> intended to serve as points of distribution for scholarly works.
> 
>   This is the fight many faculty face every day, where they
>   want to innovate scholarly communication, but someone
>   in the IT department does not give the necessary permission
>   for network access...

I don't think I need to get into this. It's not specific to
self-archiving, and a tempest in a teapot as far as that is concerned. An
efficient system can and will be worked out once there is an effective
institutional self-archiving policy. There are already plenty of excellent
examples, such as CalTech: 
http://library.caltech.edu/digital/ 
See also:
http://software.eprints.org/#ep2

Stevan Harnad


Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: "Margaret H. Freeman" <freemamh AT lavc.edu>
Date: Sun, 16 Mar 2003 09:35:42 -0500


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message
             [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from lqthede AT apk.net
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org

On 3/16/03 9:15 AM, "Stevan Harnad" <harnad AT ecs.soton.ac.uk> 
wrote:

> The bottom line is this: The free
> Eprints.org software (for example) can be installed within a few days. It
> can then be replicated to handle all the departmental or research group
> archives a university wants, with minimal maintenance time or costs. The
> rest is just down to self-archiving, which takes a few minutes for the
> first paper, and even less time for subsequent papers (as the repeating
> metadata -- author, institution, etc., can be "cloned" into each 
new
> deposit template). An institution may wish to impose an institutional
> "look" on all of its separate eprints archives; but apart from 
that,
> they can be as autonomous and as distributed and as many as desired:
> OAI-interoperability works locally just as well as it does globally.

I'd like to ask Stevan Harnad what arrangements can be made for publishing
faculty and independent scholars who don't have the kind of institutional
connections like a major research university for making their work OAI
accessible without having to create personal websites. Is there some
distributed depository that is or could be made available to them?

Margaret Freeman
Emeritus Professor
Los Angeles Valley College



[BOAI] Re: Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Sun, 16 Mar 2003 18:17:40 +0000 (GMT)


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from freemamh AT lavc.edu
      • This Message

On Sun, 16 Mar 2003, Margaret H. Freeman wrote:

> I'd like to ask Stevan Harnad what arrangements can be made for publishing
> faculty and independent scholars who don't have the kind of institutional
> connections like a major research university for making their work OAI
> accessible without having to create personal websites. Is there some
> distributed depository that is or could be made available to them?

(1) Self-archiving of refereed, published research is not the same
as self-publishing.
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.4

(2) Unaffiliated faculty can publish as they always did, but they can
self-archive their preprints and postprints either in central archives
such as http://cogprints.ecs.soton.ac.uk/ or (as is often the case
in the relations between universities and unaffiliated scholars), they
can be allowed to self-archive in a university's eprint archive.

(3) Or perhaps your question referred to the hypothetical future,
if/when all toll-access journals become open-access journals, charging
authors' insitutions for the peer-review service? My guess would be that
unaffiliated authors are rare enough so a slush fund can cover their
costs for them out of a tiny portion of the costs paid by the institutions
of affiliated authors. (I don't believe this is a significant issue -- and
it is in any case hypothetical, as most journals are not yet open-access.)
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#4.2

Stevan Harnad


Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Linda Thede <lqthede AT apk.net>
Date: Sun, 16 Mar 2003 13:12:24 -0500


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from freemamh AT lavc.edu
      • This Message
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk

The problem that Margaret writes about is particularly true in practice
disciplines such as nursing or physical therapy. Those in these disciplines
are also at a great disadvantage when it comes to accessing the literature -
outside of academic medical centers, few healthcare facilities have good
libraries. They of course will among the big benefactors of OAI.

I do have an additional question, and perhaps it stems from not understanding
this process, but how will all these individual institutional papers be indexed
so interested persons can find them.

"Margaret H. Freeman" wrote:

> On 3/16/03 9:15 AM, "Stevan Harnad" <harnad AT 
ecs.soton.ac.uk> wrote:
>
> > The bottom line is this: The free
> > Eprints.org software (for example) can be installed within a few 
days. It
> > can then be replicated to handle all the departmental or research 
group
> > archives a university wants, with minimal maintenance time or costs. 
The
> > rest is just down to self-archiving, which takes a few minutes for 
the
> > first paper, and even less time for subsequent papers (as the 
repeating
> > metadata -- author, institution, etc., can be "cloned" into 
each new
> > deposit template). An institution may wish to impose an institutional
> > "look" on all of its separate eprints archives; but apart 
from that,
> > they can be as autonomous and as distributed and as many as desired:
> > OAI-interoperability works locally just as well as it does globally.
>
> I'd like to ask Stevan Harnad what arrangements can be made for publishing
> faculty and independent scholars who don't have the kind of institutional
> connections like a major research university for making their work OAI
> accessible without having to create personal websites. Is there some
> distributed depository that is or could be made available to them?
>
> Margaret Freeman
> Emeritus Professor
> Los Angeles Valley College

--
Linda Q. Thede
435-4 Chandler Drive
Aurora, OH 44202
lqthede AT apk.net
330-562-3281




Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Thomas Krichel <krichel AT openlib.org>
Date: Sun, 16 Mar 2003 20:57:13 +0200


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from freemamh AT lavc.edu
      • This Message
             [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk

  Stevan Harnad writes

> Hence my conclusion that distributed, interoperable OAI institutional
> archives are enough (and the fastest route to open-access). No need
> to harvest their contents into central OAI discipline-based archives
> (except perhaps for redundancy, as backup).

  I agree. 

  But this is not what I mean by "not enough". I suggest that 
  institutional archives will lie empty unless there are better
  incentives for scholars to contribute to them. If you tell
  them that it will open their scholarship to the world to
  read, they will listen. If you tell them, figures at hand, 
  how much it does, and how much impact they gain---relatively
  to their colleagues in the offices next door---they will act.
  To be able to build such measures, you need to build complicated
  datasets. This is too complex a task to be done in all disciplines
  at once. Therefore you need to work discipline by discipline. 
  
> It should be noted, though, that Thomas Krichel's excellent RePec
> archive and service in Economics -- http://repec.org/ -- goes
> well beyond the confines of OAI-harvesting! RePec harvests non-OAI
> content too, along lines similar to the way ResearchIndex/citeseer --
> http://citeseer.nj.nec.com/cs

  Not really, these systems are quite different actually. But
  this is a matter for another email...

> by (3) self-archiving them on arbitrary Web and FTP sites (and
> hoping they will be found or harvested by services like Repec or
> ResearchIndex)

  RePEc is not a harvesting service. RePEc has pioneered the way
  OAI operates before there was OAI. The degree of interoperability
  that it achieves goes way beyond what OAI achieves at present,
  but we are only at the start with OAI, remember. Basically RePEc aims to 
  achieve a type of dataset that will allow to measure impact---as
  mentioned in my first paragraph---but it is not quite there yet.
  In the meantime, it acts as the starting point for a whole bunch
  of user and contributor services.

  (sorry, I could not resist...)

> My conclusion in favor of institutional self-archiving is based on the
> evidence and on logic, and it represents a change of thinking,
> for I had originally advocated (3) Web/FTP self-archiving --
> http://www.arl.org/scomm/subversive/toc.html -- then switched allegiance
> to central self-archiving (1), even creating a discipline-based archive:
> http://cogprints.ecs.soton.ac.uk/ But with the advent of OAI in 1999,
> plus a little reflection, it became apparent that
> institutional self-archiving (2) was the fastest, most direct, and most
> natural road to open access: http://www.eprints.org/
> And since then its accumulating momentum seems to be confirming that this
> is indeed so: http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2212.html
> http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt

  Hmm, with you changing your mind, and with more than a little
  reflection over that many years, I think all of us on this
  forum will be convinced that the best road is not an easy topic
  to approach. I don't have the answer either, but I will show
  instead that there is no answer.

  The way I see it that if you want to achieve self-archiving,
  you have to get authors to self-archive. To do that, you need
  to find the right incentives. One way is to have Clifford Lynch
  running around campus, switching off every independent web
  service because it is a security risk, and then force faculty
  to digitally publish through a central facility. Granted, my 
  vision of Clifford's intention is exagerated, but even a milder
  form of it will not succeed. This is no way to run a university.
  Right? So you are left off to find a way in which you have to give
  incentives to academics. Now, please accept my hypothesis that
  publishing is done more with the academic colleagues in mind
  rather than with the university's central administration 
  in mind. Then you inevitably end up with a situation where
  you have to get a whole discipline along to self-archive. As
  long as others in the discipline are not doing it, there 
  is little interest in the individual scholar doing it. They
  may send the paper directly to closed-access publisher facilities
  or, may be in addition, upload it on a web site somewhere.

> >   The primary sense of belonging
> >   of a scholar in her research activities is with the disciplinary
> >   community of which she thinks herself a part... It certainly
> >   is not with the institution.
> 
> That may or may not be the case, but in any case it is irrelevant to
> the question of which is the more promising route to open-access. Our
> primary sense of belonging may be with our family, our community,
> our creed, our tribe, or even our species. But our rewards (research
> grant funding and overheads, salaries, postdocs and students attracted
> to our research, prizes and honors) are intertwined and shared with our
> institutions (our employers) and not our disciplines (which are often
> in fact the locus of competition for those same rewards!)

  Sure, that is why we need institutional support to take the competition
  head on, by maximising the impact of our work. But the object of 
  the competition is still the discipline.

> Content "aggregation," in other words, is a paper-based notion. 
In
> the online era, it merely means digital sorting of the pointers to
> the content.

  I understand that. But you can aggregate and aggregate, as 
  long as you not prove that formal archiving is improving impact,
  you are not likely to get far with your formal archiving.

> >   I am afraid, there more and more such faculty members. Much
> >   of the research papers found over the Internet are deposited
> >   in the way. This trend is growing not declining.
> 
> You mean self-archiving in arbitrary non-OAI author websites? 

  I do.

> There is another reason why institutional OAI archives and official
> institutional self-archiving policies (and assistance) are so
> important. In reality, it is far easier to deposit and maintain
> one's papers in institutional OAI archives like Eprints than to set
> up and maintain one's own website.  All that is needed is a clear
> official institutional policy, plus some startup help in launching
> it. (No such thing is possible at a "discipline" level.)

> http://www.ecs.soton.ac.uk/~lac/archpol.html
> http://www.eprints.org/self-faq/#institution-facilitate-filling
> http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
> http://paracite.eprints.org/cgi-bin/rae_front.cgi

  If this is what authors feel, then this is wonderful. But the
  proof of the pudding is in the eating. If the authors do not
  deposit, you will have to think (yet again) about your best
  strategy.

  Incidentally, have you deposited all your papers in institutional
  archives? I see some ~harnad above. Heaven forbid I tell Clifford
  about this :-) 

> But where there is a causal contingency -- as there is
> between (a) the research impact and its rewards, which academics like as
> much as anyone else, and (b) the accessibility of their research -- 
academics
> are surely no less responsive than Prof. Skinner's pigeons and rats to
> those causal contingencies, and which buttons they will have to press
> in order to maximize their rewards!
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm

  Yes, but the arguing in the aggregate is not sufficient, I think.
  You have to demonstrate that to individual academics, figures at
  hand. In the meantime you have to collect formally archive contents.
  Institutional archives is one way, departmental is another way,
  discipline based archiving another, but there is no "right" or
  "wrong" way. Whatever way there is discipline-based services will
  be a key to providing incentives to scholars. 

  With greetings from Minsk, Belarus,


  Thomas Krichel                         http://openlib.org/home/krichel
                                     RePEc:per:1965-06-05:thomas_krichel



[BOAI] Re: Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Mon, 17 Mar 2003 01:50:59 +0000 (GMT)


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org
      • This Message
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from paquetse AT iro.umontreal.ca
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from radu AT monicsoft.net
             [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org

I basically agree with Thomas Krichel on all the substantive points:

On Sun, 16 Mar 2003, Thomas Krichel wrote:

>   institutional archives will lie empty unless there are better
>   incentives for scholars to contribute to them. If you tell
>   them that it will open their scholarship to the world to
>   read, they will listen. If you tell them, figures in hand, 
>   how much it does, and how much impact they gain---relatively
>   to their colleagues in the offices next door---they will act...
>   Basically RePEc aims to achieve a type of dataset that will allow
>   to measure impact

I agree. Steve Lawrence has gathered some data along these lines. We are
doing so too. And I know you are too. These data will help demonstrate
to the research community, quantitatively, the direct causal connection
between research access and research impact.
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm

>   you have to get authors to self-archive. To do that, you need
>   to find the right incentives...
>   publishing is done more with the academic colleagues in mind
>   rather than with the university's central administration 
>   in mind. Then you inevitably end up with a situation where
>   you have to get a whole discipline along to self-archive. As
>   long as others in the discipline are not doing it, there 
>   is little interest in the individual scholar doing it. 
>   You have to demonstrate that to individual academics, figures at
>   hand. In the meantime you have to collect formally archive contents.

I also agree completely that until OAI-compliant self-archiving prevails,
havesting or centralized links to authors' arbitrary websites is extremely
desirable and useful. I expect that there is an order of magnitude
more non-OAI self-archived content (preprints and postprints) on the
Web today then there is OAI. Harvesting it (citeseer-style) or linking
to it with OAI-equivalent metadata (RePec-style) is not only valuable
in itself (making a lot of open-access work more visible and usable)
but it will help encourage more self-archiving, as well as providing the
access/impact causality data that will help inspire still more! 

[Les Carr is doing it now with the 2001
UK-wide RAE returns, generating "RAEprints":
http://www.hero.ac.uk/rae/submissions/ 
http://www.rareview.ac.uk/
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm ]

(I couldn't quite see the point about why individuals couldn't do it,
and a whole discipline needs to be convinced. Surely individuals
come first, but never mind.) 

>   Incidentally, have you deposited all your papers in institutional
>   archives? I see some ~harnad above.

Of course! All my papers (retroactive to the 70's) have been FTP- and
then web-archived since the late '80's, as well as in CogPrints since
1997 and the Southampton ECS Archive since 1999. Both Archives have since
become OAI-compliant: 
http://www.ecs.soton.ac.uk/~harnad/genpub.html
http://www.ecs.soton.ac.uk/~harnad/intpub.html
http://makeashorterlink.com/?R3DD514D3
http://makeashorterlink.com/?S60652783

(I practise what I preach!)

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 
02):

    http://amsci-forum.amsci.org/archives/september98-forum.html
                            or
    http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: september98-forum AT amsci-forum.amsci.org 

See also the Budapest Open Access Initiative:
    http://www.soros.org/openaccess

the BOAI Forum:
    http://www.eprints.org/boaiforum.php/

the Free Online Scholarship Movement:
    http://www.earlham.edu/~peters/fos/timeline.htm

the SPARC position paper on institutional repositories:
    http://www.unites.uqam.ca/src/sante.htm

the OAI site:
    http://www.openarchives.org

and the free OAI institutional archiving software site:
    http://www.eprints.org/



Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Mon, 17 Mar 2003 02:01:29 +0000 (GMT)


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from lqthede AT apk.net
      • This Message

On Sun, 16 Mar 2003, Linda Thede wrote:

> how will all these individual institutional papers be indexed
> so interested persons can find them?

Let me count the ways:

http://www.openarchives.org/service/listproviders.html
http://oaister.umdl.umich.edu/o/oaister/
http://www.scirus.com/search_simple_boolean/
http://citebase.eprints.org/cgi-bin/search
http://arc.cs.odu.edu/ ...

And as the OAI Archives grow in content --
http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt --
there will be more and better cross-archive search engines.

Stevan Harnad


Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Sebastien Paquet <paquetse AT iro.umontreal.ca>
Date: Sun, 16 Mar 2003 21:36:15 -0500 (EST)


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message
             [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk

On Mon, 17 Mar 2003, Stevan Harnad wrote:

> (I couldn't quite see the point about why individuals couldn't do it,
> and a whole discipline needs to be convinced. Surely individuals
> come first, but never mind.) 

I believe Thomas Krichel is referring to a network effect. A single
telephone is useless; the usefulness of having a telephone increases with
the overall number of telephones. A similar argument can be made for
putting papers in open access, but the value to a particular researcher
strongly depends on the number of people who are also doing it (or at
least using open archives) in his discipline and only weakly on the number
of people who are doing it in other disciplines. This is why it makes
sense to say, as he writes:

 "As long as others in the discipline are not doing it, there
  is little interest in the individual scholar doing it."

Since scholars are much better socially interconnected within disciplines,
peer pressure is likely to drive authors towards open access on a
disciplinary basis. It would be so much easier to convince a researcher 
to jump in if he were persuaded that everyone else in his discipline is 
also going to do it!

Sébastien Paquet
Université de Montréal
-- 
Seb's Open Research -
news, pointers and thoughts on the evolution of knowledge sharing
http://radio.weblogs.com/0110772


[BOAI] Re: Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Mon, 17 Mar 2003 13:53:22 +0000 (GMT)


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from paquetse AT iro.umontreal.ca
      • This Message

On Sun, 16 Mar 2003, Sebastien Paquet wrote:

> the value to a particular researcher [of] putting papers in open access
> strongly depends on the number of people who are also doing it (or at
> least using open archives) in his discipline and only weakly on the number
> of people who are doing it in other disciplines. 

I think the answer to your own point is contained within your parentheses:
Even if I am the *only* author who self-archives, I get the full
impact-enhancing value of it as long as the relevant researchers are
*using* my self-archived version.

Well, evidence suggests they are! See:
http://www.neci.nec.com/~lawrence/papers/online-nature01/
(And DP9, for example, also ensures that all google-users can access
all OAI papers, if the OAI search engines are not enough to point them
there!  http://www.openarchives.org/service/listproviders.html ).

> Since scholars are much better socially interconnected within disciplines,
> peer pressure is likely to drive authors towards open access on a
> disciplinary basis. It would be so much easier to convince a researcher 
> to jump in if he were persuaded that everyone else in his discipline is 
> also going to do it!

I don't disagree. But the question was merely whether there was any
special need for central, discipline-based OAI Archives, rather than
distributed, institution-based ones, to promote self-archiving and
open-access (given that they are all interopeable anyway). I think the
answer is no. Both kinds of archives are useful and speed us toward open
access (but I still think there are good reasons to believe institutional
self-archiving is the more universal and natural route, as well as the
speediest one!).

Stevan Harnad


Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Radu <radu AT monicsoft.net>
Date: Mon, 17 Mar 2003 10:24:52 -0500


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message

Quoting Stevan Harnad <harnad AT ecs.soton.ac.uk>:

> (I couldn't quite see the point about why individuals couldn't
> do it, and a whole discipline needs to be convinced. Surely
> individuals come first, but never mind.) 

Not never mind, because this is a very important point. Disciplines 
are made out of individuals who share (or assume they share), a 
specific body of knowledge.

Yes, individuals should do it, but individuals have enough of a hard 
time doing the research and writing the paper. If you give them yet 
another set of tasks to put their papers online, just for the sake of 
others, with no recognition... they simply won't do it.

Personally I'm against the waste of paper that paper-publishing 
creates. Still, I rarely get the time to put the stuff I place online 
in paper format and vice-versa.

Right now I'm working with Marie-Odile Junker at Carleton University, 
Ottawa, Canada on online documenting the Cree language (an aboriginal 
American couple of dialects). And she's upset that though this effort 
took much more resources than writing a dozen of scholarly journal 
papers, the University does not give her any publication-related 
bonus. THAT is the reason the 'disciplines' have to be made aware of 
the power of open distribution.

While I was doing my Master's thesis on the process of psychological 
research itself, I noticed that people are so burnt out after the 
minutiae of research and publishing that they forget or neglect the 
further steps of archiving, to a point that years later they can't 
find copies of the research materials. They tend to delegate this end 
bit to lab staff who is simply not trained in archival theory and 
practice.

So since then, whenever I get a bit of time out of my work and 
studies, I try to put together a system that would allow researchers 
to do the research steps in an organized way, that would allow them 
to really forget about archiving (other than backing up the archive 
files now and then.)

If you ask researchers to make their work available, you should not 
ask them to put more resources into it (not everyone is THAT 
altruistic), but show them how they can make their work easier.

Interestingly enough, most of my research and development has been 
directly or indirectly funded by the Open Society Foundation :)

Cheers,
Radu
(www.monicsoft.net)



[BOAI] Re: Cliff Lynch on Institutional Archives

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Tue, 18 Mar 2003 14:05:02 +0000 (GMT)


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from radu AT monicsoft.net
             [BOAI] BOAI Forum and FAQ [was: Cliff Lynch on Institutional Archives] from peters AT earlham.edu
             [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org

On Tue, 18 Mar 2003, Christopher Gutteridge wrote:

> we are planning a University-wide eprints archive. I am 
> concerned that some physicists will want to place their items in both
> the university eprints service AND the arXiv physics archive. They may 
> be required to use the university service, but want to use arXiv as it
> is the primary source for their discipline. This is a duplication of 
> effort and a potential irritation.

This is a very minor technical problem (the interoperability of multiple
OAI Archives containing the same paper) and part of another, slightly
less minor problem, namely, version-control, within and across OAI
Archives (the coordination of multiple versions and revisions of the
same paper, within the same or different OAI archives), plus the
optimization of cross-archive OAI search services:
http://www.openarchives.org/service/listproviders.html

I recommend that this be discussed with the pertinent experts in oai-tech
or oai-general. It is not a general archiving or open-access matter, and
can only confuse researchers (needlessly). For them, self-archiving is
the optimal thing to do, institutionally in the first instance, but also
in a central disciplinary archive if/when they wish; and they should
not worry any further about it. (What is needed, urgently, today, is
universal self-archiving, and not trivial worries about whether to do it
here or there or both: OAI-interoperability makes this into a non-issue
from the self-archiver's point of view, and merely a technical feature
to sort out, from the OAI-developers' point of view.)

> Ultimately, of course, I'd hope that disciplinary archives will be 
replaced
> with subject-specific OAI service providers harvesting from the 
institutional
> archives. But there is going to be a very long transition period in which
> the solution evolves from our experience.

A very long transition period from what to what? Right now, most OAI
Archives, whether institutional or disciplinary, are either (1)
non-existent, or (2) near-empty! The transition we are striving for is
from empty to full archives (and let us hope it will not be too long!),
not from disciplinary to institutional archives!

What Chris has in mind is only one, exceptional, special case,
namely, the Physics ArXiv, a disciplinary archive (but the *only*
one) which is, since 1991, well on the road to getting filled in
certain subareas of physics (200,000+ papers) (although even this
archive is still a decade from completeness at its present linear
growth rate: http://arxiv.org/show_monthly_submissions see slide 10 of
http://www.ecs.soton.ac.uk/~harnad/Temp/tim-arch.htm )

Chris is imagining that if/when the institutions of those physicists
who are already self-archiving in ArXiv adopt an institutional
self-archiving policy like the one in Chris's own department --
http://www.ecs.soton.ac.uk/~lac/archpol.html -- then some of those
physicists may wonder why/whether they should self-archive twice!
(A tempest in a teapot! The real challenge is getting all the *other*
disciplines to self-archive in the first place. Don't worry about those
physicists who are already ahead of the game. They are not the
problem!)

> What I'm asking is; has anyone given consideration to ways of smoothing
> over this duplication of effort? Possibly some negotiated automated 
process
> for insitutional archives uploading to the subject archive, or at least
> assisting the author in the process.

No need! First, because the "duplification of effort" is so minimal 
(the
centrally self-archiving physicists being such an infinitesimal subset
of all that needs to be self-archived -- namely, 2,000,000 articles per
year, across disciplines, not just 200,000 across 10 years, in one
discipline!). And second, because the technical problem (of duplicate
self-archiving) is so soluble, in so many obvious ways!

> This isn't the biggest issue, but it'd be good to address it before it
> becomes more of a problem.

It is such a small issue that it does not belong in a general discussion
of open access and self-archiving for researchers. It belongs only in a
technical discussion group for developers and implementers of the OAI
protocol. The only issue for the research community is how to get the
OAI Archives created and filled, as soon as possible; and I think it
is becoming apparent that institution-based self-archiving is the most
general and natural route to this goal, for the many reasons already
discussed in this thread.

Stevan Harnad




Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Thomas Krichel <krichel AT openlib.org>
Date: Tue, 18 Mar 2003 19:10:36 +0200


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from radu AT monicsoft.net
             Re: [BOAI] Re: Cliff Lynch on Institutional Archives from cjg AT ecs.soton.ac.uk

  Sebastien Paquet writes

> telephone is useless; the usefulness of having a telephone increases with
> the overall number of telephones. A similar argument can be made for
> putting papers in open access, but the value to a particular researcher
> strongly depends on the number of people who are also doing it (or at
> least using open archives) in his discipline and only weakly on the number
> of people who are doing it in other disciplines. This is why it makes
> sense to say, as he writes:
> 
>  "As long as others in the discipline are not doing it, there
>   is little interest in the individual scholar doing it."
> 
> Since scholars are much better socially interconnected within disciplines,
> peer pressure is likely to drive authors towards open access on a
> disciplinary basis. It would be so much easier to convince a researcher 
> to jump in if he were persuaded that everyone else in his discipline is 
> also going to do it!

  Merci Sébastien, this is a good way to put it. but there is a bit
  more to it. The incentives mill come from the evaluative data that
  is computed on the discipline-based aggregative dataset, plus 
  some other optional data.



  With greetings from Minsk, Belarus,


  Thomas Krichel                         http://openlib.org/home/krichel
                                     RePEc:per:1965-06-05:thomas_krichel


Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Radu <radu AT monicsoft.net>
Date: Tue, 18 Mar 2003 15:08:22 -0500


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org
      • This Message

Quoting Christopher Gutteridge <cjg AT ecs.soton.ac.uk>:

> What I'm asking is; has anyone given consideration to ways
> of smoothing over this duplication of effort? Possibly some
> negotiated automated process for insitutional archives
> uploading to the subject archive, or at least
> assisting the author in the process.

If peer-to-peer 'open' music sharing software like Napster and the 
like managed to get set up so quickly and be so successful, I wonder 
what the problem is within the academic circles.

Is it the inertia of 'researching the best standard'? Why don't we 
simply adopt one of the successful models already at work in 
the 'fringe industry'?

Why do we have to develop yet another standard?
- Is it for the sake of credit? Think about it. Are citations a good 
measure of credit? When you cite an article that simply describes 
someone else's work, who gets the credit? How far can one follow back 
the syntopical chain of citations? Just because a paper is cited a 
lot does it mean it's influential or plain wrong and lots of people 
jumped in the water to retrieve the stick?
- Because of reliability? That would be solved by someone investing 
in some servers that will be always up and which will selectively 
duplicate the works which get good 'marks' from their users.

Make the system 'credit-based', allow the researchers to just place 
the work they want to make public on dedicated machines within their 
Universities and other research venues.

 And please:
- stop creating all-new standards. Before you start standardization, 
look around and see if the same functionality is not already 
available.
- stop fragmenting the digital world into exclusivist 'servers' 
and 'services'. Are we striving for open or closed access?
- stop looking for the 'final ontology' for classifying stuff. The 
world is not perfect. People are not perfect. And good 
indexing/search facilities are more efficient than any ontology.

I could dig up references for most of my assertions, but I bet most 
of you are already aware of them.

We just need access to each-other's work, so that our ideas grow in 
the fertile land of other minds.

Cheers,
Radu
(www.monicsoft.net)





Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Radu <radu AT monicsoft.net>
Date: Tue, 18 Mar 2003 15:38:32 -0500


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message

I would like to suggest a FAQ for people new to the list like me who 
come and don't know what issues have already been discussed and in 
what forum they should be discussed.

If such thing exists already and I didn't find it, please point me to 
it.

Cheers,
Radu

(www.monicsoft.net)



[BOAI] BOAI Forum and FAQ [was: Cliff Lynch on Institutional Archives]

From: Peter Suber <peters AT earlham.edu>
Date: Tue, 18 Mar 2003 16:26:15 -0500


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message

At 03:38 PM 3/18/2003 -0500, you wrote:
>I would like to suggest a FAQ for people new to the list like me who
>come and don't know what issues have already been discussed and in
>what forum they should be discussed.
>
>If such thing exists already and I didn't find it, please point me to
>it.
>
>Cheers,
>Radu
>
>(www.monicsoft.net)


Radu,
      The BOAI Forum is just a few days more than one month old.  The 
number of topics already discussed is very small.  See the archive of past 
postings at <http://threader.ecs.soton.ac.uk/lists/boaiforum/>.
      The BOAI has a general FAQ, 
<http://www.earlham.edu/~peters/fos/boaifaq.htm>, and a more specialized 
Self-Archiving FAQ, <http://www.eprints.org/self-faq/>.  But the BOAI 
Forum does 
not have its own FAQ.

      Peter Suber
      (moderator of the BOAI Forum)


----------
Peter Suber, Professor of Philosophy
Earlham College, Richmond, Indiana, 47374
Email peters AT earlham.edu
Web http://www.earlham.edu/~peters

Editor, Free Online Scholarship Newsletter
http://www.earlham.edu/~peters/fos/
Editor, FOS News blog
http://www.earlham.edu/~peters/fos/fosblog.html

ATTACHMENT: message.html!


[BOAI] Re: Cliff Lynch on Institutional Archives

From: Thomas Krichel <krichel AT openlib.org>
Date: Wed, 19 Mar 2003 00:42:43 +0200


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from harnad AT ecs.soton.ac.uk
      • This Message
             [BOAI] The RePEc (Economics) Model from harnad AT ecs.soton.ac.uk
             [BOAI] Re: The RePEc (Economics) Model from krichel AT openlib.org
             [BOAI] Re: Cliff Lynch on Institutional Archives from hussein AT cs.uct.ac.za

  Stevan Harnad writes

> >   Success here depends on selling the idea to academics, and that
> >   depends crucially on what business models are followed. 
> 
> I have no idea what "business models" have to do with 
demonstrating to
> academics that increasing research access increases research impact.
> http://www.nature.com/nature/debates/e-access/Articles/lawrence.html

  For self-archiving, abstract understanding is not sufficiont.
  You need action by academics. If you want to have an intermediated
  process (by means of an achive) then it will crucially depond
  on the behaviour of the intermediary, in this case of the archive
  managemnt. This is what I mean here by the business model
  of the archive. 

  You have changed your mind twice on what the optimal business
  model is. You will change it again... Until then, I shall
  keep a bit more quiet. When I return to NYC, I will have
  web access again, and find other things to do. 

  Just for correction

> online papers that already exist on arbitrary websites webwide. This
> is the invaluable service Thomas's RePEc (Research Papers in
> Economics) is performing for over 86,000 non-OAI papers

  RePEc does not index arbinary website, but archive sites.
  They have the same functioality as OAI archives, in fact
  OAI was modeled after RePEc. The whole OAI concept was 
  first implemented there. 

  With greetings from Minsk, Belarus,


  Thomas Krichel                     http://openlib.org/home/krichel
                                 RePEc:per:1965-06-05:thomas_krichel


[BOAI] The RePEc (Economics) Model

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Wed, 19 Mar 2003 13:44:28 +0000 (GMT)


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org
      • This Message
             Re: [BOAI] The RePEc (Economics) Model from radu AT monicsoft.net

[Subject header changed from the Cliff Lynch paper to RePEc to reflect
the change in focus.]

On Wed, 19 Mar 2003, Thomas Krichel wrote:

>   For self-archiving, abstract understanding [by academics]
>   is not sufficient.  You need action by academics. 
>   If you want to have an intermediated
>   process (by means of an archive) then it will crucially depend
>   on the behaviour of the intermediary, in this case of the archive
>   management.

The Repec model is one in which many distributed institutions,
each having archives of multiple economics papers of
their own, have their metadata gathered together and
enriched to provide OAI-like interoperability: http://repec.org/
Instead of using the OAI protocol, Repec uses the "Guildford" 
protocol --
ftp://netec.mcc.ac.uk/pub/NetEc/RePEc/all/root/docu/guilp.html -- but
it has been announced that Repec plans to become OAI-compliant eventually.
(Repec does *not*, as I had wrongly assumed, cover individual websites
too, as ResearchIndex/citeseer http://citeseer.nj.nec.com/cs
does, only multi-paper institutional archives.)

Repec is accordingly a form of institutional self-archiving, pre-dating
the OAI, but (1) focused on one discipline only (economics), and
(2) not requiring the individual archives to be OAI-compliant (but
Guildford-compliant). It is a very activist project, "a collaborative
effort of over 100 volunteers in 30 countries to enhance the dissemination
of research in economics."

It should be noted at once that if every discipline had its own
institutional Guildford-compliant archives and volunteers, as Economics
has, then I and many others would today be promoting Institutional
Guilford-compliant repositories rather than Institutional OAI-compliant
repositories (and the free software that Southampton designed for creating
OAI-compliant institutional repositories for self-archiving 
http://www.dlib.org/dlib/october00/10inbrief.html would have
been Guildford-compliant software).

As it happened, it is OAI that prevailed (inspired partly by Guildford
and Repec), with Thomas Krichel as one of its co-founders, and still a
member of the OAI technical committee. What distinguished Repec is hence
not its interoperability protocol (since it plans to become OAI-compliant
anyway) but (a) its activism and (b) its discipline-specificity. If
there were a way to spread Repec's activism from economics to the other
disciplines, it would certainly be very welcome, just as it would  be
very welcome if there were a way to spread ArXiv's central-archiving
tendency to the other disciplines.

Unfortunately, no such generalization of either Repec or Arxiv to the
other disciplines has taken place (Repec began in 1997, Arxiv in 1991).
http://www.earlham.edu/~peters/fos/timeline.htm
It is for this reason that it is OAI-compliant institutional
self-archiving that I happen to be promoting. And this is at last
showing signs of generalizing
http://www.ecs.soton.ac.uk/~harnad/Temp/tim-arch.htm
though still not fast enough. It is for that reason that various forms
of activism need to be promoted too, especially institutional activism:
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.eprints.org/self-faq/#libraries-do
http://www.eprints.org/self-faq/#research-funders-do
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
http://www.ecs.soton.ac.uk/~lac/archpol.html
http://paracite.eprints.org/cgi-bin/rae_front.cgi

>   You have changed your mind twice on what the optimal business
>   model is. You will change it again... 

I have changed my mind in response to specific empirical changes that
have taken place across the years. (I would hope everyone else has
done so too.) For me, the first major change was the Internet itself,
converting me from conducting most activities on-paper to on-line:
http://cogprints.soton.ac.uk/documents/disk0/00/00/15/81/ I even founded
an online-only journal (1989): http://psycprints.ecs.soton.ac.uk

Then came Ann Okerson's suggestion that information should be free,
which I initially dismissed as unrealistic, but then realized that it
could be turned into something that made excellent sense on condition that it
was applied very specifically only to *author give-away* information
(of which the refereed research literature is the main representative),
rather than all information (or even all scholarly information):
http://www.ecs.soton.ac.uk/~harnad/Papers/Harnad/harnad95.quo.vadis.html

That was what then prompted the "subversive proposal" that 
researchers
should self-archive their give-away research (1994):
http://www.arl.org/scomm/subversive/toc.html  

At first, FTP sites and Web sites seemed the simplest, fastest and
most direct way for researchers to self-archive, on a distributed,
institutional basis; but then the slow progress in this, and the
success of the physicists' centralized disciplinary model suggested
that centralized, discipline-based self-archiving might be
faster, with the Physics Arxiv itself perhaps subsuming it all
http://cogprints.soton.ac.uk/documents/disk0/00/00/16/99/
(Thomas Krichel argued against central archiving, and in favor of
distributed archiving at the time, but at that time, pre-OAI, and with
Arxiv looking as if it would scale up, it was not at all clear why
distributed archiving was preferable.)

I even founded a central disciplinary archive modeled on the
Physics Arxiv, (Cogprints, designed by Matt Hemus, 1997 and later
Rob Tansley) with a view to Arxiv's eventually subsuming it:
http://cogprints.ecs.soton.ac.uk/

But central archiving did not catch on (Cogprints has only reached
1500 papers in 2003) or generalize to other disciplines, and Arxiv
itself kept growing at only an unchanged linear rate from year to year:
http://arxiv.org/show_monthly_submissions

And then came the OAI protocol in 1999, making distributed self-archiving
equivalent to central (because of interoperability)
http://www.openarchives.org/documents/index.html
which immediately prompted me to ask Rob Tansley to redesign
the Cogprints software to make it OAI-compliant and then turn it
into free generic OAI archive-creating software for institutions
http://www.dlib.org/dlib/october00/10inbrief.html

Next came the Budapest Open Access Initiative, uniting the two roads
to open access (BOAI-1: self-archiving; BOAI-2: open-access journals)
http://www.soros.org/openaccess/

And the self-archiving momentum has been growing ever since:
http://www.ecs.soton.ac.uk/~harnad/Temp/tim-arch.htm

But I am still ready to change my mind if any new developments call
for it. (I hope you are too!) The momentum is still not nearly as great
as it could and should be.

>   RePEc does not index arbitrary website, but archive sites.
>   They have the same functionality as OAI archives, in fact
>   OAI was modeled after RePEc. The whole OAI concept was 
>   first implemented there. 

I think I now understand this. See above. Both Repec's
aggregation of institutional multi-paper archives in economics and
Citeseer/ResearchIndex's harvesting of arbitrary individual websites
in computer science are welcome interim measures for increasing the
visibility and usability of what open-access content already exists
online -- while the institutional OAI-compliant self-archiving momentum
grows. Anything that helps fast-forward us toward universal open-access
to the entire refereed research literature (2,000,000 papers per year,
across all disciplines) is welcome and should be embraced by all who are
open-minded among us, regardless of which open-access route they happen
to favor.

Stevan Harnad



Re: [BOAI] The RePEc (Economics) Model

From: Radu <radu AT monicsoft.net>
Date: Wed, 19 Mar 2003 11:34:10 -0500


Threading: [BOAI] The RePEc (Economics) Model from harnad AT ecs.soton.ac.uk
      • This Message

Excellent historical FAQ/timeline. Could you please put it up on the 
BOAI site and update it given other individual points of view of the 
people involved?

Thanks,
Radu
(www.monicsoft.net)



[BOAI] Re: The RePEc (Economics) Model

From: Thomas Krichel <krichel AT openlib.org>
Date: Wed, 19 Mar 2003 22:16:33 +0200


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org
      • This Message
             [BOAI] Re: The RePEc (Economics) Model from harnad AT ecs.soton.ac.uk


  Stevan Harnad writes

> The Repec model is one in which many distributed institutions,
> each having archives of multiple economics papers of
> their own, have their metadata gathered together and
> enriched to provide OAI-like interoperability: http://repec.org/

  The interoperability is more complicated then in a conventional
  OAI setting, because the structure of the data exchanged goes will
  beyond what can be done with oai_dc.

> Instead of using the OAI protocol, Repec uses the "Guildford"
> protocol -- ftp://netec.mcc.ac.uk/pub/NetEc/RePEc/all/root/docu/guilp.html 
--
> but it has been announced that Repec plans to become OAI-compliant
> eventually.

  I already operate a gateway at http://oai.repec.openlib.org. It's
  oai_dc data may be a bit thin, but there is plenty of AMF metadata.

> (Repec does *not*, as I had wrongly assumed, cover individual
> websites too, as ResearchIndex/citeseer
> http://citeseer.nj.nec.com/cs does, only multi-paper institutional
> archives.)

  Departmental archives, as distinguished from institutional archives.
  Some archives serve special purposes, they hold no docuemnt
  data at all. 

> Repec is accordingly a form of institutional self-archiving,
> pre-dating the OAI, but (1) focused on one discipline only
> (economics), and (2) not requiring the individual archives to be
> OAI-compliant (but Guildford-compliant).

  Correct, which is basically just a way to dump files on a disk,
  nothing more. 

> It is a very activist project, "a collaborative
> effort of over 100 volunteers in 30 countries to enhance the dissemination
> of research in economics."

  Correct, and almost all are economics faculty. Some folks do 
  little, but the construction of the whole enterprise means that
  even if they do little, since there are many 

> It should be noted at once that if every discipline had its own
> institutional Guildford-compliant archives and volunteers, as Economics
> has, then I and many others would today be promoting Institutional
> Guilford-compliant repositories rather than Institutional OAI-compliant
> repositories (and the free software that Southampton designed for creating
> OAI-compliant institutional repositories for self-archiving
> http://www.dlib.org/dlib/october00/10inbrief.html would have
> been Guildford-compliant software).

  The technical protocol for the transport matters little. This
  really (!) is a technical matter. We continue with what we
  got because we can not rearrange 250+ archives that otherwise
  do just fine. 

> What distinguished Repec is hence not its interoperability protocol
> (since it plans to become OAI-compliant anyway) but (a) its activism
> and (b) its discipline-specificity.

  and (c) its metadata model. This is by far the most important, but least
  well understood distinction. 

> If there were a way to spread Repec's activism from economics to the
> other disciplines, it would certainly be very welcome, just as it
> would be very welcome if there were a way to spread ArXiv's
> central-archiving tendency to the other disciplines.

  Could not agree more.

> Unfortunately, no such generalization of either Repec or Arxiv to the
> other disciplines has taken place (Repec began in 1997, Arxiv in 1991).

  RePEc has its origin in a project called WoPEc that I started on
  February 1, 1993. In 1997, RePEc was born essentially out of WoPEc
  and some other partners, but WoPEc had the lion's share (I am
  simplifying here a bit.)

> http://www.earlham.edu/~peters/fos/timeline.htm It is for this
> reason that it is OAI-compliant institutional self-archiving that I
> happen to be promoting. And this is at last showing signs of
> generalizing http://www.ecs.soton.ac.uk/~harnad/Temp/tim-arch.htm
> though still not fast enough. It is for that reason that various
> forms of activism need to be promoted too, especially institutional
> activism:

  There is no contradicition between institutional and departmental 
  archives, and aggregator strutures. It is by no means an either
  or choice. And let me emphasise again: having discipline-based
  aggregators will be the best way to stimulate institutional 
  and departmental archiving. The problem is, of course, that
  there are not many aggregators around. Therefore I have been
  argueing for a while thet the institutional self-archiving
  community should stick together to elect one area of discplinary
  priority. That is rather that to fight a war on all fronts,
  concentrate the effort and build systems that are interoperable
  beyond the unqualified DC data model. The DC data model is too simple
  for academic self-documentation.

> At first, FTP sites and Web sites seemed the simplest, fastest and
> most direct way for researchers to self-archive, on a distributed,
> institutional basis;

  They still are, just look at the amount of stuff that is on the
  web. There are so many grass-roots initiatives. The larger
  public is not aware of them because they serve specific communities. 
  This is where I get so angry with Clifford and his---implicit---call
  to shut them down, to fit all publishing activities into a central
  straightjacket. 

> but then the slow progress in this, and the success of the
> physicists' centralized disciplinary model suggested that
> centralized, discipline-based self-archiving might be faster, with
> the Physics Arxiv itself perhaps subsuming it all
> http://cogprints.soton.ac.uk/documents/disk0/00/00/16/99/ (Thomas
> Krichel argued against central archiving,

  Nope. I simply argued that the centralized model would not
  carry through to many disciplines. Where it worked it 
  was certainly an extremely good model. But you insisted
  that because the Physcists had done it everyone could
  and would, it was the optimal way (your flavour of the day).
  But I am still right. arXiv has a very unequal distribution
  of papers even in sub-areas of Physics, I am told. Ebs will
  know better. arXiv is still growing and that is a good thing.

> But central archiving did not catch on (Cogprints has only reached
> 1500 papers in 2003) or generalize to other disciplines,

  Exactly as I had forecasted! And that, depite the fact that
  it was a project subsidized by public funds. When WoPEc became
  a funded project, by the same funders, it had around 5,000
  papers accumulated as a labor of love, only. Much of that
  work was done by José Manuel Barrueco Cruz. 

> and Arxiv itself kept growing at only an unchanged linear rate from
> year to year: http://arxiv.org/show_monthly_submissions
  
  Sure, but it is still is the finest self-archiving project on the planet.
  But it really is self-archiving. Self-archiving is only a part 
  of what I call self-documentation. 
  
> And then came the OAI protocol in 1999, making distributed
> self-archiving equivalent to central (because of interoperability)
> http://www.openarchives.org/documents/index.html
 
  They are not quite, but that is a matter for another email...

> which immediately prompted me to ask Rob Tansley to redesign the
> Cogprints software to make it OAI-compliant and then turn it into
> free generic OAI archive-creating software for institutions
> http://www.dlib.org/dlib/october00/10inbrief.html

  And I think your team are doing a very good job with this.

> I think I now understand this. See above. Both Repec's
> aggregation of institutional multi-paper archives in economics and
> Citeseer/ResearchIndex's harvesting of arbitrary individual websites
> in computer science

  Citeseer are a truely fab project. The material that is there
  should become part of new, RePEc-like data structure called
  rclis and pronounced "reckless". Watch out for it over the
  next few years. 

  With greetings from Minsk, Belarus,


  Thomas Krichel                     http://openlib.org/home/krichel
                                 RePEc:per:1965-06-05:thomas_krichel


[BOAI] Re: The RePEc (Economics) Model

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Sat, 22 Mar 2003 02:10:48 +0000 (GMT)


Threading: [BOAI] Re: The RePEc (Economics) Model from krichel AT openlib.org
      • This Message

On Wed, 19 Mar 2003, Thomas Krichel wrote:

> There is no contradiction between institutional and departmental 
> archives

I agree completely. In fact, departmental archives *are* institutional
archives (as opposed to centralised, disciplinary ones, like the Physics
Ar Xiv or Cog Prints).


> having discipline-based
> aggregated will be the best way to stimulate institutional 
> and departmental archiving. The problem is, of course, that
> there are not many aggregateness around. Therefore I have been
> arguing for a while that the institutional self-archiving
> community should stick together to elect one area of disciplinary
> priority... [R]aether than to fight a war on all fronts,
> concentrate the effort and build systems that are inter operable
> beyond the unqualified DC data model. The DC data model is too simple
> for academic self-documentation.

I have no problem with elaborating the MAI protocol if it is necessary
and useful (I am not technically qualified to judge one way or the
other). But I *definitely* disagree that the institutional self-archiving
immunity should "elect one area of disciplinary priority"! 

Repacks aggregating and enriching efforts with what Economics web content
exists already, and Cite seer's harvesting and enriching efforts with what
Computer-Science web content exists already are both invaluable interim
contributions to making existing web content more inter operable and
usable, but what is urgently needed is (much, much) more content, in all
disciplines! That is what the (AI) self-archiving movement is about.
And this can and will be done in parallel, for all disciplines. There is
no sense in waiting to do it one-by-one serially, whether discipline by
discipline or journal by journal!

> just look at the amount of stuff that is on the web. There are so many
> grass-roots initiatives. The larger public is not aware of them because
> they serve specific communities. This is where I get so angry with
> Clifford and his---implicit---call to shut them down, to fit all
> publishing activities into a central straight jacket.

Cliff Lynch is not calling -- explicitly or implicitly -- for fitting
"all publishing activities into a central straitjacket"! He is simply
supporting self-archiving by institutions (which includes self-archiving
by their departments!)

And when I look at the web I am of course struck by how much is on it,
but for more struck by how much could so easily be on it, but is *not* --
across all disciplines. The target is the 2,000,000 papers published
annually in the planet's 20,000 peer-reviewed journals.

> you insisted that because the Physicist's had done [centralisers
> self-archiving], everyone could and would, it was the optimal way

No Thomas, what I said and wrote (many times) is "optimal and 
inevitable,"
is *open access* (i.e., free on line full-text access to all refereed
research). That is the *end.* Centralized Arxiv-style self-archiving is
merely one of the candidate *means,* and it did look like it was headed
toward prevailing for a while; but then it became clear that faster means
were needed. And with OAI-compliant institutional (including departmental)
self-archiving I think those faster means are at hand, the ones that
can and will at last scale up to the whole corpus, across disciplines.
http://www.ecs.soton.ac.uk/~harnad/Temp/tim-arch.htm

>sh> central archiving did not catch on
>
> Exactly as I had forecasted! 

Dear Thomas: *Nothing* has so far caught on, in over 10 years of having
open-access within reach! So it was always the safer bet that any new
candidate means would fail too! Don't be too proud of having predicted
that central archiving would not catch on. The challenge is still to
find a means that *will* catch on, and to *make* it catch on. (And the
Big Koan is still: Why is it taking so long, given that the outcome is
optimal and inevitable and reachable?)

>sh> The Big Koan is: "Why aren't all researchers self-archiving 
yet, given
>sh> its benefits and feasibility?"
>sh> http://www.dlib.org/dlib/december99/12harnad.html
>
> One answer that I have is that the benefits of doing
> self archiving have to be demonstrated to the invidual
> level of each researcher.

Agreed. And we, and you, and others are working on doing exactly that.
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm

Stevan Harnad


[BOAI] Re: Cliff Lynch on Institutional Archives

From: Hussein Suleman <hussein AT cs.uct.ac.za>
Date: Thu, 27 Mar 2003 09:17:52 +0200


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org
      • This Message
             [BOAI] Re: Interoperability - subject classification/terminology from harnad AT ecs.soton.ac.uk
             [BOAI] Re: Cliff Lynch on Institutional Archives from comyn AT utk.edu

hi

this may be stating the obvious, but why not use sets for the separate 
disciplines, aimed at particular service providers? i say it that way 
because some disciplines are not well-defined (namely, computer science) 
so such archives may want to play ball with multiple service providers 
and hence may need different sets.

in any event, for something like physics, a simple set might do the 
trick at the source. then, somewhat in keeping with the Kepler model (as 
published in DLib a while back), the service provider can provide an 
interface for potential data providers to self-register. i know this 
sounds dodgy, but think of it as an alternative mechanism for 
contribution. either individual users submit individual papers or groups 
submit baseURLS - both go through some kind of review and while one 
leads to once-off storage, the other leads to periodic harvesting.

what remains a difficult problem, however, is how to recreate the 
metadata used by the service provider as its native format. so, for a 
typical example, if arXiv classifies items using a specific set 
structure, this is certainly not going to be the default for an 
institutional archive. does the service provider automatically or 
manually reclassify? or does it not allow browsing by categories? in 
either event, the quality of the metadata from the perspective of the 
service provider may be an impetus for potential users to want to 
replicate their effort rather than rely on the automated submission from 
their own institutions ... this needs more thought ...

ttfn,
----hussein


Christopher Gutteridge wrote:
> Disciplinary/subject archives vs. Institutional/Organisation/Region based
> archives. This is going to be a key challenge now open archives begin
> to gain momentum. 
> 
> For example; we are planning a University-wide eprints archive. I am 
> concerned that some physisists will want to place their items in both
> the university eprints service AND the arXiv physics archive. They may 
> be required to use the university service, but want to use arXiv as it
> is the primary source for their discipline. This is a duplication of 
> effort and a potential irritation.
> 
> Ultimately, of course, I'd hope that diciplinary archives will be replaced
> with subject-specific OAI service providers harvesting from the 
institutional
> archives. But there is going to be a very long transition period in which
> the solution evolves from our experience.
> 
> What I'm asking is; has anyone given consideration to ways of smoothing
> over this duplication of effort? Possibly some negotiated automated 
process
> for insitutional archives uploading to the subject archive, or at least
> assisting the author in the process.
> 
> This isn't the biggest issue, but it'd be good to address it before it
> becomes more of a problem.
> 
>   Christopher Gutteridge
>   GNU EPrints Head Developer
>   http://software.eprints.org/
> 
> On Sun, Mar 16, 2003 at 02:15:56 +0000, Stevan Harnad wrote:
> 
>>On Sat, 15 Mar 2003, Thomas Krichel wrote:
>>
>>
>>>  Stevan Harnad writes:
>>>
>>>sh> There is no need -- in the age of OAI-interoperability -- 
for
>>>sh> institutional archives to "feed" central 
disciplinary archives:
>>>
>>>  I do not share what I see as a  blind faith in interoperability
>>>  through a technical protocol. 
>>
>>I am quite happy to defer to the technical OAI experts on this one, but 
let
>>us put the question precisely: 
>>
>>Thomas Krichel suggests that institutional (OAI) data-archives
>>(full-texts) should "feed" disciplinary (OAI) data-archives,
>>because OAI-interoperability is somehow not enough. I suggest that
>>OAI-interoperability (if I understand it correctly) should be enough. 
No
>>harm in redundant archiving, of course, for backup and security, but 
not
>>necessary for the usage and functionality itself. In fact, if I 
understand
>>correctly the intent of the OAI distinction between OAI data-providers 
-- 
>>http://www.openarchives.org/Register/BrowseSites.pl 
>>-- and OAI service-providers --
>>http://www.openarchives.org/service/listproviders.html 
>>-- it is not the full-texts of data-archives that need to be 
"fed" to
>>(i.e., harvested by) the OAI service providers, but only their 
metadata.
>>
>>Hence my conclusion that distributed, interoperable OAI institutional
>>archives are enough (and the fastest route to open-access). No need
>>to harvest their contents into central OAI discipline-based archives
>>(except perhaps for redundancy, as backup). Their OAI interoperability
>>should be enough so that the OAI service-providers can (among other 
things)
>>do the "virtual aggregation" by discipline (or any other 
computable
>>criterion) by harvesting the metadata alone, without the need to 
harvest
>>full-text data-contents too.
>>
>>It should be noted, though, that Thomas Krichel's excellent RePec
>>archive and service in Economics -- http://repec.org/ -- goes
>>well beyond the confines of OAI-harvesting! RePec harvests non-OAI
>>content too, along lines similar to the way ResearchIndex/citeseer --
>>http://citeseer.nj.nec.com/cs -- harvests non-OAI content in computer
>>science. What I said about there being no need to "feed" 
institutional OAI
>>archive content into disciplinary OAI archives certainly does not apply
>>to *non-OAI* content, which would otherwise be scattered willy-nilly
>>all over the net and not integrated in any way. Here RePec's and
>>ResearchIndex's harvesting is invaluable, especially as RePec already
>>does (and ResearchIndex has announced that it plans to) make all its
>>harvested content OAI-compliant!
>>
>>To summarize: The goal is to get all research papers, pre- and
>>post-peer-review, openly accessible (and OAI-interoperable) as soon as
>>possible. (These are BOAI Strategies 1 [self-archiving] and 2
>>[open-access journals]: http://www.soros.org/openaccess/read.shtml
>>). In principle this can be done by (1) self-archiving them in central
>>OAI disciplinary archives like the Physics arXiv (the biggest and
>>first of its kind) -- http://arxiv.org/show_monthly_submissions
>>-- by (2) self-archiving them in distributed institutional OAI
>>Archives -- http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt -- by (3)
>>self-archiving them on arbitrary Web and FTP sites (and hoping they
>>will be found or harvested by services like Repec or ResearchIndex)
>>or by (4) publishing them in open-access journals (BOAI Strategy 2:
>>http://www.soros.org/openaccess/journals.shtml ).
>>
>>My point was only that because researchers and their institutions
>>(*not* their disciplines) have shared interests vested in maximizing
>>their joint research impact and its rewards, institution-based
>>self-archiving (2) is a more promising way to go -- in the age of
>>OAI-interoperability -- than discipline-based self-archiving (1), even
>>though the latter began earlier. It is also obvious that both (1) and
>>(2) are preferable to arbitrary Web and FTP self-archiving (3), which
>>began even earlier (although harvesting arbitrary Website and FTP 
contents
>>into OAI-compliant Archives is still a welcome makeshift strategy
>>until the practise of OAI self-archiving is up to speed). Creating new
>>open-access journals and converting the established (20,000) 
toll-access
>>journals to open-access is desirable too, but it is obviously a much
>>slower and more complicated path to open access than self-archiving,
>>so should be pursued in parallel.
>>
>>My conclusion in favor of institutional self-archiving is based on the
>>evidence and on logic, and it represents a change of thinking,
>>for I had originally advocated (3) Web/FTP self-archiving --
>>http://www.arl.org/scomm/subversive/toc.html -- then switched 
allegiance
>>to central self-archiving (1), even creating a discipline-based 
archive:
>>http://cogprints.ecs.soton.ac.uk/ But with the advent of OAI in 1999,
>>plus a little reflection, it became apparent that
>>institutional self-archiving (2) was the fastest, most direct, and most
>>natural road to open access: http://www.eprints.org/ 
>>And since then its accumulating momentum seems to be confirming that 
this
>>is indeed so: 
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2212.html
>>http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt
>>
>>
>>>  The primary sense of belonging
>>>  of a scholar in her research activities is with the disciplinary
>>>  community of which she thinks herself a part... It certainly
>>>  is not with the institution. 
>>
>>That may or may not be the case, but in any case it is irrelevant to
>>the question of which is the more promising route to open-access. Our
>>primary sense of belonging may be with our family, our community,
>>our creed, our tribe, or even our species. But our rewards (research
>>grant funding and overheads, salaries, postdocs and students attracted
>>to our research, prizes and honors) are intertwined and shared with our
>>institutions (our employers) and not our disciplines (which are often
>>in fact the locus of competition for those same rewards!)
>>
>>
>>>  Therefore, if you want to fill
>>>  institutional archives---which I agree is the best long-run way
>>>  to enhance access and preservation to scholarly research--- [the]
>>>  institutional archive has to be accompanied by a discipline-based
>>>  aggregation process. 
>>
>>But the question is whether this "aggregation" needs to be 
the "feeding"
>>of institutional OAI archive contents into disciplinary OAI archives, 
or
>>merely the "feeding" of OAI metadata into OAI services.
>>
>>
>>>   The RePEc project has produced such an aggregator
>>>  for economics for a while now. I am sure that other, similar
>>>  projects will follow the same aims, but, with the benefit of
>>>  hindsight, offer superior service. The lack of such services
>>>  in many disciplines,  or the lack of interoperability between
>>>  disciplinary and  institutional archives, are major obstacle to
>>>  the filling  the institutional archives.  There are no
>>>  inherent contradictions between institution-based archives
>>>  and disciplinary aggregators,
>>
>>There is no contradiction. In fact, I suspect this will prove to be a
>>non-issue, once we confirm that (a) we agree on the need for
>>OAI-compliance and (b) "aggregation" amounts to 
metadata-harvesting and
>>OAI service-provision when the full-texts are in the institutional
>>archive are OAI-compliant (and calls for full-text harvesting only
>>if/when they are not). Content "aggregation," in other words, 
is a
>>paper-based notion. In the online era, it merely means digital sorting
>>of the pointers to the content.
>>
>>
>>>  In the paper that Stevan refers to, Cliff Lynch writes,
>>>  at http://www.arl.org/newsltr/226/ir.html
>>>
>>>cl> But consider the plight of a faculty member seeking only 
broader
>>>cl> dissemination and availability of his or her traditional 
journal
>>>cl> articles, book chapters, or perhaps even monographs through 
use of
>>>cl> the network, working in parallel with the traditional 
scholarly
>>>cl> publishing system.
>>>
>>>  I am afraid, there more and more such faculty members. Much
>>>  of the research papers found over the Internet are deposited
>>>  in the way. This trend is growing not declining.
>>
>>You mean self-archiving in arbitrary non-OAI author websites? There is
>>another reason why institutional OAI archives and official 
institutional
>>self-archiving policies (and assistance) are so important. In reality,
>>it is far easier to deposit and maintain one's papers in institutional
>>OAI archives like Eprints than to set up and maintain one's own 
website.
>>All that is needed is a clear official institutional policy, plus
>>some startup help in launching it. (No such thing is possible at a
>>"discipline" level.)
>>http://www.ecs.soton.ac.uk/~lac/archpol.html 
>>http://www.eprints.org/self-faq/#institution-facilitate-filling 
>>http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
>>http://paracite.eprints.org/cgi-bin/rae_front.cgi
>>
>>
>>>cl> Such a faculty member faces several time-consuming problems. 
He or
>>>cl> she must exercise stewardship over the actual content and 
its
>>>cl> metadata: migrating the content to new formats as they 
evolve over
>>>cl> time, creating metadata describing the content, and ensuring 
the
>>>cl> metadata is available in the appropriate schemas and formats 
and
>>>cl> through appropriate protocol interfaces such as open 
archives
>>>cl> metadata harvesting.
>>>
>>>  Sure, but academics do not like their work-, and certainly
>>>  not their publishing-habits, [to] be interfered with by external
>>>  forces. Organizing academics is like herding cats!
>>
>>I am sure academics didn't like to be herded into publishing with the
>>threat of perishing either. Nor did they like switching from paper to
>>word-processors. Their early counterparts probably clung to the oral
>>tradition, resisting writing too; and monks did not like be herded from
>>their peaceful manuscript-illumination chambers to the clamour of
>>printing presses. But where there is a causal contingency -- as there 
is
>>between (a) the research impact and its rewards, which academics like 
as
>>much as anyone else, and (b) the accessibility of their research -- 
academics
>>are surely no less responsive than Prof. Skinner's pigeons and rats to
>>those causal contingencies, and which buttons they will have to press 
>>in order to maximize their rewards!
>>http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
>>
>>Besides, it is not *publishing* habits that need to be changed, but
>>*archiving* habits, which are an online supplement, not a substitute,
>>for existing (and unchanged) publishing habits.
>>
>>
>>>cl> Faculty are typically best at creating new
>>>cl> knowledge, not maintaining the record of this process of
>>>cl> creation. Worse still, this faculty member must not only 
manage
>>>cl> content but must manage a dissemination system such as a 
personal Web
>>>cl> site, playing the role of system administrator (or the 
manager of
>>>cl> someone serving as a system administrator).
>>>
>>>  There are lot of ways in which to maintain a web site or to get
>>>  access to a maintained one. It is a customary activity these days 
and
>>>  no longer requires much technical expertise. A primitive 
integration
>>>  of the contents can be done by Google, it requires  no metadata.
>>>  Academics don't care  about long-run preservation, so that 
problem
>>>  remains unsolved. In the meantime, the academic who uploads 
papers to a web
>>>  site takes steps to resolve the most pressing problem, access.
>>
>>Agreed. And uploading it into a departmental OAI Eprints Archive is 
>>by far the simplest way and most effective way to do all of that. All 
it
>>needs is a policy to mandate it:
>>http://www.ecs.soton.ac.uk/~lac/archpol.html
>>
>>
>>>cl> Over the past few years, this has ceased to be a reasonable 
activity
>>>cl> for most amateurs; software complexity, security risks, 
backup
>>>cl> requirements, and other problems have generally relegated 
effective
>>>cl> operation of Web sites to professionals who can exploit 
economies of
>>>cl> scale, and who can begin each day with a review of recently 
issued
>>>cl> security patches.
>>>
>>>  These are technical concerns. When you operate a linux box
>>>  on the web you simply fire up a script that will download
>>>  the latest version. That is easy enough. Most departments
>>>  have separate web operations. Arguing for one institutional
>>>  archive for digital contents is akin to calling for a single web
>>>  site for an institution. The diseconomies of scale of central
>>>  administration impose other types of costs that the ones that it 
was to
>>>  reduce. The secret is to find a middle way.
>>
>>I couldn't quite follow all of this. The bottom line is this: The free
>>Eprints.org software (for example) can be installed within a few days. 
It
>>can then be replicated to handle all the departmental or research group
>>archives a university wants, with minimal maintenance time or costs. 
The
>>rest is just down to self-archiving, which takes a few minutes for the
>>first paper, and even less time for subsequent papers (as the repeating
>>metadata -- author, institution, etc., can be "cloned" into 
each new
>>deposit template). An institution may wish to impose an institutional
>>"look" on all of its separate eprints archives; but apart 
from that,
>>they can be as autonomous and as distributed and as many as desired:
>>OAI-interoperability works locally just as well as it does globally.
>>
>>
>>>cl> Today, our faculty time is being wasted, and expended 
ineffectively,
>>>cl> on system administration activities and content curation. 
And,
>>>cl> because system administration is ineffective, it places our
>>>cl> institutions at risk: because faculty are generally not 
capable of
>>>cl> responding to the endless series of security exposures and 
patches,
>>>cl> our university networks are riddled with vulnerable faculty 
machines
>>>cl> intended to serve as points of distribution for scholarly 
works.
>>>
>>>  This is the fight many faculty face every day, where they
>>>  want to innovate scholarly communication, but someone
>>>  in the IT department does not give the necessary permission
>>>  for network access...
>>
>>I don't think I need to get into this. It's not specific to
>>self-archiving, and a tempest in a teapot as far as that is concerned. 
An
>>efficient system can and will be worked out once there is an effective
>>institutional self-archiving policy. There are already plenty of 
excellent
>>examples, such as CalTech: 
>>http://library.caltech.edu/digital/ 
>>See also:
>>http://software.eprints.org/#ep2
>>
>>Stevan Harnad
> 
> 


-- 
=====================================================================
hussein suleman ~ hussein AT cs.uct.ac.za ~ http://www.husseinsspace.com
=====================================================================


[BOAI] Re: Interoperability - subject classification/terminology

From: Stevan Harnad <harnad AT ecs.soton.ac.uk>
Date: Thu, 27 Mar 2003 09:36:42 +0000 (GMT)


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from hussein AT cs.uct.ac.za
      • This Message

On Thu, 27 Mar 2003, Hussein Suleman wrote:

> ...why not use sets for the separate 
> disciplines, aimed at particular service providers?...
> some disciplines are not well-defined (namely, computer science) 
> so such archives may want to play ball with multiple service providers 
> and hence may need different sets.

The question of taxonomic classification sets and version-control for
Open Archives is a technical one, so I will not presume to comment on it
except from the point of view of the potential *users* of one particular
kind of Archive Content, namely, unrefereed preprints and refereed
postprints of research papers from one or many or all disciplines: This
-- in the google-age of boolean inverted full-text searchability --
does not require a detailed a-priori taxonomy, as book metadata or the
metadata for other kinds of material might. A fairly general sorting by
discipline should suffice.
http://www.eprints.org/self-faq/#26.Classification
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2385.html

> ...the service provider can provide an 
> interface for potential data providers to self-register.

I hope that once the number and contents of Open-Access Eprint Archives
for research preprints and postprints have scaled up toward something
closer to universality, the simple metadata descriptors "pre-refereeing
preprint" and "refereed journal article" plus perhaps 
"discipline name"
will be enough to guide relevant service-providers in automatically
harvesting their relevant metadata. Multiple self-registration seems a
tedious and unnecessary constraint. (Possibly a master-registry of valid
institutions and disciplinary archives will also help, but may not be
necessary unless commercial spamming invades this sector too.)

> what remains a difficult problem, however, is how to recreate the 
> metadata used by the service provider as its native format. so, for a 
> typical example, if arXiv classifies items using a specific set 
> structure, this is certainly not going to be the default for an 
> institutional archive. does the service provider automatically or 
> manually reclassify? or does it not allow browsing by categories? 

Worrying about "recreating the categories" in this boolean full-text 
age
is, I believe, a waste of time (for research preprints/postprints). Just
harness google's harvested full-text to your engine's search capability,
if it is incapable of contending with boolean full-text search on its
own. (Manual reclassification! Heaven forfend! Don't bother classifying
this material in the first place, beyond the simplest of first-cuts,
such as discipline. Any further classification should be algorithmic and
text-data-driven, not manual.)

> in either event, the quality of the metadata from the perspective of the 
> service provider may be an impetus for potential users to want to 
> replicate their effort rather than rely on the automated submission from 
> their own institutions ... this needs more thought ...

Again, I speak only for research preprints/postprints, but please let's
not inject any further credibility into the notion that self-archiving
author/institutions will also have to self-advertise by multiple
self-archiving of the same paper. Surely that is one headache that
OAI-interoperability should eradicate from the planet! Self-archiving
itself is self-advertising (and effort) enough. Please let us not
now -- when the momentum is still not big enough -- saddle would-be
self-archivers with needless extra worries, and tasks!
http://www.ecs.soton.ac.uk/~harnad/Temp/tim-arch.htm

Stevan Harnad


[BOAI] Re: Cliff Lynch on Institutional Archives

From: "Paul Cummins" <comyn AT utk.edu>
Date: Thu, 27 Mar 2003 10:05:54 -0500 (EST)


Threading: [BOAI] Re: Cliff Lynch on Institutional Archives from hussein AT cs.uct.ac.za
      • This Message

 I have thought about trying to make sets for each subject entry, and then ran
across the idea of a "home set" identifier that would point to the 
original
association.  But I am just beginning to work with OAI and probably need to
read all the archives. :)
--Paul Cummins
UT Library, Systems


> hi
>
> this may be stating the obvious, but why not use sets for the separate
> disciplines, aimed at particular service providers? i say it that way
> because some disciplines are not well-defined (namely, computer science)  
so
> such archives may want to play ball with multiple service providers  and
> hence may need different sets.
>
> in any event, for something like physics, a simple set might do the  trick
> at the source. then, somewhat in keeping with the Kepler model (as
> published in DLib a while back), the service provider can provide an
> interface for potential data providers to self-register. i know this  
sounds
> dodgy, but think of it as an alternative mechanism for
> contribution. either individual users submit individual papers or groups
> submit baseURLS - both go through some kind of review and while one  leads
> to once-off storage, the other leads to periodic harvesting.
>
> what remains a difficult problem, however, is how to recreate the  
metadata
> used by the service provider as its native format. so, for a  typical
> example, if arXiv classifies items using a specific set
> structure, this is certainly not going to be the default for an
> institutional archive. does the service provider automatically or  
manually
> reclassify? or does it not allow browsing by categories? in  either event,
> the quality of the metadata from the perspective of the  service provider
> may be an impetus for potential users to want to  replicate their effort
> rather than rely on the automated submission from  their own institutions
> ... this needs more thought ...
>
> ttfn,
> ----hussein
>
>
> Christopher Gutteridge wrote:
>> Disciplinary/subject archives vs. Institutional/Organisation/Region 
based
>> archives. This is going to be a key challenge now open archives begin 
to
>> gain momentum.
>>
>> For example; we are planning a University-wide eprints archive. I am
>> concerned that some physisists will want to place their items in both 
the
>> university eprints service AND the arXiv physics archive. They may  be
>> required to use the university service, but want to use arXiv as it is 
the
>> primary source for their discipline. This is a duplication of  effort 
and
>> a potential irritation.
>>
>> Ultimately, of course, I'd hope that diciplinary archives will be 
replaced
>> with subject-specific OAI service providers harvesting from the
>> institutional archives. But there is going to be a very long 
transition
>> period in which the solution evolves from our experience.
>>
>> What I'm asking is; has anyone given consideration to ways of 
smoothing
>> over this duplication of effort? Possibly some negotiated automated
>> process for insitutional archives uploading to the subject archive, or 
at
>> least assisting the author in the process.
>>
>> This isn't the biggest issue, but it'd be good to address it before it
>> becomes more of a problem.
>>
>>   Christopher Gutteridge
>>   GNU EPrints Head Developer
>>   http://software.eprints.org/
>>
>> On Sun, Mar 16, 2003 at 02:15:56 +0000, Stevan Harnad wrote:
>>
>>>On Sat, 15 Mar 2003, Thomas Krichel wrote:
>>>
>>>
>>>>  Stevan Harnad writes:
>>>>
>>>>sh> There is no need -- in the age of OAI-interoperability 
-- for sh>
>>>> institutional archives to "feed" central 
disciplinary archives:
>>>>
>>>>  I do not share what I see as a  blind faith in 
interoperability through
>>>> a technical protocol.
>>>
>>>I am quite happy to defer to the technical OAI experts on this one, 
but
>>> let us put the question precisely:
>>>
>>>Thomas Krichel suggests that institutional (OAI) data-archives
>>>(full-texts) should "feed" disciplinary (OAI) 
data-archives,
>>>because OAI-interoperability is somehow not enough. I suggest that
>>> OAI-interoperability (if I understand it correctly) should be 
enough. No
>>> harm in redundant archiving, of course, for backup and security, 
but not
>>> necessary for the usage and functionality itself. In fact, if I 
understand
>>> correctly the intent of the OAI distinction between OAI 
data-providers --
>>> http://www.openarchives.org/Register/BrowseSites.pl
>>>-- and OAI service-providers --
>>>http://www.openarchives.org/service/listproviders.html
>>>-- it is not the full-texts of data-archives that need to be 
"fed" to
>>> (i.e., harvested by) the OAI service providers, but only their 
metadata.
>>>
>>>Hence my conclusion that distributed, interoperable OAI 
institutional
>>> archives are enough (and the fastest route to open-access). No 
need to
>>> harvest their contents into central OAI discipline-based archives 
(except
>>> perhaps for redundancy, as backup). Their OAI interoperability 
should be
>>> enough so that the OAI service-providers can (among other things) 
do the
>>> "virtual aggregation" by discipline (or any other 
computable criterion) by
>>> harvesting the metadata alone, without the need to harvest 
full-text
>>> data-contents too.
>>>
>>>It should be noted, though, that Thomas Krichel's excellent RePec 
archive
>>> and service in Economics -- http://repec.org/ -- goes
>>>well beyond the confines of OAI-harvesting! RePec harvests non-OAI 
content
>>> too, along lines similar to the way ResearchIndex/citeseer --
>>> http://citeseer.nj.nec.com/cs -- harvests non-OAI content in 
computer
>>> science. What I said about there being no need to "feed" 
institutional OAI
>>> archive content into disciplinary OAI archives certainly does not 
apply to
>>> *non-OAI* content, which would otherwise be scattered willy-nilly 
all over
>>> the net and not integrated in any way. Here RePec's and 
ResearchIndex's
>>> harvesting is invaluable, especially as RePec already does (and
>>> ResearchIndex has announced that it plans to) make all its 
harvested
>>> content OAI-compliant!
>>>
>>>To summarize: The goal is to get all research papers, pre- and
>>>post-peer-review, openly accessible (and OAI-interoperable) as soon 
as
>>> possible. (These are BOAI Strategies 1 [self-archiving] and 2
>>>[open-access journals]: http://www.soros.org/openaccess/read.shtml 
). In
>>> principle this can be done by (1) self-archiving them in central 
OAI
>>> disciplinary archives like the Physics arXiv (the biggest and 
first of its
>>> kind) -- http://arxiv.org/show_monthly_submissions
>>>-- by (2) self-archiving them in distributed institutional OAI
>>>Archives -- http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt -- by 
(3)
>>> self-archiving them on arbitrary Web and FTP sites (and hoping 
they will
>>> be found or harvested by services like Repec or ResearchIndex) or 
by (4)
>>> publishing them in open-access journals (BOAI Strategy 2:
>>> http://www.soros.org/openaccess/journals.shtml ).
>>>
>>>My point was only that because researchers and their institutions 
(*not*
>>> their disciplines) have shared interests vested in maximizing 
their joint
>>> research impact and its rewards, institution-based
>>>self-archiving (2) is a more promising way to go -- in the age of
>>> OAI-interoperability -- than discipline-based self-archiving (1), 
even
>>> though the latter began earlier. It is also obvious that both (1) 
and (2)
>>> are preferable to arbitrary Web and FTP self-archiving (3), which 
began
>>> even earlier (although harvesting arbitrary Website and FTP 
contents into
>>> OAI-compliant Archives is still a welcome makeshift strategy until 
the
>>> practise of OAI self-archiving is up to speed). Creating new 
open-access
>>> journals and converting the established (20,000) toll-access 
journals to
>>> open-access is desirable too, but it is obviously a much slower 
and more
>>> complicated path to open access than self-archiving, so should be 
pursued
>>> in parallel.
>>>
>>>My conclusion in favor of institutional self-archiving is based on 
the
>>> evidence and on logic, and it represents a change of thinking,
>>>for I had originally advocated (3) Web/FTP self-archiving --
>>>http://www.arl.org/scomm/subversive/toc.html -- then switched 
allegiance
>>> to central self-archiving (1), even creating a discipline-based 
archive:
>>> http://cogprints.ecs.soton.ac.uk/ But with the advent of OAI in 
1999, plus
>>> a little reflection, it became apparent that
>>>institutional self-archiving (2) was the fastest, most direct, and 
most
>>> natural road to open access: http://www.eprints.org/
>>>And since then its accumulating momentum seems to be confirming 
that this
>>> is indeed so: 
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2212.html
>>> http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt
>>>
>>>
>>>>  The primary sense of belonging
>>>>  of a scholar in her research activities is with the 
disciplinary
>>>> community of which she thinks herself a part... It certainly
>>>>  is not with the institution.
>>>
>>>That may or may not be the case, but in any case it is irrelevant 
to the
>>> question of which is the more promising route to open-access. Our 
primary
>>> sense of belonging may be with our family, our community, our 
creed, our
>>> tribe, or even our species. But our rewards (research grant 
funding and
>>> overheads, salaries, postdocs and students attracted to our 
research,
>>> prizes and honors) are intertwined and shared with our 
institutions (our
>>> employers) and not our disciplines (which are often in fact the 
locus of
>>> competition for those same rewards!)
>>>
>>>
>>>>  Therefore, if you want to fill
>>>>  institutional archives---which I agree is the best long-run 
way to
>>>> enhance access and preservation to scholarly research--- [the]
>>>> institutional archive has to be accompanied by a 
discipline-based
>>>> aggregation process.
>>>
>>>But the question is whether this "aggregation" needs to 
be the "feeding"
>>> of institutional OAI archive contents into disciplinary OAI 
archives, or
>>> merely the "feeding" of OAI metadata into OAI services.
>>>
>>>
>>>>   The RePEc project has produced such an aggregator
>>>>  for economics for a while now. I am sure that other, similar
>>>>  projects will follow the same aims, but, with the benefit of
>>>>  hindsight, offer superior service. The lack of such services
>>>>  in many disciplines,  or the lack of interoperability between
>>>> disciplinary and  institutional archives, are major obstacle 
to the
>>>> filling  the institutional archives.  There are no
>>>>  inherent contradictions between institution-based archives
>>>>  and disciplinary aggregators,
>>>
>>>There is no contradiction. In fact, I suspect this will prove to be 
a
>>> non-issue, once we confirm that (a) we agree on the need for
>>>OAI-compliance and (b) "aggregation" amounts to 
metadata-harvesting and
>>> OAI service-provision when the full-texts are in the institutional 
archive
>>> are OAI-compliant (and calls for full-text harvesting only if/when 
they
>>> are not). Content "aggregation," in other words, is a 
paper-based notion.
>>> In the online era, it merely means digital sorting of the pointers 
to the
>>> content.
>>>
>>>
>>>>  In the paper that Stevan refers to, Cliff Lynch writes,
>>>>  at http://www.arl.org/newsltr/226/ir.html
>>>>
>>>>cl> But consider the plight of a faculty member seeking only 
broader cl>
>>>> dissemination and availability of his or her traditional 
journal cl>
>>>> articles, book chapters, or perhaps even monographs through 
use of cl>
>>>> the network, working in parallel with the traditional 
scholarly cl>
>>>> publishing system.
>>>>
>>>>  I am afraid, there more and more such faculty members. Much
>>>>  of the research papers found over the Internet are deposited
>>>>  in the way. This trend is growing not declining.
>>>
>>>You mean self-archiving in arbitrary non-OAI author websites? There 
is
>>> another reason why institutional OAI archives and official 
institutional
>>> self-archiving policies (and assistance) are so important. In 
reality, it
>>> is far easier to deposit and maintain one's papers in 
institutional OAI
>>> archives like Eprints than to set up and maintain one's own 
website. All
>>> that is needed is a clear official institutional policy, plus some 
startup
>>> help in launching it. (No such thing is possible at a 
"discipline" level.)
>>>http://www.ecs.soton.ac.uk/~lac/archpol.html
>>>http://www.eprints.org/self-faq/#institution-facilitate-filling
>>> http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
>>>http://paracite.eprints.org/cgi-bin/rae_front.cgi
>>>
>>>
>>>>cl> Such a faculty member faces several time-consuming 
problems. He or
>>>> cl> she must exercise stewardship over the actual content 
and its cl>
>>>> metadata: migrating the content to new formats as they evolve 
over cl>
>>>> time, creating metadata describing the content, and ensuring 
the cl>
>>>> metadata is available in the appropriate schemas and formats 
and cl>
>>>> through appropriate protocol interfaces such as open archives 
cl>
>>>> metadata harvesting.
>>>>
>>>>  Sure, but academics do not like their work-, and certainly
>>>>  not their publishing-habits, [to] be interfered with by 
external
>>>> forces. Organizing academics is like herding cats!
>>>
>>>I am sure academics didn't like to be herded into publishing with 
the
>>> threat of perishing either. Nor did they like switching from paper 
to
>>> word-processors. Their early counterparts probably clung to the 
oral
>>> tradition, resisting writing too; and monks did not like be herded 
from
>>> their peaceful manuscript-illumination chambers to the clamour of 
printing
>>> presses. But where there is a causal contingency -- as there is 
between
>>> (a) the research impact and its rewards, which academics like as 
much as
>>> anyone else, and (b) the accessibility of their research -- 
academics are
>>> surely no less responsive than Prof. Skinner's pigeons and rats to 
those
>>> causal contingencies, and which buttons they will have to press  
in order
>>> to maximize their rewards!
>>>http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
>>>
>>>Besides, it is not *publishing* habits that need to be changed, but
>>> *archiving* habits, which are an online supplement, not a 
substitute, for
>>> existing (and unchanged) publishing habits.
>>>
>>>
>>>>cl> Faculty are typically best at creating new
>>>>cl> knowledge, not maintaining the record of this process of
>>>>cl> creation. Worse still, this faculty member must not only 
manage cl>
>>>> content but must manage a dissemination system such as a 
personal Web cl>
>>>> site, playing the role of system administrator (or the manager 
of cl>
>>>> someone serving as a system administrator).
>>>>
>>>>  There are lot of ways in which to maintain a web site or to 
get access
>>>> to a maintained one. It is a customary activity these days and 
no
>>>> longer requires much technical expertise. A primitive 
integration of
>>>> the contents can be done by Google, it requires  no metadata. 
Academics
>>>> don't care  about long-run preservation, so that problem 
remains
>>>> unsolved. In the meantime, the academic who uploads papers to 
a web
>>>> site takes steps to resolve the most pressing problem, access.
>>>
>>>Agreed. And uploading it into a departmental OAI Eprints Archive is 
 by
>>> far the simplest way and most effective way to do all of that. All 
it
>>> needs is a policy to mandate it:
>>>http://www.ecs.soton.ac.uk/~lac/archpol.html
>>>
>>>
>>>>cl> Over the past few years, this has ceased to be a 
reasonable activity
>>>> cl> for most amateurs; software complexity, security risks, 
backup cl>
>>>> requirements, and other problems have generally relegated 
effective cl>
>>>> operation of Web sites to professionals who can exploit 
economies of cl>
>>>> scale, and who can begin each day with a review of recently 
issued cl>
>>>> security patches.
>>>>
>>>>  These are technical concerns. When you operate a linux box
>>>>  on the web you simply fire up a script that will download
>>>>  the latest version. That is easy enough. Most departments
>>>>  have separate web operations. Arguing for one institutional
>>>>  archive for digital contents is akin to calling for a single 
web site
>>>> for an institution. The diseconomies of scale of central 
administration
>>>> impose other types of costs that the ones that it was to 
reduce. The
>>>> secret is to find a middle way.
>>>
>>>I couldn't quite follow all of this. The bottom line is this: The 
free
>>> Eprints.org software (for example) can be installed within a few 
days. It
>>> can then be replicated to handle all the departmental or research 
group
>>> archives a university wants, with minimal maintenance time or 
costs. The
>>> rest is just down to self-archiving, which takes a few minutes for 
the
>>> first paper, and even less time for subsequent papers (as the 
repeating
>>> metadata -- author, institution, etc., can be "cloned" 
into each new
>>> deposit template). An institution may wish to impose an 
institutional
>>> "look" on all of its separate eprints archives; but 
apart from that, they
>>> can be as autonomous and as distributed and as many as desired:
>>> OAI-interoperability works locally just as well as it does 
globally.
>>>
>>>
>>>>cl> Today, our faculty time is being wasted, and expended 
ineffectively,
>>>> cl> on system administration activities and content 
curation. And, cl>
>>>> because system administration is ineffective, it places our 
cl>
>>>> institutions at risk: because faculty are generally not 
capable of cl>
>>>> responding to the endless series of security exposures and 
patches, cl>
>>>> our university networks are riddled with vulnerable faculty 
machines cl>
>>>> intended to serve as points of distribution for scholarly 
works.
>>>>
>>>>  This is the fight many faculty face every day, where they
>>>>  want to innovate scholarly communication, but someone
>>>>  in the IT department does not give the necessary permission
>>>>  for network access...
>>>
>>>I don't think I need to get into this. It's not specific to
>>>self-archiving, and a tempest in a teapot as far as that is 
concerned. An
>>> efficient system can and will be worked out once there is an 
effective
>>> institutional self-archiving policy. There are already plenty of 
excellent
>>> examples, such as CalTech:
>>>http://library.caltech.edu/digital/
>>>See also:
>>>http://software.eprints.org/#ep2
>>>
>>>Stevan Harnad
>>
>>
>
>
> --
> =====================================================================
> hussein suleman ~ hussein AT cs.uct.ac.za ~ http://www.husseinsspace.com
> =====================================================================
>
> _______________________________________________
> OAI-general mailing list
> OAI-general AT oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-general




Re: [BOAI] Re: Cliff Lynch on Institutional Archives

From: Christopher Gutteridge <cjg AT ecs.soton.ac.uk>
Date: Thu, 27 Mar 2003 15:59:58 +0000


Threading: Re: [BOAI] Re: Cliff Lynch on Institutional Archives from krichel AT openlib.org
      • This Message

I agree! Most archives currently existing have, so far as I can tell,
created sets based on their own subject schemes.

Given that sets are *not* part of the metadata, but a way to harvest
a subset of the records, creating sets which conform to the requirements
of a service provider.

For example, our archive contains 8000 records, but only 800 of those
have actual documents available online. Some OAI services only want to
deal with records which are available online, so the 800 records are
available as an OAI set.

Keeping it simple would be good. If harvesters could describe their
scope in terms of popular classification schemes. Dewey, LoC, etc.

Although there is an argument for making the service provider/
harvester do all the work, as anything which makes it harder to
set up an OAI archive is a Bad Thing.

On Thu, Mar 27, 2003 at 09:17:52 +0200, Hussein Suleman wrote:
> hi
> 
> this may be stating the obvious, but why not use sets for the separate 
> disciplines, aimed at particular service providers? i say it that way 
> because some disciplines are not well-defined (namely, computer science) 
> so such archives may want to play ball with multiple service providers 
> and hence may need different sets.
> 
> in any event, for something like physics, a simple set might do the 
> trick at the source. then, somewhat in keeping with the Kepler model (as 
> published in DLib a while back), the service provider can provide an 
> interface for potential data providers to self-register. i know this 
> sounds dodgy, but think of it as an alternative mechanism for 
> contribution. either individual users submit individual papers or groups 
> submit baseURLS - both go through some kind of review and while one 
> leads to once-off storage, the other leads to periodic harvesting.
> 
> what remains a difficult problem, however, is how to recreate the 
> metadata used by the service provider as its native format. so, for a 
> typical example, if arXiv classifies items using a specific set 
> structure, this is certainly not going to be the default for an 
> institutional archive. does the service provider automatically or 
> manually reclassify? or does it not allow browsing by categories? in 
> either event, the quality of the metadata from the perspective of the 
> service provider may be an impetus for potential users to want to 
> replicate their effort rather than rely on the automated submission from 
> their own institutions ... this needs more thought ...
> 
> ttfn,
> ----hussein
> 
> 
> Christopher Gutteridge wrote:
> >Disciplinary/subject archives vs. Institutional/Organisation/Region 
based
> >archives. This is going to be a key challenge now open archives begin
> >to gain momentum. 
> >
> >For example; we are planning a University-wide eprints archive. I am 
> >concerned that some physisists will want to place their items in both
> >the university eprints service AND the arXiv physics archive. They may 

> >be required to use the university service, but want to use arXiv as it
> >is the primary source for their discipline. This is a duplication of 
> >effort and a potential irritation.
> >
> >Ultimately, of course, I'd hope that diciplinary archives will be 
replaced
> >with subject-specific OAI service providers harvesting from the 
> >institutional
> >archives. But there is going to be a very long transition period in 
which
> >the solution evolves from our experience.
> >
> >What I'm asking is; has anyone given consideration to ways of 
smoothing
> >over this duplication of effort? Possibly some negotiated automated 
process
> >for insitutional archives uploading to the subject archive, or at 
least
> >assisting the author in the process.
> >
> >This isn't the biggest issue, but it'd be good to address it before it
> >becomes more of a problem.
> >
> >  Christopher Gutteridge
> >  GNU EPrints Head Developer
> >  http://software.eprints.org/
> >
> >On Sun, Mar 16, 2003 at 02:15:56 +0000, Stevan Harnad wrote:
> >
> >>On Sat, 15 Mar 2003, Thomas Krichel wrote:
> >>
> >>
> >>> Stevan Harnad writes:
> >>>
> >>>sh> There is no need -- in the age of OAI-interoperability 
-- for
> >>>sh> institutional archives to "feed" central 
disciplinary archives:
> >>>
> >>> I do not share what I see as a  blind faith in 
interoperability
> >>> through a technical protocol. 
> >>
> >>I am quite happy to defer to the technical OAI experts on this 
one, but 
> >>let
> >>us put the question precisely: 
> >>
> >>Thomas Krichel suggests that institutional (OAI) data-archives
> >>(full-texts) should "feed" disciplinary (OAI) 
data-archives,
> >>because OAI-interoperability is somehow not enough. I suggest that
> >>OAI-interoperability (if I understand it correctly) should be 
enough. No
> >>harm in redundant archiving, of course, for backup and security, 
but not
> >>necessary for the usage and functionality itself. In fact, if I 
understand
> >>correctly the intent of the OAI distinction between OAI 
data-providers -- 
> >>http://www.openarchives.org/Register/BrowseSites.pl 
> >>-- and OAI service-providers --
> >>http://www.openarchives.org/service/listproviders.html 
> >>-- it is not the full-texts of data-archives that need to be 
"fed" to
> >>(i.e., harvested by) the OAI service providers, but only their 
metadata.
> >>
> >>Hence my conclusion that distributed, interoperable OAI 
institutional
> >>archives are enough (and the fastest route to open-access). No 
need
> >>to harvest their contents into central OAI discipline-based 
archives
> >>(except perhaps for redundancy, as backup). Their OAI 
interoperability
> >>should be enough so that the OAI service-providers can (among 
other 
> >>things)
> >>do the "virtual aggregation" by discipline (or any other 
computable
> >>criterion) by harvesting the metadata alone, without the need to 
harvest
> >>full-text data-contents too.
> >>
> >>It should be noted, though, that Thomas Krichel's excellent RePec
> >>archive and service in Economics -- http://repec.org/ -- goes
> >>well beyond the confines of OAI-harvesting! RePec harvests non-OAI
> >>content too, along lines similar to the way ResearchIndex/citeseer 
--
> >>http://citeseer.nj.nec.com/cs -- harvests non-OAI content in 
computer
> >>science. What I said about there being no need to "feed" 
institutional OAI
> >>archive content into disciplinary OAI archives certainly does not 
apply
> >>to *non-OAI* content, which would otherwise be scattered 
willy-nilly
> >>all over the net and not integrated in any way. Here RePec's and
> >>ResearchIndex's harvesting is invaluable, especially as RePec 
already
> >>does (and ResearchIndex has announced that it plans to) make all 
its
> >>harvested content OAI-compliant!
> >>
> >>To summarize: The goal is to get all research papers, pre- and
> >>post-peer-review, openly accessible (and OAI-interoperable) as 
soon as
> >>possible. (These are BOAI Strategies 1 [self-archiving] and 2
> >>[open-access journals]: http://www.soros.org/openaccess/read.shtml
> >>). In principle this can be done by (1) self-archiving them in 
central
> >>OAI disciplinary archives like the Physics arXiv (the biggest and
> >>first of its kind) -- http://arxiv.org/show_monthly_submissions
> >>-- by (2) self-archiving them in distributed institutional OAI
> >>Archives -- http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt -- by 
(3)
> >>self-archiving them on arbitrary Web and FTP sites (and hoping 
they
> >>will be found or harvested by services like Repec or 
ResearchIndex)
> >>or by (4) publishing them in open-access journals (BOAI Strategy 
2:
> >>http://www.soros.org/openaccess/journals.shtml ).
> >>
> >>My point was only that because researchers and their institutions
> >>(*not* their disciplines) have shared interests vested in 
maximizing
> >>their joint research impact and its rewards, institution-based
> >>self-archiving (2) is a more promising way to go -- in the age of
> >>OAI-interoperability -- than discipline-based self-archiving (1), 
even
> >>though the latter began earlier. It is also obvious that both (1) 
and
> >>(2) are preferable to arbitrary Web and FTP self-archiving (3), 
which
> >>began even earlier (although harvesting arbitrary Website and FTP 
contents
> >>into OAI-compliant Archives is still a welcome makeshift strategy
> >>until the practise of OAI self-archiving is up to speed). Creating 
new
> >>open-access journals and converting the established (20,000) 
toll-access
> >>journals to open-access is desirable too, but it is obviously a 
much
> >>slower and more complicated path to open access than 
self-archiving,
> >>so should be pursued in parallel.
> >>
> >>My conclusion in favor of institutional self-archiving is based on 
the
> >>evidence and on logic, and it represents a change of thinking,
> >>for I had originally advocated (3) Web/FTP self-archiving --
> >>http://www.arl.org/scomm/subversive/toc.html -- then switched 
allegiance
> >>to central self-archiving (1), even creating a discipline-based 
archive:
> >>http://cogprints.ecs.soton.ac.uk/ But with the advent of OAI in 
1999,
> >>plus a little reflection, it became apparent that
> >>institutional self-archiving (2) was the fastest, most direct, and 
most
> >>natural road to open access: http://www.eprints.org/ 
> >>And since then its accumulating momentum seems to be confirming 
that this
> >>is indeed so: 
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2212.html
> >>http://www.ecs.soton.ac.uk/~harnad/Temp/tim.ppt
> >>
> >>
> >>> The primary sense of belonging
> >>> of a scholar in her research activities is with the 
disciplinary
> >>> community of which she thinks herself a part... It certainly
> >>> is not with the institution. 
> >>
> >>That may or may not be the case, but in any case it is irrelevant 
to
> >>the question of which is the more promising route to open-access. 
Our
> >>primary sense of belonging may be with our family, our community,
> >>our creed, our tribe, or even our species. But our rewards 
(research
> >>grant funding and overheads, salaries, postdocs and students 
attracted
> >>to our research, prizes and honors) are intertwined and shared 
with our
> >>institutions (our employers) and not our disciplines (which are 
often
> >>in fact the locus of competition for those same rewards!)
> >>
> >>
> >>> Therefore, if you want to fill
> >>> institutional archives---which I agree is the best long-run 
way
> >>> to enhance access and preservation to scholarly research--- 
[the]
> >>> institutional archive has to be accompanied by a 
discipline-based
> >>> aggregation process. 
> >>
> >>But the question is whether this "aggregation" needs to 
be the "feeding"
> >>of institutional OAI archive contents into disciplinary OAI 
archives, or
> >>merely the "feeding" of OAI metadata into OAI services.
> >>
> >>
> >>>  The RePEc project has produced such an aggregator
> >>> for economics for a while now. I am sure that other, similar
> >>> projects will follow the same aims, but, with the benefit of
> >>> hindsight, offer superior service. The lack of such services
> >>> in many disciplines,  or the lack of interoperability between
> >>> disciplinary and  institutional archives, are major obstacle 
to
> >>> the filling  the institutional archives.  There are no
> >>> inherent contradictions between institution-based archives
> >>> and disciplinary aggregators,
> >>
> >>There is no contradiction. In fact, I suspect this will prove to 
be a
> >>non-issue, once we confirm that (a) we agree on the need for
> >>OAI-compliance and (b) "aggregation" amounts to 
metadata-harvesting and
> >>OAI service-provision when the full-texts are in the institutional
> >>archive are OAI-compliant (and calls for full-text harvesting only
> >>if/when they are not). Content "aggregation," in other 
words, is a
> >>paper-based notion. In the online era, it merely means digital 
sorting
> >>of the pointers to the content.
> >>
> >>
> >>> In the paper that Stevan refers to, Cliff Lynch writes,
> >>> at http://www.arl.org/newsltr/226/ir.html
> >>>
> >>>cl> But consider the plight of a faculty member seeking 
only broader
> >>>cl> dissemination and availability of his or her 
traditional journal
> >>>cl> articles, book chapters, or perhaps even monographs 
through use of
> >>>cl> the network, working in parallel with the traditional 
scholarly
> >>>cl> publishing system.
> >>>
> >>> I am afraid, there more and more such faculty members. Much
> >>> of the research papers found over the Internet are deposited
> >>> in the way. This trend is growing not declining.
> >>
> >>You mean self-archiving in arbitrary non-OAI author websites? 
There is
> >>another reason why institutional OAI archives and official 
institutional
> >>self-archiving policies (and assistance) are so important. In 
reality,
> >>it is far easier to deposit and maintain one's papers in 
institutional
> >>OAI archives like Eprints than to set up and maintain one's own 
website.
> >>All that is needed is a clear official institutional policy, plus
> >>some startup help in launching it. (No such thing is possible at a
> >>"discipline" level.)
> >>http://www.ecs.soton.ac.uk/~lac/archpol.html 
> >>http://www.eprints.org/self-faq/#institution-facilitate-filling 
> >>http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm
> >>http://paracite.eprints.org/cgi-bin/rae_front.cgi
> >>
> >>
> >>>cl> Such a faculty member faces several time-consuming 
problems. He or
> >>>cl> she must exercise stewardship over the actual content 
and its
> >>>cl> metadata: migrating the content to new formats as they 
evolve over
> >>>cl> time, creating metadata describing the content, and 
ensuring the
> >>>cl> metadata is available in the appropriate schemas and 
formats and
> >>>cl> through appropriate protocol interfaces such as open 
archives
> >>>cl> metadata harvesting.
> >>>
> >>> Sure, but academics do not like their work-, and certainly
> >>> not their publishing-habits, [to] be interfered with by 
external
> >>> forces. Organizing academics is like herding cats!
> >>
> >>I am sure academics didn't like to be herded into publishing with 
the
> >>threat of perishing either. Nor did they like switching from paper 
to
> >>word-processors. Their early counterparts probably clung to the 
oral
> >>tradition, resisting writing too; and monks did not like be herded 
from
> >>their peaceful manuscript-illumination chambers to the clamour of
> >>printing presses. But where there is a causal contingency -- as 
there is
> >>between (a) the research impact and its rewards, which academics 
like as
> >>much as anyone else, and (b) the accessibility of their research 
-- 
> >>academics
> >>are surely no less responsive than Prof. Skinner's pigeons and 
rats to
> >>those causal contingencies, and which buttons they will have to 
press 
> >>in order to maximize their rewards!
> >>http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
> >>
> >>Besides, it is not *publishing* habits that need to be changed, 
but
> >>*archiving* habits, which are an online supplement, not a 
substitute,
> >>for existing (and unchanged) publishing habits.
> >>
> >>
> >>>cl> Faculty are typically best at creating new
> >>>cl> knowledge, not maintaining the record of this process 
of
> >>>cl> creation. Worse still, this faculty member must not 
only manage
> >>>cl> content but must manage a dissemination system such as 
a personal Web
> >>>cl> site, playing the role of system administrator (or the 
manager of
> >>>cl> someone serving as a system administrator).
> >>>
> >>> There are lot of ways in which to maintain a web site or to 
get
> >>> access to a maintained one. It is a customary activity these 
days and
> >>> no longer requires much technical expertise. A primitive 
integration
> >>> of the contents can be done by Google, it requires  no 
metadata.
> >>> Academics don't care  about long-run preservation, so that 
problem
> >>> remains unsolved. In the meantime, the academic who uploads 
papers to a 
> >>> web
> >>> site takes steps to resolve the most pressing problem, 
access.
> >>
> >>Agreed. And uploading it into a departmental OAI Eprints Archive 
is 
> >>by far the simplest way and most effective way to do all of that. 
All it
> >>needs is a policy to mandate it:
> >>http://www.ecs.soton.ac.uk/~lac/archpol.html
> >>
> >>
> >>>cl> Over the past few years, this has ceased to be a 
reasonable activity
> >>>cl> for most amateurs; software complexity, security risks, 
backup
> >>>cl> requirements, and other problems have generally 
relegated effective
> >>>cl> operation of Web sites to professionals who can exploit 
economies of
> >>>cl> scale, and who can begin each day with a review of 
recently issued
> >>>cl> security patches.
> >>>
> >>> These are technical concerns. When you operate a linux box
> >>> on the web you simply fire up a script that will download
> >>> the latest version. That is easy enough. Most departments
> >>> have separate web operations. Arguing for one institutional
> >>> archive for digital contents is akin to calling for a single 
web
> >>> site for an institution. The diseconomies of scale of central
> >>> administration impose other types of costs that the ones that 
it was to
> >>> reduce. The secret is to find a middle way.
> >>
> >>I couldn't quite follow all of this. The bottom line is this: The 
free
> >>Eprints.org software (for example) can be installed within a few 
days. It
> >>can then be replicated to handle all the departmental or research 
group
> >>archives a university wants, with minimal maintenance time or 
costs. The
> >>rest is just down to self-archiving, which takes a few minutes for 
the
> >>first paper, and even less time for subsequent papers (as the 
repeating
> >>metadata -- author, institution, etc., can be "cloned" 
into each new
> >>deposit template). An institution may wish to impose an 
institutional
> >>"look" on all of its separate eprints archives; but 
apart from that,
> >>they can be as autonomous and as distributed and as many as 
desired:
> >>OAI-interoperability works locally just as well as it does 
globally.
> >>
> >>
> >>>cl> Today, our faculty time is being wasted, and expended 
ineffectively,
> >>>cl> on system administration activities and content 
curation. And,
> >>>cl> because system administration is ineffective, it places 
our
> >>>cl> institutions at risk: because faculty are generally not 
capable of
> >>>cl> responding to the endless series of security exposures 
and patches,
> >>>cl> our university networks are riddled with vulnerable 
faculty machines
> >>>cl> intended to serve as points of distribution for 
scholarly works.
> >>>
> >>> This is the fight many faculty face every day, where they
> >>> want to innovate scholarly communication, but someone
> >>> in the IT department does not give the necessary permission
> >>> for network access...
> >>
> >>I don't think I need to get into this. It's not specific to
> >>self-archiving, and a tempest in a teapot as far as that is 
concerned. An
> >>efficient system can and will be worked out once there is an 
effective
> >>institutional self-archiving policy. There are already plenty of 
excellent
> >>examples, such as CalTech: 
> >>http://library.caltech.edu/digital/ 
> >>See also:
> >>http://software.eprints.org/#ep2
> >>
> >>Stevan Harnad
> >
> >
> 
> 
> -- 
> =====================================================================
> hussein suleman ~ hussein AT cs.uct.ac.za ~ http://www.husseinsspace.com
> =====================================================================

-- 
    Christopher Gutteridge -- cjg AT ecs.soton.ac.uk -- +44 (0)23 8059 4833

                          >O___,
 __________________________(___)___________________________________________
|                                   |                                      |
| Now Playing: "For You" from       | Pessimist by policy, optimist 
by     |
| Tracy Chapman - Tracy Chapman     | temperament -- it is possible to be  |
|                                   | both. How? By never taking an        |
|                                   | unnecessary chance and by            |
|                                   | minimizing risks you can't avoid.    |
|                                   | This permits you to play out the     |
|                                   | game happily, untroubled by the      |
|                                   | certainty of the outcome. -- From    |
|                                   | "The Notebooks of Lazarus Long" 
by   |
|                                   | Robert Heinlein                      |
|___________________________________|______________________________________|


[BOAI] [Forum Home] [index] [options] [help]

 E-mail:  openaccess@soros.org .