Budapest Open Access Initiative      

Budapest Open Access Initiative: BOAI Forum Archive

[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

boaiforum messages

RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license and metadata)

From: "David Prosser" <david.prosser AT bodley.ox.ac.uk>
Date: Tue, 13 May 2008 17:33:30 +0100


Threading: RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license and metadata) from t.d.wilson AT sheffield.ac.uk
      • This Message
             RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license and metadata) from t.d.wilson AT sheffield.ac.uk

Tom

I think that you are downplaying what data- and text- mining have already
achieved.  There have been some very interesting results from text-mining
just the abstracts of papers in Medline (and abstracts have been used
because the researchers did not have access to all of the full-text).  Also,
in the humanities, text-mining has been used with good effect using
newspapers as the 'mines'.  (And to head-off a tangent - I know that access
to newspapers is not the aim of OA, but I mention it to show that results
are possible in the humanities.)  Yes this is an embryonic field, but one
that is already providing results.

So, would an individual working in a commercial setting be allowed to
download copies of all of the papers in your journal and host them locally,
reformat them, and run text-mining programmes over them?  If not, whose
permission would they need:  yours as publisher or that of each author?  If
they need permission and wanted to run their programme across 100,000 papers
from 2000 publishers how would they go about getting that permission?

Best wishes

David


-----Original Message-----
From: owner-boai-forum AT ecs.soton.ac.uk
[mailto:owner-boai-forum AT ecs.soton.ac.uk] On Behalf Of Prof. Tom Wilson
Sent: 11 May 2008 15:45
To: BOAI Forum
Subject: RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license
and metadata)

Data mining is usually defined as searching for hitherto unrecognized
patterns
in collections of data, employing a variety of statistical and AI
techniques.
To do this, one needs collections of DATA sets, not simply collections of
papers. Papers are predominantly text and the treatment of text to discover
new
relationships is not only in its infancy but probably embryonic - the
problems
are the usual ones faced by AI - texts, particularly a collection of texts
such
as the entire contents of a journal are notoriously difficult to carry out
any
automatic extraction process upon.  Even information retrieval after about
50
years of development has still not cracked the fundamentally ambiguous
nature
of human language.

However, let us suppose that some machine exists that could take the entire
contents of an OA journal and somehow "mine" it, so that the 
researcher
could
discover relationships among concepts of which s/he was previously unaware.
How
is this different from the human being carrying out a literature review to
do
exactly the same thing?  In neither case is there any infringement of
copyright
- research is published precisely to enable this kind of analysis and the
consequent further progression of the field of knowledge. At the level of
data
within a document, consider the publication of a new way of calculating a
statistical measure that makes that measure more useful in certain
circumstances to do with the nature of the sample population: and suppose
that
datum is discovered in a search (by man or machine) - am I banned from using
it
because it is in a copyrighted document?  Of course not. I am banned from
passing it off as my own discovery, but not from using it.  It was published
with the intention that it should be used.

The notion, therefore, that only the BY-CC licence can aid "data 
mining" -
whatever that turns out to be beyond the usual hype - is untenable. To base
the
Seal upon this perception, therefore, is misguided.

Professor T.D. Wilson, PhD, Hon.PhD
Publisher/Editor in Chief
Information Research
InformationR.net
e-mail: t.d.wilson AT shef.ac.uk
Web site: http://InformationR.net/
___________________________________________________ 


Quoting David Prosser <david.prosser AT bodley.ox.ac.uk>:

> 
> 
> The confusion for me (and this may just be my misunderstanding) comes from
> the way in which people do data-mining.  It is not just a question of
> searching across a range of articles.  Many data-miners want to copy the
> articles onto a local computer, possibly re-format them so that they are
in
> a standard form, and then perform the data-mining.  It is not clear to me
> that a researcher at a commercial organisation could do that to papers
that
> are published under a non-commercial license.  If that is so, then they
> would need to contact either all the publishers individually (in the best
> case) or each author (in the worse case).  Surely this would not be
> practical.
> 
> (I would welcome comment from either copyright lawyers or data-miners to
> tell me if I have this wrong.).
> 
> David 
> 
> -----Original Message-----
> From: owner-boai-forum AT ecs.soton.ac.uk
> [mailto:owner-boai-forum AT ecs.soton.ac.uk] On Behalf Of Andras Holl
> Sent: 08 May 2008 15:22
> To: boai-forum AT ecs.soton.ac.uk
> Subject: RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license
> and metadata)
> 
> 
> V. Sasi Kumar wrote:
> 
> > I have a small doubt here. Can we say that searching a journal is a
> > commercial use of the journal?
> 
> No, You are right, I have to clarify my position. Doing the 
> search is not commercial, and there is no need for permission
> if they use the full-text search facilities provided by the
> journal. Nor is any need for getting permission if the journal
> permits transfer of their whole content. Permission is needed only,
> where is no such permission, when they want to transfer the whole 
> content to their site, for data mining, in my opinion. Also, most 
> journals provide their full-text search or other facilities for the 
> individual user, for "normal" use. It is advisable at least, for 
a
> commercial
> company (or anyone who is capable for that) to contact the
> journal publisher if they want to run "industrial", 
"massive"
> searches or downloads.
> 
> In my opinion, there is a difference between the "normal" or
> "average" or "fair" use - when an individual 
researcher pursues his/her
> own scholarly agenda, and a commercial company, doing something
> on industrial scales, and for profit. If nothing else,
> such an activity could owerload the server, and deny other
> users' access to the content potentially.
> 
> On the other hand, whether there is need for permission from the
> authors, when a company would use their work - if they use the
> result, the conclusion, then probably not. If they use the
> paper as a whole, a table, a figure reproduced in a product which 
> is for sale, then I would say yes.
> 
> Andras Holl
> 
> 
> 




[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

 E-mail:  openaccess@soros.org .