Budapest Open Access Initiative: BOAI Forum Archive[BOAI] [Forum Home] [index] [prev] [next] [options] [help]
RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license and metadata)
From: "David Prosser" <david.prosser AT bodley.ox.ac.uk>
Tom I think that you are downplaying what data- and text- mining have already achieved. There have been some very interesting results from text-mining just the abstracts of papers in Medline (and abstracts have been used because the researchers did not have access to all of the full-text). Also, in the humanities, text-mining has been used with good effect using newspapers as the 'mines'. (And to head-off a tangent - I know that access to newspapers is not the aim of OA, but I mention it to show that results are possible in the humanities.) Yes this is an embryonic field, but one that is already providing results. So, would an individual working in a commercial setting be allowed to download copies of all of the papers in your journal and host them locally, reformat them, and run text-mining programmes over them? If not, whose permission would they need: yours as publisher or that of each author? If they need permission and wanted to run their programme across 100,000 papers from 2000 publishers how would they go about getting that permission? Best wishes David -----Original Message----- From: owner-boai-forum AT ecs.soton.ac.uk [mailto:owner-boai-forum AT ecs.soton.ac.uk] On Behalf Of Prof. Tom Wilson Sent: 11 May 2008 15:45 To: BOAI Forum Subject: RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license and metadata) Data mining is usually defined as searching for hitherto unrecognized patterns in collections of data, employing a variety of statistical and AI techniques. To do this, one needs collections of DATA sets, not simply collections of papers. Papers are predominantly text and the treatment of text to discover new relationships is not only in its infancy but probably embryonic - the problems are the usual ones faced by AI - texts, particularly a collection of texts such as the entire contents of a journal are notoriously difficult to carry out any automatic extraction process upon. Even information retrieval after about 50 years of development has still not cracked the fundamentally ambiguous nature of human language. However, let us suppose that some machine exists that could take the entire contents of an OA journal and somehow "mine" it, so that the ↵ researcher could discover relationships among concepts of which s/he was previously unaware. How is this different from the human being carrying out a literature review to do exactly the same thing? In neither case is there any infringement of copyright - research is published precisely to enable this kind of analysis and the consequent further progression of the field of knowledge. At the level of data within a document, consider the publication of a new way of calculating a statistical measure that makes that measure more useful in certain circumstances to do with the nature of the sample population: and suppose that datum is discovered in a search (by man or machine) - am I banned from using it because it is in a copyrighted document? Of course not. I am banned from passing it off as my own discovery, but not from using it. It was published with the intention that it should be used. The notion, therefore, that only the BY-CC licence can aid "data ↵ mining" - whatever that turns out to be beyond the usual hype - is untenable. To base the Seal upon this perception, therefore, is misguided. Professor T.D. Wilson, PhD, Hon.PhD Publisher/Editor in Chief Information Research InformationR.net e-mail: t.d.wilson AT shef.ac.uk Web site: http://InformationR.net/ ___________________________________________________ Quoting David Prosser <david.prosser AT bodley.ox.ac.uk>: > > > The confusion for me (and this may just be my misunderstanding) comes from > the way in which people do data-mining. It is not just a question of > searching across a range of articles. Many data-miners want to copy the > articles onto a local computer, possibly re-format them so that they are in > a standard form, and then perform the data-mining. It is not clear to me > that a researcher at a commercial organisation could do that to papers that > are published under a non-commercial license. If that is so, then they > would need to contact either all the publishers individually (in the best > case) or each author (in the worse case). Surely this would not be > practical. > > (I would welcome comment from either copyright lawyers or data-miners to > tell me if I have this wrong.). > > David > > -----Original Message----- > From: owner-boai-forum AT ecs.soton.ac.uk > [mailto:owner-boai-forum AT ecs.soton.ac.uk] On Behalf Of Andras Holl > Sent: 08 May 2008 15:22 > To: boai-forum AT ecs.soton.ac.uk > Subject: RE: [BOAI] Launch of SPARC Europe Seal for OA (standards, license > and metadata) > > > V. Sasi Kumar wrote: > > > I have a small doubt here. Can we say that searching a journal is a > > commercial use of the journal? > > No, You are right, I have to clarify my position. Doing the > search is not commercial, and there is no need for permission > if they use the full-text search facilities provided by the > journal. Nor is any need for getting permission if the journal > permits transfer of their whole content. Permission is needed only, > where is no such permission, when they want to transfer the whole > content to their site, for data mining, in my opinion. Also, most > journals provide their full-text search or other facilities for the > individual user, for "normal" use. It is advisable at least, for ↵ a > commercial > company (or anyone who is capable for that) to contact the > journal publisher if they want to run "industrial", ↵ "massive" > searches or downloads. > > In my opinion, there is a difference between the "normal" or > "average" or "fair" use - when an individual ↵ researcher pursues his/her > own scholarly agenda, and a commercial company, doing something > on industrial scales, and for profit. If nothing else, > such an activity could owerload the server, and deny other > users' access to the content potentially. > > On the other hand, whether there is need for permission from the > authors, when a company would use their work - if they use the > result, the conclusion, then probably not. If they use the > paper as a whole, a table, a figure reproduced in a product which > is for sale, then I would say yes. > > Andras Holl > > >
[BOAI] [Forum Home] [index] [prev] [next] [options] [help]
E-mail: firstname.lastname@example.org .