Budapest Open Access Initiative      

Budapest Open Access Initiative: BOAI Forum Archive

[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

boaiforum messages

[BOAI] OA Growth Monitoring Needs a Google Data-Mining Exemption

From: Stevan Harnad <amsciforum AT gmail.com>
Date: Fri, 23 Aug 2013 01:17:19 -0400


--047d7b603b3e5330be04e49685c2
Content-Type: text/plain; charset=ISO-8859-1

This is a response to a query* *regarding Eric Archambault's report on OA
Growth<http://www.science-metrix.com/pdf/SM_EC_OA_Availability_2004-2011.pdf>
 by Adam G 
Dunn<http://news.sciencemag.org/scientific-community/2013/08/free-papers-have-reached-tipping-point-study-claims#comment-1014280880>
 in *Science Insider*: *"I find it difficult to believe that the authors 
of
the study managed to create a harvester that could identify and verify the
pdfs linked to by Google Scholar when Google Scholar actively blocks IP
addresses when they identify crawling*."

Our own "harvester" attempts to gather the all-important data on OA 
growth
were blocked by Google.

It is completely understandable and justifiable that Google shields its
increasingly vital global database and search mechanisms from the countless
and incessant worldwide attempts at exploitation by commercial interests,
spammers, and malware that could bring Google to its knees if not
rigorously and relentlessly blocked.

But in the very special (and tiny) case of scientific research articles it
would not only be a great help to the worldwide research community but to
Google (and Google Scholar) itself if Google granted special individual
exemptions for important international studies like Eric Archambault's,
which was commissioned by the European Union to monitor the global growth
rate of open access to research.

Google and Google Scholar would become all the richer as research databases
if data like Eric's (and our own) were not made so excruciatingly difficult
and time-consuming to gather by Google's blanket blockage of automated
data-mining.

(We do not trawl books, so Google's agreements with publishers are not
violated or at issue in any way. We just want to trawl for articles whose
metadata match the the metadata from Web of Science or SCOPUS and have been
made freely accessible on the web; nor do we want their full-texts: just to
check whether they are there!)

Stevan Harnad

--047d7b603b3e5330be04e49685c2
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;fon=
t-size:15px;line-height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,san=
s-serif;color:rgb(66,71,74)">This is a response to a 
query<i>=A0</i>regardi=
ng <a 
href=3D"http://www.science-metrix.com/pdf/SM_EC_OA_Availability_2004-=
2011.pdf">Eric Archambault&#39;s report on OA 
Growth</a>=A0by=A0<a href=3D"=
http://news.sciencemag.org/scientific-community/2013/08/free-papers-have-re=
ached-tipping-point-study-claims#comment-1014280880">Adam G 
Dunn</a>=A0in=
=A0<i>Science Insider</i>: <i>&quot;I find it difficult 
to believe that the=
 authors of the study managed to create a harvester that could identify and=
 verify the pdfs linked to by Google Scholar when Google Scholar actively b=
locks IP addresses when they identify crawling</i>.&quot;</p>
<p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;font-size:15px;line-=
height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,sans-serif;color:rgb=
(66,71,74)">Our own &quot;harvester&quot; attempts to gather 
the all-import=
ant data on OA growth were blocked by Google.=A0</p>
<p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;font-size:15px;line-=
height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,sans-serif;color:rgb=
(66,71,74)">It is completely understandable and justifiable that Google 
shi=
elds its increasingly vital global database and search mechanisms from the =
countless and incessant worldwide attempts at exploitation by commercial in=
terests, spammers, and malware that could bring Google to its knees if not =
rigorously and relentlessly blocked.=A0</p>
<p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;font-size:15px;line-=
height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,sans-serif;color:rgb=
(66,71,74)">But in the very special (and tiny) case of scientific 
research =
articles it would not only be a great help to the worldwide research commun=
ity but to Google (and Google Scholar) itself if Google granted special ind=
ividual exemptions for important international studies like Eric Archambaul=
t&#39;s, which was commissioned by the European Union to monitor the 
global=
 growth rate of open access to research.=A0</p>
<p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;font-size:15px;line-=
height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,sans-serif;color:rgb=
(66,71,74)">Google and Google Scholar would become all the richer as 
resear=
ch databases if data like Eric&#39;s (and our own) were not made so 
excruci=
atingly difficult and time-consuming to gather by Google&#39;s blanket 
bloc=
kage of automated data-mining.<br>
</p><p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;font-size:15px;l=
ine-height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,sans-serif;color=
:rgb(66,71,74)">(We do not trawl books, so Google&#39;s agreements 
with pub=
lishers are not violated or at issue in any way. We just want to trawl for =
articles whose metadata match the the metadata from Web of Science or SCOPU=
S and have been made freely accessible on the web; nor do we want their ful=
l-texts: just to check whether they are there!)</p>
<p style=3D"margin:0px 0px 
15px;padding:0px;border:0px;font-size:15px;line-=
height:21px;font-family:&#39;Helvetica 
Neue&#39;,arial,sans-serif;color:rgb=
(66,71,74)">Stevan Harnad</p></div>

--047d7b603b3e5330be04e49685c2--

        
--      
To unsubscribe from the BOAI Forum, use the form on this page:
http://mailman.ecs.soton.ac.uk/mailman/listinfo/boai-forum

[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

 E-mail:  openaccess@soros.org .