Budapest Open Access Initiative      

Budapest Open Access Initiative: BOAI Forum Archive

[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

boaiforum messages

Re: [BOAI] Formats for electronic dissemination

From: Mike Brown <M.Brown AT liverpool.ac.uk>
Date: Thu, 30 Oct 2003 15:11:45 +0000 (GMT)


Threading: Re: [BOAI] Formats for electronic dissemination from radu AT monicsoft.net
      • This Message

And I have to strongly disagree ;-)

>They can always use them as GIFs or JPEGs which is much handy and
>easily downloadable and there is no special advantage being a PDF by
>itself,

Have you ever tried to assemble a journal article for dissemination
using just GIFs or JPEGs? Take a look at this article:

An investigation on filariasis in the Berau region (Inanwatan District,
North-West New Guinea) (1.4MB pdf)
http://filariasis.net/library/media/report_pdfs/spc/1957/spc_tpn_105.pdf

and tell me it would be easier for non-IT literate people (99% of the
auidence - unless it's an IT one :-)) to view it as a series of
individual GIFs/JPEGs - rather than click on the link and for it to open
in Acrobat Reader - just like a facsimile of the original printed
report - across *all* platforms (UNIX, Mac, WinPC).  Furthermore if the
article is still protected by Copyright using
PDF allows (should you desire) you to control access to the document -
something you cannot do with GIFs/JPEGs.

>there is no special advantage being a PDF

I would suggest spending sometime with the PDF specification and an
application such as Adobe Acrobat to see what you can do with PDF - did
you know, for example, that the windowing system in Mac OS X uses
PDF as the basis of its imaging model?

http://www.apple.com/macosx/features/quartz/

>these are virtually impenetrable by data mining tools.

This is not the fault of PDF - but the person who applies the
technology - see my last e-mail on this subject.

>Moreover, the print quality of many of these scanned PDFs are equally
poor.

Again, this is not the fault of PDF - but the person who applies the
technology.

The report I link to above (and all  of the articles I have converted on
filariasis.net) prints out a high quality - why? because it has a DPI of
300 - and not screen DPI of 72.

Many Journals when converting their archives to PDF simply choose a low
resolution either at the scan stage or conversion to PDF stage as it
lowers the final size of the PDF - why?

I think because:

1)  smaller frootprint (less cost to produce and store)
2)  smaller bandwith to transmit across a network (less costs to
transmit)

It's down to economics.

The result, often, is poor quality printing. A further problem is that
image quality in these pdfs is often very poor - making the images
unreadable - and for us in the world of medicine - useless.

So in short, it's not often the fault of a technology - is mostly the
fault of a human using the technology inappropriately or being constrained by
economic factors.

Best wishes,

Mike


On Mon, 20 Oct 2003, Dr.Vinod Scaria wrote:

>I agree to Radu's views.
>I have always wondered why they convert page scans to PDFs.
>They can always use them as GIFs or JPEGs which is much handy and easily
>downloadable and there is no special advantage being a PDF by itself, as
>Radu notes, these are virtually impenetrable by data mining tools.
>Moreover, the print quality of many of these scanned PDFs are equally poor.
>
>kind regards
>Vinod
>
>
>Dr.Vinod Scaria
>WEB: www.drvinod.netfirms.com
>MAIL: vinodscaria AT yahoo.co.in
>Mobile: +91 98474 65452
>
>
>
>----- Original Message -----
>From: Radu
>To: BOAI Forum
>Sent: Wednesday, October 29, 2003 12:07 AM
>Subject: Re: [BOAI] Formats for electronic dissemination
>
>
>At 11:55 AM 10/27/03, Dario Taraborelli wrote:
>>(I confess that I don't thoroughly understand the problem with pdf's,
>>since pdf documents can be indexed by search engines as easily as html
>>documents: it doesn't look like an insuperable technical problem).
>
>There's something else about archived pdfs, much worse than the relative
>inaccessibility of the semantics for their content, and that's image-based
>text.
>
>I have seen many journal archives which simply dump page scans into pdf
>format. The resulting documents are huge and totally impenetrable by
>current classification/data mining tools. It's even impossible to
>copy/paste text out of these 'archives'.
>
>
>Yours,
>Radu
>--
>Eastcree.org project
>Carleton University
>www.monicsoft.net/proj/creeTime.html
>(613) 520-2600x2174
>
>
>


[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

 E-mail:  openaccess@soros.org .