Re: [BOAI] Formats for electronic dissemination

From: "Dr.Vinod Scaria" <drvinod AT>
Date: Mon, 20 Oct 2003 13:46:17 +0530

Re: [BOAI] Formats for electronic dissemination from radu AT
I agree to Radu's views.
I have always wondered why they convert page scans to PDFs.
They can always use them as GIFs or JPEGs which is much handy and easily
downloadable and there is no special advantage being a PDF by itself, as
Radu notes, these are virtually impenetrable by data mining tools.
Moreover, the print quality of many of these scanned PDFs are equally poor.

kind regards

Dr.Vinod Scaria
MAIL: vinodscaria AT
Mobile: +91 98474 65452

From: Radu
To: BOAI Forum
Sent: Wednesday, October 29, 2003 12:07 AM
Subject: Re: [BOAI] Formats for electronic dissemination

At 11:55 AM 10/27/03, Dario Taraborelli wrote:
>(I confess that I don't thoroughly understand the problem with pdf's,
>since pdf documents can be indexed by search engines as easily as html
>documents: it doesn't look like an insuperable technical problem).

There's something else about archived pdfs, much worse than the relative
inaccessibility of the semantics for their content, and that's image-based

I have seen many journal archives which simply dump page scans into pdf
format. The resulting documents are huge and totally impenetrable by
current classification/data mining tools. It's even impossible to
copy/paste text out of these 'archives'.

-- project
Carleton University
(613) 520-2600x2174

