Budapest Open Access Initiative      

Budapest Open Access Initiative: BOAI Forum Archive

[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

boaiforum messages

Re: [BOAI] Formats for electronic dissemination

From: Radu <radu AT monicsoft.net>
Date: Tue, 28 Oct 2003 13:37:25 -0500


Threading: [BOAI] Formats for electronic dissemination from tarabore AT clipper.ens.fr
      • This Message
             Re: [BOAI] Formats for electronic dissemination from m.brown AT liverpool.ac.uk
             Re: [BOAI] Formats for electronic dissemination from drvinod AT hotpop.com
             Re: [BOAI] Formats for electronic dissemination from M.Brown AT liverpool.ac.uk

At 11:55 AM 10/27/03, Dario Taraborelli wrote:
>(I confess that I don't thoroughly understand the problem with pdf's,
>since pdf documents can be indexed by search engines as easily as html
>documents: it doesn't look like an insuperable technical problem).

There's something else about archived pdfs, much worse than the relative 
inaccessibility of the semantics for their content, and that's image-based 
text.

I have seen many journal archives which simply dump page scans into pdf 
format. The resulting documents are huge and totally impenetrable by 
current classification/data mining tools. It's even impossible to 
copy/paste text out of these 'archives'.


Yours,
Radu
--
Eastcree.org project
Carleton University
www.monicsoft.net/proj/creeTime.html
(613) 520-2600x2174 


[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

 E-mail:  openaccess@soros.org .