Budapest Open Access Initiative      

Budapest Open Access Initiative: BOAI Forum Archive

[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

boaiforum messages

Re: [BOAI] Formats for electronic dissemination

From: Michael Brown <m.brown AT liverpool.ac.uk>
Date: Fri, 31 Oct 2003 16:59:20 +0000


Threading: RE: [BOAI] Formats for electronic dissemination from joatp2000 AT yahoo.com
      • This Message


--Apple-Mail-1--881952606
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=US-ASCII;
	delsp=yes;
	format=flowed

I was responding to the comments about the PDF format made on this  
forum which are clearly incorrect - not about it's suitability as a  
format for self-archiving - for example:

> and two, they are too often in columns making them difficult to read
> online.

This is *not* the fault of the format - clearly you can make one column  
PDFs.

> There's something else about archived pdfs, much worse than the  
> relative
> inaccessibility of the semantics for their content, and that's  
> image-based
> text.

again this is *not* the fault of the format.

I don't believe that the PDF format is the only (or best) way of  
archiving material - that is what SGML and XML are for - allowing  
information to be extracted and re-used in a variety of dissemination  
strategies - giving the user the choice.

Cheers,

Mike

On 30 Oct 2003, at 20:57, Michael J. O'Donnell wrote:

> Well, first I agree with Steven Harnad that achieving open archiving  
> is much more important than the choice of format. Can we coin a  
> proverb: "PDF in the archive is better than perfect format 
deleted"?  
> But, since I am expert on formats, I have to respond:
>
> Mike Brown wrote:
>
>>> and two, they are too often in columns making them difficult to 
read
>>> online.
>>>
>>
>> This is not the technology at fault - this is the preference of the
>> producer of the document.
>>
> It's partly the fault of the choice of "technology" (that is, 
format).  
> PDF is a page layout format, not a structured text format. It requires  
> the producer to make choices that are much better left to readers, in  
> particular because different readers have different needs. It should  
> be the reader, rather than the producer, who is making layout choices.
>
> Supposing that we can choose which format to put in the archive, we  
> should choose formats that provide maximum flexible utility to  
> readers. We may enumerate some things that readers might like to do  
> with documents: display them, search them, perform statistical  
> analyses on them, display them in huge fonts because of low visual  
> acuity (magnifying a 10-point layout does a very bad job of this),  
> browse them audibly (which is much more than just having them read),  
> ... But the point is not to support particular anticipated uses.  
> Rather, the point is to support maximum flexibility, and avoid  
> foreclosing uses that will be invented in the future.
>
> Paradoxically, it is quite possible to evaluate the inherent  
> flexibility of a format without knowing the precise use to which that  
> flexibility will contribute. I wrote up some analysis of known types  
> of formats, and the article is openly (but not very well organizedly)  
> archived at  
> http://people.cs.uchicago.edu/~odonnell/Scholar/Technical_papers/ 
> Electronic_Journal/description.html
>
> Mike O'Donnell
> The University of Chicago
>
>

--Apple-Mail-1--881952606
Content-Transfer-Encoding: 7bit
Content-Type: text/enriched;
	charset=US-ASCII

I was responding to the comments about the PDF format made on this
forum which are clearly incorrect - not about it's suitability as a
format for self-archiving - for example:


<excerpt>and two, they are too often in columns making them difficult
to read

online.

</excerpt>

This is *not* the fault of the format - clearly you can make one
column PDFs.


<excerpt>There's something else about archived pdfs, much worse than
the relative 

inaccessibility of the semantics for their content, and that's
image-based 

text.

</excerpt>

again this is *not* the fault of the format.


I don't believe that the PDF format is the only (or best) way of
archiving material - that is what SGML and XML are for -
<fontfamily><param>Arial</param>allowing information to be 
extracted
and re-used in a variety of dissemination strategies - giving the user
the choice.


</fontfamily>Cheers,


Mike


On 30 Oct 2003, at 20:57, Michael J. O'Donnell wrote:


<excerpt>Well, first I agree with Steven Harnad that achieving open
archiving is much more important than the choice of format. Can we
coin a proverb: "PDF in the archive is better than perfect format
deleted"? But, since I am expert on formats, I have to respond:


Mike Brown wrote:


<excerpt><excerpt>and two, they are too often in columns making 
them
difficult to read

online.

   

</excerpt>

This is not the technology at fault - this is the preference of the

producer of the document.


</excerpt>It's partly the fault of the choice of "technology" 
(that
is, format). PDF is a page layout format, not a structured text
format. It requires the producer to make choices that are much better
left to readers, in particular because different readers have
different needs. It should be the reader, rather than the producer,
who is making layout choices.


Supposing that we can choose which format to put in the archive, we
should choose formats that provide maximum flexible utility to
readers. We may enumerate some things that readers might like to do
with documents: display them, search them, perform statistical
analyses on them, display them in huge fonts because of low visual
acuity (magnifying a 10-point layout does a very bad job of this),
browse them audibly (which is much more than just having them read),
... But the point is not to support particular anticipated uses.
Rather, the point is to support maximum flexibility, and avoid
foreclosing uses that will be invented in the future.


Paradoxically, it is quite possible to evaluate the inherent
flexibility of a format without knowing the precise use to which that
flexibility will contribute. I wrote up some analysis of known types
of formats, and the article is openly (but not very well organizedly)
archived at
http://people.cs.uchicago.edu/~odonnell/Scholar/Technical_papers/Electronic_Journal/description.html


Mike O'Donnell

The University of Chicago



</excerpt>
--Apple-Mail-1--881952606--


[BOAI] [Forum Home] [index] [prev] [next] [options] [help]

 E-mail:  openaccess@soros.org .