[Simh] OSs with accessible documentation

Johnny Billquist bqt at softjar.se
Sun Feb 7 06:49:35 EST 2016


On 2016-02-06 20:32, Dave Wade wrote:
>
>
>> -----Original Message-----
>> From: Simh [mailto:simh-bounces at trailing-edge.com] On Behalf Of Paul
>> Koning
>> Sent: 06 February 2016 19:01
>> To: Timothe Litt <litt at ieee.org>
>> Cc: simh at trailing-edge.com
>> Subject: Re: [Simh] OSs with accessible documentation
>>
>>
>>> On Feb 5, 2016, at 6:10 PM, Timothe Litt <litt at ieee.org> wrote:
>>>
>>> Some of the PDFs on bitsavers are searchable.  It would be a good
>>> project to OCR the rest into searchable pdfs - as that also means that
>>> the text can be extracted.   OCR is getting good enough (finally) that
>>> it's feasible.  I'm sure that they'd be accepted back into bitsavers
>>> - searchable is good for everyone.
>>
>> Some disapprove of OCR for reasons I don't really understand.
>
> It depends how you build the PDF. If you replace the images with the OCR's text, which seems to be the default, then you introduce errors.
> If you leave the images in place and put text behind the images I can't see what the problem is,

For me personally, I would like to have two copies of documentation. One 
which is pure/plain text. No preservation of the scan. Images in the 
documentation needs to be preserved, but nothing else. And then you can 
have the full scanned sources in a different file for those who actually 
want that.

The reason is that working on a 50M pdf file is horrible. PDF do not 
work that good with huge amounts of data for each page. It gets slow, it 
eats resources, and becomes almost unusable as reading material.

I want manuals to use them, not to just "preserve" them.

	Johnny

-- 
Johnny Billquist                  || "I'm on a bus
                                   ||  on a psychedelic trip
email: bqt at softjar.se             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol


More information about the Simh mailing list