[Simh] OSs with accessible documentation
Johnny Billquist
bqt at softjar.se
Sun Feb 7 06:49:35 EST 2016
On 2016-02-06 20:32, Dave Wade wrote:
>
>
>> -----Original Message-----
>> From: Simh [mailto:simh-bounces at trailing-edge.com] On Behalf Of Paul
>> Koning
>> Sent: 06 February 2016 19:01
>> To: Timothe Litt <litt at ieee.org>
>> Cc: simh at trailing-edge.com
>> Subject: Re: [Simh] OSs with accessible documentation
>>
>>
>>> On Feb 5, 2016, at 6:10 PM, Timothe Litt <litt at ieee.org> wrote:
>>>
>>> Some of the PDFs on bitsavers are searchable. It would be a good
>>> project to OCR the rest into searchable pdfs - as that also means that
>>> the text can be extracted. OCR is getting good enough (finally) that
>>> it's feasible. I'm sure that they'd be accepted back into bitsavers
>>> - searchable is good for everyone.
>>
>> Some disapprove of OCR for reasons I don't really understand.
>
> It depends how you build the PDF. If you replace the images with the OCR's text, which seems to be the default, then you introduce errors.
> If you leave the images in place and put text behind the images I can't see what the problem is,
For me personally, I would like to have two copies of documentation. One
which is pure/plain text. No preservation of the scan. Images in the
documentation needs to be preserved, but nothing else. And then you can
have the full scanned sources in a different file for those who actually
want that.
The reason is that working on a 50M pdf file is horrible. PDF do not
work that good with huge amounts of data for each page. It gets slow, it
eats resources, and becomes almost unusable as reading material.
I want manuals to use them, not to just "preserve" them.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
More information about the Simh
mailing list