[Simh] best way to scan 172 column fanfold 80s printout?

Timothe Litt litt at ieee.org
Sun Feb 11 15:10:36 EST 2018


On 11-Feb-18 14:29, Davis Johnson wrote:
> I think what you need is a wide carriage printer with the typical feed
> up through a slot in the bottom, and a camera.
>
> The only working function needed from the printer is form feed.
> Photograph the page that is hanging below the printer, form feed and
> repeat.
>
> Anybody here ought to be able to handle the programming to automate
> this process.
>
> You would need to manually photograph the first page.
>
> The camera would need good depth of field.
>
>
It's not that simple.  You need to deal with at least 2 common vertical
pitches (6 & 8 LPI), and a number of page lengths (and widths).  These
need to be setup per job; not all printers support all these.  Plus,
misalignment (as Al noted, crossing the perforations at the bottom of a
page is quite common).  The OP mentioned that his listings have a hard
crease; this will cause (at least) feed and stacking problems.  Form
feed causes a high-speed slew; this becomes less reliable as the
distance moved increases.  You're proposing an entire page at a time -
which means that the paper will jump off the tractors frequently.[1] 
Old paper is fragile.  Over hundreds of pages, dimensions may not be
stable; it was not uncommon to have to re-adjust TOF after a while. 
There's a fair bit of error detection and recovery to work out.

Lighting is an issue, as is compensating for keystoning and other
misalignments.  Most cameras don't have a standard remote trigger
interface - one of the pointers I provided loads modified firmware into
cameras from one manufacturer to make this work.  If you look at digital
camera reviews, you'll see that the lenses have varying degrees of
artifacts, especially at the edges.  So you need to find and zoom to an
area that's relatively "flat" & doesn't need a lot of correction.  While
depth of field will help, it also will result in apparent font size
changes as paper sways forward and back.  If you stop that, you simplify
the OCR - and don't need as much depth of field.

There are many backgrounds that need to be subtracted for OCR to work. 
(Printer paper was notorious for institutional logos, as well as bars
and other aids to human readers.)  Then there are the other issues
mentioned in my earlier note.

It seems simple, but it is a P.roject.  That's a capital P.  With a lot
of roject to work out.

It's worthwhile, but it's not simple.  It's a pretty interesting
hardware (and software) project.  I don't mean to discourage anyone who
wants to work on it - but you need to go in with eyes open, or you'll
end up very, very frustrated.

Thunderscan tried to scan line by line & retrieve grayscale; the
challenges were piecing together the adjacent lines with pixel
resolution.   The focal distance was constant because the camera was on
a carriage.  The idea here is to capture a page per frame.  So the
registration problems are quite different.  One could try the
thunderscan approach; it would trade one set of problems xxx "challenges
and opportunities" for another.

[1] In my experience, with many brands and models of tractor feed
printers over many years.  Paper handling is really difficult to get right.

> On 02/11/2018 01:17 PM, Al Kossow wrote:
>>
>> On 2/11/18 10:11 AM, Dan Gahlinger wrote:
>>
>>> which is why I wondered what people thought of turning an old DEC
>>> teletype or printer into a scanner, by fixing a camera
>>> to it
>> sounds like a bigger version of the Thunderscan
>> https://www.folklore.org/StoryView.py?story=Thunderscan.txt
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.trailing-edge.com/pipermail/simh/attachments/20180211/62c00634/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4577 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.trailing-edge.com/pipermail/simh/attachments/20180211/62c00634/attachment.bin>


More information about the Simh mailing list