[Printers] printers@frotz.zork.net

Michael Sleator sleat at hottie.net
Mon Oct 31 01:27:34 PST 2005


I guess I should have spent more time digging through the archives.
Perhaps I would have found this:

    http://frotz.zork.net/pipermail/printers/2005-September/000013.html

before reinventing (or, perhaps, rediscovering) the wheel here:

    http://frotz.zork.net/pipermail/printers/2005-October/000048.html

Nonetheless, the exercise may have been worthwhile, as I have a slightly
different interpretation on a couple of points.

But first, let me direct your attention to the following image files:

    http://www.sleator.com/printers/tosky_FC-22.png
    http://www.sleator.com/printers/tosky_C5016N.png
    http://www.sleator.com/printers/tosky_CLJ2500.png
    http://www.sleator.com/printers/tosky_HP2600N_1.png
    http://www.sleator.com/printers/tosky_HP2600N_2.png
    http://www.sleator.com/printers/tosky_Template.png

These show the data from a Toshiba FC-22, a Kyocera C5016N, an HP
2500, and two HP 2600Ns, superimposed on a template that shows
the structure of the code as I presently understand it.  The Toshiba
data came from my own scans, as I posted earlier, the Kyocera data
from Ralf Muschall's post (.../2005-October/000030.html), the 2500
from Patrick Burns (http://www.burnsonline.net/dots/grid1.png), and
the HP2600Ns from David Carne's posting of Patrick Murphy's decoding
(http://www.carne.sys-techs.com/2600N_patterns/viewer/murphy_decode.php).

By the way, Patrick Burns, as you'll see, your pattern was rotated by
a column pair.  If you could confirm my transcription of your data, it
would be much appreciated.

The file tosky_Template.png shows just the template, for those who want
to superimpose their own data.  The dot spacing is 36 pixels, and in Gimp,
if you set your grid spacing and offset properly and turn on snap-to-grid,
it is a very quick matter to transcribe a pattern onto the template.  I
create a new layer and use the pencil tool with a 19-pixel diameter brush.
The sharp eye might notice that the dots in tosky_FC-22.png are slightly
different than the other two files.  This is because the FC-22 pattern was
directly processed from a scanned image, whereas the others were manually
transcribed.  At this point I'm still working with data in the pure
physical representation so as not to overlook any clues that might be
buried in the geometry.

Regarding the file names, I had taken to calling this code the "TosKy"
code (for TOShibaKYocera) before I realized it was also used by the
HP CLJ series.

I've changed the block numbering from my earlier post, for which I
apologize.  Although it is arbitrary, my instincts are pulling me in
the direction of numbering the top blocks "0".

In the above-mentioned .../2005-September/000013.html post, Patrick
Murphy says:

	Each row has 2 numbers on it.  Each of these numbers is
	one bit, so the only possible "digits" are 1, 2, 4, or 8.
	Therefore, there are 16 possible 2-"digit" numbers:  11,
	12, 14, 18, 21, 22, 24, 28, 41, 42, 44, 48, 81, 82, 84, 88.
	I believe each one of these represents a single hex digit,
	though that's the part I haven't been able to establish yet.

I have been working with the interpretation, as I suggested in my
.../2005-October/000048.html post, that each dot represents two bits
in a one-of-four code.  This still leaves many possible ways of
associating the bits to produce meaningful data.  One of the first
questions, given this interpretation, is whether the two halves of
a row (labeled A and B in my template) stand alone, or are associated
in some way.  Note that if the four eight-column rows on the A side
or the B side are taken as a block, they represent a single 8-bit byte
with its own parity word.  Temptingly tidy.  It is somehow less compelling
to take the A and B sides of a single row together as four bits of a
decoded word.  A related observation is that, in the Toshiba and Kyocera
data, block 3B seems to have properties that suggest assigning it a
terminal position.  However, none of the HP data show this characteristic,
so this may be a red herring.

Blocks 0A and 0B clearly have some special properties.  First, within
each data set, 0A and 0B are identical.  Second, in each of these sets,
each column pair is used exactly once.  With these rules, there are
only 24 possible patterns for these blocks.  This seems a bit scant for
something like a manufacturer's code, but note that all three HP datasets
included here are identical in the 0 blocks, and the Toshiba and Kyocera
are each unique, so a manufacturer's code is a reasonable conjecture.
But why then the "each column exactly once" rule, and why duplicate
the block?

Backing up to the interpretation of blocks 1-3, if we assume 1-of-4
coding, the question immediately occurs, which column is assigned
which value?  (Notice that I didn't label the columns in my template,
to avoid prejudicing the issue.)  The obvious choices would be 0-3
or 3-0 left-to-right, but so far I don't know of anything to support
either one in particular.

I've been toying with another idea, but so far haven't found a lot to
support it either, and several reasons to doubt it.  Nevertheless, I'll
toss it into the ring as yet another interesting idea, which I call the
"Column Swizzling" conjecture.

Suppose that instead of a fixed column-to-value assignment, there is
some reason to want to reassign the column values on an instance-by-
instance basis.  This might be to improve visual dispersion of dot
patterns, or something to do with automated recovery of the data that
I haven't thought of yet.  For example, remapping could break up the
diagonal run of dots in block 2A of HP2600N_2 (seral number CNBC55L0QJ).

This could be done if the 0 blocks are in fact maps for the interpretation
of the rest of the pattern.  Here's how it might work.  Row 0 of block
0 (A or B) contains one dot.  That dot would indicate the column for
the 00 dibit.  Row 1 would indicate the column of the 01 dibit, etc.
This is consistent with the above observed rule for the 0 blocks that
each column is used exactly once.  (Of course, the numbering of the
rows could be reversed or swizzled as well.)

One observation that set me on this track was regarding the 3B block of
the Toshiba and Kyocera data.  I commented in my previous post on the
fact that in both cases the 3B block contains the same column bit in
all four rows, but the column differs between the two files.  However,
it is striking that in each case the column is that which is marked by
row 0 of block 0B.  So if this dibit mapping is taking place, both 0B
blocks would decode to the same value.

Unfortunately, the HP data doesn't support this theory.  It doesn't
disprove the swizzling idea, but it contradicts the idea of 3B as a
fixed pattern terminal block, which was one of the things propping up
the swizzling idea.  It would seem quite wasteful to throw away a whole
block as some sort of terminal marker, so in one sense I don't feel
bad about letting that part go.  On the other hand, it might make
sense that, in both the Kyocera and Toshiba data, block 3B represents
unused high-order bits that are all set to zero.

Another problem with the theory of the 0 blocks as maps is that there's
no obvious way to justify the fact that there are two identical ones.
It could be that the code designers wanted to provide the capability of
independently swizzling the left and right sides of the pattern (which
would in fact be desirable if the goal is to improve dispersion of the
dots), but then decided that it was unnecessary in practice.  In fact,
it could be that the refinement of optimizing the mapping was never taken
up, and the 0 blocks are not so much manufacturer's codes as arbitrary
patterns, where each manufacturer picked a column map and stuck with it.
I'm not quite ready to believe this without some supporting evidence,
though.

By the way, another interesting observation on the HP data is that block
2A is identical for both 2600N files and different from block 2A in the
2500 file.  The same holds true for block 2B.  Perhaps HP uses these bytes
as a model code?

A fundamental problem I see in this entire endeavor is that the same
coding scheme is used by several different manufacturers, but different
manufacturers tend to use different formats for their product serial
numbers.  For example, it is not clear how to reconcile the serial
numbers CNBC55L0QJ, CNBC573058, and BD011291 in a single coding scheme.

Thus it may be the case that there is no direct correlation between the
coded data on the page and the machine serial number.  Instead, there may
be a translation table that is hidden from us.  Each machine would still
have a unique identifier, but there might be no correlation between this
identifier and the product serial number.  This could easily come about if
the marking is implemented at a very low level by a chip that is common
to all products using this code.  That chip would contain a unique ID,
in silicon, but each manufacturer would be responsible for maintaining
records that associate that unique ID with a given product serial number.
This situation is common with other schemes that rely on unique IDs buried
in silicon.  It is also common, but certainly not universal, in this case
that each manufacturer is assigned a dedicated range of unique ID values.

Oh well.  So little data, so much speculation.  Bits!  Must have more BITS!

Michael Sleator
sleat at sleator.com



More information about the printers mailing list