Thanks for the rapid reply, Seth! I'm extremely interested in the
brochure produced by EFF - is there an online copy available? <br><br>A question: are there known printer watermarking schemes that involve permutations of the watermark across the printed sheet? In the case of Xerox DocuColor, it appears that there isn't any "room" in the watermark for such information, and it suggests to me that the same watermark is repeated across the sheet. Therefore, in the case of DocuColor printers, the logical-AND method propose proposed by Seth Schoen would be a trivial attack against the random-dot overlay generated by yellowdot. <br>
<br>However, a more sophisticated watermarking technique (checksumming the row and column, or perhaps encryption) would require more sophisticated analysis to disambiguate yellowdot noise. For what it's worth, there is a density parameter in yellowdot that
roughly (though not directly) corresponds to how much noise appears in
the generated PNG. This parameter is currently hardcoded, but the next release will make this available as a command line parameter.<br><br>Although the proposed attack methodology is completely straight-forward, I am curious to know how time-consuming it is to employ. It sounds a little tedious to scan the page at an appropriate DPI, extract and align the "watermark tiles," then apply a logical-AND (e.g. with GIMP and multiple 50% opacity layers)... but to get a good sample of the "signal" should only require a few such tiles. <br>
<br>An attack that had multiple pages of output from a single printer could be optimized to select only the top-most, left-most watermark from each page, which might make extraction more efficient. To mitigate this risk, in the case of a multi-page document, the same yellowdot noise should be used for all pages. In fact, perhaps this suggests that the same yellowdot overlay should be used for all documents produced by a single printer, and never for any document produced by another printer.<br>
<br>I think Seth Schoen's challenge sounds like lots of fun. I don't have a laser printer, but I'm happy to find a local print shop for this purpose. However, if anyone else has faster access to a DocuColor printer, by all means take advantage of that access!<br>
<br>Finally, as a general question to the list: I have followed the printer watermark _decoding_ meme for several years now, but have there been other proactive attempts, aside from yellowdot, to _obfuscate_ printer watermarking?<br>
<br>-Ian<br><br><div class="gmail_quote">On Fri, Oct 24, 2008 at 1:02 AM, Seth David Schoen <span dir="ltr"><<a href="mailto:schoen@eff.org">schoen@eff.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I Miller writes:<br>
<br>
> This software will create a PNG containing yellow dots that can then be<br>
> overlaid onto existing printer output. These dots are randomly distributed<br>
> across the image, and are therefore not useful for securely obfuscating the<br>
> identity that is encoded within the printer's generated dot pattern.<br>
> However, ``yellowdot'' raises the amount of resources that must be expended<br>
> to successfully identify the printer that generated a given document. It is<br>
> hoped that ``yellowdot'' can be used to thwart amateur attempts to breach<br>
> privacy, such as could conceivably be practiced in an office environment.<br>
<br>
This is an interesting idea, but I'm concerned that a lot of people<br>
won't realize just how easy and automatic it could be to undo this.<br>
(EFF has also just produced a brochure in which I argue against this<br>
idea for the reasons below, in a lot less technical detail.)<br>
<br>
Suppose we're talking about DocuColor or Dell dots, which we understand<br>
well and where the repeating unit of the tracking pattern is relatively<br>
small. We probably don't need Fourier transforms or anything fancy to<br>
find the repeating periodic signal; if we know its period accurately<br>
enough we can take n identically-sized adjacent rectangular regions of<br>
the appropriate size, do a color separation and then invert the blue<br>
channel values, and then multiply (or logical-AND) to find yellow pixels<br>
that are present in each such region. Then a false positive added noise<br>
dot will only occur at a given location with probability p^n, where p is<br>
the probability that the added noise will turn on a dot at a particular<br>
randomly chosen location. If p is very high, the output image quality<br>
would be adversely effected, and anyway choosing even a moderately large<br>
n makes p^n pretty small. What's worse, the resulting image can probably<br>
be manually edited to remove false positive dots because the normal<br>
structure of a tracking pattern is predictable enough. (There's also<br>
the problem that even the grid alignment of the tracking pattern is<br>
probably extremely constrained, and of course the DocuColor and Dell<br>
tracking patterns have row and column parity, sufficient to correct a<br>
single bit error.)<br>
<br>
I'm not an image-processing expert, but I know that there are an enormous<br>
number of techniques and tools to remove noise from periodic signals,<br>
and to detect changes (or the lack of changes) in related images. For<br>
instance, I know of an astronomy project where students get time-lapse<br>
imagery from telescopes and then digitally subtract images from one<br>
another in order to remove the fixed stars and find asteroids. In a way,<br>
these obfuscation techniques are like taking a number of images of fixed<br>
stars (the forensic tracking dots) and adding asteroids (the noise dots)<br>
to them. If the students can distinguish the two in telescope images, I<br>
bet they could distinguish them in forensic images.<br>
<br>
I think it's valuable to understand how practical this attack is, so I'd<br>
be happy to try a challenge -- if someone wants to print a "yellowdot"<br>
document (mostly whitespace for simplicity) on a DocuColor or Dell, write<br>
down the printer serial number but don't tell me what it was, and then<br>
send it to me in the mail, I'll make an attempt to read the serial number<br>
and tell the list the results.<br>
<font color="#888888"><br>
--<br>
Seth Schoen<br>
Staff Technologist <a href="mailto:schoen@eff.org">schoen@eff.org</a><br>
Electronic Frontier Foundation <a href="http://www.eff.org/" target="_blank">http://www.eff.org/</a><br>
454 Shotwell Street, San Francisco, CA 94110 1 415 436 9333 x107<br>
</font></blockquote></div><br>