I know Calibre can remove DRM, but it seems that Calibre does not remove things like watermarks, references to the buyer by name, etc. Now maybe I can try to find those manually, but that is an error prone process. Plus, what if they embed a unique digital signature that ties back to me? I understand that this is a very uncommon practice, but I do not want to find myself in a bad place.
I suppose the only way to remove a digital signature of any sort is to buy two of the same e-book by different people, diff them, and remove anything that differentiates them.
Is there any tool that does this or automates the process? am I being too paranoid, and this is not a real threat?
have a look on “snowdrop” (search together with “steganography”), its basically the opposite of what you want, but worth mentioning here. watermarks could be placed into whitespace (not limited to actual spaces or linebreaks, intentionally changed usage of paragraphs, tabs or even page boundaries could possibly be detected after scanning andeven after OCR. IMHO snowdrop uses -depending on choosen operation mode- small errors like misspelled words, commata etc but also has a mode that comes along with fine grammar and without misspelled words…
how do you make sure that by diff’ing two versions you do cover "everything’ that has been deliberately placed into both documents but share literally the same informations?
lets say you bought two books at two different stores with two different watermarks. if the watermark contains the date and time of the purchase and the only difference of this were the minutes because you bought them within the same hour, the remaining watermark would point to all buyers that bought exactly this book in this hour - worldwide. but still it could be “very” precise depending on all other(!) buyers, if they exist at all within that timeframe. what if the watermark includes unix epoch? then the part which is the same in both watermarks would not be bound by hours, but by seconds, 10seconds, 100seconds etc.
and you could not know if there were other watermarks hidden that just happened to be the same for your two (three.?) purchases (same country, continent, payment method, credit card holder name, name of internet provider used during purchase, browser used etc.) it fully depends on the creator of the watermark what would be included and what not. if you happem to know all that (without any possibleexemptions) you might be on the safe side, but if not…
my general suggestion here is:
just to mention… the “safe” side sometimes seems limited but maybe is actually not, if you really look at it.
Diffing should reveal any differences, even white space. I suppose with white space it may be harder to fix, as you have to figure out the neutral state. But it is still possible.
Regarding the time stamp, I actually did think of this and you’re right. It would work especially for a small online bookstore. I believe the two books just have to be bought at very different times and ideally different other things, like people with different last name and even general location of billing address.
Regarding your other points… You make good points, so I will consider.
i have to admit, that my point ‘just don’t do it’ in reality does not garantee to prevent any trouble. it still is possible to be sued for things someone else did.
also one suggestion to think about:
if the seller just sprays some random changes over a book for every sold version, one would have differences in “every” sold version to every other sold version. by blindly changing those parts to something else you could reveal which exact two/three versions you had for diffing.
UPDATE: someone else here had the same thought a bit earlier…
my suggestion to not do it stays the same ;-)
it could be interesting to figure things out how they work, what could be done to prevent or circumvent such prevention, but actually doing it seems risky no matter what.