hpr2771 :: Embedding hidden text in Djvu files
Part 2 of Klaatu's Djvu mini series
Hosted by Klaatu on 2019-03-18 is flagged as Clean and is released under a CC-BY-SA license.
pdf, ebook, bloat, djvu. (Be the first).
Listen in ogg,
or mp3 format. Play now:
To embed text into a Djvu file, you must create a
djvused script detailing the page and bitmap location of one of: character, word, line, paragraph, or region.
For good measure, you should first list the contents of your Djvu bundle:
$ djvused -e 'select; ls' test.djvu 1 P 177062 p0001.djvu 2 P 199144 p0002.djvu 3 P 12323 p0003.djvu 4 P 57059 p0004.djvu 5 P 96725 p0005.djvu 6 P 53868 p0006.djvu
Then define the location of text in a file called, for instance,
content.dsed. Assume that my page is 1000 px by 1000 px:
select; remove-ant; remove-txt select "p0004.djvu" # page 4 set-txt (page 0 0 1000 1000 (word 100 600 450 800 "Hello" ) (word 100 600 450 800 "world" )) . select "p0005.djvu" set-txt (page 0 0 1000 1000 (line 100 400 900 600 "Hacker Puppy Radio"))
Apply this script to your Djvu file with
djvused -f ./content.dsed -s test.djvu
Converting from PDF to Djvu
You can convert PDF files to Djvu with the
djvudigital command. Due to license incompatibility, it does require you to compile a Ghostscript plugin, but it's an easy build. Get the gsdjvu code, and then follow its README instructions.
Once you've built the Ghostscript driver, you can convert PDF to Djvu:
djvudigital --words foo.pdf foo.djvu