0
Not a bug

JPEG 2000 Error

nxmehta 10 years ago updated by Tom 10 years ago 5
I see the following error in my Ubooquity logs when scanning certain PDFs:

java.lang.RuntimeException: JPeg 2000 Images needs the VM parameter -Dorg.jpedal.jai=true switch turned on
at org.jpedal.parser.PdfStreamDecoder.decodeStreamIntoObjects(Unknown Source) ~[jpedal_lgpl.jar.1278601029713114070.tmp:4.92b23] at org.jpedal.parser.PdfStreamDecoder.decodePageContent(Unknown Source) ~[jpedal_lgpl.jar.1278601029713114070.tmp:4.92b23]
at org.jpedal.PDFtoImageConvertor.convert(Unknown Source) ~[jpedal_lgpl.jar.1278601029713114070.tmp:4.92b23]
at org.jpedal.PdfDecoder.getPageAsImage(Unknown Source) ~[jpedal_lgpl.jar.1278601029713114070.tmp:4.92b23]
at org.jpedal.PdfDecoder.getPageAsImage(Unknown Source) ~[jpedal_lgpl.jar.1278601029713114070.tmp:4.92b23]
at com.ubooquity.fileformat.pdf.c.a(SourceFile:62) ~[Ubooquity.jar:1.6.0]
at com.ubooquity.fileformat.pdf.c.a(SourceFile:48) ~[Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.b(SourceFile:359) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.a(SourceFile:202) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.b(SourceFile:302) [Ubooquity.jar:1.6.0] at com.ubooquity.data.feeder.a.a(SourceFile:34) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a$1.run(SourceFile:112) [Ubooquity.jar:1.6.0]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]

I passed the requested parameter to Ubooquity like so:

java -Dorg.jpedal.jai=true -jar Ubooquity.jar

But that didn't seem to help.  Is it possible to fix this issue in Ubooquity itself?
Under review
The error you get comes from images contained in your PDF files that are encoded using the JPEG2000 format.

JPEG2000 is an image compression format that is quite efficient but also very complex and that never really took off. Because it is complex and not really used, it has very poor support in applications and libraries.
In your case, the PDF file is decoded by Ubooquity using the JPedal PDF library. This library does not support JPEG2000 out of the box and has to rely on another external library: Java Advanced Imaging (JAI). The problem is that JAI is an old library that is not maintained anymore (abandoned since 2006). 

So in theory, depending on the architecture you're running Ubooquity on, you might be able to download the last JAI package, install it and add the JAI library to your classpath when running Ubooquity. But I doubt it will succedd.

Bottom line: Ubooquity does not support JPEG2000, sorry.
Hey Tom, thanks for the explanation. I was actually able to get JPEG2000 support to work with your advice!  I added jai_imageio.jar file to Ubooquity.jar, and with the -Dorg.jpedal.jai=true parameter, JPEG2000 images inside pdfs are read perfectly.

I was also seeing another funny error in my logs:

java.lang.RuntimeException: This PDF file is encrypted and JPedal needs an additional library to
decode on the classpath (we recommend bouncycastle library).
There is additional explanation at http://www.idrsolutions.com/additional-jars
at org.jpedal.io.PdfFileReader.setupDecryption(Unknown Source) ~[jpedal_lgpl.jar.7484534240993843609.tmp:4.92b23]
at org.jpedal.io.PdfFileReader.readLegacyReferenceTable(Unknown Source) ~[jpedal_lgpl.jar.7484534240993843609.tmp:4.92b23]
at org.jpedal.io.PdfFileReader.readReferenceTable(Unknown Source) ~[jpedal_lgpl.jar.7484534240993843609.tmp:4.92b23]
at org.jpedal.PdfDecoder.openPdfFile(Unknown Source) ~[jpedal_lgpl.jar.7484534240993843609.tmp:4.92b23]
at org.jpedal.PdfDecoder.openPdfFile(Unknown Source) ~[jpedal_lgpl.jar.7484534240993843609.tmp:4.92b23]
at com.ubooquity.fileformat.pdf.c.a(SourceFile:37) ~[Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.b(SourceFile:359) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.a(SourceFile:202) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.b(SourceFile:302) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a.a(SourceFile:34) [Ubooquity.jar:1.6.0]
at com.ubooquity.data.feeder.a$1.run(SourceFile:112) [Ubooquity.jar:1.6.0]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]

I don't have any password protected PDFs, but it does look like there is some encryption in my pdfs.  In any case, I did the same thing with the bouncycastle library as i did with jai: I added bcprov.jar to Ubooquity.jar, and voila my encrypted PDFs were read transparently by Ubooquity.

You might want to consider adding these two jars (and the -D flag) to Ubooquity, as it seems like it give Ubooquity additional features right out of the box.  I think pretty much all the crazy pdfs I throw at Ubooquity now work (I'll let you know if I find any other weird things).
I stand corrected. :)
Thanks for the information and the tests you have done !

I am a little bit reluctant to add almost 5 MB of additional jars in the Ubooquity binary for what I think to be quite specific cases, but I'll add an entry in the FAQ so that other people can easily solve this problem if they encounter it.

I changed my mind and included both libraries (JAI and Bouncy Castle) in Ubooquity. :)