0
Planned

Java error during scaning

Elouan 8 years ago updated by Anthony Reinking 8 years ago 7

Every now and then, during scanning, generally when my NAS is otherwise busy, I get a "java heap space" exception. The error is displayed in the log like the one at the bottom of this post.

Once the error has happened, ubooquity hangs: it's not killed (it still shows in the process list), but it's not responsive either. That's an issue in my view to hae a runnung process that really is dead

When that happens all I have to do, is stop and then restart ubooquity and the scan can generally continue without problem (it was a temporary issue; I also suspect memory leaks in the code since this only happens when scanning a large number of comics; it only ever happened when the number of comics to scan was above a few 1000s)

To avoid this situation,I propose to add some improvement to the scanning process:

  • catch error when scanning a book; and dump a more relevant message to the user (for example, giving the name and the file and the item that was being scanned at the time of the error (cover image, title, metadata...)
  • when a memory error occurs, shut down ubooquity process; In my case, since I've added a "nohup" commands in the startup script, upstart will restart ubooquity and I would have a functioning server instead of a zombie process (I've also added to the "nohup" commands a limit to the number of restart so that if ubooquity crashes 3 times in a row, then upstart will not restart it anymore)
  • Optional: add the incriminated file to a "black list" so ubooquity doesn't try to rescan it next time... and of course allow the admin to reset the black list

ERROR com.ubooquity.Ubooquity - Uncaught exception on thread: Scanner thread
java.lang.OutOfMemoryError: Java heap space
   at java.awt.image.DataBufferByte.(DataBufferByte.java:92) ~[na:1.8.0_101]
   at java.awt.image.ComponentSampleModel.createDataBuffer(ComponentSampleModel.java:445) ~[na:1.8.0_101]
   at java.awt.image.Raster.createWritableRaster(Raster.java:941) ~[na:1.8.0_101]
   at javax.imageio.ImageTypeSpecifier.createBufferedImage(ImageTypeSpecifier.java:1074) ~[na:1.8.0_101]
   at javax.imageio.ImageReader.getDestination(ImageReader.java:2892) ~[na:1.8.0_101]
   at com.sun.imageio.plugins.jpeg.JPEGImageReader.readInternal(JPEGImageReader.java:1071) ~[na:1.8.0_101]
   at com.sun.imageio.plugins.jpeg.JPEGImageReader.read(JPEGImageReader.java:1039) ~[na:1.8.0_101]
   at com.twelvemonkeys.imageio.plugins.jpeg.JPEGImageReader.read(Unknown Source) ~[imageio-jpeg-3.1.0.jar.2469300273615590271.tmp:3.1.0]
   at javax.imageio.ImageIO.read(ImageIO.java:1448) ~[na:1.8.0_101]
   at javax.imageio.ImageIO.read(ImageIO.java:1352) ~[na:1.8.0_101]
   at com.ubooquity.fileformat.cbz.a.a(SourceFile:87) ~[Ubooquity.jar:1.10.1]
   at com.ubooquity.f.a.a(SourceFile:41) ~[Ubooquity.jar:1.10.1]
   at com.ubooquity.data.feeder.b.a(SourceFile:63) ~[Ubooquity.jar:1.10.1]
   at com.ubooquity.data.feeder.a.b(SourceFile:531) ~[Ubooquity.jar:1.10.1]
   at com.ubooquity.data.feeder.a.c(SourceFile:470) ~[Ubooquity.jar:1.10.1]
   at com.ubooquity.data.feeder.a.b(SourceFile:35) ~[Ubooquity.jar:1.10.1]
   at com.ubooquity.data.feeder.a$1.run(SourceFile:123) ~[Ubooquity.jar:1.10.1]
   at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]

Going back to this "Java heap space" error.

The same situation happens sometime when reading a comic using the web reader. It's generally because some scan are insane large (more than 1Mo for an image... I should probably reproces them, but htat's too much work)

When encountering java heap space error, Ubooquity hangs... but it's not 100% dead: for example, I can still refresh the log pages (even though the reader is stuck because of this error).

So I belive you should definitely catch that error in the two situations:

  • during a scan
  • when reading a book

In case of that error, throw a message for the user, and either kill ubooquity entirely, or try to recover (although I don't know how you can force java to release all memory)


Hello!


I'm a new user of Ubooquity and I have the same error than Elouan.

I have some big pdf files that blocked the scan process. The scan starts to hang for files size of 100Mb and more. I can process some files of 65Mb without errors.

Ubooquity server runs well even if the scan process is blocked but all files and folders after the big pdf are not scanned.

So the only issue for me is to add a regex expression to exclude them.


Here is the log message I have :

0161130 18:48:45 [Scanner thread] ERROR com.ubooquity.Ubooquity - Uncaught exception on thread: Scanner thread
java.lang.OutOfMemoryError: Java heap space
at org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:132) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:203) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:58) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at java.io.FilterOutputStream.write(Unknown Source) ~[na:1.8.0_111]
at java.io.FilterOutputStream.write(Unknown Source) ~[na:1.8.0_111]
at org.apache.pdfbox.pdfparser.COSParser.readValidStream(COSParser.java:1128) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:952) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:760) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:721) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:652) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:612) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:215) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:249) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:847) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:803) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:757) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at com.ubooquity.fileformat.pdf.b.a(SourceFile:34) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.b.c.a(SourceFile:58) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a.b(SourceFile:512) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a.c(SourceFile:470) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a.b(SourceFile:35) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a$1.run(SourceFile:123) ~[Ubooquity.jar:1.10.1]
at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_111]

Hope this can help to resolve this issue !

I can send you a pdf file if you want to try to reproduce it.

Thanks.

I'm having the same JAVA crash as you with large PDF's.

pdf creates problem: I've converted all my comics to cbz to avoid this problem

Thanks for the comment, however ALL of my downloads and thousands of magazines are PDF format. None of them are comic books either...is there a forward momentum on fixing this?

Planned

PDF has been a painful topic since the beginning.

There is no "perfect" free library to render PDF in Java yet (although PdfBox is making progress).

So what I'll do in a future version is to allow users to plug external PDF renderers (MuPDF, Poppler...) to Ubooquity.

This should solve all the most of the PDF relarted problems as they are usuallly more efficient than PdfBox.


In the meantime, the only workaround (which might not work in all cases) is to try to allocate more memory to Ubooquity (see the FAQ).

Thanks Tom! Hope you had a Merry Christmas.