Your comments

Hello!


I'm a new user of Ubooquity and I have the same error than Elouan.

I have some big pdf files that blocked the scan process. The scan starts to hang for files size of 100Mb and more. I can process some files of 65Mb without errors.

Ubooquity server runs well even if the scan process is blocked but all files and folders after the big pdf are not scanned.

So the only issue for me is to add a regex expression to exclude them.


Here is the log message I have :

0161130 18:48:45 [Scanner thread] ERROR com.ubooquity.Ubooquity - Uncaught exception on thread: Scanner thread
java.lang.OutOfMemoryError: Java heap space
at org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:132) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:203) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:58) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at java.io.FilterOutputStream.write(Unknown Source) ~[na:1.8.0_111]
at java.io.FilterOutputStream.write(Unknown Source) ~[na:1.8.0_111]
at org.apache.pdfbox.pdfparser.COSParser.readValidStream(COSParser.java:1128) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:952) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:760) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:721) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:652) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:612) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:215) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:249) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:847) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:803) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:757) ~[pdfbox-2.0.0.jar.4074501298508754503.tmp:2.0.0]
at com.ubooquity.fileformat.pdf.b.a(SourceFile:34) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.b.c.a(SourceFile:58) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a.b(SourceFile:512) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a.c(SourceFile:470) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a.b(SourceFile:35) ~[Ubooquity.jar:1.10.1]
at com.ubooquity.data.feeder.a$1.run(SourceFile:123) ~[Ubooquity.jar:1.10.1]
at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_111]

Hope this can help to resolve this issue !

I can send you a pdf file if you want to try to reproduce it.

Thanks.