Any tips on processing large amounts of files? Retrobatch crashes when I try to specify the source bucket ( ~ 7K files)
Could I call my Retrobatch one image at a time instead of as a batch? I see from the documentation I can call via apple script but it seems the unit of work is “directory at a time”
Can you try that out and let me know if it’s any faster? I tried with 8k files on my end, and it’s working much better after these changes.
And if it’s still crashing, can you send in one of the crash reports? To get at Retrobatch’s crash reports, open the Finder, choose the Go menu while holding down the Option key so the Library menu item shows up- and then choose the Library option. Then navigate to the Logs/DiagnosticReports/ folder, and you should find some entries from Retrobatch.
Spoke to soon – it’s not a crash, but rather a it “preflights” for 685.87 seconds, and then says “0 pixels processed”
In the nodes it show “7145 Files” (which I think is accurate)
I’ve tried to let it “preflight to completion” and then pressed “go”, as well as “press go right away”, same behavior.
The workflow was complex (10-ish nodes) – I’ve been deleting nodes trying to get a working state-- still fails.
I’ve looked in /Library/Logs/Diagnostics/Retrobatch-*, This entry stood out
Event: cpu usage
Action taken: none
CPU: 90 seconds cpu time over 154 seconds (58% cpu average), exceeding limit of 50% cpu over 180 seconds
CPU limit: 90s
Limit duration: 180s
CPU used: 90s
Duration: 154.17s
Steps: 110
Where can I send the full log (if that is helpful?)
I’m guessing there’s something in the nodes which we’re not expecting, and files aren’t being closed for some reason, which is what is causing the error.
I thought I tried this on my data set and it worked, but I downloaded / updated and it doesn’t seem to work, so I don’t know if I was dreaming then or not. I’m using 1.b1 (544) and it seems to behave as it did before.
(But seriously, I thought it did work while I was on the beta version before current.
Trying to set up the test again. Not crashes, but I was getting the same behavior as before – (seemingly) hangs during the preflight at “0 of XXXXX”. It doesn’t seem to matter if I let the whole “preflight” preflight finish, and then press the run button, or if I press the run button ‘during’ the first preflight.
(If I do press the run button early, it changes to “pre-flight for run” but then never gets beyond processing “0 of XXXX”
I’ve tried a few times, waiting 30-60 minutes. The source directory is approx 400 GB. Trying to re-export ‘source’ images as JPEG (rather than TIFF) and will try again.
I have a directory of 61 large PDF files that I would like to explode into single page PDFs. The result will be ~71K individual pages. I am using the Read Directory node to read a folder; I send it to Page Splitter; and then to Write. Preflighting seems to take forever. I have yet to be able to press start.
What I do is I just press the “go” button during pre-flight and it seems to run through fine. I don’t think there is a difference if you let it “preflight to done” or not.
Holy smokes that’s a bunch of pages. How does the file system hold up on that?
@martinstreicher Wow, yea that’s a ton of pages. I’d love to hear what happens when that’s finished as well. Your PDF files are around 1.2k pages each then?
@andreweick I didn’t realize the total space of your images was up to 400GB in size. The 8k images I’ve been playing around with is ~2GB. I’ll try and make things faster for this case as well.
@martinstreicher I’ve just uploaded a new build of Retrobatch to the latest builds page, where I fixed a couple of things that’ll speed up page splitting in the preprocessing stage. If you grab the latest version from Latest Builds from Flying Meat , does it preprocess faster for you?
And if not, can you send me a sample PDF (support@flyingmeat.com would be a good place to send it).