Your comments

That's a weird one, haven't encountered it myself. Though I haven't played around with that in a bit.

Ah yeah, I had that issue with descriptions and rather than actually sorting it out I just replaced &'s with and's. Something about sed really doesn't like ampersands, I just haven't looked much into it.


Right now I'm fighting with cover downloads to try to avoid getting flagged by amazon web services (too many requests yields a 403/503 error).

Try the latest when you get a chance, it tries first with the date (if given) and then falls back to without on failure. Also sometimes hammering google like this can keep it from giving results for a little bit, I need a better check for that.

I just posted a standalone imageGet that uses google to get the series id and then builds from there. Seems pretty solid, but there are some using weird urls (Chew is ID_300x300.jpg instead of just ID.jpg). I think the key now is to try to track any more of these.

Try the standalone imageGet.sh I just posted. It's using a new method, googling to get the series id and then making the image URL. It uses sed, but not the -i flag that causes Mac issues. And it's curl based, like PageBuilder itself.

Mylar, has the option of saving cvinfo files in the folders of comic series. This was actually borrowed from ComicRack, so I assumed it would already exist. I was pretty sure this was what it was using for comicvine scraping, as they contain the appropriate comicvine link for the series. If the scraper isn't making these files by default, there may be some option for it to do so.


PageBuilder uses the url in these files to get the related comicvine ids, so without them it won't work in its current form.


For example, Batman (2011) has a cvinfo file with the url http://comicvine.gamespot.com/batman/4050-42721/ . From that, I pull the last digit string, 42721 (4050 designates series), and use that to get name/publisher/description info for building the comic page. It then passes the name to imageGet to download the appropriate cover image.

Be sure to include the cvinfo/CVInfo file in the path.

/path/to/PageBuilder.sh /path/to/comic/folder/cvinfo


Be sure to use the right case (cvinfo/CVInfo), since linux is picky about that sort of thing.

Nice, that's a hefty improvement. I tried it manually (url in browser) and eu is great for IDW/Image. Marvel threw some curveballs though. My test case is Spidey, .eu had it in the result list but waaaaay down the list.

I'm wondering if it might be cleaner to use Google proper, rather than google image, to get Comixology's series id. You could then grab the cover based on that, since the image urls are following a standard format.

Yes, it should be downloading into the PageBuilder folder... though I left a couple !'s in PageBuilder.sh and made it not move it from there.

As for why it's not downloading, I'm wondering if wget has different syntax on mac, since PageBuilder uses curl.

Oh right, missed that one. Fixed and uploaded, thanks!