AutoComic
AutoComic is a python script that turns a webcomic into a pdf. It visits each comic page sequentially, downloads the images, and puts the images (and other information) into a pdf. Over 70 comics I have tried have been successfully converted to PDF format using the script. However, is still a work in progress as I continue to add features.
See the Github Repository for current work.
×
examples of input and output from Everblue by Michael Sexton
×
examples of input and output from Housepets! by Rick Griffin
Challenges and Obstacles
- signal handling
- converting images to pdf
- HTML link format parsing
- PDF filesize optimizations
- image splitting
What I Learned
- Webcrawling is complicated. Even though I used many libraries to help with crawling the web, it is not a simple problem and must be approached with care. Using webcrawling libraries is much more efficient. I originally wrote my own system with regular expressions, but using libraries allows css selectors, which are much easier to use.
- Using config files is something that usually annoys me, but in this case it improves the usability of the script. If it used user input, the user would have to re-input every detail about the output every time they ran the script. If the script itself was edited, then it would only work for one comic. Usability is not about following some strict guidelines, but about doing what is easiest for the user.
Future Work
There is still a lot of work to do on this script. Here are some things I plan on adding in the near future:
- More robust testing: testing is tedious and takes a while right now.
- More output: when something goes wrong, it is very difficult to tell where the error originated.