I’ve been playing around with AI image generation for a few weeks now, but I haven’t been using it as a way to generate images for any real purpose. It’s been fun to see what a dumb prompt like “Godzilla in a suit and tie” would look like but I’d be wary of anyone who could put such a thing to any use.
I’ve been using Stable Diffusion: Home Edition rather than bother using one of the hosted versions, and in the process have been learning a lot about how the whole AI image generation process works. The front-runner in the SD camp for local versions is/has been Automatic1111 which is an easy to use web-based tool that allows users to enter in positive and negative prompts, slide a few sliders, and generate images.
Never one to make life easy for myself, I have been using the alternate SD front-end, ComfyUI.
Comfy is a node-based tool, and I like that because I can build workflows that cater to whatever operation I want to perform. Do I want to generate one basic image using one basic model source, a positive and negative prompt? That’s what Comfy can do out of the box. Do I want to test several different models with the same settings? I can do that too.
In order to learn about SD and ComfyUI, I have been watching the YT channel of Scott Detweiler. Scott works for Stability, makers of the Stable Diffusion platform, and has himself switched from Auto1111 to Comfy. He is a great source of info not just on SD, but how to set up Comfy to perform specific tasks. One task that I probably will find a use for is to upscale existing images.
Resizing images down works pretty well, usually. Algorithms which do this remove pixels and shift color information around, and at a smaller size we don’t really see the results at the pixel level; our eyes only care about the overall composition, which is often more than passable. However, scaling up is a problem. In this direction, the algorithm needs to add pixels and color info that isn’t there. Modern resizing with tools such as Photoshop do OK if you’re taking a 512×512 image and increasing its size to maybe 2x that, but going beyond that 2x multiplier and you might be looking at a blurry image. Considering that upsizing is probably being done for a reason, having blurry, jagged-edged images is not going to cut it at the final product.
Thanks to a recent video from Scott Detweiler, I now have a workflow which allows me to drop in any image and upsample it to make it larger. Let’s look!
For my test, I used a variation on the logo I’m using for this site. The original dimensions of the full image are 512×486. Whenever I want to use this avatar, this size works well as a starting point, and I would only downsize this image to something smaller if need be.
The zoomed in version above was resized in Photoshop by a factor of 4, making this a 2048×1944 pixel image. As expected, when zoomed in, everything is jagged as the resizing process tried to (and actually did a fairly good job of) expanding the composition…just don’t look too close and you probably won’t notice the blur and pixels. Again, blowing up an image is probably being done for a purpose of viewing the image much closer, so making the image larger this way is kind of a waste of time as the quality just isn’t there.
I recreated Scott’s upsample workflow in Comfy, and loaded the original avatar image into the source.
A lot of these nodes are present only because they are required by the app, such as the red and green text prompt boxes, but aren’t being used to generate a new image. The rest of the nodes handle the multiplication of the original image by the factors defined in the Resize Factor node, the AI model used to upscale the source, and then the actual act of upscaling. This can take a few minutes, although one of the benefits of learning the complexities of ComfyUI over Auto1111 is that Comfy is designed to work on low-end computers…it can even utilize the CPU instead of the GPU, if that’s your jam (I have a 3070 right now).
The above image shows the results of the upscale operation along with the workflow. I had to zoom out so I could screenshot the results alongside the original image, so you can see the difference in size.
Above is a zoomed in version that shows the results up close.
And even closer. This image was upscaled to the same size as the Photoshop version, 2048×1944, but as you can see, the results are much, much, much better.
One of my goals for this workflow is to take some of my photos from my recent trip to Colorado and see if I can’t blow them up to make them suitable for printing and framing. As of right now, that’s the only reason I can think of for this process, although as the test above shows, taking a smaller image and enhancing the size could fit the bill in other cases where needed.