Stable Diffusion

By now this entire topic is an old hat so I wont spend any time on explaining this in detail. Basically Stable Diffusion lets you generate images from text as well as replace parts of an image or create a new image from an existing one and a prompt. People have been screwing around with it for a while and it might have even made OpenAI realize that turning their DALL-E ai into a secret club is not only the opposite of what their name suggests, but also pretty lame since they’ve recently opened it up more.

So there’s now this cool new neural network that let’s you generate images that range from anything between convincingly real to abhorrent abomination. While you can use online services for this it’s always cooler to run it on your own machine. I finally got a new video card a while ago and decided that it was time to give AMD a chance. This was before stable diffusion came out, otherwise I probably would’ve gone with another nvidia card. AMD cards are just not as good with machine learning and other stuff like hardware accelerated rendering.

My initial attempts at getting SD to run very quickly ended with errors regarding ROCm. The error was hipErrorNoBinaryForGpu. My Card is newer and not official supported which made me immediately regret getting an AMD GPU. The strange thing is that older cards work just fine. I would usually expect the opposite. A few days ago I gave it another try after hearing from some people that their cards worked. First I used a pre-made docker image. Specifically the rocm5.2_ubuntu20.04_py3.7_pytorch_1.11.0_navi21 one. I got this from this video but it only goes through the steps in the description and also has an editing mistake at the end where one step is done twice, because he fucked it up on the first try and left it in the video so you all can make the same mistake. That image worked out of the box for me, but it’s a bit old and doesn’t have the web interface.

Since the docker image eats like 20gb and I hate docker with a passion I then tried to get it to run directly. For that I followed these instructions which I got from mint. Because of my previous attempts I already had most of the things installed. The steps are basically

export TORCH_COMMAND="pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.1.1";
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export MIOPEN_DEBUG_COMGR_HIP_PCH_ENFORCE=0
export PATH=/opt/rocm/bin:/opt/rocm/llvm/bin:$PATH
export LLVM_PATH=/opt/rocm/llvm
export ROCM_PATH=/opt/rocm

That’s basically it. For me the issue was that I was missing some of the environment variables which coerce ROCm into playing nice with my GPU. I still probably don’t get as much performance as with an nvidia GPU, but at the same time I paid less and have more VRAM which is probably the better deal for this stuff. One other issue I’ve had was ./webui.sh: line 141: 113907 Killed "${python_cmd}" "${LAUNCH_SCRIPT}" but that just means that it ran out of RAM.

The steps required to get it to run are surprisingly simple, but information regarding this is pretty scarce and I was almost convinced that it didn’t work at all. Funnily enough the environment variables also fix issues I’ve had with blender and with them I can use HIP to render images with cycles which is quite convenient considering that Blender just recently started to ship with HIP by default.

Finally I’ll just leave some useful links regarding stable diffusion.

Edit: Today this stopped working for me and I just got this error when generating an image:

MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx1030_20.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package
MIOpen(HIP): Error [Do] 'amd_comgr_do_action(kind, handle, in.GetHandle(), out.GetHandle())' AMD_COMGR_ACTION_COMPILE_SOURCE_TO_BC: ERROR (1)
MIOpen(HIP): Error [BuildHip] comgr status = ERROR (1)
MIOpen(HIP): Warning [BuildHip] <built-in>:1:10: fatal error: '__clang_hip_runtime_wrapper.h' file not found
#include "__clang_hip_runtime_wrapper.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for gfx1030.

MIOpen Error: /MIOpen/src/hipoc/hipoc_program.cpp:300: Code object build failed. Source: convolution_forward_implicit_gemm_v6r1_dlops_nchw_kcyx_nkhw.cpp

The error is a bit confusing, but it seems like it stopped finding the header __clang_hip_runtime_wrapper.h for some reason. Adding export CPLUS_INCLUDE_PATH=/opt/rocm/llvm/lib/clang/15.0.0/include/ to webui-user.sh like the other environment variables fixed the issue for me.