Stable Diffusion 2.0 on Mac and Linux via imaginAIry Python library

234 points by bryced 3 years ago

davely 3 years ago

I've been working on a web client[1] that interacts with a neat project called Stable Horde[2] to create a distributed cluster of GPUs that run Stable Diffusion. Just added support for SD 2.0:

[1] https://tinybots.net/artbot?model=stable_diffusion_2.0

[2] https://stablehorde.net/

davidkunz 3 years ago

Wow, this a great site, thanks for the links!

bryced 3 years ago

Try out the pre-release like this:

`pip install imaginairy==6.0.0a0 --upgrade`

New 512x512 model supported with all samplers and inpainting

New 768x768 model supported with the DDIM sampler only

Not yet supported is the upscaling and depth maps.

To be honest I'm not sure the new model produces better images but maybe they will release some improved models in the future now that they have the pipeline open.

swyx 3 years ago

congrats! how did you upgrade it so fast? and what would you call out as the main technical pointers to adapting the base release for M1's?
- bryced 3 years ago
  
  All the same issues as migrating 1.5 to M1s. It went fast because I upgraded my existing codebase that had those fixes already instead of building of the new compvis one.

greggh 3 years ago

This is awesome, but I still like using the GUI for m1/m2 Macs, DiffusionBee.

https://github.com/divamgupta/diffusionbee-stable-diffusion-...

malshe 3 years ago

Thanks for sharing this. I was looking for something simple like this
diebeforei485 3 years ago

Does this use Stable Diffusion 2.0?
jibbers 3 years ago

And apparently Intel Macs also! I had no idea!

Smaug123 3 years ago

Nicely done; this seems to work for me. In my own attempt, I got stock Stable Diffusion 2.0 "working" on M1 using the GPU but it's producing some of the most cursed (and low-res) images I've ever seen, so I've definitely got it wrong somewhere. The reader can infer the usual rant about dynamic typing causing runtime misconfiguration in Python.

liuliu 3 years ago

There are some network changes on the UNet, so if you ported the code over or have mismatched configuration files, it may generate garbage outputs, I wrote some notes here: https://www.reddit.com/r/StableDiffusion/comments/z42yph/som...

typest 3 years ago

How much of this is stable diffusion 2, and how much is something else? For instance, the text based masks, the syntax like AND and OR, the face up scaling — are these all part of stable diffusion 2 (and can be used via other stable diffusion apis)?

bryced 3 years ago

- text-based masks use a clipseg model. - the boolean mask logic is unique to this library - the face fixing is done by CodeFormer

yreg 3 years ago

As with previous macOS Stable Diffusion tools, this is Apple Silicon only.

smoldesu 3 years ago

If you have an Intel Mac with sufficient memory, it's totally possible to run it on-CPU as well.
- dylan604 3 years ago
  
  >If you have an Intel Mac with sufficient memory,
  which means what? why be so ambiguous. If if needs 16GB, say so. If it needs 32, say so. your sufficient memory comment is insufficient
  
  smoldesu 3 years ago
  
  The figure isn't static. Some models require as little as 3.5gb of free memory, others demand 8-16 gigs. MacOS is weird with memory management and everyone's Mac is different; I'd really only recommend running the model on 32-gig machines to avoid writing into swap, but technically it's possible with 8 and 16 gig machines.

fareesh 3 years ago

What's the minimum VRAM requirement?

gbighin 3 years ago

Requirements:

> A decent computer with either a CUDA supported graphics card or M1 processor.

Why so? How does an M1 processor replace CUDA in a way a x86_64 processor can't? Do they use ARM assembly?

pavlov 3 years ago

It’s not the ARM core but the integrated GPU in the M1. It has access to the entire main memory unlike a traditional GPU with its own local VRAM.
- gbighin 3 years ago
  
  Oh, interesting! But does it support CUDA? How is the integrated GPU used for ML tasks?
  
  hnarayanan 3 years ago
  
  Both PyTorch and TensorFlow offer backends for Metal that works pretty well on Apple Silicon.
  
  dagmx 3 years ago
  
  To add to what people said, most of these ML models target an ML library like TensorFlow or PyTorch.
  Those in turn have hardware accelerated backends. Traditionally they’ve only had CUDA backends but Apple ported large chunks of both to Metal as well.
  So none of these libraries really target CUDA. In fact they’d run fine without a supported GPU but much slower.
  
  Filligree 3 years ago
  
  It does not support CUDA; SD does not require CUDA.
  
  pavlov 3 years ago
  
  I believe there’s a Tensorflow acceleration adapter for Apple’s ML API which uses Metal behind the scenes.
  
  malshe 3 years ago
  
  pytorch can use the GPUs on M1 macs. Sebastian Raschka's post explains it nicely and shows some benchmarks too. https://sebastianraschka.com/blog/2022/pytorch-m1-gpu.html
  From his post:
  if you want to run PyTorch code on the GPU, use torch.device("mps") analogous to torch.device("cuda") on an Nvidia GPU.
  
  crucialfelix 3 years ago
  
  In some cases there are operations not supported on mps. For those set:
  os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"
  and it will run on cpu if some operation isn't supported
  
  malshe 3 years ago
  
  Excellent! Thanks

anothernewdude 3 years ago

2.0 is a mixed bag. It's set making pixel art back entirely. I'm pretty sure this is down to the aesthetic filter - it has a very biased idea of what good images are. It's silly to do that at the training stage, that should be something you do in the prompt.

Fine tuning is out of reach for me, so I'm sticking to 1.5.

lostintangent 3 years ago

Wow, this looks awesome! I noticed that the sample notebook doesn’t include SD 2.0 by default, and says that it’s too big for Colab. Is that a disk size/RAM limitation?

As an aside, it would be cool if you versioned that notebook in the repo, so that it could be easily opened with Codespaces.

bryced 3 years ago

Yeah I tried to get it running but it kept crashing with "out-of-ram" errors.
Good idea to version the notebook.

egeozcan 3 years ago

This would have been perfect if it worked on Windows too. I need to look into dual booting Linux (opening a can of worms) just to give it a try, as WSL doesn't seem to cut it.

bryced 3 years ago

It might work on windows but I haven't tested it there.
- patates 3 years ago
  
  It only uses the CPU. Somehow the GPU detection fails.
- dekhn 3 years ago
  
  for me the pip install on windows (anaconda) failed installing basicsr: error: metadata-generation-failed
  
  bryced 3 years ago
  
  I don't think it works with anaconda on any OS.
satvikpendem 3 years ago

Why not use Automatic1111's? I think he already added SD 2.0.
boycott-israel 3 years ago

fwiw dual booting is ultimately simpler than WSL and it's quirks

underlines 3 years ago

is it possible to add volta or xformers for a massive speed increase?

https://github.com/VoltaML/voltaML-fast-stable-diffusion

bryced 3 years ago

Possibly. Haven't tried. In principle should be possible.

superpope99 3 years ago

This seems to work for me. Incredible work turning this around so quickly!

habibur 3 years ago

If you are running it natively [ not on a cloud ] what's the ram size of your graphics card?

semicolon_storm 3 years ago

Pretty slick, SD 2.0 performance actually seems to be better than 1.5?

bryced 3 years ago

You're probably noticing the newest sampler, which also works with 1.5.

algon33 3 years ago

Nice, a friend was looking for something like this.

TekMol 3 years ago

What is a good VM to try this out?

Something on AWS, Hetzner etc?

petercooper 3 years ago

AWS g5.xlarge instances. Very fast (roughly RTX 3080 speeds) and about $1 an hour. However, you can just turn the instance on and off and not pay anything except the latent EBS cost.

88stacks 3 years ago

awesome library, I haven't seen this before. I just added it to my stable diffusion api service so you can query stable diffusion 2.0 if you don't GPUs setup currently: https://88stacks.com

ttpphd 3 years ago

Why is it called 88 stacks?
- turnsout 3 years ago
  
  Also wondering about the 88—only because of its Neo-Nazi/hate-speech connotations