lucidrains mentioned a possibility of building the haiku version of Guided Diffusion (or something similar), so TPU-route is also possible (and we always have enough TPUs). We can borrow some model-parallel components from GPT-J, so that'll make it simpler 😉


