Aran Komatsuzaki   @arankomatsuzaki   6/9/2021       

We have a code to produce a CC-BY-SA dataset of nearly a billion image-text pairs by collecting them from Common Crawl, but we don't have money to afford the egress lol https://t.co/g0l1HIfNuC






Posted by Aran Komatsuzaki