Stas Bekman
@StasBekman

Toolmaker. Software creator, optimizer and harmonizer. Current doman: Natural Language Processing/Machine Learning.




Stas Bekman    @StasBekman
Additionally, DeepSpeed released a super-fast CUDA-kernel-based DeepSpeed-Inference for BERT, GPT-2, and GPT-Neo https://t.co/tvybS0xMX5 and https://t.co/qgd2QUB7Yl

Stas Bekman    @StasBekman
If you ever need to shrink a large 🤗 tokenizer, e.g. when you need it for fast testing, here are several ways of doing it: https://t.co/2H2xibP6IP Huge thanks to @LysandreJik and @moi_anthony for offering the recipes!

Stas Bekman    @StasBekman
🤗 Transformers DeepSpeed integration (master) tested to work with: Albert, Bart, Bert, DistilBert, Electra, FSMT, GPT2, Marian, Mbart, Pegasus, Roberta, T5, XLM-Roberta, XLNet If you tried another model and had problems please file an Issue and we will make it work. Thank you!

Stas Bekman    @StasBekman
The different model parallelism techniques (and combinations) are now documented here: https://t.co/wAT0GI2j8P and which of them are supported by 🤗 Transformers. If you have questions please open an Issue and tag stas00 or make a forum post and tag stas. Thank you!

Stas Bekman    @StasBekman
wav2vec2 🤗 Transformers + DeepSpeed integration is now complete. For a quick start please see: https://t.co/vHWnpvHihB And also a quick note that Deepspeed integration now has its own dedicated doc: https://t.co/byoZZr4ydo
 Reply      Retweet   26      Like    133   











 








Stas Bekman

Hugging Face

Antonin Raffin

Julien Chaumond

Tim Dettmers

PyTorch Lightning