This is really useful / I have a request

#13
by WyattTheSkid - opened

This is a really neat concept and reminds me of what nvidia did with their nemotron line. That said, I would appreciate it a lot (I'm sure many others would as well) if you guys did the same thing to gpt-oss-120b. It's already an incredibly efficient model, I think it could benefit from such an approach even further. It could potentially fit on 2 3090s without needing to offload anything! Thank you for what you guys do cerebras.

Sign up or log in to comment