scratchtoscale (Scratch to Scale)

arjunguha

authored 3 papers about 2 months ago

Saurav2023

published a Space 3 months ago

README

👀

chansung

posted an update 4 months ago

Post

4299

YAML engineering becomes more and more important than ever from infra provisioning to model training (recipes).

Here, I built a simple editor first for @dstackai , and I will share the live endpoint this week. Let me know what you think about this approach.

Based on this approach, if people think this is useful, I am going to do the same thing for the LLM training recipes for popular frameworks such as Hugging Face open-r1, Axolotl, and so on. Let me hear.

Aurelien-Morgan

posted an update 6 months ago

Post

456

Hey, I'll be presenting @retrain-pipelines and almighty function-calling at the Hugging Face Paris HQ, you guys.
Monday evening. Lightning-talk style. With AI Tinkerers.

Come hang !

https://paris.aitinkerers.org/p/ai-tinkerers-paris-ai21-labs-takeover-on-may-19th

https://huggingface.co/blog/Aurelien-Morgan/the-almighty-function-caller

Aurelien-Morgan

posted an update 7 months ago

Post

3154

The Almighty function-caller

How would you like to build smart GenAi infrastructure ?
Give extensive tools memory to your edge agentic system,
And optimize the resources it takes to run yet a high-performance set of agents ?

We came up with a novel approach to function-calling at scale for smart companies and corporate-grade use-cases.

Read our full-fledged blog article on this here on Hugging Face :
https://huggingface.co/blog/Aurelien-Morgan/the-almighty-function-caller

Aurelien-Morgan

posted an update 7 months ago

Post

676

retrain-pipelines 0.1.2 finally dropped. It comes with a hot Hugging Face Hub integration. Go check it out. We have 2 articles about it coming up. One already fully written so, be on the lookout !
@retrain-pipelines

Also, I'll be volunteering at GOSIM AI Paris 2025. If you're interested in chatting, hmu.

Aurelien-Morgan

posted an update 8 months ago

Post

2009

Almost there !
https://test.pypi.org/project/test-010-retrain-pipelines/

chansung

posted an update 8 months ago

Post

3944

simple guide on the recipe for GRPO on Open-R1 which is built on top of TRL

I think FastAPI wrapper of vLLM with WeightSyncWorker is pretty cool feature. Also, we have many predefined reward functions out of the box!

5 replies

·

chansung

posted an update 8 months ago

Post

2673

Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)

1 reply

·

chansung

posted an update 8 months ago

Post

1602

Gemma 3 Release in a nutshell
(seems like function calling is not supported whereas the announcement said so)

seopbo

authored 2 papers 9 months ago

What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

Paper • 2109.04650 • Published Sep 10, 2021

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published Feb 26 • 65

arjunguha

authored a paper 10 months ago