File size: 5,070 Bytes
ab12a12 6aea83e 5ba7106 ab12a12 e7e3979 ab12a12 e7e3979 6aea83e ab12a12 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
library_name: transformers.js
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
license: apache-2.0
datasets:
- Kukedlc/dpo-orpo-spanish-15k
language:
- en
- es
---
[<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/67b2f4e49edebc815a3a4739/R1g957j1aBbx8lhZbWmxw.jpeg" width="200"/>](https://huggingface.co/fjmgAI)
## Fine-Tuned Model
**`fjmgAI/b1-R1-1.5B-ONNX`**
## Base Model
**`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`**
## Fine-Tuning Method
Fine-tuning was performed using **[`unsloth`](https://github.com/unslothai/unsloth)**, an efficient fine-tuning framework optimized for low-resource environments and Huggingface's TRL library.
Using ONNx runtime to transform the resulting model weights and make it compatible with Transformers.js.
## Dataset
**[`Kukedlc/dpo-orpo-spanish-15k`](https://huggingface.co/datasets/Kukedlc/dpo-orpo-spanish-15k)**
### Description
A Spanish-language dataset containing **15,000 examples**, designed for **Direct Preference Optimization (DPO)** or **Outcome-Regularized Preference Optimization (ORPO).**
### Adaptation
The dataset was adapted to a reasoning-based format for GPRO, enhancing its ability to guide preference-based decision-making during fine-tuning. This adaptation ensures better alignment with instruction-following tasks in Spanish.
## Fine-Tuning Details
- The model was trained using the **GPRO algorithm**, leveraging structured preference data to refine its response generation.
- The focus was on retaining the model's **instructional abilities** while improving its **understanding and generation** of Spanish text.
## Usage (Transformers.js)
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
```bash
npm i @huggingface/transformers
```
**Example:** Text-generation w/ `fjmgAI/b1-R1-1.5B-ONNX`
```js
import { pipeline, TextStreamer } from "@huggingface/transformers";
// Create a text generation pipeline
const generator = await pipeline(
"text-generation",
"fjmgAI/b1-R1-1.5B-ONNX",
{ dtype: "q4f16" },
);
// Define the list of messages
const messages = [
{ role: "user", content: "Resuelve esta ecuación: x^2 - 3x + 2 = 0" },
];
// Create text streamer
const streamer = new TextStreamer(generator.tokenizer, {
skip_prompt: true,
// callback_function: (text) => { }, // Optional callback function
})
// Generate a response
const output = await generator(messages, { max_new_tokens: 512, do_sample: false, streamer });
console.log(output[0].generated_text.at(-1).content);
```
<details>
<summary>See example output</summary>
```
<think>
To solve the quadratic equation \( x^2 - 3x + 2 = 0 \), I'll start by factoring the left-hand side. I need to find two numbers that multiply to 2 and add up to -3. These numbers are -1 and -2.
Next, I'll rewrite the equation as \( (x - 1)(x - 2) = 0 \).
Using the zero product property, I'll set each factor equal to zero:
1. \( x - 1 = 0 \) leads to \( x = 1 \).
2. \( x - 2 = 0 \) leads to \( x = 2 \).
Therefore, the solutions to the equation are \( x = 1 \) and \( x = 2 \).
</think>
To solve the quadratic equation:
\[
x^2 - 3x + 2 = 0
\]
**Step 1: Factor the Quadratic**
We look for two numbers that multiply to \( +2 \) and add up to \( -3 \). These numbers are \( -1 \) and \( -2 \).
\[
x^2 - 3x + 2 = (x - 1)(x - 2) = 0
\]
**Step 2: Apply the Zero Product Property**
If the product of two factors is zero, at least one of the factors must be zero.
\[
x - 1 = 0 \quad \text{or} \quad x - 2 = 0
\]
**Step 3: Solve for \( x \)**
\[
x = 1 \quad \text{or} \quad x = 2
\]
**Final Answer:**
\[
\boxed{1 \text{ and } 2}
\]
```
</details>
---
## Purpose
This fine-tuned model is intended for **Spanish-language applications** that require efficient AI that follows instructions using a **lightweight reasoning process.**
- **Developed by:** fjmgAI
- **License:** apache-2.0
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) [<img src="https://camo.githubusercontent.com/9585eb3e70c8138cbc0f73de7e970be4c668e957e45d16fc3ee6687fcc1da905/68747470733a2f2f68756767696e67666163652e636f2f64617461736574732f74726c2d6c69622f646f63756d656e746174696f6e2d696d616765732f7265736f6c76652f6d61696e2f74726c5f62616e6e65725f6461726b2e706e67" width="200"/>](https://github.com/huggingface/trl?tab=readme-ov-file)
[<img src="https://github.com/microsoft/onnxruntime/blob/main/docs/images/ONNX_Runtime_logo_dark.png?raw=true" width="200"/>](https://github.com/microsoft/onnxruntime)
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`). |