Update README.md
Browse files
README.md
CHANGED
|
@@ -7,6 +7,75 @@ base_model:
|
|
| 7 |
- google/gemma-2-2b-it
|
| 8 |
---
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
This model has been uploaded using the Keras library and can be used with JAX,
|
| 11 |
TensorFlow, and PyTorch backends.
|
| 12 |
|
|
@@ -18,15 +87,3 @@ more information.
|
|
| 18 |
For more details about the model architecture, check out
|
| 19 |
[config.json](./config.json).
|
| 20 |
|
| 21 |
-
```
|
| 22 |
-
preset ="gemma2_instruct_2b_en"
|
| 23 |
-
seed = 42
|
| 24 |
-
sequence_length = 512
|
| 25 |
-
batch_size = 16
|
| 26 |
-
lora_rank = 8
|
| 27 |
-
learning_rate = 2e-5
|
| 28 |
-
epochs = 20
|
| 29 |
-
max_length = 512
|
| 30 |
-
train_data_rate = 0.9
|
| 31 |
-
early_stopping_step = 5
|
| 32 |
-
```
|
|
|
|
| 7 |
- google/gemma-2-2b-it
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# Icebreaking Quiz Generation Model
|
| 11 |
+
|
| 12 |
+
This model is designed to generate icebreaking quizzes. Icebreaking refers to activities conducted before meetings to help participants break the ice and ease tension by solving simple quizzes. By inputting simple, easy-to-answer questions along with users' responses into this model, it generates new four-option multiple-choice questions, options, and the correct answer. These customized questions can help participants engage with each other and reduce awkwardness in the room.
|
| 13 |
+
|
| 14 |
+
## 1. Data Collection
|
| 15 |
+
We collected questions and answers from blogs through web crawling and used state-of-the-art models like ChatGPT to generate new four-option multiple-choice questions. We then manually reviewed and edited the generated questions to correct any errors in answer matching and adjusted the difficulty level of the options to make the multiple-choice questions more challenging. This process resulted in the creation of a well-structured dataset.
|
| 16 |
+
|
| 17 |
+
**Example Dataset:**
|
| 18 |
+
|
| 19 |
+
| blog_id | question_crawling | question_number | answer_crawling | question_generated | multiple_choice_generated | answer_generated |
|
| 20 |
+
|---------|------------------------------|-----------------|--------------------------------------------------------------------------------------------|--------------------------------------------------|--------------------------------------------------|-----------------|
|
| 21 |
+
| gemma | ์ด๋ฉด์ ๊ฐ์ฅ ๊ธฐ์ต์ ๋จ๋ ์ฌ๋ | 95 | ๋ฏธ์ ํ์ ์ ์๋! ๋ด๊ฐ ๋ฏธ์ ์ชฝ์ ํฅ๋ฏธ๋ฅผ ๊ฐ์ง ์ ์๊ฒ ๋์์ฃผ์ ๋ถ์ด์ผ. ํ๊ต ๋๋๊ณ ํ์์ ๊ฐ๋ฉด ๋ง๋ ๊ฒ๋ ๋ง์ด ์ฃผ์๊ณ ์ข์ ๊ธฐ์ต๋ค๋ง ๋ง๋ค์ด์ฃผ์ ๋ถ์ด์ผ. | ์ด ์ฌ๋์ด ์ด๋ฑํ๊ต๋ ๊ฐ์ฅ ๊ธฐ์ต์ ๋จ๋ ์ฌ๋์ ๋๊ตฌ์ธ๊ฐ์? | 1. ์ด๋ฑํ๊ต ๋ด์ ์ ์๋ 2. ์คํ๊ต ๋์ฐฝ 3. ๋ฏธ์ ํ์ ์ ์๋ 4. ๋ํ๊ต ๊ต์ | 3 |
|
| 22 |
+
|
| 23 |
+
## 2. Fine-tuning Process
|
| 24 |
+
This model is fine-tuned using the Google Gemma 2B instruct model, with LoRA applied to make the training process lighter and faster on TPU. LoRA reduces the number of parameters updated during training, allowing efficient fine-tuning without consuming excessive memory resources. Early stopping was also applied to prevent overfitting.
|
| 25 |
+
|
| 26 |
+
The training was conducted on a JAX backend using TPU acceleration, distributing the workload across multiple TPU cores for improved efficiency. The model was optimized using the AdamW optimizer, and the loss function was calculated using sparse categorical cross-entropy.
|
| 27 |
+
|
| 28 |
+
### Key Fine-tuning Details:
|
| 29 |
+
|
| 30 |
+
- **sequence length**: 512
|
| 31 |
+
- **batch_size**: 16
|
| 32 |
+
- **lora_rank**: 8
|
| 33 |
+
- **learning_rate**: 2e-5
|
| 34 |
+
- **epochs**: 20
|
| 35 |
+
- **max_length**: 512
|
| 36 |
+
- **train_data_rate**: 0.9
|
| 37 |
+
- **early_stopping_step**: 5
|
| 38 |
+
|
| 39 |
+
**Training Method:**
|
| 40 |
+
|
| 41 |
+
The training data consists of "question_crawling" and "answer_crawling" columns as inputs, with "question_generated", "multiple_choice_generated", and "answer_generated" columns as outputs. The data was formatted to fit the fine-tuned model's input structure, and training proceeded accordingly.
|
| 42 |
+
|
| 43 |
+
The model is built using Keras_NLP's pre-built `GemmaCausalLM` and applied to the Gemma 2B instruct architecture. We activated LoRA on the decoder blocks and efficiently managed the model across TPU devices through model parallelization. Training was conducted for 20 epochs using TPU resources on Kaggle.
|
| 44 |
+
|
| 45 |
+
**Training Time and Results:**
|
| 46 |
+
|
| 47 |
+
- Training ran for 20 epochs with a batch size of 16.
|
| 48 |
+
- Time per epoch: ~5 minutes (with TPU)
|
| 49 |
+
- Total training time: ~100 minutes
|
| 50 |
+
- The model showed high accuracy in quiz generation, achieving approximately 81% accuracy on the validation data.
|
| 51 |
+
|
| 52 |
+
**Generated Example:**
|
| 53 |
+
```
|
| 54 |
+
์ง๊ธ ๋น์ฅ ํด๋ณด๊ณ ์ถ์ ๊ฒ์ ๋ฌด์์ธ๊ฐ์?
|
| 55 |
+
1. ์ฐ์ 2. ์ฌํ 3. ์ด๋ 4. ์๋ฆฌ
|
| 56 |
+
1
|
| 57 |
+
```
|
| 58 |
+
**Correct Data:**
|
| 59 |
+
```
|
| 60 |
+
์ง๊ธ ๋น์ฅ ํด๋ณด๊ณ ์ถ์ ๊ฒ์ ๋ฌด์์ธ๊ฐ์?
|
| 61 |
+
1. ํด์ธ ์ฌํ 2. ์ฐ์ 3. ์ค์นด์ด๋ค์ด๋น 4. ๋ง๋ผํค
|
| 62 |
+
2
|
| 63 |
+
```
|
| 64 |
+
## Model Usage:
|
| 65 |
+
|
| 66 |
+
This model has been developed as a service using the Flask framework and Google Survey, allowing real users to utilize it for icebreaking. Users can respond to simple questions on Google Survey, and the model generates new questions based on their input. Flask processes the question generation, while a 'conversation topic guide' is provided to facilitate discussions while waiting for the generated questions.
|
| 67 |
+
|
| 68 |
+
**Deployment Platform:**
|
| 69 |
+
- Supported backends: JAX, TensorFlow, PyTorch
|
| 70 |
+
|
| 71 |
+
**Service link:** [http://64.110.84.104:7860](http://64.110.84.104:7860)
|
| 72 |
+
**Code link:** [https://huggingface.co/spaces/yunzi7/icebreaking/tree/main](https://huggingface.co/spaces/yunzi7/icebreaking/tree/main)
|
| 73 |
+
|
| 74 |
+
---
|
| 75 |
+
|
| 76 |
+
This version incorporates the suggested improvements and ensures a clear, natural flow of information in English. Let me know if you'd like any further refinements!
|
| 77 |
+
|
| 78 |
+
|
| 79 |
This model has been uploaded using the Keras library and can be used with JAX,
|
| 80 |
TensorFlow, and PyTorch backends.
|
| 81 |
|
|
|
|
| 87 |
For more details about the model architecture, check out
|
| 88 |
[config.json](./config.json).
|
| 89 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|