File size: 6,220 Bytes
ba0ba18
102e2c5
 
350aa84
 
 
 
 
102e2c5
 
881acfa
 
ba0ba18
881acfa
 
ba0ba18
881acfa
6625308
881acfa
54ba535
 
 
 
881acfa
ba0ba18
881acfa
 
 
 
 
6625308
 
 
ba0ba18
6625308
 
 
881acfa
 
 
ba0ba18
881acfa
ba0ba18
881acfa
 
 
 
 
 
ba0ba18
881acfa
1310b56
881acfa
1310b56
 
 
 
 
 
 
 
881acfa
 
1310b56
881acfa
1310b56
 
 
 
881acfa
ba0ba18
d86501b
 
1310b56
881acfa
ba0ba18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
881acfa
 
 
 
ba0ba18
 
102e2c5
ba0ba18
fcfc460
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

---
library_name: keras
license: mit
language:
- ko
base_model:
- google/gemma-2-2b-it
---

# Icebreaking Quiz Generation Model

This model is designed to generate icebreaking quizzes. Icebreaking refers to activities conducted before meetings to help participants break the ice and ease tension by solving simple quizzes. By inputting easy-to-answer questions along with users' responses into the model, it generates new multiple-choice questions with four options and the correct answer. These customized questions can help participants engage with each other and reduce awkwardness in the room.

## 1. Data Collection
We collected questions and answers from blogs via web crawling and used state-of-the-art models like ChatGPT to generate new four-option multiple-choice questions. We then manually reviewed and edited the generated questions to correct any errors in answer matching and adjusted the difficulty level to make the multiple-choice options more challenging. This process resulted in a well-structured dataset.

**Example Data**

| blog_id | question_crawling             | question_number | answer_crawling                                                                                                      | question_generated                                      | multiple_choice_generated                                           | answer_generated |
|---------|-------------------------------|-----------------|----------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------------------|-----------------|
| gemma   | μ‚΄λ©΄μ„œ κ°€μž₯ 기얡에 λ‚¨λŠ” μ‚¬λžŒ  | 95              | λ―Έμˆ ν•™μ› μ„ μƒλ‹˜! λ‚΄κ°€ 미술 μͺ½μ— ν₯λ―Έλ₯Ό κ°€μ§ˆ 수 있게 도와주신 뢄이야. 학ꡐ λλ‚˜κ³  학원에 κ°€λ©΄ λ§›λ‚œ 것도 많이 μ£Όμ‹œκ³  쒋은 κΈ°μ–΅λ“€λ§Œ λ§Œλ“€μ–΄μ£Όμ‹  뢄이야. | 이 μ‚¬λžŒμ΄ μ΄ˆλ“±ν•™κ΅ λ•Œ κ°€μž₯ 기얡에 λ‚¨λŠ” μ‚¬λžŒμ€ λˆ„κ΅¬μΈκ°€μš”? | 1. μ΄ˆλ“±ν•™κ΅ λ‹΄μž„ μ„ μƒλ‹˜  2. 쀑학ꡐ 동창  3. λ―Έμˆ ν•™μ› μ„ μƒλ‹˜  4. λŒ€ν•™κ΅ ꡐ수 | 3               |

## 2. Fine-tuning Process
This model was fine-tuned using the Google Gemma 2B instruct model, with LoRA applied to make the training process lighter and faster on TPU. LoRA reduces the number of parameters updated during training, allowing efficient fine-tuning without consuming excessive memory resources. Early stopping was also applied to prevent overfitting.

The training was conducted on a JAX backend using TPU acceleration, distributing the workload across multiple TPU cores for improved efficiency. The model was optimized using the AdamW optimizer, and the loss function was calculated using sparse categorical cross-entropy.

### Key Fine-tuning Details:

- **Sequence Length**: 512  
- **Batch Size**: 16  
- **LoRA Rank**: 8  
- **Learning Rate**: 2e-5  
- **Epochs**: 20  
- **Train Data Rate**: 0.9  
- **Early Stopping Step**: 5  

**Training Method:**

The training data consists of "question_crawling" and "answer_crawling" columns as inputs, with "question_generated", "multiple_choice_generated", and "answer_generated" columns as outputs. The data was formatted to fit the fine-tuned model's input structure for training.

The model is built using Keras_NLP's pre-built `GemmaCausalLM` and applied to the Gemma 2B instruct architecture. We activated LoRA on the decoder blocks and managed the model efficiently across TPU devices via model parallelization. Training was conducted for 20 epochs using TPU resources on Kaggle.

**Training Time and Results:**

- Training ran for 20 epochs with a batch size of 16.  
- Time per epoch: ~5 minutes (with TPU)  
- Total training time: ~100 minutes  
- The model demonstrated high accuracy in quiz generation, achieving approximately 81% accuracy on the validation data.

**Input**
```
<instruction>
Using the text: μ’‹μ•„ν•˜λŠ” νžˆμ–΄λ‘œ? λ‚˜μ˜ νžˆμ–΄λ‘œλŠ” 우리 μ•„λΉ , create a new multiple-choice question with 4 answer options.
```
**Model Output**
```
<Response>
κ°€μž₯ μ’‹μ•„ν•˜λŠ” νžˆμ–΄λ‘œλŠ” λˆ„κ΅¬μΈκ°€μš”?
1. λ‚˜μ˜ μ•„λΉ   2. 슈퍼맨  3. μΊ‘ν‹΄ 아메리카  4. 헐크
1
```
**Expected Answer**
```
<Response>
μ’‹μ•„ν•˜λŠ” νžˆμ–΄λ‘œλŠ” λˆ„κ΅¬μΈκ°€μš”?
1. 우리 μ—„λ§ˆ  2. μŠ€νŒŒμ΄λ”λ§¨  3. 우리 μ•„λΉ   4. 아이언맨
3
```

**Kaggle Notebook**: [https://www.kaggle.com/code/mukmukmukmuk/v2-fine-tune-2b-icebreaking-quiz-tpu/notebook](https://www.kaggle.com/code/mukmukmukmuk/v2-fine-tune-2b-icebreaking-quiz-tpu/notebook)

## 3. Model Usage

When using this model, data must be formatted according to a specific template. The following structure is required for input:

```plaintext
<instruction>
Using the text: {question_crawling} {answer_crawling}, create a new multiple-choice question with 4 answer options.
```

The model takes the `question_crawling` and `answer_crawling` data to generate new multiple-choice questions. The expected format for the generated output is:

```plaintext
<Response>
{question_generated}
{multiple_choice_generated}
{answer_generated}
```

This template includes the newly generated question, multiple-choice options, and the correct answer. When using this template for testing, the fields for `question_generated`, `multiple_choice_generated`, and `answer_generated` are left blank for the model to fill in.

This model has been developed as a service using the Flask framework and Google Survey, allowing real users to utilize it for icebreaking. Users respond to simple questions on Google Survey, and the model generates new questions based on their input. Flask processes the question generation, while a 'conversation topic guide' is provided to facilitate discussions while waiting for the generated questions.

**Deployment Platform:**  
- Supported backends: JAX, TensorFlow, PyTorch

**Service link**: [http://64.110.84.104:7860](http://64.110.84.104:7860)  
**Code link**: [https://huggingface.co/spaces/yunzi7/icebreaking/tree/main](https://huggingface.co/spaces/yunzi7/icebreaking/tree/main)

For more details about the model architecture, check out the [config.json](./config.json).