jastorj
/

snowflake_arctic_text2sql_r1_7b-nl2sqlpp-4bit-v8-cw-32K

Model card Files Files and versions

snowflake_arctic_text2sql_r1_7b-nl2sqlpp-4bit-v8-cw-32K / README.md

jastorj's picture

Update README.md

59b5a34 verified 14 days ago

|

history blame contribute delete

1.41 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- text-to-sql
	- code
	- sql
	- fine-tuned
	- unsloth
	- lora
	base_model: Snowflake/Arctic-Text2SQL-R1-7B
	---

	# Snowflake/Arctic-Text2SQL-R1-7B Fine-tuned for NL2SQL++ v8

	This model is a fine-tuned version of [Snowflake/Arctic-Text2SQL-R1-7B](https://huggingface.co/Snowflake/Arctic-Text2SQL-R1-7B) on the NL2SQL++ v8 dataset with code-with-thought reasoning.

	## Model Details

	- Base Model: Snowflake/Arctic-Text2SQL-R1-7B
	- Task: Text-to-SQL generation
	- Dataset: NL2SQL++ v8 with code-with-thought reasoning
	- Fine-tuning Method: LoRA (Low-Rank Adaptation) with Unsloth
	- Quantization: 16-bit merged weights
	- Maximum Sequence Length: 32768 tokens
	- Training Dataset Size: 46344 examples
	- Validation Dataset Size: 1986 examples

	## Training Configuration

	### LoRA Parameters
	- LoRA Rank (r): 64
	- LoRA Alpha: 128
	- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

	### Training Hyperparameters
	- Learning Rate: 0.0002
	- Training Epochs: 2
	- Max Steps: N/A (using epochs)
	- Train Batch Size: 64
	- Eval Batch Size: 50
	- Gradient Accumulation Steps: 2
	- Effective Batch Size: 128
	- Warmup Steps: 0
	- Warmup Ratio: 0.1
	- Optimizer: AdamW (torch)
	- Learning Rate Scheduler: Cosine
	- Weight Decay: 0.01
	- Max Gradient Norm: 1.0
	- Seed: 3407