uploaded readme
Browse files
README.md
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Quantization made by Richard Erkhov.
|
| 2 |
+
|
| 3 |
+
[Github](https://github.com/RichardErkhov)
|
| 4 |
+
|
| 5 |
+
[Discord](https://discord.gg/pvy7H8DZMG)
|
| 6 |
+
|
| 7 |
+
[Request more models](https://github.com/RichardErkhov/quant_request)
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
tat-llm-7b-fft - bnb 8bits
|
| 11 |
+
- Model creator: https://huggingface.co/next-tat/
|
| 12 |
+
- Original model: https://huggingface.co/next-tat/tat-llm-7b-fft/
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
Original model description:
|
| 18 |
+
---
|
| 19 |
+
language:
|
| 20 |
+
- en
|
| 21 |
+
license: llama2
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
# TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data
|
| 25 |
+
|
| 26 |
+
Paper: https://arxiv.org/abs/2401.13223
|
| 27 |
+
|
| 28 |
+
Code: https://github.com/fengbinzhu/TAT-LLM
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
## Introduction
|
| 32 |
+
|
| 33 |
+
We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding financial QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.
|
| 34 |
+
|
| 35 |
+
| Model | Size | FINQA | TATQA | TATDQA |
|
| 36 |
+
| --- | --- | --- | --- | --- |
|
| 37 |
+
| GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
|
| 38 |
+
| GPT-4 | - | 63.91 | 71.92 | 64.46 |
|
| 39 |
+
| [TAT-LLM-7B-LORA](https://huggingface.co/next-tat/tat-llm-7b-lora) | 7B | 65.13 | 76.49 | 71.38 |
|
| 40 |
+
| [TAT-LLM-7B-FFT](https://huggingface.co/next-tat/tat-llm-7b-fft) | 7B | 69.75 | 76.91 | 72.64 |
|
| 41 |
+
| [TAT-LLM-13B-LORA](https://huggingface.co/next-tat/tat-llm-13b-lora) | 13B | 71.93 | 77.51 | 72.22 |
|
| 42 |
+
| [TAT-LLM-13B-FFT](https://huggingface.co/next-tat/tat-llm-13b-fft) | 13B | 72.97 | 78.41 | 73.18 |
|
| 43 |
+
| [TAT-LLM-70B-LORA](https://huggingface.co/next-tat/tat-llm-70b-lora) | 70B | **76.81** | 81.42 | 76.55 |
|
| 44 |
+
| [TAT-LLM-70B-FFT](https://huggingface.co/next-tat/tat-llm-70b-fft) | 70B | 76.11 | **82.20** | **76.97** |
|
| 45 |
+
|
| 46 |
+
## Training
|
| 47 |
+
|
| 48 |
+
We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, using different methods such as parameter-efficient fine-tuning and full-parameter fine-tuning of LLaMA 2 on a combination of financial data from the FinQA, TAT-QA, and TAT-DQA training sets([🤗HuggingFace Repo](https://huggingface.co/datasets/next-tat/tat-llm-instructions)). To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.
|
| 49 |
+
|
| 50 |
+
## Inference & Evaluation
|
| 51 |
+
|
| 52 |
+
Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)
|
| 53 |
+
|
| 54 |
+
## Citation
|
| 55 |
+
|
| 56 |
+
If you find this model helpful, please consider citing our paper:
|
| 57 |
+
|
| 58 |
+
```
|
| 59 |
+
@misc{zhu2024tatllm,
|
| 60 |
+
title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
|
| 61 |
+
author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
|
| 62 |
+
year={2024},
|
| 63 |
+
eprint={2401.13223},
|
| 64 |
+
archivePrefix={arXiv},
|
| 65 |
+
primaryClass={cs.CL}
|
| 66 |
+
}
|
| 67 |
+
```
|
| 68 |
+
|