RichardErkhov commited on
Commit
016bfbc
·
verified ·
1 Parent(s): 015b5f2

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ tat-llm-7b-fft - bnb 8bits
11
+ - Model creator: https://huggingface.co/next-tat/
12
+ - Original model: https://huggingface.co/next-tat/tat-llm-7b-fft/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ language:
20
+ - en
21
+ license: llama2
22
+ ---
23
+
24
+ # TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data
25
+
26
+ Paper: https://arxiv.org/abs/2401.13223
27
+
28
+ Code: https://github.com/fengbinzhu/TAT-LLM
29
+
30
+
31
+ ## Introduction
32
+
33
+ We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding financial QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.
34
+
35
+ | Model | Size | FINQA | TATQA | TATDQA |
36
+ | --- | --- | --- | --- | --- |
37
+ | GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
38
+ | GPT-4 | - | 63.91 | 71.92 | 64.46 |
39
+ | [TAT-LLM-7B-LORA](https://huggingface.co/next-tat/tat-llm-7b-lora) | 7B | 65.13 | 76.49 | 71.38 |
40
+ | [TAT-LLM-7B-FFT](https://huggingface.co/next-tat/tat-llm-7b-fft) | 7B | 69.75 | 76.91 | 72.64 |
41
+ | [TAT-LLM-13B-LORA](https://huggingface.co/next-tat/tat-llm-13b-lora) | 13B | 71.93 | 77.51 | 72.22 |
42
+ | [TAT-LLM-13B-FFT](https://huggingface.co/next-tat/tat-llm-13b-fft) | 13B | 72.97 | 78.41 | 73.18 |
43
+ | [TAT-LLM-70B-LORA](https://huggingface.co/next-tat/tat-llm-70b-lora) | 70B | **76.81** | 81.42 | 76.55 |
44
+ | [TAT-LLM-70B-FFT](https://huggingface.co/next-tat/tat-llm-70b-fft) | 70B | 76.11 | **82.20** | **76.97** |
45
+
46
+ ## Training
47
+
48
+ We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, using different methods such as parameter-efficient fine-tuning and full-parameter fine-tuning of LLaMA 2 on a combination of financial data from the FinQA, TAT-QA, and TAT-DQA training sets([🤗HuggingFace Repo](https://huggingface.co/datasets/next-tat/tat-llm-instructions)). To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.
49
+
50
+ ## Inference & Evaluation
51
+
52
+ Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)
53
+
54
+ ## Citation
55
+
56
+ If you find this model helpful, please consider citing our paper:
57
+
58
+ ```
59
+ @misc{zhu2024tatllm,
60
+ title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
61
+ author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
62
+ year={2024},
63
+ eprint={2401.13223},
64
+ archivePrefix={arXiv},
65
+ primaryClass={cs.CL}
66
+ }
67
+ ```
68
+