Figure 1: Learning Capability of Fine-tuning Approaches. Shown is the percentage correct on a hotel booking task before and after each of the four different methods of fine-tuning are applied. Percentage highlighted on top of each post fine-tuning bar reflects the net change in proficiency provided by that fine-tuning method.
We observe that the model fine-tuned by the Tenyx method results in the highest-performing hotel booking agent (with the model fine-tuned by TogetherAI a close second). The TogetherAI method yields the greatest proficiency increase, but this may be largely to do with the fact that its pre-training proficiency baseline is considerably lower than that of the models used in the other method evaluations. Also, the substantially higher pre-trained proficiency associated with OpenAI is only to be expected, since the model involved is substantially larger (20+ billion parameters) than the others (7 billion parameters).
Safety (Toxicity)
Figure 2 illustrates the most striking results of our investigations. It shows how the fine-tuning approaches used by industry leaders, as well as the open-source one (LoRA), eliminate the safeguards obtained via the costly RLHF process. However, we also find that while all the current fine-tuning solutions lose some of this protective layer, the Tenyx approach maintains it best, resulting in the safest fine-tuning approach.
Knowledge (Forgetting)
To measure the forgetting induced by each fine-tuning method, we compared each model’s performance on a domain-general dataset both before and following the fine-tuning of that model on our domain-specific dataset (hotel room booking). For the domain general dataset, we used Dolly Q&A, which consists of general knowledge questions covering a wide range of topics. Our results are displayed in Figure 3. We found that the Tenyx fine-tuning method is the one that best mitigates the forgetting phenomena (with only a 3% loss of domain-general performance), while effectively retaining the knowledge of the base model. Again, the OpenAI model’s pre-fine-tuning performance is, as expected, significantly better than the others, which is explained by its much larger size.
Safer, Smarter Fine-Tuning
The safety removal and knowledge loss observed above are a direct consequence of learning on a new domain through fine-tuning. But why does this happen?
Fine-tuning alters the pre-trained weights of the model – in particular, techniques such as LoRA update all weights across the layers to which it is applied. These perturbations typically distort the model outputs. This explains the general trend in the figures above: proficiency on the hotel booking use case increases after fine-tuning, yet general knowledge and safety capabilities are drastically reduced.
The Tenyx fine-tuning method, by contrast, minimally disturbs these pre-trained weights while still learning on the new data. Thus, while all the other assessed fine-tuning schemes achieve proficiency at the cost of knowledge and safety degradation, the Tenyx approach achieves the highest standards of proficiency while also inducing the least loss in knowledge and safety.
When we think about Multi-Layer Perceptrons (MLPs), we often visualize them as interconnected neurons processing information. However, there’s an elegant alternative perspective - viewing MLPs as hashing functions that partition input space and mapping functions on these partitions. Read more
At Tenyx, we’ve delved into the intricate workings of Large Language Models (LLMs) to uncover the geometric structures underlying their reasoning capabilities. Our research provides new insights into how LLMs process information and the implications for improving their reasoning abilities. Read more