Digital Product Studio

The New OpenChat-3.5-0106-Gemma Outshines Google Gemma 7B Model with 6T Tokens

OpenChat has recently introduced its new language model: OpenChat-3.5-0106-Gemma. It has been proven to outperform Google’s original Gemma 7B model with an astounding 6T tokens on several benchmarks. Let’s explore what makes this OpenChat Gemma model so special and how it achieves such impressive results.

OpenChat-3.5-0106: A Foundation to Build Upon

Before we delve into the details of the OpenChat Gemma model, let’s first understand the foundation on which it is built. OpenChat-3.5-0106 is a 7B parameter conversational language model originally trained using Mistral, with C-RLFT fine-tuning to advance open-source models with mixed data. It demonstrates state-of-the-art performance on benchmarks like HumanEval, AGIEval and more. The model code and weights are released under an open-source license for everyone to use freely.

Introduction to OpenChat-3.5-0106-Gemma

OpenChat-3.5-0106-Gemma is a unique model that applies the same C-RLFT training procedure as OpenChat-3.5-0106 but uses the Gemma 7B model by Google instead of Mistral. Google’s proprietary Gemma framework allows the replication of the model performance and methodology under their terms of use license. The OpenChat Gemma model achieves results that are on par with the Mistral-based version, which is no small feat while outperforming the original Gemma 7B model.

The Secret Recipe: 6T Tokens

The secret lies in the recipe, and the main ingredient is the use of 6T tokens. During initial model development, OpenChat found that using 6T (trillion) tokens for unsupervised pretraining led to significant gains over standard 1T token pretraining. This extra data helps the model learn richer representations, leading to more coherent, consistent, and capable conversations. This gemma model benefits from inheriting this high-quality pretraining foundation. The use of vast pretraining data can thus be considered a key “secret sauce” to the model’s strong zero-shot capabilities.

Performance Evaluation of OpenChat-3.5-0106-Gemma

Openchat-3.5-0106-gemma achieves similar performance to the Mistral-based openchat model and outperforms the original Gemma 7B model by Google on various benchmarks, which we will discuss in detail below. Plus, it also outperforms the popular OpenHermes 2.5 7B model on 7/8 benchmarks and widely used OpenAI ChatGPT in 4/8 benchmarks.

ModelOpenChat-3.5-0106 Gemma (7B)ChatGPT (March)OpenHermes 2.5 (7B)
Average64.461.559.3
MT-Bench7.837.947.54
HumanEval67.748.148.2
BBH MC52.747.649.4
AGIEval50.247.146.5
TruthfulQA55.457.757.5
MMLU65.767.363.8
GSM8K81.574.973.5
BBH CoT63.770.159.9
Table Credits: OpenChat (HuggingFace)

1. OpenChat-3.5-0106-Gemma vs. OpenChat-3.5-0106 Mistral

As per the benchmarks, the OpenChat Gemma model achieves state-of-the-art performance similar to the Mistral version on most NLP tasks. On average, across all benchmarks, the Gemma version scores 64.4 compared to 64.5 for Mistral. Both models outperform other baselines like ChatGPT. On tasks measuring conversational skills like HumanEval, the Gemma version scores 67.7, while Mistral achieves 71.3. However, on mathematical problem solving under the MATH benchmark, Gemma scores slightly higher at 29.3 vs. 28.6 for Mistral.

ModelOpenChat-3.5-0106 GemmaOpenChat-3.5-0106 Mistral
# Params7B7B
Average 64.464.5
MT-Bench7.837.8
HumanEval67.771.3
BBH MC52.751.5
AGIEval50.249.1
TruthfulQA55.461.0
MMLU65.765.8
GSM8K81.577.4
BBH CoT63.762.2
Table Credits: OpenChat (HuggingFace)

Overall, the benchmark results establish that OpenChat Gemma is able to match the high-quality performance of the original Mistral model despite differences in implementation frameworks.

2. OpenChat-3.5-0106-Gemma vs. Gemma-7B

When compared to the original Gemma-7B model, OpenChat Gemma demonstrates clear improvements on all benchmarks where scores are available.

For example, on HumanEval, the OpenChat Gemma model scores 67.7 compared to 32.3 for the original Gemma-7B model. Similarly, on AGIEval, the scores are 41.7 vs 50.2 vs 41.7 respectively. Notably, OpenChat Gemma even sets a new state-of-the-art score of 81.5 on the GSM8K dialogue task, surpassing the Gemma-7B score of 46.4.

Model# ParamsHumanEvalAGIEvalMMLUGSM8K
OpenChat-3.5-0106 Gemma7B67.750.265.781.5
Gemma-7B7B32.341.764.346.4
Table Credits: OpenChat (HuggingFace)

These results validate that the C-RLFT fine-tuning methodology effectively enhances the capabilities of Gemma-7B.

How to Get Started With OpenChat Gemma Model

To use this model, visit the model page on HuggingFace by OpenChat. There, you’ll find all the necessary information and resources to use it effectively. The model is compatible with the OpenAI ChatCompletion API specifications. Additionally, you can use the OpenChat Web UI for a user-friendly experience. 

Conclusion

OpenChat-3.5-0106-Gemma represents a significant advancement in conversational AI. With its fine-tuning technique, it pushes the boundaries of what is possible in text generation tasks. Its impressive performance demonstrates the potential for even more sophisticated conversational AI models in the future.

| Also Read: Google’s New Gemma 2B and 7B Open-Source AI Models, But Do They Beat Meta Llama 2 7B and Mistral 7B?

| Latest From Us

SUBSCRIBE TO OUR NEWSLETTER

Stay updated with the latest news and exclusive offers!


* indicates required
Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

Don't Miss Out on AI Breakthroughs!

Advanced futuristic humanoid robot

*No spam, no sharing, no selling. Just AI updates.