Digital Product Studio

Sky-T1, An Open-Source Reasoning AI Model That Can be Trained Under $450 to be as Powerful as OpenAI o1-preview

With the introduction of the OpenAI o1 model, many AI reasoning models are being developed to rival its performance. Sky-T1-32B-Preview is one such open-source model offering competitive performance against OpenAI o1-preview and Alibaba’s QwQ-32B-Preview. Developed by the NovaSky team at UC Berkeley, Sky-T1 performs complex reasoning tasks at a remarkably low cost, under $450. The performance is on par with the OpenAI o1-preview model on both math and coding benchmarks. 

Training Details of Sky-T1-32B-Preview

1. Training Data

Sky-T1-32B-Preview was trained on a carefully curated dataset comprising 17,000 verified correct responses. This dataset includes data from the Qwen/QwQ-32B-Preview models, focusing on coding and math. This initial dataset was then refined; the team curated the data mixture to ensure a diverse representation across various reasoning domains.  By incorporating OpenAI’s GPT-4o-mini, they enhanced the formatting of this data, making it more suitable for training purposes. Additionally, the training data was supplemented with scientific questions derived from the Still-2 paper, broadening the model’s reasoning scope. 

2. Training Procedure

To ensure optimal performance, the NovaSky team employed a supervised fine-tuning approach. The training utilized a batch size of 96, enabling the model to learn from a diverse range of examples effectively. The training was conducted using Llama-Factory, a framework that supports efficient model training. The entire process took approximately 19 hours on a setup of 8 Nvidia H100 GPUs, leveraging DeepSpeed Zero-3 Offload technology to enhance computational efficiency. Notably, the entire operation was accomplished at a cost of less than $450, according to Lambda Cloud pricing. This contrasts the millions of dollars typically required for training models of comparable performance.

Competitive Performance Metrics

The NovaSky team evaluated the performance of Sky-T1-32B-Preview across several benchmarks. These include math and coding assessments that are critical for gauging the model’s reasoning capabilities. For instance, the model achieved an impressive score of 82.4% on the Math500 benchmark, surpassing the o1-preview.  Additionally, on the AIME2024 dataset, it scored 43.3%, showcasing its effective reasoning in mathematical contexts, again outperforming the o1-preview. In coding evaluations, Sky-T1 demonstrated its prowess with a score of 86.3% on the LiveCodeBench-Easy benchmark. 

Sky-T1-32B-Preview vs. OpenAI o1-preview

This score indicates the model’s ability to understand and generate functional code, a critical skill in software development. Overall, the evaluation results establish Sky-T1-32B-Preview as a competitive player capable of handling complex reasoning tasks with accuracy and efficiency.

The Open-Source Advantage

Sky-T1-32B-Preview is an open-source reasoning model. By making the model’s code, training data, and weights publicly available, the NovaSky team aims to foster collaboration within the academic and open-source communities. This allows others to replicate the model. Users can enhance their own projects using a state-of-the-art reasoning model. This enables a broader range of researchers and developers to engage in meaningful AI development without the prohibitive costs traditionally associated with such technologies. You can download the model through Hugging Face: Sky-T1-32B-Preview.

Sky-T1, An Open-Source Reasoning AI Model That Can be Trained Under $450 to be as Powerful as OpenAI o1-preview

Future of Sky-T1-32B-Preview

As the NovaSky team looks towards the future, they will further develop Sky-T1-32B-preview and enhance its capabilities. Ongoing research will focus on optimizing the model’s efficiency and exploring advanced techniques that could elevate its performance even further. The potential applications of this reasoning model extend beyond academic research; industries that rely on reasoning, coding, and problem-solving can benefit significantly from this innovative model. Overall, Sky-T1-32B-Preview represents a monumental achievement in the field of open-source AI. By demonstrating that training a powerful reasoning model is this affordable, the NovaSky team has set a new standard for accessibility and collaboration in AI research. 

| Latest From Us

SUBSCRIBE TO OUR NEWSLETTER

Stay updated with the latest news and exclusive offers!


* indicates required
Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

Don't Miss Out on AI Breakthroughs!

Advanced futuristic humanoid robot

*No spam, no sharing, no selling. Just AI updates.