Digital Product Studio

Magic Takes Coding to New Heights with 100 Million Token Model

Magic is a San Francisco-based startup focused on developing AI assistants for software development tasks. In their recent research update, Magic shared that they have achieved a major milestone by training their first model to handle a context window of 100 million tokens. Let’s take a deeper look at what this means and how Magic achieved this frontier-level breakthrough.

Magic First 100 Million Token Model: LTM-2-mini

Magic reveals that they have successfully trained their initial 100 million token context model called LTM-2-mini. This is the equivalent of 10 million lines of code or 750 novels worth of information. This clearly shows the potential of such a large context window for software development-related tasks.

Performance Evaluation of  LTM-2-mini: HashHop

Current methods for evaluating long context models like “Needle in Haystack” are imperfect as they give hints to the models. Magic has created a new benchmark called “HashHop” that uses random hashes to remove any semantic clues, truly testing the models’ ability to store and recall information from maximum context. 

HashHop evaluates single inference hops as well as “chain of thought” recall over multiple chained inferences. Magic shares some preliminary results showing the model can chain inferences over hashes and complete simple code generation tasks using an in-context framework.

Efficiency Gains of Magic’s Architecture

Magic notes that for each decoded token, the sequence-dimension algorithm used in LTM-2-mini is approximately 1000 times more computationally efficient than the attention mechanism used in the OpenAI GPT-3.5 model. Their approach also requires vastly less memory, allowing them to process this huge context size on a small fraction of the hardware.

Building Supercomputers with Google Cloud

To train their larger LTM models, Magic has partnered with Google Cloud to build two new supercomputers – Magic-G4 using NVIDIA H100 GPUs and Magic-G5 using the new NVIDIA GB200 GPUs. Google Cloud will provide the infrastructure and tools to train models at scale. Magic believes these supercomputers can enable the next breakthroughs in AI by training models at a scale not possible before.

Roadmap to Larger Models and Applications

While LTM-2-mini showed promising results on hashes, Magic acknowledges it was still limited for complex tasks like code generation. However, it proved their approach is viable. Their new supercomputers will enable training far larger models that can tackle nuanced software development problems.

Going Forward

Magic recently raised $320M in funding, led by Eric Schmidt, Jane Street, and Sequoia Capital. This brings their total funding to date to $465M. The new funds will support scaling their AI infrastructure through the Google Cloud partnership and developing new safety-focused research. 

The company is hiring across various engineering and research roles to expand their 23-person team. This includes supercomputing experts to manage their growing GPU clusters as well as key roles in security, distributed systems and more. The ultimate goal is to deploy AI assistants capable of massive contextual understanding to transform software development.

To get more technical details, please visit their blog.

| Latest From Us

SUBSCRIBE TO OUR NEWSLETTER

Stay updated with the latest news and exclusive offers!


* indicates required
Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

Don't Miss Out on AI Breakthroughs!

Advanced futuristic humanoid robot

*No spam, no sharing, no selling. Just AI updates.