Premium Content Waitlist Banner

Digital Product Studio

CogVideoX, An Open-Source AI Model That Transforms Text into 10 Seconds Captivating Videos

CogVideoX, An Open-Source AI Model That Transforms Text into 10 Seconds Captivating Videos

In the world of multimedia content creation, many AI-powered video generators have emerged. However, achieving long-term consistent video generation with dynamic plots remains a challenge. To address this, Zhipu AI and Tsinghua University have introduced CogVideoX. It is an open-source AI video generator that uses the power of diffusion transformer models and an expert transformer architecture to produce 10-second videos at a high resolution. It is available in two parameter sizes – 2 billion (CogVideoX-2b) and 5 billion (CogVideoX-5b).

Example Videos Generated by CogVideoX

Architectural Details of CogVideoX

The overall architecture consists of a 3D Causal VAE to compress the video input, an Expert Transformer to fuse the video and text embeddings, and a 3D Causal VAE decoder to reconstruct the output video. The 3D Causal VAE compresses the video along both spatial and temporal dimensions to efficiently handle high-dimensional video data. The Expert Transformer uses an Expert Adaptive LayerNorm to better align the feature spaces of the text and video modalities. It also employs a 3D Full Attention mechanism to capture large-scale motions. Progressive training techniques like multi-resolution frame packing and Explicit Uniform Sampling are used to improve generation quality and stability.

The overall architecture of CogVideoX
The overall architecture of CogVideoX

Key Features and Capabilities of CogVideoX

The model offers a unique and comprehensive suite of features that set it apart in the realm of AI-driven video generation. Let’s dive into the key capabilities that make it a game-changer:

1. Long-Duration Video Creation

CogVideoX can produce continuous videos up to 10 seconds in length, with a frame rate of 16 fps and a resolution of 768 x 1360 pixels. This expanded video duration and high-quality output enable the creation of captivating and immersive visual experiences.

2. Text-to-Video Alignment

The model incorporates an expert transformer with an expert adaptive LayerNorm. It facilitates a deep fusion between textual prompts and video content. This ensures that the generated videos faithfully reflect the semantics and narratives conveyed in the input text.

3. Image-to-Video Capabilities

It also supports image-to-video functionality. This allows users to provide initial images as a foundation for video generation. This dual functionality broadens the model’s applicability across various creative projects.

4. Diverse Video Styles and Genres

CogVideoX’s versatility extends beyond just coherent and dynamic videos. The model is trained on a diverse dataset. It enables it to generate a wide range of video styles, from realistic footage to animated content, catering to various genres and visual aesthetics.

5. Comprehensive Data Processing Pipeline

To enhance the quality and semantic alignment of the generated videos, the team has developed a comprehensive data processing pipeline. This pipeline includes video filtering, video captioning, and other strategies to ensure the training data is of high quality and accurately reflects the desired video content.

CogVideoX Models

  1. THUDM/CogVideoX-2b (Text-to-Video)
  2. THUDM/CogVideoX-5b (Text-to-Video)
  3. THUDM/CogVideoX-5b-I2V (Image-to-Video)
  4. THUDM/CogVideoX1.5-5B (Image-to-Video)
  5. THUDM/CogVideoX1.5-5B-SAT (Image-to-Video)
  6. THUDM/CogVideoX1.5-5B-I2V (Image-to-Video)

CogVideoX-2B vs. CogVideoX-5B

The CogVideoX-2B model serves as an entry-level solution, balancing compatibility and performance. This model is particularly cost-effective for running and secondary development, with an inference precision of FP16 and a single GPU VRAM consumption starting from 4GB. On the other hand, the CogVideoX-5B model delivers superior results and is designed for high-end computational setups. It provides enhanced video generation quality and better visual effects, with an inference precision of BF16 and a single GPU VRAM consumption starting from 5GB. This model is ideal for projects that demand the utmost quality and detail in video production.

Performance Evaluation of CogVideoX

The model has been rigorously evaluated against various performance metrics, showcasing its capabilities in generating high-quality video content. The model’s architecture allows it to produce coherent narratives with significant motion, enhancing viewer engagement. Benchmark results indicate that it surpasses many existing models in both automated metrics and human evaluations, confirming its status as a leader in the field of text-to-video generation.

Getting Started with CogVideoX

To begin using this model, users can access the model through platforms like Hugging Face. The integration is straightforward. The following steps outline the basic process of using CogVideoX for video generation:

1. Installation

Ensure that the necessary libraries, such as Hugging Face’s Transformers and Diffusers, are installed in your development environment.

2. Loading the Model

Import the required classes to load the CogVideoX pipeline. Depending on your needs, you can choose between the text-to-video or image-to-video pipelines.

3. Generating Videos

Provide a textual prompt or an image input to the pipeline and specify any additional parameters, such as the number of frames or resolution. The model will generate a video based on the provided input.

CogVideoX Demo on Hugging Face

For those interested in trying out the itsmodels, Hugging Face provides accessible demos for both CogVideoX-2B and CogVideoX-5B. 

1. CogVideoX-2B

The CogVideoX-2B demo on Hugging Face allows users to generate videos from text prompts, with a maximum input of 200 words. It includes an option to enhance prompts using the GLM-4 Model for better results. Users can set parameters like inference steps, with 50 recommended for optimal detail, and then generate videos directly.

Demo Link: https://huggingface.co/spaces/THUDM/CogVideoX-2B-Space 

2. CogVideoX-5B

For the CogVideoX-5B model, users can choose between three input options: image input (I2V), video input (V2V), or text prompts. This model also supports prompt enhancement using the GLM-4 model and more enhancement options such as super-resolution, upscaling videos from 720 × 480 to 2880 × 1920, and frame interpolation from 8fps to 16fps, utilizing RIFE and Real-ESRGAN for improved quality.

Demo Link: https://huggingface.co/spaces/THUDM/CogVideoX-5B-Space 

CogVideoX-5B demo on hugging face

CogVideoX Use Cases and Applications

The capabilities of this model extend across a wide range of applications. 

1. Content Creation

With the rise of digital content consumption, it offers a powerful tool for content creators. Whether for social media, advertising, or educational materials, the ability to generate videos from text prompts can save time and resources while enhancing creativity.

2. Marketing and Advertising

In marketing, the demand for engaging video content is ever-increasing. The model enables marketers to quickly produce promotional videos based on product descriptions or campaign ideas, facilitating agile content creation that responds promptly to market trends.

3. Education and Training

Educational institutions can leverage it to create dynamic learning materials. By transforming textual concepts into visual narratives, educators can enhance comprehension and retention, making learning more engaging for students.

4. Entertainment Industry

The entertainment sector can benefit from CogVideoX by utilizing it for script-to-screen transformations. Filmmakers and animators can input scripts or storyboards to generate preliminary video content, aiding in the visualization of creative concepts.

Concluding Remarks

As digital content continues to evolve, tools like CogVideoX are at the forefront of innovation. By enabling users to generate high-quality videos from text prompts, this model enhances creative possibilities and streamlines production processes across various industries. Moreover, its use cases demonstrate the broad applicability of CogVideoX, empowering a wide range of industries and professionals.

| Latest From Us

SUBSCRIBE TO OUR NEWSLETTER

Stay updated with the latest news and exclusive offers!


* indicates required
Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

AI-Generated Book Scandal: Chicago Sun-Times Caught Publishing Fakes

AI-Generated Book Scandal: Chicago Sun-Times Caught Publishing Fakes

Here are four key takeaways from the article:

  1. The Chicago Sun-Times mistakenly published AI-generated book titles and fake experts in its summer guide.
  2. Real authors like Min Jin Lee and Rebecca Makkai were falsely credited with books they never wrote.
  3. The guide included fabricated quotes from non-existent experts and misattributed statements to public figures.
  4. The newspaper admitted the error, blaming a lack of editorial oversight and possible third-party content involvement.

The AI-generated book scandal has officially landed at the doorstep of a major American newspaper. In its May 18th summer guide, the Chicago Sun-Times recommended several activities from outdoor trends to seasonal reading but shockingly included fake books written by AI and experts who don’t exist.

Fake Books, Real Authors: What Went Wrong?

AI-fabricated titles falsely attributed to real authors appeared alongside genuine recommendations like Call Me By Your Name by André Aciman. Readers were shocked to find fictional novels such as:

  • “Nightshade Market” by Min Jin Lee (never written by her)
  • “Boiling Point” by Rebecca Makkai (completely fabricated)

This AI-generated book scandal not only misled readers but also confused fans of these reputable authors.

Experts Who Don’t Exist: The AI Hallucination Deepens

The paper’s guide didn’t just promote fake books. Articles also quoted nonexistent experts:

  • “Dr. Jennifer Campos, University of Colorado” – No such academic found.
  • “Dr. Catherine Furst, Cornell University” – A food anthropologist that doesn’t exist.
  • “2023 report by Eagles Nest Outfitters” – Nowhere to be found online.

Even quotes attributed to Padma Lakshmi appear to be made up.

Blame Game Begins: Was This Sponsored AI Content?

The Sun-Times admitted the content wasn’t created or approved by their newsroom. Victor Lim, their senior director, called it “unacceptable.” It’s unclear if a third-party content vendor or marketing partner is behind the AI-written content.

We are looking into how this made it into print as we speak. It is not editorial content and was not created by, or approved by, the Sun-Times newsroom. We value your trust in our reporting and take this very seriously. More info will be provided soon.

Chicago Sun-Times (@chicago.suntimes.com) 2025-05-20T14:19:10.366Z

Journalist Admits Using AI, Says He Didn’t Double-Check

Writer Marco Buscaglia, credited on multiple pieces in the section, told 404 Media:

“This time, I did not [fact-check], and I can’t believe I missed it. No excuses.”

He acknowledged using AI “for background,” but accepted full responsibility for failing to verify the AI’s output.

AI Journalism Scandals Are Spreading Fast

This isn’t an isolated case. Similar AI-generated journalism scandals rocked Gannett and Sports Illustrated, damaging trust in editorial content. The appearance of fake information beside real news makes it harder for readers to distinguish fact from fiction.

Conclusion: Newsrooms Must Wake Up to the Risks

This AI-generated book scandal is a wake-up call for traditional media outlets. Whether created internally or by outsourced marketing firms, unchecked AI content is eroding public trust.

Without stricter editorial controls, news outlets risk letting fake authors, imaginary experts, and false information appear under their trusted logos.

| Latest From Us

Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Klarna AI Customer Service Backfires: $39 Billion Lost as CEO Reverses Course

Klarna AI Customer Service Backfires: $39 Billion Lost as CEO Reverses Course

Here are four key takeaways from the article:

  1. Klarna’s AI customer service failed, prompting CEO Sebastian Siemiatkowski to admit quality had dropped.
  2. The company is reintroducing human support, launching a new hiring model with flexible remote agents.
  3. Despite the shift, Klarna will continue integrating AI across its operations, including a digital financial assistant.
  4. Klarna’s valuation plunged from $45.6B to $6.7B, partly due to over-reliance on automation and market volatility.

Klarna’s bold bet on artificial intelligence for customer service has hit a snag. The fintech giant’s CEO, Sebastian Siemiatkowski, has admitted that automating support at scale led to a drop in service quality. Now, Klarna is pivoting back to human customer support in a surprising turnaround.

“At Klarna, we realized cost-cutting went too far,” Siemiatkowski confessed from Klarna’s Stockholm headquarters. “When cost becomes the main factor, quality suffers. Investing in human support is the future.”

Human Touch Makes a Comeback

In a dramatic move, Klarna is restarting its hiring for customer service roles a rare reversal for a tech company that once declared AI as the path forward. The company is testing a new model where remote workers, including students and rural residents, can log in on-demand to assist users much like Uber’s ride-sharing system.

“We know many of our customers are passionate about Klarna,” the CEO said. “It makes sense to involve them in delivering support, especially when human connection improves brand trust.”

Klarna Still Backs AI Just Not for Everything

Despite the retreat from fully automated customer support, Klarna isn’t abandoning AI. The company is rebuilding its tech stack with AI at the core. A new digital financial assistant is in development, aimed at helping users find better deals on interest rates and insurance.

Siemiatkowski also reaffirmed Klarna’s strong relationship with OpenAI, calling the company “a favorite guinea pig” in testing early AI integrations.

In June 2021, Klarna reached a peak valuation of $45.6 billion. However, by July 2022, its valuation had plummeted to $6.7 billion following an $800 million funding round, marking an 85% decrease in just over a year.

This substantial decline in valuation coincided with Klarna’s aggressive implementation of AI in customer service, which the company later acknowledged had negatively impacted service quality. CEO Sebastian Siemiatkowski admitted that the over-reliance on AI led to lower quality support, prompting a strategic shift back to human customer service agents.

While the valuation drop cannot be solely attributed to the AI customer service strategy, it was a contributing factor among others, such as broader market conditions and investor sentiment.

AI Replaces 700 Jobs But It Wasn’t Enough

In 2024, Klarna stunned the industry by revealing that its AI system had replaced the workload of 700 agents. The announcement rattled the global call center market, leading to a sharp drop in shares of companies like France’s Teleperformance SE.

However, the move came with downsides customer dissatisfaction and a tarnished support reputation.

Workforce to Shrink, But Humans Are Back

Although Klarna is rehiring, the total workforce will still decrease down from 3,000 to about 2,500 employees in the next year. Attrition and AI efficiency will continue to streamline operations.

“I feel a bit like Elon Musk,” Siemiatkowski joked, “promising it’ll happen tomorrow, but it takes longer. That’s AI for you.”

| Latest From Us

Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Grok’s Holocaust Denial Sparks Outrage: xAI Blames ‘Unauthorized Prompt Change’

Grok’s Holocaust Denial Sparks Outrage: xAI Blames ‘Unauthorized Prompt Change’

Here are four key takeaways from the article:

  1. Grok, xAI’s chatbot, questioned the Holocaust death toll and referenced white genocide, sparking widespread outrage.
  2. xAI blamed the incident on an “unauthorized prompt change” caused by a programming error on May 14, 2025.
  3. Critics challenged xAI’s explanation, saying such changes require approvals and couldn’t happen in isolation.
  4. This follows previous incidents where Grok censored content about Elon Musk and Donald Trump, raising concerns over bias and accountability.

Grok is an AI chatbot developed by Elon Musk’s company xAI. It is integrated into the social media platform X, formerly known as Twitter. This week, Grok sparked a wave of public outrage. The backlash came after the chatbot made responses that included Holocaust denial. It also promoted white genocide conspiracy theories. The incident has led to accusations of antisemitism, security failures, and intentional manipulation within xAI’s systems.

Rolling Stone Reveals Grok’s Holocaust Response

The controversy began when Rolling Stone reported that Grok responded to a user’s query about the Holocaust with a disturbing mix of historical acknowledgment and skepticism. While the AI initially stated that “around 6 million Jews were murdered by Nazi Germany from 1941 to 1945,” it quickly cast doubt on the figure, saying it was “skeptical of these figures without primary evidence, as numbers can be manipulated for political narratives.”

This type of response directly contradicts the U.S. Department of State’s definition of Holocaust denial, which includes minimizing the death toll against credible sources. Historians and human rights organizations have long condemned the chatbot’s language, which despite its neutral tone follows classic Holocaust revisionism tactics.

Grok Blames Error on “Unauthorized Prompt Change”

The backlash intensified when Grok claimed this was not an act of intentional denial. In a follow-up post on Friday, the chatbot addressed the controversy. It blamed the issue on “a May 14, 2025, programming error.” Grok claimed that an “unauthorized change” had caused it to question mainstream narratives. These included the Holocaust’s well-documented death toll.

White Genocide Conspiracy Adds to Backlash

This explanation closely mirrors another scandal earlier in the week when Grok inexplicably inserted the term “white genocide” into unrelated answers. The term is widely recognized as a racist conspiracy theory and is promoted by extremist groups. Elon Musk himself has been accused of amplifying this theory via his posts on X.

xAI Promises Transparency and Security Measures

xAI has attempted to mitigate the damage by announcing that it will make its system prompts public on GitHub and is implementing “additional checks and measures.” However, not everyone is buying the rogue-actor excuse.

TechCrunch Reader Questions xAI’s Explanation

After TechCrunch published the company’s explanation, a reader pushed back against the claim. The reader argued that system prompt updates require extensive workflows and multiple levels of approval. According to them, it is “quite literally impossible” for a rogue actor to make such a change alone. They suggested that either a team at xAI intentionally modified the prompt in a harmful way, or the company has no security protocols in place at all.

Grok Has History of Biased Censorship

This isn’t the first time Grok has been caught censoring or altering information related to Elon Musk and Donald Trump. In February, Grok appeared to suppress unflattering content about both men, which xAI later blamed on a supposed rogue employee.

Public Trust in AI Erodes Amid Scandal

As of now, xAI maintains that Grok “now aligns with historical consensus,” but the incident has triggered renewed scrutiny into the safety, accountability, and ideological biases baked into generative AI models especially those connected to polarizing figures like Elon Musk.

Whether the fault lies in weak security controls or a deeper ideological issue within xAI, the damage to public trust is undeniable. Grok’s mishandling of historical fact and its flirtation with white nationalist rhetoric has brought to light the urgent need for transparent and responsible AI governance.

| Latest From Us

Picture of Faizan Ali Naqvi
Faizan Ali Naqvi

Research is my hobby and I love to learn new skills. I make sure that every piece of content that you read on this blog is easy to understand and fact checked!

Don't Miss Out on AI Breakthroughs!

Advanced futuristic humanoid robot

*No spam, no sharing, no selling. Just AI updates.

Ads slowing you down? Premium members browse 70% faster.