120x600

What is Sora? The OpenAI model explained


You’re probably very familiar with ChatGPT by now and, with that, the company behind it, OpenAI. However, ChatGPT isn’t OpenAI’s only AI model. Here’s everything you need to know about Sora. 

Sora is OpenAI’s text-to-video AI model, but what does that mean and how does it work? Keep reading to learn all about Sora, its features and how much it costs. 

What is Sora? 

Sora is a text-to-video AI generation model by OpenAI, the same company behind the popular chatbot, ChatGPT. 

The AI model is designed to create realistic video footage up to one minute long from a text prompt, with users able to specify both the subject matter and style of the video. 

According to OpenAI, Sora is capable of generating complex scenes with multiple characters and specific types of motion. Not only does Sora understand what is wanted from a prompt, but the model has also been taught how these things exist in the physical world. 

What features are available?

OpenAI has released several features for Sora, including Remix, Re-cut, Storyboard, Loop, Blend and Style Presets. 

Remix makes it possible to replace, remove and re-imagine different elements within your video, such as replacing doors or transforming a library into a spaceship. Re-cut, on the other hand, invites you to isolate your favourite frames and extend them to further your scene. 

The Storyboard feature is for organising and editing a sequence of videos, while Loop allows you to create seamless repetitions. Blend is for combining two videos into one, while Style Presets make it possible to specify a specific look, such as film noir or archival footage. 

Sora Present Styles Film NoirSora Present Styles Film Noir
Sora Present Styles – Film Noir

How does Sora work? 

Like other image and video models, Sora is a diffusion model. This means that videos begin as static noise and gradually transform into the end result by removing noise in steps. 

In this instance, prompts are submitted as text descriptions or by uploading an existing image or video, with the user asking Sora to animate the still or extend the video by creating new frames. 

Like ChatGPT, Sora uses a transformer architecture for scaling. Images and videos are represented as collections of smaller data units called patches (comparable to a token in a GPT model). By representing data in this way, diffusion transformers can be trained on a larger variety of visual data, including different video lengths, resolutions and aspect ratios. 

When was Sora released? 

Sora was announced in February 2024 and officially released to ChatGPT Plus and ChatGPT Pro users in December of that same year. 

Is Sora free? 

Sora is not free, with the text-to-video model requiring a ChatGPT Plus or ChatGPT Pro subscription to be accessed. 

ChatGPT Plus starts at $20/month and includes the ability to generate videos of up to 10 seconds with a 720p resolution, along with up to two concurrent generations. 

ChatGPT Pro, on the other hand, allows users to generate videos of up to 20 seconds with a 1080p resolution. Users can create up to five concurrent generations and each generation is faster with a Pro membership, too. Additionally, Pro users can download their videos without watermarks, though the plan is significantly more costly at $200/month. 

About The Author

Related posts