Tech Accelerator What is enterprise AI? A complete guide for businesses

Prev Next

Feature

OpenAI o1 explained: Everything you need to know

OpenAI's o1 models, launched in December 2024, enhance reasoning in AI and excel in complex tasks, such as generating and debugging code.

Sean Michael Kerner

By

Sean Michael Kerner

Published: 11 Dec 2024

OpenAI has emerged to be one of the primary leaders of the generative AI era. The company's ChatGPT is among the most popular and widely used instances of generative AI, powered by its GPT family of large language models, or LLMs. As of December 2024, the primary models used by ChatGPT are GPT-4o and GPT-3.5.

For multiple weeks in August and into September 2024, reports circulated about a new model from OpenAI -- codenamed "Strawberry." Initially, it was not clear whether Strawberry was the successor to GPT-4o or something else.

On Sept. 12, 2024, the suspense behind Strawberry lifted with the initial launch of OpenAI o1 models, including o1-preview and o1-mini. On Dec. 5, 2024, as part of its "12 Days of OpenAI" event, the company made the o1 model generally available, alongside the introduction of the o1 pro mode offering.

What is OpenAI o1?

OpenAI o1 is a family of LLMs from OpenAI that have been optimized with enhanced reasoning functionality.

The o1 models were initially intended to be preview models, designed to provide users -- as well as OpenAI -- with a different type of LLM experience than the GPT-4o model. As is the case with all OpenAI's LLMs, o1 is a transformer model. It can be used to summarize content, generate new content, answer questions and write application code.

This article is part of

What is enterprise AI? A complete guide for businesses

Which also includes:
How can AI drive revenue? Here are 10 approaches
8 jobs that AI can't replace and why
10 AI and machine learning trends to watch in 2026

As opposed to OpenAI's prior models, the o1 models are designed to reason better. That is, instead of just providing a response as quickly as possible and using the basic transformer approach of weights and understanding what word or words belong together, o1 "thinks" about what the right approach is to solve a problem. The process of reasoning about a given problem in response to a user query is intended to provide a potentially more accurate response to certain types of complex queries. Unlike previous models, the o1 series spends more time processing information before responding. The o1 models are targeted at tackling hard problems that require multistep reasoning and complex problem-solving strategies.

The basic strategy taken by OpenAI for reasoning is chain-of-thought prompting, where a model reasons step by step through a problem in an iterative approach. The development of o1 involved advanced training techniques, such as reinforcement learning.

The initial launch in September 2024 included two models:

OpenAI o1-preview -- excels at tackling sophisticated problems.
OpenAI o1-mini -- provides a smaller, more cost-efficient version of o1.

In December 2024, OpenAI graduated the o1-preview to become just o1 and introduced the o1 pro mode as part of the $200 ChatGPT Pro service tier.

The o1 model family

There are three models in the OpenAI o1 model family, and each model is designed to meet a specific target use case.

o1

The full o1 model is the graduated version of the original o1-preview release. According to OpenAI, the release version introduces significant improvements, including a 34% reduction in major errors on difficult problems. It also includes the ability to analyze and respond to uploaded images.

o1-mini

The o1-mini model is a small version of the primary o1 model, optimized for speed and efficiency while maintaining strong performance metrics. According to OpenAI, o1-mini does particularly well at coding tasks, making it a good choice for developers and programmers who need quick, reliable responses.

o1 pro mode

The o1 pro mode is the most powerful iteration of the OpenAI reasoning model family. This premium version uses additional computing power to improve performance across multiple challenging benchmarks. According to OpenAI, o1 pro mode had an 86% pass rate on American Invitational Mathematics Examination (AIME) 2024 math competitions, compared to 78% for standard o1.

Some queries can take more time than ChatGPT users have grown to expect. To help manage expectations, the o1 pro mode also provides a progress bar and a notification system for long-running queries to keep users updated.

But all that power comes at a cost. The o1 pro mode is exclusively available through OpenAI's high-end ChatGPT Pro subscription, which costs $200 per month.

What can OpenAI o1 do?

OpenAI o1 can perform many tasks like any of OpenAI's other GPT models -- such as answering questions, summarizing content and generating new content.

As an advanced reasoning model, o1 is particularly well-suited for certain tasks and use cases, including the following:

Enhanced reasoning. The o1 models are optimized for complex reasoning tasks, especially in STEM (science, technology, engineering and mathematics).
Brainstorming and ideation. The model's advanced reasoning abilities make it useful for generating creative ideas and solutions in various contexts.
Scientific research. The o1 models are ideal for different types of scientific research tasks. For example, o1 can annotate cell sequencing data and handle complex mathematical formulas needed in fields such as quantum optics.
Coding. The o1 models are effective at generating and debugging code, performing well in coding benchmarks such as HumanEval and Codeforces, according to OpenAI. The models are also effective in helping build and execute multi-step workflows for developers.
Mathematics. According to OpenAI, o1 excels in math-related benchmarks, outscoring the company's prior models. On the American Invitational Mathematics Examination (AIME) benchmark, o1 pro mode scored 86%, while standard o1 scored 78%. The model's math capabilities could potentially be used to help generate complex mathematical formulas for physicists.
Self-fact-checking. The o1 models can self-fact-check, improving the accuracy of its responses.
Image analysis capabilities. The o1 models provide advanced image analysis capabilities, letting users upload images and receive detailed responses. For example, users can upload photos of objects such as birdhouses and receive building instructions, or submit sketches for data center designs and receive detailed technical feedback.

How to use OpenAI o1

There are several ways users and organizations can use the o1 models.

ChatGPT Plus, Team Enterprise and Education users. The o1 and o1-mini models are available directly for users of ChatGPT Plus, Team, Enterprise and Education subscribers. Users can select the model manually in the model picker.
ChatGPT Pro users. The ChatGPT Pro tier at $200 a month is the initial exclusive home to the o1 pro model. ChatGPT Pro also includes a grant program providing free access to leading medical researchers, with initial grants awarded to researchers at institutions including Boston Children's Hospital, Berkeley Lab and The Jackson Laboratory.
API developers. Developers can access o1 and o1-mini through OpenAI's API.
Third-party services. Multiple third-party services have made the models available, including Microsoft Azure AI Studio and GitHub Models.

What are the limitations of OpenAI o1

As a new type of LLM, there are several limitations to the OpenAI o1 model, including the following:

Feature gaps. The o1 models lack web browsing, though it is a planned future capability.
API restrictions. At launch, there are a variety of restrictions on the API limiting the models. OpenAI has announced plans to expand o1's API functionality to include enhanced features such as function calling and structured outputs in future updates.
Response time. OpenAI users have come to expect rapid responses with little delay. But the o1 models are initially slower than previous models due to more thorough reasoning processes.
Cost. For API users OpenAI o1 is more expensive than previous models -- including GPT-4o.

How OpenAI o1 improves safety

As part of the o1 models release, OpenAI also publicly released a System Card, which is a document that describes the safety evaluations and risk assessments that were done during model development. It details how the models were evaluated using OpenAI's framework for assessing risks in areas such as cybersecurity, persuasion and model autonomy.

Chain-of-thought reasoning. The o1 models use large-scale reinforcement learning to perform complex reasoning before responding. This lets them refine the generation process and recognize mistakes. As a result, they can better follow specific guidelines and model policies, improving their ability to provide safe and appropriate content.
Advanced jailbreak resistance. The o1 models demonstrate significant improvements in resisting jailbreaks. On the Strong Reject benchmark, which tests resistance against common attacks from literature, o1 and o1-mini achieve better scores than GPT-4o.
Improved content policy adherence. On the Challenging Refusal Evaluation, which tests the model's ability to refuse unsafe content across categories such as harassment, hate speech and illicit activities, o1 achieves a not-unsafe score of 0.92, which is superior to GPT-4o's 0.713.
Enhanced bias mitigation. On the Bias Benchmark for QA evaluation, which tests for demographic fairness, o1 selects the correct answer 94% of the time on unambiguous questions, compared to GPT-4o's 72%. The models also show improved performance on evaluations measuring the use of race, gender and age in decision-making, with o1 generally outperforming GPT-4o.
Legible safety monitoring. The chain-of-thought summaries provided by o1 models offer a new approach for safety monitoring. In an analysis of 100,000 synthetic prompts, only 0.17% of o1's responses were flagged as deceptive, with most of these being forms of hallucination rather than intentional deception.

GPT-4o vs. OpenAI o1

The following chart provides a comparison of OpenAI's GPT-4o and o1 models, showing a number of differences across them.

Feature	GPT-4o	o1 models
Release date	May 13, 2024	Dec. 5, 2024
Model variants	Single model	Three variants: o1, o1-mini and o1 pro
Reasoning capabilities	Good performance	Enhanced reasoning, especially in STEM fields
Performance benchmarks	13% on Mathematics Olympiad	86% on Mathematics Olympiad, PhD-level accuracy in STEM
Multimodal capabilities	Handles text, images, audio and video	Handles text and images
Context window	128K tokens	128K tokens
Speed	Twice as fast as previous models	Slower due to more reasoning processes
Availability	Widely available across OpenAI products	Limited access for specific users
Features	Includes web browsing, file uploads	Lacks some features from GPT-4o, such as web browsing
Safety and alignment	Focused on safety measures	Improved safety measures, higher resistance to jailbreaking

Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.

Dig Deeper on Data analytics and AI

Search Networking

What is multi-access edge computing? Benefits and use cases
Multi-access edge computing (MEC) is a network architecture concept that brings cloud computing capabilities and IT services ...
What is 5G?
Fifth-generation wireless or 5G is a global standard and technology for wireless and telecommunications networks.
What is a small cell in wireless networks?
A small cell is a type of low-power cellular radio access point or base station that provides wireless service within a limited ...

Search Security

What is identity and access management? Guide to IAM
No longer just a good idea, IAM is a crucial piece of the cybersecurity puzzle. It's how an organization regulates access to ...
What is data masking?
Data masking is a security technique that modifies sensitive data in a data set so it can be used safely in a non-production ...
What is antivirus software?
Antivirus software (antivirus program) is a security program designed to prevent, detect, search and remove viruses and other ...

Search CIO

What is a chief data officer (CDO)?
A chief data officer (CDO) in many organizations is a C-level executive whose position has evolved into a range of strategic data...
What is user-generated content?
User-generated content (UGC) is published information that an unpaid contributor provides to a website.
What is business process outsourcing (BPO)?
Business process outsourcing (BPO) is a business practice in which an organization contracts with an external service provider to...

Search HRSoftware

What is compensation management?
Compensation management is the discipline and process for determining employees' appropriate pay, incentives, rewards, bonuses ...
What is HR technology (human resources tech)?
HR technology (human resources tech) refers to the hardware and software that support an organization's human resource management...
What is core HR (core human resources)?
Core HR (core human resources) is an umbrella term that refers to the essential, mandatory and fundamental tasks and functions of...

Search Customer Experience

What are virtual agents and how are they being used?
A virtual agent is an AI-powered software application or service that interacts with humans or other digital systems in a ...
Customer acquisition cost (CAC): How to calculate and reduce it
Customer acquisition cost (CAC) is the cost associated with convincing a consumer to buy your product or service, including ...
What is direct marketing?
Direct marketing is a type of advertising campaign that seeks to elicit an action (such as an order, a visit to a store or ...

Close