/
DirectorySecurity AdvisoriesPricing
Sign inRequest a trial
Directory
tiktoken logo

tiktoken

Last changed

Request a free trial

Contact our team to test out this image for free. Please also indicate any other images you would like to evaluate.

Request trial
Tags
Overview
Comparison
Provenance
Specifications
SBOM
Vulnerabilities
Advisories

Chainguard Container for tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models

Chainguard Containers are regularly-updated, secure-by-default container images.

Download this Container Image

For those with access, this container image is available on cgr.dev:

docker pull cgr.dev/ORGANIZATION/tiktoken:latest

Be sure to replace the ORGANIZATION placeholder with the name used for your organization's private repository within the Chainguard Registry.

Compatibility Notes

The tiktoken container provides the core functionality of OpenAI's tiktoken library with the following differences:

  • Unlike the upstream Python package installation, this container image provides tiktoken as a ready-to-use Python environment
  • The container includes the latest python and the tiktoken library pre-installed, eliminating the need for separate package management
  • It uses Chainguard's minimal base image for enhanced security and reduced attack surface

The container maintains full compatibility with the upstream tiktoken API and functionality.

Getting Started

The tiktoken container runs Python as its entrypoint, allowing you to use tiktoken directly in your Python applications. You can run interactive Python sessions or execute Python scripts that use tiktoken.

Basic Usage

Run an interactive Python session with tiktoken:

docker run -it cgr.dev/ORGANIZATION/tiktoken
Python 3.13.x (main, ...) 
>>> import tiktoken
>>> enc = tiktoken.get_encoding("o200k_base")
>>> enc.encode("hello world")
[15339, 1917]
>>> enc.decode([15339, 1917])
'hello world'

Text Tokenization

You can tokenize text using different encoding models:

docker run cgr.dev/ORGANIZATION/tiktoken python3 -c "
import tiktoken

# Get encoding for GPT-4o
enc = tiktoken.encoding_for_model('gpt-4o')
text = 'tiktoken is great!'
tokens = enc.encode(text)
print(f'Text: {text}')
print(f'Tokens: {tokens}')
print(f'Token count: {len(tokens)}')
print(f'Decoded: {enc.decode(tokens)}')
"
Text: tiktoken is great!
Tokens: [83, 1609, 5963, 374, 2294, 0]
Token count: 6
Decoded: tiktoken is great!

Token Counting for Cost Estimation

Use tiktoken to count tokens before making OpenAI API calls:

docker run cgr.dev/ORGANIZATION/tiktoken python3 -c "
import tiktoken

def count_tokens(text, model='gpt-4o'):
    enc = tiktoken.encoding_for_model(model)
    return len(enc.encode(text))

text = 'This is a sample text for token counting.'
tokens = count_tokens(text)
print(f'Text: {text}')
print(f'Token count: {tokens}')
"
Text: This is a sample text for token counting.
Token count: 9

Documentation and Resources

  • OpenAI tiktoken GitHub Repository - Official source code and documentation
  • OpenAI Cookbook: How to count tokens with tiktoken - Comprehensive examples and use cases
  • OpenAI Tokenizer - Interactive web tool for testing tokenization
  • OpenAI API Documentation - Official API documentation
  • Chainguard Academy: Python - Working with Python in Chainguard containers

What are Chainguard Containers?

Chainguard Containers are minimal container images that are secure by default.

In many cases, the Chainguard Containers tagged as :latest contain only an open-source application and its runtime dependencies. These minimal container images typically do not contain a shell or package manager. Chainguard Containers are built with Wolfi, our Linux undistro designed to produce container images that meet the requirements of a more secure software supply chain.

The main features of Chainguard Containers include:

For cases where you need container images with shells and package managers to build or debug, most Chainguard Containers come paired with a -dev variant.

Although the -dev container image variants have similar security features as their more minimal versions, they feature additional software that is typically not necessary in production environments. We recommend using multi-stage builds to leverage the -dev variants, copying application artifacts into a final minimal container that offers a reduced attack surface that won’t allow package installations or logins.

Learn More

To better understand how to work with Chainguard Containers, please visit Chainguard Academy and Chainguard Courses.

In addition to Containers, Chainguard offers VMs and Libraries. Contact Chainguard to access additional products.

Trademarks

This software listing is packaged by Chainguard. The trademarks set forth in this offering are owned by their respective companies, and use of them does not imply any affiliation, sponsorship, or endorsement by such companies.

Licenses

Chainguard container images contain software packages that are direct or transitive dependencies. The following licenses were found in the "latest" tag of this image:

  • Apache-2.0

  • BSD-1-Clause

  • BSD-2-Clause

  • BSD-3-Clause

  • BSD-4-Clause-UC

  • CC-PDDC

  • GPL-1.0-only

For a complete list of licenses, please refer to this Image's SBOM.

Software license agreement

Category
application

Safe Source for Open Sourceâ„¢
Contact us
© 2025 Chainguard. All Rights Reserved.
Private PolicyTerms of Use

Products

Chainguard ContainersChainguard LibrariesChainguard VMs