/
DirectorySecurity AdvisoriesPricing
Sign in
Directory
tiktoken logo

tiktoken

Last changed

Request a free trial

Contact our team to test out this image for free. Please also indicate any other images you would like to evaluate.

Tags
Overview
Comparison
Provenance
Specifications
SBOM
Vulnerabilities
Advisories

Chainguard Container for tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models

Chainguard Containers are regularly-updated, secure-by-default container images.

Download this Container Image

For those with access, this container image is available on cgr.dev:

docker pull cgr.dev/ORGANIZATION/tiktoken:latest

Be sure to replace the ORGANIZATION placeholder with the name used for your organization's private repository within the Chainguard Registry.

Compatibility Notes

The tiktoken container provides the core functionality of OpenAI's tiktoken library with the following differences:

  • Unlike the upstream Python package installation, this container image provides tiktoken as a ready-to-use Python environment
  • The container includes the latest python and the tiktoken library pre-installed, eliminating the need for separate package management
  • It uses Chainguard's minimal base image for enhanced security and reduced attack surface

The container maintains full compatibility with the upstream tiktoken API and functionality.

Getting Started

The tiktoken container runs Python as its entrypoint, allowing you to use tiktoken directly in your Python applications. You can run interactive Python sessions or execute Python scripts that use tiktoken.

Basic Usage

Run an interactive Python session with tiktoken:

docker run -it cgr.dev/ORGANIZATION/tiktoken
Python 3.13.x (main, ...) 
>>> import tiktoken
>>> enc = tiktoken.get_encoding("o200k_base")
>>> enc.encode("hello world")
[15339, 1917]
>>> enc.decode([15339, 1917])
'hello world'

Text Tokenization

You can tokenize text using different encoding models:

docker run cgr.dev/ORGANIZATION/tiktoken python3 -c "
import tiktoken

# Get encoding for GPT-4o
enc = tiktoken.encoding_for_model('gpt-4o')
text = 'tiktoken is great!'
tokens = enc.encode(text)
print(f'Text: {text}')
print(f'Tokens: {tokens}')
print(f'Token count: {len(tokens)}')
print(f'Decoded: {enc.decode(tokens)}')
"
Text: tiktoken is great!
Tokens: [83, 1609, 5963, 374, 2294, 0]
Token count: 6
Decoded: tiktoken is great!

Token Counting for Cost Estimation

Use tiktoken to count tokens before making OpenAI API calls:

docker run cgr.dev/ORGANIZATION/tiktoken python3 -c "
import tiktoken

def count_tokens(text, model='gpt-4o'):
    enc = tiktoken.encoding_for_model(model)
    return len(enc.encode(text))

text = 'This is a sample text for token counting.'
tokens = count_tokens(text)
print(f'Text: {text}')
print(f'Token count: {tokens}')
"
Text: This is a sample text for token counting.
Token count: 9

Documentation and Resources

  • OpenAI tiktoken GitHub Repository - Official source code and documentation
  • OpenAI Cookbook: How to count tokens with tiktoken - Comprehensive examples and use cases
  • OpenAI Tokenizer - Interactive web tool for testing tokenization
  • OpenAI API Documentation - Official API documentation
  • Chainguard Academy: Python - Working with Python in Chainguard containers

What are Chainguard Containers?

Chainguard's free tier of Starter container images are built with Wolfi, our minimal Linux undistro.

All other Chainguard Containers are built with Chainguard OS, Chainguard's minimal Linux operating system designed to produce container images that meet the requirements of a more secure software supply chain.

The main features of Chainguard Containers include:

For cases where you need container images with shells and package managers to build or debug, most Chainguard Containers come paired with a development, or -dev, variant.

In all other cases, including Chainguard Containers tagged as :latest or with a specific version number, the container images include only an open-source application and its runtime dependencies. These minimal container images typically do not contain a shell or package manager.

Although the -dev container image variants have similar security features as their more minimal versions, they include additional software that is typically not necessary in production environments. We recommend using multi-stage builds to copy artifacts from the -dev variant into a more minimal production image.

Need additional packages?

To improve security, Chainguard Containers include only essential dependencies. Need more packages? Chainguard customers can use Custom Assembly to add packages, either through the Console, chainctl, or API.

To use Custom Assembly in the Chainguard Console: navigate to the image you'd like to customize in your Organization's list of images, and click on the Customize image button at the top of the page.

Learn More

Refer to our Chainguard Containers documentation on Chainguard Academy. Chainguard also offers VMs and Librariescontact us for access.

Trademarks

This software listing is packaged by Chainguard. The trademarks set forth in this offering are owned by their respective companies, and use of them does not imply any affiliation, sponsorship, or endorsement by such companies.

Licenses

Chainguard container images contain software packages that are direct or transitive dependencies. The following licenses were found in the "latest" tag of this image:

  • Apache-2.0

  • BSD-1-Clause

  • BSD-2-Clause

  • BSD-3-Clause

  • BSD-4-Clause-UC

  • CC-PDDC

  • GCC-exception-3.1

For a complete list of licenses, please refer to this Image's SBOM.

Software license agreement

Category
application

Safe Source for Open Source™
Contact us
© 2025 Chainguard. All Rights Reserved.
Private PolicyTerms of Use

Product

Chainguard ContainersChainguard LibrariesChainguard VMsIntegrationsPricing