Last changed
Contact our team to test out this image for free. Please also indicate any other images you would like to evaluate.
Apache Beam is a unified programming model for Batch and Streaming data processing.
Chainguard Containers are regularly-updated, secure-by-default container images.
For those with access, this container image is available on cgr.dev:
Be sure to replace the ORGANIZATION placeholder with the name used for your organization's private repository within the Chainguard Registry.
Get started with the Beam Python SDK quickstart to set up your Python development environment, get the Beam SDK for Python, and run an example pipeline.
To get you quickly started, we are giving you an example that uses DirectRunner. The Apache Beam examples directory has many examples. All examples can be run locally by passing the required arguments described in the example script.
For example, run wordcount.py with the following command:
Here ${OUTPUT_DIR} is any directory you want the output to be copied at, the output would be of format ${OUTPUT_DIR}/part-00000-of-00001, you can ls the content of the output and that will show you the word count map of each word and how many times it occured in that input file.
We have another example of using PortableRunner documented in our TESTING.md.
Chainguard's free tier of Starter container images are built with Wolfi, our minimal Linux undistro.
All other Chainguard Containers are built with Chainguard OS, Chainguard's minimal Linux operating system designed to produce container images that meet the requirements of a more secure software supply chain.
The main features of Chainguard Containers include:
For cases where you need container images with shells and package managers to build or debug, most Chainguard Containers come paired with a development, or -dev, variant.
In all other cases, including Chainguard Containers tagged as :latest or with a specific version number, the container images include only an open-source application and its runtime dependencies. These minimal container images typically do not contain a shell or package manager.
Although the -dev container image variants have similar security features as their more minimal versions, they include additional software that is typically not necessary in production environments. We recommend using multi-stage builds to copy artifacts from the -dev variant into a more minimal production image.
To improve security, Chainguard Containers include only essential dependencies. Need more packages? Chainguard customers can use Custom Assembly to add packages, either through the Console, chainctl, or API.
To use Custom Assembly in the Chainguard Console: navigate to the image you'd like to customize in your Organization's list of images, and click on the Customize image button at the top of the page.
Refer to our Chainguard Containers documentation on Chainguard Academy. Chainguard also offers VMs and Libraries — contact us for access.
This software listing is packaged by Chainguard. The trademarks set forth in this offering are owned by their respective companies, and use of them does not imply any affiliation, sponsorship, or endorsement by such companies.
Chainguard container images contain software packages that are direct or transitive dependencies. The following licenses were found in the "latest" tag of this image:
Apache-2.0
BSD-1-Clause
BSD-2-Clause
BSD-3-Clause
BSD-4-Clause-UC
Bitstream-Vera
CC-PDDC
For a complete list of licenses, please refer to this Image's SBOM.
Software license agreement