Running Cron Jobs in Container Environments

By Darren Broemmer on 15 MAR 2021 4:11 PM

For decades, engineers have been using the system time-based job scheduler cron to manage processes that run on a periodic basis. As applications move to container-based infrastructures, teams are faced with the decision of how to implement scheduled jobs in a Kubernetes environment. Because containers provide a virtualization of the underlying operating system, the use of system services requires additional consideration. Fortunately, there are some very straightforward solutions for this common scenario.

What is Cron?

Cron automates running shell processes using definitions supplied in a crontab, or cron table. The format for this is shown below. You can use a single crontab, or multiple files typically located in the /etc/cron.d directory.

# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │                                 7 is also Sunday on some systems)
# * * * * * <command to execute>

This file defines the jobs to be run and their schedule. An asterisk denotes every increment of that unit of time, so in the example above, the command specified would run every minute of every hour of every day, etc. Specifying numbers in these slots denotes specific instances of the given time unit when the process should be fun.

We will cover three options in this article to implement cron in a container-based environment.

  1. Install and configure cron in your Docker image
  2. Use the Kubernetes CronJob mechanism
  3. Use a software process implementation such as Cronenberg

Run Cron Using Docker

The first approach uses Docker to set up cron on the container image. When looking at the individual container, this will look and run almost the same as it would on a bare metal server or virtual machine. However, there are a few important differences to consider.

First, let’s examine how to set this up. The Dockerfile is used to put the crontab files in place and run the system service. As an example, consider the crontab file shown below which is located in the root directory of your project.

* * * * * root echo "Hello world" >> /var/log/cron.log 2>&1
# Cron requires newline characters at the end of each entry, so leave this here

The following commands are used in the Dockerfile to configure and run the cron service.

RUN apt-get update && apt-get -y install cron

ADD crontab /etc/cron.d/my-cron-file
RUN chmod 0644 /etc/cron.d/my-cron-file
RUN crontab /etc/cron.d/my-cron-file
RUN touch /var/log/cron.log

CMD cron && tail -f /var/log/cron.log

At this point, you can modify the crontab file in your repository and redeploy to make changes. Containers have a more ephemeral life cycle than their virtual machine counterparts, so keep this in mind when using this option. Your cron jobs may be interrupted more often by a container instance starting or stopping. You may also have multiple container instances running if the resource utilization is high enough, so make sure your jobs are prepared to deal with contention and idempotence.

For these reasons and others, this approach may not be the right fit for your architecture. Thus, container orchestration platforms offer mechanisms to accomplish the same goal.

Use the Kubernetes CronJob Mechanism

Kubernetes itself offers a CronJob mechanism. Similar to other Kubernetes features, the CronJob is created in a manifest, an example of which is shown below.

apiVersion: batch/v1beta1
  kind: CronJob
  metadata:
    name: hello
  spec:
    schedule: "*/1 * * * *"
    jobTemplate:
      spec:
        template:
          spec:
            containers:
              - name: hello
                image: busybox
                imagePullPolicy: IfNotPresent
                command:
                  - /bin/sh
                  - -c
                  - date; echo Hello from the Kubernetes cluster
                restartPolicy: OnFailure

The format itself is a bit cumbersome, but will look familiar to those dealing with Kubernetes manifest files. Note that you specify the container image on which the scheduled jobs run.

You do not have the same strong guarantee as the native operating system service. The Kubernetes documentation states:

A cron job creates a job object about once per execution time of its schedule. We say “about” because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent.”

Once again, we see a slight difference and additional considerations in terms of designing and running these scheduled processes in a container environment.

Use a Software Process Solution such as Cronenberg

Cronenberg provides a software implementation of cron that complements your Twelve-Factor Application application. It runs as another process in your container, not as a system service. It is intended to be simple and portable, so it avoids the use of hard-coded locations for the equivalent of crontab files. Instead, it takes an argument with the location of the configuration file.

To run Cronenberg, use the following command in your Dockerfile, or Procfile for common Platforms-as-a-service (PaaS).

cronenberg ./config/cron-jobs.yml

The cron-jobs.yml file format is shown in the example below. It is much simpler than the Kubernetes manifest and a bit closer to the crontab format we are used to dealing with.

# This is just a normal job that runs every minute
- name: hello-world
  command: echo "Hello World"
  when: "* * * * *"

Additional Design Considerations

In each of these scenarios, your scheduled jobs run on container instances specifically deployed for this purpose. This leads to a few decision points.

  • Consider whether you should auto-scale these containers, or set their scale specifically to one. If your jobs are not idempotent or you need to avoid contention, set the scale to one. If you don’t have these issues and can leverage the scalability, then by all means use those containers to their potential.
  • In a strict Docker implementation, the Dockerfile will determine what user your cron daemon runs as, so be sure to consider permissions and related setup in your Dockerfile. Cron will also fail silently unless you stream the output to a log file, so minimally during development you will want to set up this logging.
  • Many scheduled processes take up very minimal amounts of resources, so consider packing related jobs on specific containers so that you can most efficiently use your infrastructure resources.
  • Cron jobs run operating system commands that have their own environment and configuration. If you want to leverage application components and services in your existing projects, consider using an asynchronous job framework such as Sidekiq for Ruby-on-Rails apps.

RECENT POSTS

  • Reduce Your Cloud Spend and Improve Performance with AWS gp3 EBS Volumes
  • CloudFix: Fix Your AWS Costs

RECENT POSTS

  • Using Forensic Psychology to Spot Problems in Your Code – Interview with Adam Tornhill
  • Direction-Autonomy Model

RECENT POSTS

  • Using Forensic Psychology to Spot Problems in Your Code – Interview with Adam Tornhill
  • Direction-Autonomy Model

RECENT POSTS

  • Improving Code Quality Without Sacrificing Development Velocity
  • Direction-Autonomy Model

RECENT POSTS

  • Resources for Getting Started with Ruby on Rails
  • Direction-Autonomy Model

RECENT POSTS

  • Running Background Jobs in Ruby on Rails Containers
  • The Ruby Unbundled Series: Designing and Launching New Features in Rails
  • Running Cron Jobs in Container Environments
  • Announcing the New Engine Yard Kontainers
  • Rails and Vue Js, Part 1

RECENT POSTS

  • Working Effectively with Unit Tests: Unit test best practices (Interview with Jay Fields)
  • 10X Programmer and other Myths in Software Engineering
  • Code Review Best Practices
  • Direction-Autonomy Model

RECENT POSTS

  • Learn All About Load Balancing Technologies
  • Why Is It Important to Use Load Balancing for Your Web Application?
  • Zero Downtime Database Migration to the Cloud
  • Load Balancing Methods Explained
  • What is a Load Balancer? Definition & Explanation

Subscribe now


    Comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *