What is Cron?
Cron automates running shell processes using definitions supplied in a crontab, or cron table. The format for this is shown below. You can use a single crontab, or multiple files typically located in the /etc/cron.d directory.
# ┌───────────── minute (0 - 59) # │ ┌───────────── hour (0 - 23) # │ │ ┌───────────── day of the month (1 - 31) # │ │ │ ┌───────────── month (1 - 12) # │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday; # │ │ │ │ │ 7 is also Sunday on some systems) # * * * * * <command to execute>
This file defines the jobs to be run and their schedule. An asterisk denotes every increment of that unit of time, so in the example above, the command specified would run every minute of every hour of every day, etc. Specifying numbers in these slots denotes specific instances of the given time unit when the process should be fun.
We will cover three options in this article to implement cron in a container-based environment.
- Install and configure cron in your Docker image
- Use the Kubernetes CronJob mechanism
- Use a software process implementation such as Cronenberg
Run Cron Using Docker
* * * * * root echo "Hello world" >> /var/log/cron.log 2>&1 # Cron requires newline characters at the end of each entry, so leave this hereThe following commands are used in the Dockerfile to configure and run the cron service.
RUN apt-get update && apt-get -y install cron ADD crontab /etc/cron.d/my-cron-file RUN chmod 0644 /etc/cron.d/my-cron-file RUN crontab /etc/cron.d/my-cron-file RUN touch /var/log/cron.log CMD cron && tail -f /var/log/cron.logAt this point, you can modify the crontab file in your repository and redeploy to make changes. Containers have a more ephemeral life cycle than their virtual machine counterparts, so keep this in mind when using this option. Your cron jobs may be interrupted more often by a container instance starting or stopping. You may also have multiple container instances running if the resource utilization is high enough, so make sure your jobs are prepared to deal with contention and idempotence. For these reasons and others, this approach may not be the right fit for your architecture. Thus, container orchestration platforms offer mechanisms to accomplish the same goal.
Use the Kubernetes CronJob Mechanism
Kubernetes itself offers a CronJob mechanism. Similar to other Kubernetes features, the CronJob is created in a manifest, an example of which is shown below.
apiVersion: batch/v1beta1 kind: CronJob metadata: name: hello spec: schedule: "*/1 * * * *" jobTemplate: spec: template: spec: containers: - name: hello image: busybox imagePullPolicy: IfNotPresent command: - /bin/sh - -c - date; echo Hello from the Kubernetes cluster restartPolicy: OnFailure
The format itself is a bit cumbersome, but will look familiar to those dealing with Kubernetes manifest files. Note that you specify the container image on which the scheduled jobs run.
You do not have the same strong guarantee as the native operating system service. The Kubernetes documentation states:
“A cron job creates a job object about once per execution time of its schedule. We say “about” because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent.”
Once again, we see a slight difference and additional considerations in terms of designing and running these scheduled processes in a container environment.
Use a Software Process Solution such as Cronenberg
Cronenberg provides a software implementation of cron that complements your Twelve-Factor Application application. It runs as another process in your container, not as a system service. It is intended to be simple and portable, so it avoids the use of hard-coded locations for the equivalent of crontab files. Instead, it takes an argument with the location of the configuration file.
To run Cronenberg, use the following command in your Dockerfile, or Procfile for common Platforms-as-a-service (PaaS).
The cron-jobs.yml file format is shown in the example below. It is much simpler than the Kubernetes manifest and a bit closer to the crontab format we are used to dealing with.
# This is just a normal job that runs every minute - name: hello-world command: echo "Hello World" when: "* * * * *"
Additional Design Considerations
In each of these scenarios, your scheduled jobs run on container instances specifically deployed for this purpose. This leads to a few decision points.
- Consider whether you should auto-scale these containers, or set their scale specifically to one. If your jobs are not idempotent or you need to avoid contention, set the scale to one. If you don’t have these issues and can leverage the scalability, then by all means use those containers to their potential.
- In a strict Docker implementation, the Dockerfile will determine what user your cron daemon runs as, so be sure to consider permissions and related setup in your Dockerfile. Cron will also fail silently unless you stream the output to a log file, so minimally during development you will want to set up this logging.
- Many scheduled processes take up very minimal amounts of resources, so consider packing related jobs on specific containers so that you can most efficiently use your infrastructure resources.
- Cron jobs run operating system commands that have their own environment and configuration. If you want to leverage application components and services in your existing projects, consider using an asynchronous job framework such as Sidekiq for Ruby-on-Rails apps.