Let’s say you have a website that does some heavy lifting. This means that you have designed a backend and hosted it on your web server. Now, you might want to run some processes periodically like generating thumbnails or enriching data in the background. The reason for this is that we don’t want to interfere with the user interface when you run these processes. It should happen somewhere in the background and it should happen automatically. Unix-based systems have a great program for this called ‘cron’. It allows tasks to automatically run in the background at regular intervals. You could also use it to automatically create backups, synchronize files, schedule updates, and much more. So how to we set this up?
Welcome to the world of crontab
The crontab command is found in all the Unix-like operating systems like Ubuntu, Mac OS X, etc. It is used to schedule commands to be executed periodically. Crontab is short for cron table. A cron table contains a list of commands that will be executed periodically. These processes are called cronjobs. To see what cronjobs are currently running on your system, you can open a terminal and run:
$ sudo crontab -l
To edit the list of cronjobs, you need to run:
$ sudo crontab -e
This will open the default vi editor to let us manipulate the crontab. If you save and exit the editor, all your cronjobs are saved into crontab. You can only execute crontab if your name appears in the file /usr/lib/cron/cron.allow. If that file does not exist, you can use crontab if your name does not appear in the file /usr/lib/cron/cron.deny. If only cron.deny exists and is empty, all users can use crontab. If neither file exists, only the root user can use crontab. The allow/deny files consist of one user name per line. It’s a good way to prevent a random person from scheduling a cronjob on someone else’s machine!
Once you open your crontab file with “sudo crontab -e”, we are ready to add some jobs. A crontab file has five fields, using which you can specify day, date and time followed by the command to be run at that interval. Each of the sections is separated by a space, with the final section having one or more spaces in it. No spaces are allowed within sections 1-5, only between them. Sections 1-5 are used to indicate when and how often you want the task to be executed. This is how a cron job is laid out:
min hr dd mo wk command
The expanded form is:
minute (0-59), hour (0-23, 0 = midnight), day (1-31), month (1-12), weekday (0-6, 0 = Sunday), command
For example, you can do something like this:
21 06 12 3 1 python /usr/bin/myfolder/myscript.py
The above example will run “python /usr/bin/myfolder/myscript.py” at 6:21 AM on March 12 plus every Monday in March. An asterisk (*) can be used so that every instance (every hour, every weekday, every month, etc.) of a time period is used. For example, you can do this:
21 06 * * * python /usr/bin/myfolder/myscript.py
The above example will run “python /usr/bin/myfolder/myscript.py” at 6:21 AM on every day of every month.
Comma-separated values can be used to run more than one instance of a particular command within a time period. Hyphen-separated values can be used to run a command continuously. For example:
01,31 04,05 1-15 1,6 * python /usr/bin/myfolder/myscript.py
The above example will run “python /usr/bin/myfolder/myscript.py” at 01 and 31, past the hours of 4:00am and 5:00am on the 1st through the 15th of every January and June.
The text “python /usr/bin/myfolder/myscript.py” indicates the task which will run at the specified times. It is recommended that you use the full path to the desired commands as shown in the above examples. The crontab will begin running as soon as it is properly edited and saved.
You may want to run a script some number of times per time unit. For example, if you want to run it every 10 minutes, use the following crontab entry (runs on minutes divisible by 10, like 0, 10, 20, 30, etc.)
*/10 * * * * python /usr/bin/myfolder/myscript.py
The above line is a more compact version of the line given below:
0,10,20,30,40,50 * * * * python /usr/bin/myfolder/myscript.py
For the first (minute) field, you can also put in a keyword instead of a number:
@reboot Run once, at startup @yearly Run once a year "0 0 1 1 *" @annually (same as @yearly) @monthly Run once a month "0 0 1 * *" @weekly Run once a week "0 0 * * 0" @daily Run once a day "0 0 * * *" @midnight (same as @daily) @hourly Run once an hour "0 * * * *"
To specify a editor to open crontab file:
$ export EDITOR=vi
To edit your crontab file, or create one if it doesn’t already exist:
To display your crontab file:
To remove your crontab file:
To display the last time you edited your crontab file:
We should know that ‘cron’ invokes the command from the user’s HOME directory with the shell, (/usr/bin/sh). cron supplies a default environment for every shell, defining:
HOME=user’s-home-directory LOGNAME=user’s-login-id PATH=/usr/bin:/usr/sbin SHELL=/usr/bin/sh
Users who desire to have their .profile executed must explicitly do so in the crontab entry or in a script called by the entry.
Storing the crontab output
By default, cron saves the output of /bin/execute/this/script.sh in the user’s mailbox (in this case, root). But it’s nicer if the output is saved in a separate log file. Here’s how we do it:
*/10 * * * * /bin/execute/this/script.sh >> /var/log/script_output.log 2>&1
Linux can report on different levels. There’s standard output (STDOUT) and standard errors (STDERR). STDOUT is marked 1, STDERR is marked 2. So the following statement tells Linux to store STDERR in STDOUT as well, creating one data stream for messages and errors:
Now that we have 1 output stream, we can pour it into a file. The symbol ‘>’ will overwrite the file, and the symbol ‘>>’ will append to the file. In this case we’d like to to append:
By default, cronjobs sends a email to the user account executing the cronjob. If this is not needed, then put the following command at the end of the cron job line:
You can also configure crontab to forward all output to a real email address by starting your crontab with the following line:
Generate log file
To collect the cron execution execution log in a file :
30 18 * * * rm /home/someuser/tmp/* > /home/someuser/cronlogs/clean_tmp_dir.log
There are a million other things you can do with it. Now that you are familiar with the basics, you can play around and discover them by yourself.