Database timed tasks can be used to implement regular backups, statistics collection, data aggregation, data cleaning and optimization, etc. PostgreSQL does not provide built-in task scheduling features similar to Oracle, MySQL and Microsoft SQL Sever, so this article introduces you to 4 ways to implement timed tasks in PostgreSQL database.

OS Timed Tasks

Linux timed tasks (crontab) or Windows Task Scheduler provides us with a traditional way to implement timed tasks. Taking crontab as an example, we can edit the task list using the following command.

1
crontab -e

Then add a line of data in the open file using the following format.

1
2
3
#分钟     小时   月份中的某一天  月份     星期          命令
#(0-59)  (0-23) (1-31)         (1-12)  (0-7 [7 or 0 == Sunday])
<minute> <hour> <day of month> <month> <day of week> <command>

The first five of these fields indicate when the command was executed, and you can use an asterisk (*) to match all times. For example, setting <month> to an asterisk indicates that the command is executed every month.

For example, enter the following to indicate that the database logical backup operation is executed at zero every day.

1
0 0 * * * pg_dump --no-password -U user db_name > backup.sql

For security reasons, instead of entering the password directly, you should add it to the .pgpass file and set the permissions of the file to be visible only to the current user.

1
chmod 600 .pgpass

pgAgent

pgAgent is a task scheduling agent for PostgreSQL databases, capable of running multi-step batch processes, shell scripts, and SQL commands based on complex scheduling plans. For Unix/Linux systems, pgAgent runs as a background process; for Windows systems, pgAgent runs as a service.

Install pgAgent

The PgAdmin 4 administration tool integrates with pgAgent, but both need to be installed separately. We can download PgAdmin 4 and pgAgent from the official website, and you can refer to the official documentation for installation steps and notes. After the installation is complete, we can see the “pgAgent Jobs” node in the left navigation tree of PgAdmin 4.

postgresql

Create a timed job

Right-click on the “pgAgent Jobs” node and select “Create” > “pgAgent Job” to create a new timed job.

sobyte

The “General” screen allows you to enter some basic information, including the name of the task. The “Steps” screen allows you to set multiple steps, including scripts or SQL statements to be executed. The “Schedules” screen is used to define the time schedule for the task execution. The “SQL” screen displays the statements that will be used to create or modify the task.

sobyte

Click the “Save” button to save the settings and create the job, then we can see the created job under the “pgAgent Job” node.

pg_cron

pg_cron is a PostgreSQL timed task plugin (similar to DBMS_SCHEDULER in Oracle) developed by citusdata. pg_cron runs as a background worker, using a cron-like editing syntax, allowing timed tasks to be executed directly in the database. Example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- 每周六 3:30am (GMT) 删除历史记录
SELECT cron.schedule('30 3 * * 6', $$DELETE FROM events WHERE event_time < now() - interval '1 week'$$);
 schedule
----------
       42

-- 每天 10:00am (GMT) 执行清理作业
SELECT cron.schedule('nightly-vacuum', '0 10 * * *', 'VACUUM');
 schedule
----------
       43

-- 将清理作业修改为 3:00am (GMT)
SELECT cron.schedule('nightly-vacuum', '0 3 * * *', 'VACUUM');
 schedule
----------
       43

-- 停止计划中的任务
SELECT cron.unschedule('nightly-vacuum' );
 unschedule 
------------
 t
(1 row)

SELECT cron.unschedule(42);
 unschedule
------------
          t

Install pg_cron

pg_cron currently only supports Linux operating systems. For Red Hat, CentOS, and Fedora operating systems the following commands can be used to install (PostgreSQL 12).

1
sudo yum install -y pg_cron_12

For Debian and Ubuntu the following command can be used to install (PostgreSQL 12).

1
sudo apt-get -y install postgresql-12-cron

Alternatively, we can use the source code for compilation and installation.

1
2
3
4
5
git clone https://github.com/citusdata/pg_cron.git
cd pg_cron
# Ensure pg_config is in your path, e.g.
export PATH=/usr/pgsql-12/bin:$PATH
make && sudo PATH=$PATH make install

Configuring pg_cron

In order to run the pg_cron background worker process when starting PostgreSQL, we need to add pg_cron to the shared_preload_libraries configuration entry in the postgresql.conf file. By default, the pg_cron background process uses the postgres database to get the metadata it needs. However, we can also set this using the cron.database_name configuration parameter.

1
2
shared_preload_libraries = 'pg_cron'
cron.database_name = 'postgres'

Restarting PostgreSQL.

1
sudo service postgresql-12 restart

We can then use the following command to create the pg_cron function and the metadata related tables.

1
2
3
4
5
-- 使用 superuser 运行以下命令
CREATE EXTENSION pg_cron;

-- 可选操作,为其他用户授予访问权限
GRANT USAGE ON SCHEMA cron TO username;

pg_timetable

pg_timetable is a PostgreSQL job scheduler developed by CYBERTEC that provides a flexible configuration approach and many advanced features. These include task chains consisting of multiple tasks, support for SQL commands and executables, built-in tasks (e.g. sending emails), fully database-based configuration and logging capabilities, cron-style scheduled scheduling, protection against concurrent execution, etc.

sobyte

Install pg_timetable

First, we can install pg_timetable using the official binary installer, which currently supports Windows, Linux, and macOS operating systems.

Alternatively, the official docker image can be downloaded here.

The latest tab of the master branch is the latest version, and is run using the command line as follows.

1
2
3
docker run --rm \
  cybertecpostgresql/pg_timetable:latest \
  -h 10.0.0.3 -p 54321 -c worker001

Specify the environment variables in the following way.

1
2
3
4
5
docker run --rm \
  -e PGTT_PGHOST=10.0.0.3 \
  -e PGTT_PGPORT=54321 \
  cybertecpostgresql/pg_timetable:latest \
  -c worker001

In addition, we can also use the source code to compile and install. First download and install the Go language environment, and then copy the pg_timetable source code using the go get command.

1
2
3
$ env GIT_TERMINAL_PROMPT=1 go get github.com/cybertec-postgresql/pg_timetable/
Username for 'https://github.com': <Github Username>
Password for 'https://cyberboy@github.com': <Github Password>

Run pg_timetable.

1
2
$ cd ~/go/src/github.com/cybertec-postgresql/pg_timetable/
$ go run main.go --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpwd

Alternatively, it can be compiled into a binary program and run.

1
2
$ go build
$ ./pg_timetable --dbname=dbname --clientname=worker001 --user=scheduler --password=strongpwd

If you want to run all the tests in your project, you can execute the following command.

1
2
3
$ cd ~/go/src/github.com/cybertec-postgresql/pg_timetable/
$ go get github.com/stretchr/testify/
$ go test ./...

You can also run tests using a postgres docker image.

1
$ RUN_DOCKER=true go test ./...

Using pg_timetable

pg_timetable runs independently of the PostgreSQL server and is equivalent to a client process. After installation, run the pg_timetable program by executing the following command.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# ./pg_timetable

Application Options:
  -c, --clientname=               Unique name for application instance
  -v, --verbose                   Show verbose debug information [$PGTT_VERBOSE]
  -h, --host=                     PG config DB host (default: localhost) [$PGTT_PGHOST]
  -p, --port=                     PG config DB port (default: 5432) [$PGTT_PGPORT]
  -d, --dbname=                   PG config DB dbname (default: timetable) [$PGTT_PGDATABASE]
  -u, --user=                     PG config DB user (default: scheduler) [$PGTT_PGUSER]
  -f, --file=                     SQL script file to execute during startup
      --password=                 PG config DB password (default: somestrong) [$PGTT_PGPASSWORD]
      --sslmode=[disable|require] What SSL priority use for connection (default: disable)
      --pgurl=                    PG config DB url [$PGTT_URL]
      --init                      Initialize database schema and exit. Can be used with --upgrade
      --upgrade                   Upgrade database to the latest version
      --no-program-tasks          Disable executing of PROGRAM tasks [$PGTT_NOPROGRAMTASKS]

A number of examples are provided in the sample directory of the source code for reference. The following command creates a timed task that runs “MyJob” at 00:05 in August.

1
SELECT timetable.job_add('MyJob', 'SELECT public.my_func()' , NULL, 'SQL', '5 0 * 8 *', live := TRUE);

The following command runs the “MyJob” task at 23 minutes of every two hours from 0:00 to 20:00.

1
SELECT timetable.job_add('MyJob', 'SELECT public.my_func()' , NULL, 'SQL', '23 0-20/2 * * *', live := TRUE);

The complete configuration of the pg_timetable scheduled task consists of 3 phases.

  • The first phase is used to configure the base_task, defining the operations that need to be executed. This includes SQL statements, external procedures, and built-in operations.

  • The second stage is used to configure task_chain, which defines a set of basic tasks to be executed sequentially.

  • The third stage is used to configure chain_execution_config, which defines the execution plan of the task chain.

    sobyte

In addition, in order to pass control parameters to the basic tasks, each task in the task chain can be accompanied by an execution parameter. For detailed configuration methods and examples, please refer to the official website.

Summary

This article introduced 4 ways to implement timed tasks in PostgreSQL database, including OS timed tasks, pgAgent agent, pg_cron plug-in and pg_timetable tool.