There are already many articles on the web comparing PostgreSQL with MySQL, this article is just to sort out some important information. Before we start our analysis, let’s take a look at these two charts.

MySQL

MySQL claims to be the most popular open source database, and the M in LAMP refers to MySQL. MySQL is used by applications built on LAMP, such as WordPress, Drupal and most php open source programs. MySQL was originally developed by MySQL AB and then sold to Sun in 2008 for $1 billion, which was acquired by Oracle in 2010. Oracle supports several versions of MySQL: Standard, Enterprise, Classic, Cluster, Embedded and Community, some of which are free to download and some of which are paid for. The core code is based on the GPL license, and because MySQL is controlled by Oracle, the community is concerned about the impact on MySQL open source, so some branches have been developed, such as: MariaDB and Percona.

PostgreSQL

PostgreSQL touts itself as the world’s most advanced open source database. some fans of PostgreSQL say it rivals Oracle, and without the expensive price tag and arrogant customer service. Originally developed at the University of California, Berkeley in 1985 as a successor to the Ingres database, PostgreSQL is a completely community-driven open source project. It offers a single full-featured version, unlike MySQL, which offers several different community, commercial and enterprise editions. postgreSQL is based on a free BSD/MIT license, and organizations can use, copy, modify and redistribute the code, requiring only a copyright notice.

MySQL vs. PostgreSQL

MySQL is backed by a well-established commercial company, while PostgreSQL is backed by a large volunteer development group. This makes MySQL’s development process more deliberate, while PostgreSQL is more responsive. Such two backgrounds lead directly to the inherent strengths and weaknesses of each.

Advantages of PostgreSQL over MySQL

1) Not just a relational database

In addition to storing normal data types, it also supports storing.

  • array, whether it’s a one-digit array or a multi-digit array
  • json (hStore) and jsonb, which are much more efficient than using text to store transfers

difference between json and jsonb

jsonb and json look almost the same at a higher level, but are different in their store implementations.

  • json stores finished text, json columns parse the stored values each time, it does not support indexes, but you can create expression indexes for queries.
  • jsonb stores in binary format, which avoids re-parsing the data structure. It supports indexes, which means you can query any path without using the specified index.

When we compare the speed of writing data, jsonb is slightly slower than json because of the way the data is stored. json columns parse the stored values each time, which means that the keys are in the same order as they were entered. Unlike jsonb, which is stored in binary format and does not guarantee the order of the keys. So if you have software that relies on the order of the keys, jsonb may not be the best choice for your application. The advantage of using jsonb is that you can easily integrate relational and non-relational data. PostgreSQL is a real threat to document-based databases like mongodb, after all, if only one column of data in a table is of a semi-structured type, there is no need to accommodate it by designing the entire table with a schemaless structure.

2 ) Support for geographic information processing extensions

PostGIS provides PostgreSQL with support for storing spatial geographic data, making PostgreSQL a spatial database capable of spatial data management, quantity measurement and geometric topology analysis. In terms of functionality, compared with MYSQL, PostGIS has the following advantages.

The LBS business in O2O business scenario has incomparable advantages using PostgreSQL + PostGIS.

3 ) REST API can be built quickly

PostgREST can easily provide a full RESTful API service for any PostgreSQL database.

4 ) Support for tree structures

Support for extensible index types like R-trees makes it easier to handle special data. MySQL can be complicated to handle tree design and requires a lot of code to be written, while PostgreSQL can handle tree structure efficiently.

5 ) has extremely strong SQL programming capabilities

Recursion support, very rich statistical functions and statistical syntax support.

  • MySQL: CREATE PROCEDURE and CREATE FUNCTION statements are supported. Stored procedures can be written in SQL and C++. User-defined functions can be written in SQL, C, and C++.
  • PostgreSQL: There is no separate stored procedure, it is all implemented by functions. User-defined functions can be written in PL/pgSQL (a specialized procedural language), PL/Tcl, PL/Perl, PL/Python, SQL, and C.

6 ) External Data Source Support

You can query 70 external data sources (including Mysql, Oracle, CSV, hadoop …) as if they were tables in your own database. Wrapper (FDW)" feature. Originally developed by PostgreSQL community leader Dave Page four years ago based on the SQL standard SQL/MED (SQL Management of External Data), FDW provides a SQL interface to remote big data objects in remote data stores, allowing DBAs to consolidate data from unrelated data sources into a common model in the Postgres database. This allows DBAs to access and manipulate data managed by other systems, just as they would in a local Postgres table. For example, using FDW for MongoDB, a database administrator can query data from the document database and use SQL to correlate it with data from the local Postgres table. With this approach, users can view, sort, and group data as rows, columns, or JSON documents. They can even write (insert, refine, or delete) data directly from Postgres to the source document database, like an all-in-one seamless deployment. FDW enables Postgres to act as a central federated database or “Hub” for the enterprise.

7 ) No string length limit

The general relational database has a limited length of about 8k strings, and the infinite length TEXT type is limited in functionality and can only be accessed as external big data. The TEXT type of PostgreSQL can be accessed directly, and the SQL syntax has built-in regular expressions, indexing, full-text search, or xml xpath. … PostgreSQL does not have this restriction, and text can support various sizes.

8 ) Support for graph structured data storage

No specific use, you can search for yourself.

9 ) Support for window functions

Window functions provide the ability to perform calculations across row-related sets of rows of the current query. Aggregate functions are called as window functions only when they follow the OVER clause; otherwise they act as regular aggregate functions. Windows are also a type of grouping, but are different from group by’s grouping. Windows, in addition to providing grouping, can also perform calculations on each window. MySQL does not support the OVER clause, while PostgreSQL does, and the OVER clause is a simple solution to the “take the top 5 of each group” problem. MySQL supports a very small part of the SQL syntax (ANSI SQL standard). No support for recursive queries, generic table expressions (Oracle’s with statement) or window functions (parser functions).

10 ) Stronger support for indexes

PostgreSQL’s can use functions and conditional indexes, which makes PostgreSQL database tuning very flexible, mysql doesn’t have this feature, conditional indexes are important in web applications. For index types.

  • MySQL: depends on the storage engine. myISAM: BTREE, InnoDB: BTREE.
  • PostgreSQL: B-Tree, Hash, R-Tree and Gist indexes are supported.

InnoDB’s tables and indexes are stored in the same way. This means that the tables are index-organized tables. This generally requires that the primary key should not be too long and the primary key should preferably be incremented sequentially when inserting, otherwise it has a big impact on performance. postgreSQL does not have this problem.

For index types, MySQL depends on the storage engine. myISAM: BTREE, InnoDB: BTREE. postgreSQL supports B-tree, hash, R-tree and Gist indexes.

11 ) Cluster support is better

Mysql Cluster may be more different from what you think. There is less open source cluster software. The replication feature is asynchronous and has significant limitations. For example, it is single-threaded, so it is difficult for a more capable Slave to keep up with a less capable Master.

PostgreSQL has rich open source cluster software support. plproxy can support statement-level mirroring or slicing, slony can set up field-level synchronization, and standby can build WAL file-level or streaming read/write separation clusters, which are easy to adjust the synchronization frequency and cluster policy, and very simple to operate.

In addition, PostgreSQL’s primary backup replication is physical replication, which is more reliable in terms of data consistency, higher replication performance and less impact on host performance compared to MySQL’s binlog-based logical replication. For WEB applications, the replication feature is important. mysql is asynchronous replication even now, and pgsql can do synchronous, asynchronous, and semi-synchronous replication. And mysql’s synchronization is based on binlog replication, similar to oracle golden gate, is based on stream replication, to do synchronization is very difficult, this way more suitable for offsite replication, pgsql’s replication based on wal, can do synchronous replication. At the same time, pgsql also provides stream replication.

12 ) Transaction isolation is better

MySQL’s transaction isolation level repeatable read doesn’t prevent common concurrent updates, you have to add locks to do so, but pessimistic locks will affect performance, and it is complicated to implement optimistic locks manually. PostgreSQL has a hidden optimistic lock version field in the column, and the default repeatable read level can guarantee the correctness of concurrent updates with the performance of optimistic locking.

13 ) Better support for characters

MySQL needs utf8mb4 to display emoji, PostgreSQL doesn’t have this pitfall.

14 ) More complete support for table joins

MySQL has only one type of table join: nested-loop, but not sort-merge join and hash join.

15 ) Storage method supports larger data volume

PostgreSQL uses heap tables for main table storage, MySQL uses indexes to organize tables, which can support larger data volume than MySQL.

16 ) Higher Time Accuracy

MySQL has no storage type for time types such as time, date, interval, etc. at sub-second level, while PostgreSQL can be accurate to sub-second.

17 ) Optimizer is more complete

MySQL is weak on complex queries, the query optimizer is not mature enough, and explain looks at the results of execution plans simply. Performance optimization tools and metric information are insufficient.

PostgreSQL has a very powerful query optimizer that supports very complex query processing. explain returns rich information. It provides some performance views to easily see the select, delete, update, and insert statistics that occur on a table and index, as well as the cache hit rate. There is an open source pgstatspack tool online.

18) Better Sequence Support

MySQL does not support multiple tables taking ids from the same sequence, while PostgreSQL does.

19) Better support for subqueries

Support for subqueries. Although using subqueries in SQL statements is inefficient in many cases, and in most cases you can use conditional multi-table joins instead of subqueries, the presence of subqueries is still inevitable in many cases. Moreover, SQL statements that use subqueries have higher program readability than those that use conditional multi-table joins. Almost any database has better subquery performance than MySQL.

20 ) Adding Columns is Easier

Adding columns to a MySQL table is basically rebuilding the table and indexes, which can take a long time. adding columns to a PostgreSQL table is just adding table definitions to the data dictionary and does not rebuild the table.

Advantages of MySQL over PostgreSQL

1) MySQL More popular than PostgreSQL

Popularity is also a very important indicator for a commercial software. Popularity means more users, means more tested, means better commercial support, means more and better documentation. Easy to use, very easy to install. Third-party tools, including visualization tools, allow users to get started easily.

2 ) Rollback implementation is superior

innodb’s rollback segment-based implementation of the MVCC mechanism is superior to PG’s XID-based MVCC mechanism that stores old and new data together. New and old data stored together, the need to regularly trigger VACUUM, will bring excess IO and database object locking overhead, causing the overall concurrency of the database to decline. And VACUUM cleanup is not timely, it may also trigger data bloat.

3 ) Runs more reliably on Windows

Compared with PostgreSQL, MySQL is more suitable to run under Windows environment. MySQL runs as a native Windows application (under NT/Win2000/WinXP, it is a service), while PostgreSQL runs under Cygwin emulation environment. postgreSQL running under It should be conceivable that PostgreSQL does not run as stable as MySQL under Windows.

4 ) Advantages of threaded mode over process mode

MySQL uses threads, while PostgreSQL uses processes. It is obviously much faster to switch environments and access common storage areas between different threads than between different processes.

  • Process mode is more efficient for multiple CPU utilization. Process mode shared data requires the use of shared memory, while thread mode data itself is all shared within the process space, and access by different threads only requires controlled synchronization between threads.
  • Threaded mode consumes less resources. So MySQL can support far more connections than PostgreSQL. But PostgreSQL has excellent connection pooling software software, such as pgbouncer and pgpool, so it can also support many connections through connection pooling.

5 ) Better Permission Setting

MySQL has a more sophisticated permissions system than PostgreSQL in some respects; whereas PostgreSQL only supports INSERT, SELECT and UPDATE/DELETE per user on one database or one data table, MySQL allows you to define a whole set of different data-level, table-level and column-level permissions. For column-level permissions, PostgreSQL can compensate by creating views and determining the permissions for the views. MySQL also allows you to specify host-based permissions, which is not possible with current PostgreSQL, but is useful in many cases.

6 ) Storage engine plug-in mechanism

MySQL’s storage engine plug-in mechanism makes it more widely used, for example, myisam is suitable for static data query scenarios in addition to innodb for transaction processing scenarios.

7 ) Adaptable to 24/7 operation

MySQL can adapt to run 24/7. In most cases, you don’t need to run any cleanup for MySQL. postgreSQL is still not fully adapted to run 24/7, this is because you have to run VACUUM every once in a while.

8 ) is more tried and true for simple scenarios

PostgreSQL only supports heap tables, not index-organized tables, Innodb only supports index-organized tables.

  • Advantage of index-organized table: the data in the table is organized by index, the data is ordered, and if the data are accessed by primary key, then it is faster to access the data. While heap tables, when accessing data by primary key, it is necessary to find the physical location of the data by primary key index first.
  • Disadvantage of indexed organization table: When other indexes are added to the indexed organization table, the data location of other index records is no longer the physical location, but the primary key value, so for indexed organization table, the value of the primary key cannot be too large, otherwise the space occupied is relatively large.
  • For index-organized tables, if you insert data in the middle each time, it may cause index splitting, and index splitting will greatly reduce the performance of inserting. So for using innodb, we generally better make the primary key a nonsensical sequence so that inserts occur at the end each time to avoid this problem.

Since the index organization table is organized by an index tree, generally it accesses data blocks must be accessed by the relationship between data blocks, not by the physical block of accessing data, so when doing a full table scan is much slower than a heap table, which may not be obvious in OLTP, but may be a problem in data warehouse applications.

Summary

MySQL was not intended to do everything from the start, and thus it has some limitations in terms of functionality and does not meet the requirements of some advanced applications. the way MySQL implements certain features (such as referencing, transactions, auditing, etc.) makes it lack some reliability compared to other relational databases. For simple and heavy read operations, using PostgreSQL can be a bit of a fuss, while performance is worse than its counterpart like MySQL. Unless you need absolute data integrity, ACID compliance or a complex design, PostgreSQL is a bit redundant for simple scenarios.

The following rules are always valid if you are sure to choose between MySQL and PostgreSQL only.

  • If your operating system is Windows, you should use MySQL.
  • When reliability and data integrity are absolutely necessary, PostgreSQL is the better choice.
  • If you need the database to perform custom procedures, then the scalable PostgreSQL is a better choice.
  • Your application deals with geographic data, and because of R-TREES, you should use PostgreSQL.
  • If you don’t know much about databases, or even what exactly transactions, stored procedures, etc. are, you should use MySQL.