Release Date: 2012-06-07
Postgres-XC 1.0 is a symetric (multi-master, read and write-scalable) shared-nothing cluster based on PostgreSQL. This release version is based on PostgreSQL 9.1.
Currently the only architectures supported are 64 bit Linux operating systems.
This release of Postgres-XC is the first major release and contains the following features, characteristics and enhancements:
Support and extensions for existing features of PostgreSQL in a cluster-wide environment.
SQL extensions and functionalities exclusive to Postgres-XC for management and operations related to a cluster, which add a node-level granularity for cluster operations.
Creation of Global Transaction Manager (GTM), which is a centralized component providing cluster-wide Multi-version Concurrency Control (MVCC).
Creation of mechanisms exclusive to Postgres-XC and enhancements of existing internal mechanisms of PostgreSQL, which are related to connection pooling, global transaction management, query planning, rewriting, analyzing and execution.
Restrictions related to existing features in PostgreSQL and currently not supported by Postgres-XC.
The above items are explained in more detail in the sections below.
The original overall architecture and design of Postgres-XC is by Koichi Suzuki, Mason Sharp, Pavan Deolasee, Andrei Martsinchyk and Michael Paquier. Koichi Suzuki is the original project lead.
This section is divided into the following parts:
Existing standard features supported and related extensions
SQL extensions exclusive to Postgres-XC
Internal mechanisms exclusive to Postgres-XC
Global transaction management features
This is an exhaustive list of all the features included in PostgreSQL and currently supported in Postgres-XC. Some features have needed extensions. In this case, the contributors are specified. If not listed in the restrictions, all the PostgreSQL standard functionalities are supported and are expected to work.
List of all the CREATE/ALTER/DROP SQL commands supported. All the features listed here work like native PostgreSQL. Extensions may have been added to make them usable in a cluster environment.
CREATE USER, ALTER USER, DROP USER, CREATE AGGREGATE, ALTER AGGREGATE, DROP AGGREGATE, CREATE COLLATION, ALTER COLLATION, DROP COLLATION, CREATE CONVERSION, ALTER CONVERSION, DROP CONVERSION, CREATE DATABASE, ALTER DATABASE, DROP DATABASE, ALTER DEFAULT PRIVILEGES, CREATE DOMAIN, ALTER DOMAIN, DROP DOMAIN, CREATE FUNCTION, ALTER FUNCTION, DROP FUNCTION, CREATE GROUP, ALTER GROUP, DROP GROUP, CREATE INDEX, ALTER INDEX, DROP INDEX, CREATE LANGUAGE, ALTER LANGUAGE, DROP LANGUAGE, CREATE OPERATOR CLASS, ALTER OPERATOR CLASS, DROP OPERATOR CLASS, CREATE OPERATOR FAMILY, ALTER OPERATOR FAMILY, DROP OPERATOR FAMILY, CREATE ROLE, ALTER ROLE, DROP ROLE, CREATE SCHEMA, ALTER SCHEMA, DROP SCHEMA, CREATE SEQUENCE, ALTER SEQUENCE, DROP SEQUENCE, CREATE TABLE, ALTER TABLE, DROP TABLE, CREATE TABLESPACE, ALTER TABLESPACE, DROP TABLESPACE, CREATE TEXT SEARCH CONFIGURATION, ALTER TEXT SEARCH CONFIGURATION, DROP TEXT SEARCH CONFIGURATION, CREATE TEXT SEARCH DICTIONARY, ALTER TEXT SEARCH DICTIONARY, DROP TEXT SEARCH DICTIONARY, CREATE TEXT SEARCH PARSER, ALTER TEXT SEARCH PARSER, DROP TEXT SEARCH PARSER, CREATE TEXT SEARCH TEMPLATE, ALTER TEXT SEARCH TEMPLATE, DROP TEXT SEARCH TEMPLATE, CREATE TYPE, ALTER TYPE, DROP TYPE, CREATE USER, ALTER USER, DROP USER, CREATE VIEW, ALTER VIEW, DROP VIEW, CREATE CAST, DROP CAST, CREATE RULE, DROP RULE, CREATE TABLE AS,
List of other supported SQL
ANALYZE, BEGIN, CHECKPOINT, CLOSE, CLUSTER, COMMENT, COMMIT, COMMIT PREPARED, COPY, DEALLOCATE, DECLARE, DELETE, DISCARD, DO, END, EXECUTE, EXPLAIN, FETCH, GRANT, INSERT, LOAD, LOCK, MOVE, PREPARE, PREPARE TRANSACTION, REASSIGN OWNED, REINDEX, RESET, REVOKE, ROLLBACK, ROLLBACK PREPARED, SELECT, SELECT INTO, SET, SET CONSTRAINTS, SET ROLE, SET SESSION AUTHORIZATION, SHOW, START TRANSACTION, TRUNCATE, UPDATE, VACUUM, VALUES
HOT-Standby and streaming replication, working as in native PostgreSQL, at the Coordinator and Datanode level.
pg_dump (Michael Paquier): extension to manage table distribution type in a dump.
initdb (Michael Paquier): addition of a new mandatory option --nodename=nodename to specify a node name when initializing it. This adds a parameter (pgxc_node_name) to the node's postgresql.conf file that the node uses to identify itself.
ORDER BY, LIMIT, OFFSET clauses: those clauses can be evaluated directly on Datanodes or on Coordinators depending on how query is shippable or not to remote nodes (Andrei Martsinchyk).
Table distribution definition (Mason Sharp, Abbas Butt, Michael Paquier): CREATE TABLE has been extended with DISTRIBUTE BY to be able to specify the table distribution at table creation. Distribution is tuple-based. Four types are supported: replication, hash-distribution, modulo-distribution and round robin.
The node list where table data is stored can also be defined with the extension TO NODE or TO GROUP.
All the distribution data is stored in catalog pgxc_node .
JOIN clauses (Mason Sharp, Andrei Martsinchyk): extension to generate a remote join plan depending on the quals used in queries and table distribution types. This remote join plan can be reduced depending on distribution types or nodes where table data is located.
GROUP BY clause (Ashutosh Bapat): some optimizations have been added in planner to evaluate if a clause can be shipped to remote nodes or not.
Window functions (Ashutosh Bapat): such functions are not shipped to remote nodes and are exclusively evaluated on Coordinator
INSERT with multiple values (Wang Diancheng, Pavan Deolasee): this feature has been extended to manage the case of distributed tables by hash, modulo or round robin.
Aggregate functions (Andrei Martsinchyk, Ashutosh Bapat): a collection function is determined on Coordinator to collect correctly the results from Datanodes.
SELECT FOR UPDATE (Abbas Butt). The implementation is not the most performant as in the case of a join on native PostgreSQL, where locks are taken on the tuples where join condition is achieved, but instead locks are taken on all the tuples used to evaluate the join condition.
SERIAL types (Michael Paquier): they behave like native PostgreSQL, but an extension has been added to create the table having SERIAL columns in cluster correctly.
TABLESPACE (Michael Paquier): an extension has been made to allow nodes to share the same tablespace repository even if those nodes are on the same server. It is possible to use EXECUTE DIRECT with CREATE TABLESPACE and DROP TABLESPACE to choose the tablespace folder for each node.
INSERT SELECT (Pavan Deolasee): extensions have been done to manage that for all the table distribution types.
PREPARE and EXECUTE (Andrei Martsinchyk, Ashutosh Bapat): extensions have been done to send parse and bind parameters to remote Datanodes. Those functions behave like in vanilla.
EXPLAIN (Andrei Martsinchyk, Ashutosh Bapat, Michael Paquier): Addition of options to print the nodes involved in query, as well as extensions to print remote queries for remote node plans.
COPY (Andrei Martsinchyk, Amit Khandekar): some extension has been added to copy data to remote nodes depending on table distribution. Support for BINARY and DEFAULT columns using non-shippable expressions as default expressions are also supported.
CREATE TABLE AS and SELECT INTO (Pavan Deolasee): Support is extended with remote node planning and table distributions. Table distribution can be specified for CREATE TABLE AS. SELECT INTO uses the default distribution type, i.e distribution by hash on the first column found as hashable or round robin if nothing can be used as a distribution key.
Session parameters (Michael Paquier): session and local parameters are all supported. Those parameters are managed with connection pooling. Connections with user-customized session parameters are locked in session and not pulled back to pool. When a session finishes, session parameter reset is made on each connection that was used by session and then pulled back to connection pool.
Sequence consistency in transaction blocks (Michael Paquier): As sequence values are evaluated in Global Transaction Manager, a callback mechanism for the following cases has been created.
Drop a sequence on GTM when the transaction that created this sequence aborts.
Drop a sequence on GTM when the transaction that dropped this sequence commits.
Fallback to its former name a sequence whose name has been modified in a transaction that aborts.
System functions (Amit Khandekar and Koichi Suzuki): most of them are working like native but some extensions have been
added for relation-size functions (
pg_database_size()) and locking functions.
New configuration parameters in postgresql.conf
Port opened by pooler to which backends can connect to communicate with
Minimum number of connections in pool
Maximum number of connections in pool
On/off switch to make sessions keep the same connections when used. This may be useful in lower concurrency environments with many session parameters set.
Maximum number of Coordinators that can be defined in local node
Maximum number of Datanodes that can be defined in local node
Host to connect to GTM
Port to connect to GTM
Name of the local node. This is currently set by initdb.
Forbid unsafe SQL
Control parameter used for temporary objects to enforce the use of autocommit on transactions that used temporary objects if disabled.
This section lists all the new SQL functionalities and system functions that are exclusive to Postgres-XC and can be used to manage a cluster environment.
These SQL commands are used to manage cluster node information in catalog pgxc_node.
These commands run only on the local node where they are run, and running them on Datanodes make no sense as this catalog data is used only by Coordinator to identify remote nodes and by connection pooling to get necessary remote connection information.
CREATE NODE GROUP and DROP NODE GROUP manage the node groups that can be used when creating a table with the extension TO GROUP of CREATE TABLE.
CREATE BARRIER (Pavan Deolasee)
When specified with an ID, this command allows to register in all of the nodes of the cluster a common and consistent time point to be able to recover all the nodes consistently back to this point. Internally, a barrier is written in the WAL file of all of the nodes.
recovery_target_barrier in recovery.conf can be used to recover a node to a given barrier ID.
CLEAN CONNECTION (Michael Paquier)
CLEAN CONNECTION is a connection pooling utility able to drop connections on chosen node(s) for a given database or/and user.
pgxc_pool_reload() (Michael Paquier, Abbas Butt)
Those system functions can be used to check or update the data cached in pooler with
checks if the connection information is consistent between pooler cache and catalogs.
pgxc_pool_reload() updates the connection information cached in pool.
EXECUTE DIRECT (Andrei Martsinchyk, Michael Paquier)
EXECUTE DIRECT can be used to launch a query directly to a given node. Only a single node can be targetted at the same time.
INSERT, UPDATE and DELETE are not authorized.
Utilities are basically forbidden but some are authorized for cluster management purposes.
Connection pooling (Andrei Martsinchyk, Michael Paquier)
Connection pooling is added on Coordinator to dynamically manage and with a minimum cost connections to remote nodes. This pooler uses data of catalog pgxc_node . It is a separate process that is forked off of the postmaster.
Connection pools are divided per both user and database for security reasons.
Fast-query shipping (Mason Sharp, Ashutosh Bapat, Andrei Martsinchyk)
Fast-query shipping (FQS) is an additional planner exclusive to Postgres-XC designed to determine as fast as possible if a query can be completely shipped to Datanodes depending on its parsed content. A fallback to standard planner is made if query cannot be shipped as-is.
Remote query planning, standard and remote join (Mason Sharp, Andrei Martsinchyk, Pavan Deolasee, Ashutosh Bapat)
This additional planner can build a plan to replace the PostgreSQL scan plan by a RemoteQuery plan able to scan remote nodes in executor. This includes plans to be able to build remote joins, reducible join functionalities are also included.
Remote INSERT, UPDATE, DELETE planning (Pavan Deolasee, Michael Paquier)
This additional planner can be used by PostgreSQL standard planner to generate plans for DML queries to remote nodes. Those plans are placed on top of an inner scan plan.
Remode node location identification and data (Michael Paquier, Abbas Butt)
Nodes use their names as unique identifiers in the cluster and use the information stored in pgxc_node to define the location of remote node to be used. Cluster nodes can also be defined as groups stored in pgxc_group .
Global Transaction manager (GTM) is a module exclusive to Postgres-XC able to maintain the same level of MVCC as vanilla PostgreSQL by distributing global transaction ID and global snapshot. GTM is also used to store global information about two-phase commit transactions when external application requested it. Sequence information (name, values) is managed directly in GTM.
This section lists all the modules and extensions related to GTM.
GTM-Proxy can be used between Postgres-XC nodes and GTM to control the number of messages by grouping them.
GTM-Standby is a solution that prevents GTM single point of failure. (Koichi Suzuki)
initgtm (Michael Paquier)
Module that can initialize GTM data folder. Can be used to initialize GTM or GTM-Proxy.
gtm.conf (Koichi Suzuki)
Configuration file of GTM. More details about configuration parameters here.
gtm_proxy.conf (Koichi Suzuki)
Configuration file of GTM Proxy. More details about configuration parameters here.
Module similar to pg_ctl able to control GTM, GTM-Standby and GTM-Proxy in similar ways. pg_ctl can be used to reconnect GTM-Proxy to a new GTM and to failover a GTM-Standy to become a primary GTM.
List of all the unsupported SQL commands
CREATE EXTENSION, ALTER EXTENSION, DROP EXTENSION, CREATE FOREIGN DATA WRAPPER, ALTER FOREIGN DATA WRAPPER, DROP FOREIGN DATA WRAPPER, CREATE FOREIGN TABLE, ALTER FOREIGN TABLE, DROP FOREIGN TABLE, ALTER LARGE OBJECT, CREATE SERVER, ALTER SERVER, DROP SERVER, CREATE TRIGGER, ALTER TRIGGER, DROP TRIGGER, CREATE USER MAPPING, ALTER USER MAPPING, DROP USER MAPPING, LISTEN, NOTIFY, RELEASE SAVEPOINT, ROLLBACK TO SAVEPOINT, SAVEPOINT, SECURITY LABEL, UNLISTEN,
Table distribution definition cannot be changed.
Distribution key of a table cannot be updated.
Utilities cannot be used in SQL functions.
DML cannot be used in plpgsql functions.
CREATE TABLE AS EXECUTE is not supported.
WHERE CURRENT OF is not supported.
RETURNING is not supported.
In cursors, MOVE BACKWARD works only if SCROLL is used in CURSOR
WITH HOLD cursors are not supported.
Statictics are not managed globally. When requesting statistics only the data related to node directly connected to is taken into account.
Barriers have no timeout functionalities, meaning that if a 2PC transaction is stuck forever, barrier will be stuck too.
NTT Open Source Software Center