E.4. Release 1.0

Release Date: 2012-06-07

E.4.1. Overview

Postgres-XC 1.0 is a symetric (multi-master, read and write-scalable) shared-nothing cluster based on PostgreSQL. This release version is based on PostgreSQL 9.1.

Currently the only architectures supported are 64 bit Linux operating systems.

This release of Postgres-XC is the first major release and contains the following features, characteristics and enhancements:

The above items are explained in more detail in the sections below.

The original overall architecture and design of Postgres-XC is by Koichi Suzuki, Mason Sharp, Pavan Deolasee, Andrei Martsinchyk and Michael Paquier. Koichi Suzuki is the original project lead.

E.4.2. Details

This section is divided into the following parts:

E.4.2.1. Existing standard features supported and related extensions

This is an exhaustive list of all the features included in PostgreSQL and currently supported in Postgres-XC. Some features have needed extensions. In this case, the contributors are specified. If not listed in the restrictions, all the PostgreSQL standard functionalities are supported and are expected to work.

E.4.2.2. SQL extensions exclusive to Postgres-XC

This section lists all the new SQL functionalities and system functions that are exclusive to Postgres-XC and can be used to manage a cluster environment.

  • CREATE NODE, ALTER NODE, DROP NODE (Michael Paquier, Abbas Butt)

    These SQL commands are used to manage cluster node information in catalog pgxc_node.

    These commands run only on the local node where they are run, and running them on Datanodes make no sense as this catalog data is used only by Coordinator to identify remote nodes and by connection pooling to get necessary remote connection information.

  • CREATE NODE GROUP, DROP NODE GROUP (Michael Paquier, Abbas Butt)

    CREATE NODE GROUP and DROP NODE GROUP manage the node groups that can be used when creating a table with the extension TO GROUP of CREATE TABLE.

  • CREATE BARRIER (Pavan Deolasee)

    When specified with an ID, this command allows to register in all of the nodes of the cluster a common and consistent time point to be able to recover all the nodes consistently back to this point. Internally, a barrier is written in the WAL file of all of the nodes.

    recovery_target_barrier in recovery.conf can be used to recover a node to a given barrier ID.

  • CLEAN CONNECTION (Michael Paquier)

    CLEAN CONNECTION is a connection pooling utility able to drop connections on chosen node(s) for a given database or/and user.

  • pgxc_pool_check(), pgxc_pool_reload() (Michael Paquier, Abbas Butt)

    Those system functions can be used to check or update the data cached in pooler with pgxc_node. pgxc_pool_check() checks if the connection information is consistent between pooler cache and catalogs. pgxc_pool_reload() updates the connection information cached in pool.

  • EXECUTE DIRECT (Andrei Martsinchyk, Michael Paquier)

    EXECUTE DIRECT can be used to launch a query directly to a given node. Only a single node can be targetted at the same time.

    INSERT, UPDATE and DELETE are not authorized.

    Utilities are basically forbidden but some are authorized for cluster management purposes.

E.4.2.3. Internal mechanisms exclusive to Postgres-XC

  • Connection pooling (Andrei Martsinchyk, Michael Paquier)

    Connection pooling is added on Coordinator to dynamically manage and with a minimum cost connections to remote nodes. This pooler uses data of catalog pgxc_node . It is a separate process that is forked off of the postmaster.

    Connection pools are divided per both user and database for security reasons.

  • Fast-query shipping (Mason Sharp, Ashutosh Bapat, Andrei Martsinchyk)

    Fast-query shipping (FQS) is an additional planner exclusive to Postgres-XC designed to determine as fast as possible if a query can be completely shipped to Datanodes depending on its parsed content. A fallback to standard planner is made if query cannot be shipped as-is.

  • Remote query planning, standard and remote join (Mason Sharp, Andrei Martsinchyk, Pavan Deolasee, Ashutosh Bapat)

    This additional planner can build a plan to replace the PostgreSQL scan plan by a RemoteQuery plan able to scan remote nodes in executor. This includes plans to be able to build remote joins, reducible join functionalities are also included.

  • Remote INSERT, UPDATE, DELETE planning (Pavan Deolasee, Michael Paquier)

    This additional planner can be used by PostgreSQL standard planner to generate plans for DML queries to remote nodes. Those plans are placed on top of an inner scan plan.

  • Remode node location identification and data (Michael Paquier, Abbas Butt)

    Nodes use their names as unique identifiers in the cluster and use the information stored in pgxc_node to define the location of remote node to be used. Cluster nodes can also be defined as groups stored in pgxc_group .

E.4.2.4. Global Transaction Manager (Pavan Deolasee, with input from Koichi Suzuki and Mason Sharp)

Global Transaction manager (GTM) is a module exclusive to Postgres-XC able to maintain the same level of MVCC as vanilla PostgreSQL by distributing global transaction ID and global snapshot. GTM is also used to store global information about two-phase commit transactions when external application requested it. Sequence information (name, values) is managed directly in GTM.

This section lists all the modules and extensions related to GTM.

GTM-Proxy can be used between Postgres-XC nodes and GTM to control the number of messages by grouping them.

GTM-Standby is a solution that prevents GTM single point of failure. (Koichi Suzuki)

  • initgtm (Michael Paquier)

    Module that can initialize GTM data folder. Can be used to initialize GTM or GTM-Proxy.

  • gtm.conf (Koichi Suzuki)

    Configuration file of GTM. More details about configuration parameters here.

  • gtm_proxy.conf (Koichi Suzuki)

    Configuration file of GTM Proxy. More details about configuration parameters here.

  • gtm_ctl

    Module similar to pg_ctl able to control GTM, GTM-Standby and GTM-Proxy in similar ways. pg_ctl can be used to reconnect GTM-Proxy to a new GTM and to failover a GTM-Standy to become a primary GTM.

E.4.2.5. Restrictions

E.4.3. Acknowledgement