E.21. ltree

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

This module implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure. Extensive facilities for searching through label trees are provided.

Note: The following description applies only to Postgres-XC

Please note that a column of the type ltree cannot be a distribution column.

E.21.1. Definitions

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

A label is a sequence of alphanumeric characters and underscores (for example, in C locale the characters A-Za-z0-9_ are allowed). Labels must be less than 256 bytes long.

Examples: 42, Personal_Services

A label path is a sequence of zero or more labels separated by dots, for example L1.L2.L3, representing a path from the root of a hierarchical tree to a particular node. The length of a label path must be less than 65Kb, but keeping it under 2Kb is preferable. In practice this is not a major limitation; for example, the longest label path in the DMOZ catalogue (http://www.dmoz.org) is about 240 bytes.

Example: Top.Countries.Europe.Russia

The ltree module provides several data types:

Note: ltxtquery allows whitespace between symbols, but ltree and lquery do not.

E.21.2. Operators and Functions

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

Type ltree has the usual comparison operators =, <>, <, >, <=, >=. Comparison sorts in the order of a tree traversal, with the children of a node sorted by label text. In addition, the specialized operators shown in Table E-12 are available.

Table E-12. ltree Operators

OperatorReturnsDescription
ltree @> ltreebooleanis left argument an ancestor of right (or equal)?
ltree <@ ltreebooleanis left argument a descendant of right (or equal)?
ltree ~ lquerybooleandoes ltree match lquery?
lquery ~ ltreebooleandoes ltree match lquery?
ltree ? lquery[]booleandoes ltree match any lquery in array?
lquery[] ? ltreebooleandoes ltree match any lquery in array?
ltree @ ltxtquerybooleandoes ltree match ltxtquery?
ltxtquery @ ltreebooleandoes ltree match ltxtquery?
ltree || ltreeltreeconcatenate ltree paths
ltree || textltreeconvert text to ltree and concatenate
text || ltreeltreeconvert text to ltree and concatenate
ltree[] @> ltreebooleandoes array contain an ancestor of ltree?
ltree <@ ltree[]booleandoes array contain an ancestor of ltree?
ltree[] <@ ltreebooleandoes array contain a descendant of ltree?
ltree @> ltree[]booleandoes array contain a descendant of ltree?
ltree[] ~ lquerybooleandoes array contain any path matching lquery?
lquery ~ ltree[]booleandoes array contain any path matching lquery?
ltree[] ? lquery[]booleandoes ltree array contain any path matching any lquery?
lquery[] ? ltree[]booleandoes ltree array contain any path matching any lquery?
ltree[] @ ltxtquerybooleandoes array contain any path matching ltxtquery?
ltxtquery @ ltree[]booleandoes array contain any path matching ltxtquery?
ltree[] ?@> ltreeltreefirst array entry that is an ancestor of ltree; NULL if none
ltree[] ?<@ ltreeltreefirst array entry that is a descendant of ltree; NULL if none
ltree[] ?~ lqueryltreefirst array entry that matches lquery; NULL if none
ltree[] ?@ ltxtqueryltreefirst array entry that matches ltxtquery; NULL if none

The operators <@, @>, @ and ~ have analogues ^<@, ^@>, ^@, ^~, which are the same except they do not use indexes. These are useful only for testing purposes.

The available functions are shown in Table E-13.

Table E-13. ltree Functions

FunctionReturn TypeDescriptionExampleResult
subltree(ltree, int start, int end)ltreesubpath of ltree from position start to position end-1 (counting from 0)subltree('Top.Child1.Child2',1,2)Child1
subpath(ltree, int offset, int len)ltreesubpath of ltree starting at position offset, length len. If offset is negative, subpath starts that far from the end of the path. If len is negative, leaves that many labels off the end of the path.subpath('Top.Child1.Child2',0,2)Top.Child1
subpath(ltree, int offset)ltreesubpath of ltree starting at position offset, extending to end of path. If offset is negative, subpath starts that far from the end of the path.subpath('Top.Child1.Child2',1)Child1.Child2
nlevel(ltree)integernumber of labels in pathnlevel('Top.Child1.Child2')3
index(ltree a, ltree b)integerposition of first occurrence of b in a; -1 if not foundindex('0.1.2.3.5.4.5.6.8.5.6.8','5.6')6
index(ltree a, ltree b, int offset)integerposition of first occurrence of b in a, searching starting at offset; negative offset means start -offset labels from the end of the pathindex('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4)9
text2ltree(text)ltreecast text to ltree
ltree2text(ltree)textcast ltree to text
lca(ltree, ltree, ...)ltreelowest common ancestor, i.e., longest common prefix of paths (up to 8 arguments supported)lca('1.2.2.3','1.2.3.4.5.6')1.2
lca(ltree[])ltreelowest common ancestor, i.e., longest common prefix of pathslca(array['1.2.2.3'::ltree,'1.2.3'])1.2

E.21.3. Indexes

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

ltree supports several types of indexes that can speed up the indicated operators:

E.21.4. Example

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

This example uses the following data (also available in file contrib/ltree/ltreetest.sql in the source distribution):

CREATE TABLE test (path ltree);
INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
CREATE INDEX path_gist_idx ON test USING gist(path);
CREATE INDEX path_idx ON test USING btree(path);

Now, we have a table test populated with data describing the hierarchy shown below:

                        Top
                     /   |  \
             Science Hobbies Collections
                 /       |              \
        Astronomy   Amateurs_Astronomy Pictures
           /  \                            |
Astrophysics  Cosmology                Astronomy
                                        /  |    \
                                 Galaxies Stars Astronauts

We can do inheritance:

ltreetest=> SELECT path FROM test WHERE path <@ 'Top.Science';
                path
------------------------------------
 Top.Science
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(4 rows)

Here are some examples of path matching:

ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*';
                     path
-----------------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Collections.Pictures.Astronomy
 Top.Collections.Pictures.Astronomy.Stars
 Top.Collections.Pictures.Astronomy.Galaxies
 Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)

ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

Here are some examples of full text search:

ltreetest=> SELECT path FROM test WHERE path @ 'Astro*% & !pictures@';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Hobbies.Amateurs_Astronomy
(4 rows)

ltreetest=> SELECT path FROM test WHERE path @ 'Astro* & !pictures@';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

Path construction using functions:

ltreetest=> SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test WHERE path <@ 'Top.Science.Astronomy';
                 ?column?
------------------------------------------
 Top.Science.Space.Astronomy
 Top.Science.Space.Astronomy.Astrophysics
 Top.Science.Space.Astronomy.Cosmology
(3 rows)

We could simplify this by creating a SQL function that inserts a label at a specified position in a path:

CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree
    AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
    LANGUAGE SQL IMMUTABLE;

ltreetest=> SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy';
                ins_label
------------------------------------------
 Top.Science.Space.Astronomy
 Top.Science.Space.Astronomy.Astrophysics
 Top.Science.Space.Astronomy.Cosmology
(3 rows)

E.21.5. Authors

All work was done by Teodor Sigaev () and Oleg Bartunov (). See http://www.sai.msu.su/~megera/postgres/gist/ for additional information. Authors would like to thank Eugeny Rodichev for helpful discussions. Comments and bug reports are welcome.