Postgres-XC 0.9.7 Documentation
Prev	Fast Backward	Appendix E. Additional Supplied Modules	Fast Forward	Next

E.21. ltree

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

This module implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure. Extensive facilities for searching through label trees are provided.

Note: The following description applies only to Postgres-XC

Please note that a column of the type ltree cannot be a distribution column.

E.21.1. Definitions

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

A label is a sequence of alphanumeric characters and underscores (for example, in C locale the characters A-Za-z0-9_ are allowed). Labels must be less than 256 bytes long.

Examples: 42, Personal_Services

A label path is a sequence of zero or more labels separated by dots, for example L1.L2.L3, representing a path from the root of a hierarchical tree to a particular node. The length of a label path must be less than 65Kb, but keeping it under 2Kb is preferable. In practice this is not a major limitation; for example, the longest label path in the DMOZ catalogue (http://www.dmoz.org) is about 240 bytes.

Example: Top.Countries.Europe.Russia

The ltree module provides several data types:

ltree stores a label path.
lquery represents a regular-expression-like pattern for matching ltree values. A simple word matches that label within a path. A star symbol (*) matches zero or more labels. For example:
```
foo         Match the exact label path foo
*.foo.*     Match any label path containing the label foo
*.foo       Match any label path whose last label is foo
```
Star symbols can also be quantified to restrict how many labels they can match:
```
*{n}        Match exactly n labels
*{n,}       Match at least n labels
*{n,m}      Match at least n but not more than m labels
*{,m}       Match at most m labels — same as  *{0,m}
```
There are several modifiers that can be put at the end of a non-star label in lquery to make it match more than just the exact match:
```
@           Match case-insensitively, for example a@ matches A
*           Match any label with this prefix, for example foo* matches foobar
%           Match initial underscore-separated words
```
The behavior of % is a bit complicated. It tries to match words rather than the entire label. For example foo_bar% matches foo_bar_baz but not foo_barbaz. If combined with *, prefix matching applies to each word separately, for example foo_bar%* matches foo1_bar2_baz but not foo1_br2_baz.
Also, you can write several possibly-modified labels separated with | (OR) to match any of those labels, and you can put ! (NOT) at the start to match any label that doesn't match any of the alternatives.
Here's an annotated example of lquery:
```
Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain
a.  b.     c.      d.               e.
```
This query will match any label path that:
1. begins with the label Top
2. and next has zero to two labels before
3. a label beginning with the case-insensitive prefix sport
4. then a label not matching football nor tennis
5. and then ends with a label beginning with Russ or exactly matching Spain.
ltxtquery represents a full-text-search-like pattern for matching ltree values. An ltxtquery value contains words, possibly with the modifiers @, *, % at the end; the modifiers have the same meanings as in lquery. Words can be combined with & (AND), | (OR), ! (NOT), and parentheses. The key difference from lquery is that ltxtquery matches words without regard to their position in the label path.
Here's an example ltxtquery:
```
Europe & Russia*@ & !Transportation
```
This will match paths that contain the label Europe and any label beginning with Russia (case-insensitive), but not paths containing the label Transportation. The location of these words within the path is not important. Also, when % is used, the word can be matched to any underscore-separated word within a label, regardless of position.

Note: ltxtquery allows whitespace between symbols, but ltree and lquery do not.

E.21.2. Operators and Functions

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

Type ltree has the usual comparison operators =, <>, <, >, <=, >=. Comparison sorts in the order of a tree traversal, with the children of a node sorted by label text. In addition, the specialized operators shown in Table E-12 are available.

Table E-12. ltree Operators

Operator	Returns	Description
`ltree` `@>` `ltree`	`boolean`	is left argument an ancestor of right (or equal)?
`ltree` `<@` `ltree`	`boolean`	is left argument a descendant of right (or equal)?
`ltree` `~` `lquery`	`boolean`	does `ltree` match `lquery`?
`lquery` `~` `ltree`	`boolean`	does `ltree` match `lquery`?
`ltree` `?` `lquery[]`	`boolean`	does `ltree` match any `lquery` in array?
`lquery[]` `?` `ltree`	`boolean`	does `ltree` match any `lquery` in array?
`ltree` `@` `ltxtquery`	`boolean`	does `ltree` match `ltxtquery`?
`ltxtquery` `@` `ltree`	`boolean`	does `ltree` match `ltxtquery`?
`ltree` `\|\|` `ltree`	`ltree`	concatenate `ltree` paths
`ltree` `\|\|` `text`	`ltree`	convert text to `ltree` and concatenate
`text` `\|\|` `ltree`	`ltree`	convert text to `ltree` and concatenate
`ltree[]` `@>` `ltree`	`boolean`	does array contain an ancestor of `ltree`?
`ltree` `<@` `ltree[]`	`boolean`	does array contain an ancestor of `ltree`?
`ltree[]` `<@` `ltree`	`boolean`	does array contain a descendant of `ltree`?
`ltree` `@>` `ltree[]`	`boolean`	does array contain a descendant of `ltree`?
`ltree[]` `~` `lquery`	`boolean`	does array contain any path matching `lquery`?
`lquery` `~` `ltree[]`	`boolean`	does array contain any path matching `lquery`?
`ltree[]` `?` `lquery[]`	`boolean`	does `ltree` array contain any path matching any `lquery`?
`lquery[]` `?` `ltree[]`	`boolean`	does `ltree` array contain any path matching any `lquery`?
`ltree[]` `@` `ltxtquery`	`boolean`	does array contain any path matching `ltxtquery`?
`ltxtquery` `@` `ltree[]`	`boolean`	does array contain any path matching `ltxtquery`?
`ltree[]` `?@>` `ltree`	`ltree`	first array entry that is an ancestor of `ltree`; NULL if none
`ltree[]` `?<@` `ltree`	`ltree`	first array entry that is a descendant of `ltree`; NULL if none
`ltree[]` `?~` `lquery`	`ltree`	first array entry that matches `lquery`; NULL if none
`ltree[]` `?@` `ltxtquery`	`ltree`	first array entry that matches `ltxtquery`; NULL if none

The operators <@, @>, @ and ~ have analogues ^<@, ^@>, ^@, ^~, which are the same except they do not use indexes. These are useful only for testing purposes.

The available functions are shown in Table E-13.

Table E-13. ltree Functions

Function	Return Type	Description	Example	Result
`subltree(ltree, int start, int end)`	`ltree`	subpath of `ltree` from position `start` to position `end`-1 (counting from 0)	`subltree('Top.Child1.Child2',1,2)`	`Child1`
`subpath(ltree, int offset, int len)`	`ltree`	subpath of `ltree` starting at position `offset`, length `len`. If `offset` is negative, subpath starts that far from the end of the path. If `len` is negative, leaves that many labels off the end of the path.	`subpath('Top.Child1.Child2',0,2)`	`Top.Child1`
`subpath(ltree, int offset)`	`ltree`	subpath of `ltree` starting at position `offset`, extending to end of path. If `offset` is negative, subpath starts that far from the end of the path.	`subpath('Top.Child1.Child2',1)`	`Child1.Child2`
`nlevel(ltree)`	`integer`	number of labels in path	`nlevel('Top.Child1.Child2')`	`3`
`index(ltree a, ltree b)`	`integer`	position of first occurrence of `b` in `a`; -1 if not found	`index('0.1.2.3.5.4.5.6.8.5.6.8','5.6')`	`6`
`index(ltree a, ltree b, int offset)`	`integer`	position of first occurrence of `b` in `a`, searching starting at `offset`; negative `offset` means start `-offset` labels from the end of the path	`index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4)`	`9`
`text2ltree(text)`	`ltree`	cast `text` to `ltree`
`ltree2text(ltree)`	`text`	cast `ltree` to `text`
`lca(ltree, ltree, ...)`	`ltree`	lowest common ancestor, i.e., longest common prefix of paths (up to 8 arguments supported)	`lca('1.2.2.3','1.2.3.4.5.6')`	`1.2`
`lca(ltree[])`	`ltree`	lowest common ancestor, i.e., longest common prefix of paths	`lca(array['1.2.2.3'::ltree,'1.2.3'])`	`1.2`

E.21.3. Indexes

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

ltree supports several types of indexes that can speed up the indicated operators:

B-tree index over ltree: <, <=, =, >=, >
GiST index over ltree: <, <=, =, >=, >, @>, <@, @, ~, ?
Example of creating such an index:
```
CREATE INDEX path_gist_idx ON test USING GIST (path);
```
GiST index over ltree[]: ltree[] <@ ltree, ltree @> ltree[], @, ~, ?
Example of creating such an index:
```
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
```
Note: This index type is lossy.

E.21.4. Example

Note: The following description applies both to Postgres-XC and PostgreSQL if not described explicitly.

This example uses the following data (also available in file contrib/ltree/ltreetest.sql in the source distribution):

CREATE TABLE test (path ltree);
INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
CREATE INDEX path_gist_idx ON test USING gist(path);
CREATE INDEX path_idx ON test USING btree(path);

Now, we have a table test populated with data describing the hierarchy shown below:

                        Top
                     /   |  \
             Science Hobbies Collections
                 /       |              \
        Astronomy   Amateurs_Astronomy Pictures
           /  \                            |
Astrophysics  Cosmology                Astronomy
                                        /  |    \
                                 Galaxies Stars Astronauts

We can do inheritance:

ltreetest=> SELECT path FROM test WHERE path <@ 'Top.Science';
                path
------------------------------------
 Top.Science
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(4 rows)

Here are some examples of path matching:

ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*';
                     path
-----------------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Collections.Pictures.Astronomy
 Top.Collections.Pictures.Astronomy.Stars
 Top.Collections.Pictures.Astronomy.Galaxies
 Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)

ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

Here are some examples of full text search:

ltreetest=> SELECT path FROM test WHERE path @ 'Astro*% & !pictures@';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Hobbies.Amateurs_Astronomy
(4 rows)

ltreetest=> SELECT path FROM test WHERE path @ 'Astro* & !pictures@';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

Path construction using functions:

ltreetest=> SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test WHERE path <@ 'Top.Science.Astronomy';
                 ?column?
------------------------------------------
 Top.Science.Space.Astronomy
 Top.Science.Space.Astronomy.Astrophysics
 Top.Science.Space.Astronomy.Cosmology
(3 rows)

We could simplify this by creating a SQL function that inserts a label at a specified position in a path:

CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree
    AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
    LANGUAGE SQL IMMUTABLE;

ltreetest=> SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy';
                ins_label
------------------------------------------
 Top.Science.Space.Astronomy
 Top.Science.Space.Astronomy.Astrophysics
 Top.Science.Space.Astronomy.Cosmology
(3 rows)

E.21.5. Authors

All work was done by Teodor Sigaev (<teodor@stack.net>) and Oleg Bartunov (<oleg@sai.msu.su>). See http://www.sai.msu.su/~megera/postgres/gist/ for additional information. Authors would like to thank Eugeny Rodichev for helpful discussions. Comments and bug reports are welcome.

Prev	Home	Next
lo	Up	oid2name