CREATE INDEX

Define a new index.

Summary

CREATE [UNIQUE] INDEX [ CONCURRENTLY ] [name] ON table_name [USING method]
       ( {column_name | (expression)} [COLLATE parameter] [opclass] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
       [ WITH ( storage_parameter = value [, ... ] ) ]
       [ TABLESPACE tablespace ]
       [ WHERE predicate ]

Description

CREATE INDEX constructs an index on the specified column(s) of the specified table. Indexes are primarily used to enhance database performance (although improper use can result in performance degradation).

The key fields of the index are specified as column names or as expressions enclosed in parentheses. Multiple fields can be specified if the index method supports multi-column indexes.

Index fields can be expressions computed from one or more columns of the table row. This feature allows for rapid data access based on transformations of the underlying data. For example, an index computed on upper(col) will enable a clause like WHERE upper(col) = 'JIM' to utilize the index.

YMatrix provides indexing methods including B-tree, bitmap, GiST, SP-GiST, and GIN. Users can also define their own indexing methods, though this is quite complex.

When a WHERE clause is present, a partial index is created. A partial index only contains entries for a subset of a table, which is often more useful than the rest of the table. For instance, if you have a table containing both invoiced and uninvoiced orders, where uninvoiced orders make up a small portion of the total but are frequently accessed, creating an index on just that subset can enhance performance.

Expressions used in the WHERE clause may only reference columns of the underlying table, but they can use all columns, not just the indexed ones. Subqueries and aggregate expressions are prohibited in WHERE. The same restrictions apply to indexed fields that are expressions.

All functions and operators used in the index definition must be immutable. Their results must depend solely on their parameters and must not be influenced by external factors such as the contents of another table or parameter values. This ensures well-defined index behavior. To use user-defined functions in index expressions or WHERE clauses, remember to mark them as IMMUTABLE during function creation.

Parameters

  • UNIQUE: Creates an index and checks for duplicate values each time data is added. Duplicate entries will trigger an error. Unique indexes are only applicable to B-tree indexes. In YMatrix, a unique index is only permitted if the index key columns match the YMatrix distribution key (or a superset thereof). On partitioned tables, unique indexes are only supported within individual partitions and cannot span all partitions.

  • CONCURRENTLY: Creates an index without locks that prevent insert, update, and delete operations. A standard index creation locks the table for write operations (though reads are unaffected) until completion. When CONCURRENTLY is specified, YMatrix avoids locking writes. For temporary tables, CREATE INDEX is typically non-concurrent as no other sessions can access them. Non-concurrent index creation has lower overhead.

  • name: The name of the index to be created. Indexes are always created in the same schema as their parent table. If omitted, YMatrix will generate a suitable name based on the parent table and index columns.

  • table_name: The name of the table on which to create the index (can be schema-qualified).

  • method: The name of the indexing method to use. Options include btree, bitmap, gist, spgist, and gin. The default is btree. Currently, only B-tree, GiST, and GIN index methods support multi-column indexes. By default, up to 32 fields can be specified. Only B-tree currently supports unique indexes.

  • column_name: The name of the table column on which to create the index. Only B-tree, bitmap, GiST, and GIN index methods support multi-column indexes.

  • expression: An expression based on one or more columns of the table. The expression must typically be enclosed in parentheses, as shown in the syntax. However, parentheses can be omitted if the expression is a function call.

  • collation: The collation name used by the index. By default, the index uses the collation declared for the column or the result collation of the expression. Indexes with non-default collations are useful for queries involving expressions with non-default collations.

  • opclass: The name of the operator class. The operator class identifies the operators to be used by the index for the column. For example, a B-tree index on a four-byte integer will use the int4_ops class. The default operator class for the column's data type is usually sufficient. Operator classes are useful when multiple meaningful orderings exist for a data type. For instance, complex numbers can be ordered by absolute value or real part, requiring separate operator classes for each.

  • ASC: Specifies ascending order sorting (the default).

  • DESC: Specifies descending order sorting.

  • NULLS FIRST: Specifies that nulls appear before non-nulls. This is the default when DESC is specified.

  • NULLS LAST: Specifies that nulls appear after non-nulls. This is the default when DESC is not specified.

  • storage_parameter: The name of a storage parameter specific to the index method. Each index method has its own set of allowed storage parameters.

    • FILLFACTOR: Accepted by B-tree, bitmap, GiST, and SP-GiST index methods. It determines the packing density of index pages. For B-tree, leaf pages are filled to this percentage during initial build and rightward expansion. If pages fill completely, they split, gradually reducing index efficiency. B-tree defaults to 90, but values between 10 and 100 can be chosen. For static tables, fillfactor 100 minimizes physical size, while smaller values better handle heavily updated tables. Other index methods use fillfactor similarly, with method-specific defaults.

    • BUFFERING: In addition to FILLFACTOR, GiST indexes accept the BUFFERING parameter. It determines whether buffering build technology is used. Setting it to OFF disables it; ON enables it; AUTO initially disables it but enables it once the index size reaches the effective cache size. The default is AUTO.

    • FASTUPDATE: The GIN index method accepts the FASTUPDATE storage parameter. This boolean parameter enables or disables GIN fast update technology. ON enables it (the default), while OFF disables it. See PostgreSQL documentation for details.

  • tablespace_name: The tablespace in which to create the index. If not specified, the default tablespace is used.

  • predicate: The constraint expression for a partial index.

Notes

An operator class can be specified for each index column. The operator class identifies the operators to be used by the index for the column. For example, a B-tree index on a four-byte integer will use the int4_ops class. This class contains comparison functions for four-byte integers. The default operator class for the column's data type is usually sufficient. Operator classes are useful when multiple meaningful orderings exist for a data type. For instance, complex numbers can be ordered by absolute value or real part, requiring separate operator classes for each.

For index methods supporting ordered scans (currently only B-tree), optional clauses ASC, DESC, NULLS FIRST, and/or NULLS LAST can modify the index's sort order. Since ordered indexes can be scanned forward or backward, creating a single-column DESC index is typically unnecessary as the default index can provide the sort order. The value of these options lies in creating multi-column indexes that match the sort order requested by mixed-sorting queries, such as SELECT ... ORDER BY x ASC, y DESC. The NULLS options are useful for supporting "nulls sort low" behavior in queries relying on indexes to avoid sort steps, contrary to the default "nulls sort high" behavior.

For most index methods, index creation speed depends on maintenance_work_mem. Larger values reduce index creation time, provided they do not exceed actual available memory, which would cause swapping.

After creating an index on a partitioned table, it will propagate to all sub-tables created by YMatrix. Creating indexes on tables for use with partitioned tables created by YMatrix is not supported.

A UNIQUE index is only allowed if the index columns match the YMatrix distribution key columns (or a superset thereof).

Unique indexes are not permitted on append-optimized tables.

Unique indexes can be created on partitioned tables. However, uniqueness is only enforced within each partition. Uniqueness is not enforced across partitions. For example, for a partitioned table with partitions based on years and subpartitions based on quarters, uniqueness is only enforced within each quarter partition. Uniqueness is not enforced across quarter partitions.

By default, indexes are not used for IS NULL clauses. In such cases, the best way to utilize an index is to create a partial index using the IS NULL predicate.

Bitmap indexes perform best for columns with 100 to 100,000 distinct values. For columns with over 100,000 distinct values, bitmap index performance and space efficiency decline. The size of a bitmap index is proportional to the number of rows in the table multiplied by the number of distinct values in the indexed column.

Columns with fewer than 100 distinct values typically do not benefit much from any type of index. For example, a gender column with only two distinct values (male and female) is not an ideal candidate for indexing.

Previous versions of YMatrix also included the R-tree indexing method. This method has been removed as it offers no significant advantages over the GiST method. Specifying USING rtree will be interpreted as USING gist by CREATE INDEX.

The use of hash indexes has been disabled in YMatrix.

Example

To create a B-tree index on the title column of the films table:

CREATE UNIQUE INDEX title_idx ON films (title);

To create a bitmap index on the gender column in the employee table:

CREATE INDEX gender_bmp_idx ON employee USING bitmap (gender);

To create an index on the expression lower(title) to enable case-insensitive searches:

CREATE INDEX ON films ((lower(title)));

(In this example, we omit the index name, so the system will generate one, typically films_lower_idx.)

To create an index using non-default collation:

CREATE INDEX title_idx_german ON films (title COLLATE "de_DE");

To create an index with a non-default fill factor:

CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70);

To create a GIN index with fast updates disabled:

CREATE INDEX gin_idx ON documents_table USING gin (locations) WITH (fastupdate = off);

To create an index on the code column of the films table and store the index in the indexspace tablespace:

CREATE INDEX code_idx ON films(code) TABLESPACE indexspace;

To create a GiST index on the point property to effectively use the box operator for conversion function results:

CREATE INDEX pointloc ON points USING gist (box(location, location));
SELECT * FROM points WHERE box(location, location) && '(0,0),(1,1)'::box;

To create an index for which write operations are not enabled:

CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);

Compatibility

CREATE INDEX is an extension of the YMatrix database language. There is no indexing provision in the SQL standard.

See Also

ALTER INDEX, DROP INDEX