blob: cad4e41ed6cc0ee35e483a6d34b007f47ad9f216 [file] [log] [blame] [edit]
---
# ----------------------------------------------------------------------------
#
# *** AUTO GENERATED CODE *** Type: MMv1 ***
#
# ----------------------------------------------------------------------------
#
# This file is automatically generated by Magic Modules and manual
# changes will be clobbered when the file is regenerated.
#
# Please read more about how to change this file in
# .github/CONTRIBUTING.md.
#
# ----------------------------------------------------------------------------
subcategory: "Vertex AI"
description: |-
A representation of a collection of database items organized in a way that allows for approximate nearest neighbor (a.
---
# google_vertex_ai_index
A representation of a collection of database items organized in a way that allows for approximate nearest neighbor (a.k.a ANN) algorithms search.
To get more information about Index, see:
* [API documentation](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.indexes/)
## Example Usage - Vertex Ai Index
```hcl
resource "google_storage_bucket" "bucket" {
name = "vertex-ai-index-test"
location = "us-central1"
uniform_bucket_level_access = true
}
# The sample data comes from the following link:
# https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#specify-namespaces-tokens
resource "google_storage_bucket_object" "data" {
name = "contents/data.json"
bucket = google_storage_bucket.bucket.name
content = <<EOF
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}
EOF
}
resource "google_vertex_ai_index" "index" {
labels = {
foo = "bar"
}
region = "us-central1"
display_name = "test-index"
description = "index for test"
metadata {
contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents"
config {
dimensions = 2
approximate_neighbors_count = 150
shard_size = "SHARD_SIZE_SMALL"
distance_measure_type = "DOT_PRODUCT_DISTANCE"
algorithm_config {
tree_ah_config {
leaf_node_embedding_count = 500
leaf_nodes_to_search_percent = 7
}
}
}
}
index_update_method = "BATCH_UPDATE"
}
```
## Example Usage - Vertex Ai Index Streaming
```hcl
resource "google_storage_bucket" "bucket" {
name = "vertex-ai-index-test"
location = "us-central1"
uniform_bucket_level_access = true
}
# The sample data comes from the following link:
# https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#specify-namespaces-tokens
resource "google_storage_bucket_object" "data" {
name = "contents/data.json"
bucket = google_storage_bucket.bucket.name
content = <<EOF
{"id": "42", "embedding": [0.5, 1.0], "restricts": [{"namespace": "class", "allow": ["cat", "pet"]},{"namespace": "category", "allow": ["feline"]}]}
{"id": "43", "embedding": [0.6, 1.0], "restricts": [{"namespace": "class", "allow": ["dog", "pet"]},{"namespace": "category", "allow": ["canine"]}]}
EOF
}
resource "google_vertex_ai_index" "index" {
labels = {
foo = "bar"
}
region = "us-central1"
display_name = "test-index"
description = "index for test"
metadata {
contents_delta_uri = "gs://${google_storage_bucket.bucket.name}/contents"
config {
dimensions = 2
shard_size = "SHARD_SIZE_LARGE"
distance_measure_type = "COSINE_DISTANCE"
feature_norm_type = "UNIT_L2_NORM"
algorithm_config {
brute_force_config {}
}
}
}
index_update_method = "STREAM_UPDATE"
}
```
## Argument Reference
The following arguments are supported:
* `display_name` -
(Required)
The display name of the Index. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- - -
* `description` -
(Optional)
The description of the Index.
* `metadata` -
(Optional)
An additional information about the Index
Structure is [documented below](#nested_metadata).
* `labels` -
(Optional)
The labels with user-defined metadata to organize your Indexes.
**Note**: This field is non-authoritative, and will only manage the labels present in your configuration.
Please refer to the field `effective_labels` for all of the labels present on the resource.
* `index_update_method` -
(Optional)
The update method to use with this Index. The value must be the followings. If not set, BATCH_UPDATE will be used by default.
* BATCH_UPDATE: user can call indexes.patch with files on Cloud Storage of datapoints to update.
* STREAM_UPDATE: user can call indexes.upsertDatapoints/DeleteDatapoints to update the Index and the updates will be applied in corresponding DeployedIndexes in nearly real-time.
* `region` -
(Optional)
The region of the index. eg us-central1
* `project` - (Optional) The ID of the project in which the resource belongs.
If it is not provided, the provider project is used.
<a name="nested_metadata"></a>The `metadata` block supports:
* `contents_delta_uri` -
(Required)
Allows inserting, updating or deleting the contents of the Matching Engine Index.
The string must be a valid Cloud Storage directory path. If this
field is set when calling IndexService.UpdateIndex, then no other
Index field can be also updated as part of the same call.
The expected structure and format of the files this URI points to is
described at https://cloud.google.com/vertex-ai/docs/matching-engine/using-matching-engine#input-data-format
* `is_complete_overwrite` -
(Optional)
If this field is set together with contentsDeltaUri when calling IndexService.UpdateIndex,
then existing content of the Index will be replaced by the data from the contentsDeltaUri.
* `config` -
(Optional)
The configuration of the Matching Engine Index.
Structure is [documented below](#nested_config).
<a name="nested_config"></a>The `config` block supports:
* `dimensions` -
(Required)
The number of dimensions of the input vectors.
* `approximate_neighbors_count` -
(Optional)
The default number of neighbors to find via approximate search before exact reordering is
performed. Exact reordering is a procedure where results returned by an
approximate search algorithm are reordered via a more expensive distance computation.
Required if tree-AH algorithm is used.
* `shard_size` -
(Optional)
Index data is split into equal parts to be processed. These are called "shards".
The shard size must be specified when creating an index. The value must be one of the followings:
* SHARD_SIZE_SMALL: Small (2GB)
* SHARD_SIZE_MEDIUM: Medium (20GB)
* SHARD_SIZE_LARGE: Large (50GB)
* `distance_measure_type` -
(Optional)
The distance measure used in nearest neighbor search. The value must be one of the followings:
* SQUARED_L2_DISTANCE: Euclidean (L_2) Distance
* L1_DISTANCE: Manhattan (L_1) Distance
* COSINE_DISTANCE: Cosine Distance. Defined as 1 - cosine similarity.
* DOT_PRODUCT_DISTANCE: Dot Product Distance. Defined as a negative of the dot product
* `feature_norm_type` -
(Optional)
Type of normalization to be carried out on each vector. The value must be one of the followings:
* UNIT_L2_NORM: Unit L2 normalization type
* NONE: No normalization type is specified.
* `algorithm_config` -
(Optional)
The configuration with regard to the algorithms used for efficient search.
Structure is [documented below](#nested_algorithm_config).
<a name="nested_algorithm_config"></a>The `algorithm_config` block supports:
* `tree_ah_config` -
(Optional)
Configuration options for using the tree-AH algorithm (Shallow tree + Asymmetric Hashing).
Please refer to this paper for more details: https://arxiv.org/abs/1908.10396
Structure is [documented below](#nested_tree_ah_config).
* `brute_force_config` -
(Optional)
Configuration options for using brute force search, which simply implements the
standard linear search in the database for each query.
<a name="nested_tree_ah_config"></a>The `tree_ah_config` block supports:
* `leaf_node_embedding_count` -
(Optional)
Number of embeddings on each leaf node. The default value is 1000 if not set.
* `leaf_nodes_to_search_percent` -
(Optional)
The default percentage of leaf nodes that any query may be searched. Must be in
range 1-100, inclusive. The default value is 10 (means 10%) if not set.
## Attributes Reference
In addition to the arguments listed above, the following computed attributes are exported:
* `id` - an identifier for the resource with format `projects/{{project}}/locations/{{region}}/indexes/{{name}}`
* `name` -
The resource name of the Index.
* `metadata_schema_uri` -
Points to a YAML file stored on Google Cloud Storage describing additional information about the Index, that is specific to it. Unset if the Index does not have any additional information.
* `deployed_indexes` -
The pointers to DeployedIndexes created from this Index. An Index can be only deleted if all its DeployedIndexes had been undeployed first.
Structure is [documented below](#nested_deployed_indexes).
* `etag` -
Used to perform consistent read-modify-write updates.
* `create_time` -
The timestamp of when the Index was created in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits.
* `update_time` -
The timestamp of when the Index was last updated in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits.
* `index_stats` -
Stats of the index resource.
Structure is [documented below](#nested_index_stats).
* `terraform_labels` -
The combination of labels configured directly on the resource
and default labels configured on the provider.
* `effective_labels` -
All of labels (key/value pairs) present on the resource in GCP, including the labels configured through Terraform, other clients and services.
<a name="nested_deployed_indexes"></a>The `deployed_indexes` block contains:
* `index_endpoint` -
(Output)
A resource name of the IndexEndpoint.
* `deployed_index_id` -
(Output)
The ID of the DeployedIndex in the above IndexEndpoint.
<a name="nested_index_stats"></a>The `index_stats` block contains:
* `vectors_count` -
(Output)
The number of vectors in the Index.
* `shards_count` -
(Output)
The number of shards in the Index.
## Timeouts
This resource provides the following
[Timeouts](https://developer.hashicorp.com/terraform/plugin/sdkv2/resources/retries-and-customizable-timeouts) configuration options:
- `create` - Default is 180 minutes.
- `update` - Default is 180 minutes.
- `delete` - Default is 180 minutes.
## Import
Index can be imported using any of these accepted formats:
* `projects/{{project}}/locations/{{region}}/indexes/{{name}}`
* `{{project}}/{{region}}/{{name}}`
* `{{region}}/{{name}}`
* `{{name}}`
In Terraform v1.5.0 and later, use an [`import` block](https://developer.hashicorp.com/terraform/language/import) to import Index using one of the formats above. For example:
```tf
import {
id = "projects/{{project}}/locations/{{region}}/indexes/{{name}}"
to = google_vertex_ai_index.default
}
```
When using the [`terraform import` command](https://developer.hashicorp.com/terraform/cli/commands/import), Index can be imported using one of the formats above. For example:
```
$ terraform import google_vertex_ai_index.default projects/{{project}}/locations/{{region}}/indexes/{{name}}
$ terraform import google_vertex_ai_index.default {{project}}/{{region}}/{{name}}
$ terraform import google_vertex_ai_index.default {{region}}/{{name}}
$ terraform import google_vertex_ai_index.default {{name}}
```
## User Project Overrides
This resource supports [User Project Overrides](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#user_project_override).