| --- |
| # ---------------------------------------------------------------------------- |
| # |
| # *** AUTO GENERATED CODE *** Type: MMv1 *** |
| # |
| # ---------------------------------------------------------------------------- |
| # |
| # This file is automatically generated by Magic Modules and manual |
| # changes will be clobbered when the file is regenerated. |
| # |
| # Please read more about how to change this file in |
| # .github/CONTRIBUTING.md. |
| # |
| # ---------------------------------------------------------------------------- |
| subcategory: "Discovery Engine" |
| description: |- |
| Data store is a collection of websites and documents used to find answers for |
| end-user's questions in Discovery Engine (a. |
| --- |
| |
| # google_discovery_engine_data_store |
| |
| Data store is a collection of websites and documents used to find answers for |
| end-user's questions in Discovery Engine (a.k.a. Vertex AI Search and |
| Conversation). |
| |
| |
| To get more information about DataStore, see: |
| |
| * [API documentation](https://cloud.google.com/generative-ai-app-builder/docs/reference/rest/v1/projects.locations.collections.dataStores) |
| * How-to Guides |
| * [Create a search data store](https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es) |
| |
| <div class = "oics-button" style="float: right; margin: 0 0 -15px"> |
| <a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md&cloudshell_working_dir=discoveryengine_datastore_basic&open_in_editor=main.tf" target="_blank"> |
| <img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;"> |
| </a> |
| </div> |
| ## Example Usage - Discoveryengine Datastore Basic |
| |
| |
| ```hcl |
| resource "google_discovery_engine_data_store" "basic" { |
| location = "global" |
| data_store_id = "data-store-id" |
| display_name = "tf-test-structured-datastore" |
| industry_vertical = "GENERIC" |
| content_config = "NO_CONTENT" |
| solution_types = ["SOLUTION_TYPE_SEARCH"] |
| create_advanced_site_search = false |
| skip_default_schema_creation = false |
| } |
| ``` |
| <div class = "oics-button" style="float: right; margin: 0 0 -15px"> |
| <a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md&cloudshell_working_dir=discoveryengine_datastore_document_processing_config&open_in_editor=main.tf" target="_blank"> |
| <img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;"> |
| </a> |
| </div> |
| ## Example Usage - Discoveryengine Datastore Document Processing Config |
| |
| |
| ```hcl |
| resource "google_discovery_engine_data_store" "document_processing_config" { |
| location = "global" |
| data_store_id = "data-store-id" |
| display_name = "tf-test-structured-datastore" |
| industry_vertical = "GENERIC" |
| content_config = "NO_CONTENT" |
| solution_types = ["SOLUTION_TYPE_SEARCH"] |
| create_advanced_site_search = false |
| document_processing_config { |
| default_parsing_config { |
| digital_parsing_config {} |
| } |
| parsing_config_overrides { |
| file_type = "pdf" |
| ocr_parsing_config { |
| use_native_text = true |
| } |
| } |
| } |
| } |
| ``` |
| |
| ## Argument Reference |
| |
| The following arguments are supported: |
| |
| |
| * `display_name` - |
| (Required) |
| The display name of the data store. This field must be a UTF-8 encoded |
| string with a length limit of 128 characters. |
| |
| * `industry_vertical` - |
| (Required) |
| The industry vertical that the data store registers. |
| Possible values are: `GENERIC`, `MEDIA`, `HEALTHCARE_FHIR`. |
| |
| * `content_config` - |
| (Required) |
| The content config of the data store. |
| Possible values are: `NO_CONTENT`, `CONTENT_REQUIRED`, `PUBLIC_WEBSITE`. |
| |
| * `location` - |
| (Required) |
| The geographic location where the data store should reside. The value can |
| only be one of "global", "us" and "eu". |
| |
| * `data_store_id` - |
| (Required) |
| The unique id of the data store. |
| |
| |
| - - - |
| |
| |
| * `solution_types` - |
| (Optional) |
| The solutions that the data store enrolls. |
| Each value may be one of: `SOLUTION_TYPE_RECOMMENDATION`, `SOLUTION_TYPE_SEARCH`, `SOLUTION_TYPE_CHAT`, `SOLUTION_TYPE_GENERATIVE_CHAT`. |
| |
| * `document_processing_config` - |
| (Optional) |
| Configuration for Document understanding and enrichment. |
| Structure is [documented below](#nested_document_processing_config). |
| |
| * `create_advanced_site_search` - |
| (Optional) |
| If true, an advanced data store for site search will be created. If the |
| data store is not configured as site search (GENERIC vertical and |
| PUBLIC_WEBSITE contentConfig), this flag will be ignored. |
| |
| * `skip_default_schema_creation` - |
| (Optional) |
| A boolean flag indicating whether to skip the default schema creation for |
| the data store. Only enable this flag if you are certain that the default |
| schema is incompatible with your use case. |
| If set to true, you must manually create a schema for the data store |
| before any documents can be ingested. |
| This flag cannot be specified if `data_store.starting_schema` is |
| specified. |
| |
| * `project` - (Optional) The ID of the project in which the resource belongs. |
| If it is not provided, the provider project is used. |
| |
| |
| <a name="nested_document_processing_config"></a>The `document_processing_config` block supports: |
| |
| * `name` - |
| (Output) |
| The full resource name of the Document Processing Config. Format: |
| `projects/{project}/locations/{location}/collections/{collection_id}/dataStores/{data_store_id}/documentProcessingConfig`. |
| |
| * `chunking_config` - |
| (Optional) |
| Whether chunking mode is enabled. |
| Structure is [documented below](#nested_chunking_config). |
| |
| * `default_parsing_config` - |
| (Optional) |
| Configurations for default Document parser. If not specified, this resource |
| will be configured to use a default DigitalParsingConfig, and the default parsing |
| config will be applied to all file types for Document parsing. |
| Structure is [documented below](#nested_default_parsing_config). |
| |
| * `parsing_config_overrides` - |
| (Optional) |
| Map from file type to override the default parsing configuration based on the file type. Supported keys: |
| * `pdf`: Override parsing config for PDF files, either digital parsing, ocr parsing or layout parsing is supported. |
| * `html`: Override parsing config for HTML files, only digital parsing and or layout parsing are supported. |
| * `docx`: Override parsing config for DOCX files, only digital parsing and or layout parsing are supported. |
| Structure is [documented below](#nested_parsing_config_overrides). |
| |
| |
| <a name="nested_chunking_config"></a>The `chunking_config` block supports: |
| |
| * `layout_based_chunking_config` - |
| (Optional) |
| Configuration for the layout based chunking. |
| Structure is [documented below](#nested_layout_based_chunking_config). |
| |
| |
| <a name="nested_layout_based_chunking_config"></a>The `layout_based_chunking_config` block supports: |
| |
| * `chunk_size` - |
| (Optional) |
| The token size limit for each chunk. |
| Supported values: 100-500 (inclusive). Default value: 500. |
| |
| * `include_ancestor_headings` - |
| (Optional) |
| Whether to include appending different levels of headings to chunks from the middle of the document to prevent context loss. |
| Default value: False. |
| |
| <a name="nested_default_parsing_config"></a>The `default_parsing_config` block supports: |
| |
| * `digital_parsing_config` - |
| (Optional) |
| Configurations applied to digital parser. |
| |
| * `ocr_parsing_config` - |
| (Optional) |
| Configurations applied to OCR parser. Currently it only applies to PDFs. |
| Structure is [documented below](#nested_ocr_parsing_config). |
| |
| * `layout_parsing_config` - |
| (Optional) |
| Configurations applied to layout parser. |
| |
| |
| <a name="nested_ocr_parsing_config"></a>The `ocr_parsing_config` block supports: |
| |
| * `use_native_text` - |
| (Optional) |
| If true, will use native text instead of OCR text on pages containing native text. |
| |
| <a name="nested_parsing_config_overrides"></a>The `parsing_config_overrides` block supports: |
| |
| * `file_type` - (Required) The identifier for this object. Format specified above. |
| |
| * `digital_parsing_config` - |
| (Optional) |
| Configurations applied to digital parser. |
| |
| * `ocr_parsing_config` - |
| (Optional) |
| Configurations applied to OCR parser. Currently it only applies to PDFs. |
| Structure is [documented below](#nested_ocr_parsing_config). |
| |
| * `layout_parsing_config` - |
| (Optional) |
| Configurations applied to layout parser. |
| |
| |
| <a name="nested_ocr_parsing_config"></a>The `ocr_parsing_config` block supports: |
| |
| * `use_native_text` - |
| (Optional) |
| If true, will use native text instead of OCR text on pages containing native text. |
| |
| ## Attributes Reference |
| |
| In addition to the arguments listed above, the following computed attributes are exported: |
| |
| * `id` - an identifier for the resource with format `projects/{{project}}/locations/{{location}}/collections/default_collection/dataStores/{{data_store_id}}` |
| |
| * `name` - |
| The unique full resource name of the data store. Values are of the format |
| `projects/{project}/locations/{location}/collections/{collection_id}/dataStores/{data_store_id}`. |
| This field must be a UTF-8 encoded string with a length limit of 1024 |
| characters. |
| |
| * `default_schema_id` - |
| The id of the default Schema associated with this data store. |
| |
| * `create_time` - |
| Timestamp when the DataStore was created. |
| |
| |
| ## Timeouts |
| |
| This resource provides the following |
| [Timeouts](https://developer.hashicorp.com/terraform/plugin/sdkv2/resources/retries-and-customizable-timeouts) configuration options: |
| |
| - `create` - Default is 20 minutes. |
| - `update` - Default is 20 minutes. |
| - `delete` - Default is 20 minutes. |
| |
| ## Import |
| |
| |
| DataStore can be imported using any of these accepted formats: |
| |
| * `projects/{{project}}/locations/{{location}}/collections/default_collection/dataStores/{{data_store_id}}` |
| * `{{project}}/{{location}}/{{data_store_id}}` |
| * `{{location}}/{{data_store_id}}` |
| |
| |
| In Terraform v1.5.0 and later, use an [`import` block](https://developer.hashicorp.com/terraform/language/import) to import DataStore using one of the formats above. For example: |
| |
| ```tf |
| import { |
| id = "projects/{{project}}/locations/{{location}}/collections/default_collection/dataStores/{{data_store_id}}" |
| to = google_discovery_engine_data_store.default |
| } |
| ``` |
| |
| When using the [`terraform import` command](https://developer.hashicorp.com/terraform/cli/commands/import), DataStore can be imported using one of the formats above. For example: |
| |
| ``` |
| $ terraform import google_discovery_engine_data_store.default projects/{{project}}/locations/{{location}}/collections/default_collection/dataStores/{{data_store_id}} |
| $ terraform import google_discovery_engine_data_store.default {{project}}/{{location}}/{{data_store_id}} |
| $ terraform import google_discovery_engine_data_store.default {{location}}/{{data_store_id}} |
| ``` |
| |
| ## User Project Overrides |
| |
| This resource supports [User Project Overrides](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#user_project_override). |