blob: 9d5e5fa35fd3cc85303199ce9fc0b821eb36de1f [file] [log] [blame]
---
# ----------------------------------------------------------------------------
#
# *** AUTO GENERATED CODE *** Type: MMv1 ***
#
# ----------------------------------------------------------------------------
#
# This file is automatically generated by Magic Modules and manual
# changes will be clobbered when the file is regenerated.
#
# Please read more about how to change this file in
# .github/CONTRIBUTING.md.
#
# ----------------------------------------------------------------------------
subcategory: "BigQuery"
description: |-
Jobs are actions that BigQuery runs on your behalf to load data, export data, query data, or copy data.
---
# google\_bigquery\_job
Jobs are actions that BigQuery runs on your behalf to load data, export data, query data, or copy data.
Once a BigQuery job is created, it cannot be changed or deleted.
To get more information about Job, see:
* [API documentation](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs)
* How-to Guides
* [BigQuery Jobs Intro](https://cloud.google.com/bigquery/docs/jobs-overview)
<div class = "oics-button" style="float: right; margin: 0 0 -15px">
<a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_working_dir=bigquery_job_query&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&open_in_editor=main.tf&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md" target="_blank">
<img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;">
</a>
</div>
## Example Usage - Bigquery Job Query
```hcl
resource "google_bigquery_table" "foo" {
deletion_protection = false
dataset_id = google_bigquery_dataset.bar.dataset_id
table_id = "job_query_table"
}
resource "google_bigquery_dataset" "bar" {
dataset_id = "job_query_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_bigquery_job" "job" {
job_id = "job_query"
labels = {
"example-label" ="example-value"
}
query {
query = "SELECT state FROM [lookerdata:cdc.project_tycho_reports]"
destination_table {
project_id = google_bigquery_table.foo.project
dataset_id = google_bigquery_table.foo.dataset_id
table_id = google_bigquery_table.foo.table_id
}
allow_large_results = true
flatten_results = true
script_options {
key_result_statement = "LAST"
}
}
}
```
<div class = "oics-button" style="float: right; margin: 0 0 -15px">
<a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_working_dir=bigquery_job_query_table_reference&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&open_in_editor=main.tf&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md" target="_blank">
<img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;">
</a>
</div>
## Example Usage - Bigquery Job Query Table Reference
```hcl
resource "google_bigquery_table" "foo" {
deletion_protection = false
dataset_id = google_bigquery_dataset.bar.dataset_id
table_id = "job_query_table"
}
resource "google_bigquery_dataset" "bar" {
dataset_id = "job_query_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_bigquery_job" "job" {
job_id = "job_query"
labels = {
"example-label" ="example-value"
}
query {
query = "SELECT state FROM [lookerdata:cdc.project_tycho_reports]"
destination_table {
table_id = google_bigquery_table.foo.id
}
default_dataset {
dataset_id = google_bigquery_dataset.bar.id
}
allow_large_results = true
flatten_results = true
script_options {
key_result_statement = "LAST"
}
}
}
```
<div class = "oics-button" style="float: right; margin: 0 0 -15px">
<a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_working_dir=bigquery_job_load&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&open_in_editor=main.tf&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md" target="_blank">
<img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;">
</a>
</div>
## Example Usage - Bigquery Job Load
```hcl
resource "google_bigquery_table" "foo" {
deletion_protection = false
dataset_id = google_bigquery_dataset.bar.dataset_id
table_id = "job_load_table"
}
resource "google_bigquery_dataset" "bar" {
dataset_id = "job_load_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_bigquery_job" "job" {
job_id = "job_load"
labels = {
"my_job" ="load"
}
load {
source_uris = [
"gs://cloud-samples-data/bigquery/us-states/us-states-by-date.csv",
]
destination_table {
project_id = google_bigquery_table.foo.project
dataset_id = google_bigquery_table.foo.dataset_id
table_id = google_bigquery_table.foo.table_id
}
skip_leading_rows = 1
schema_update_options = ["ALLOW_FIELD_RELAXATION", "ALLOW_FIELD_ADDITION"]
write_disposition = "WRITE_APPEND"
autodetect = true
}
}
```
## Example Usage - Bigquery Job Load Geojson
```hcl
locals {
project = "my-project-name" # Google Cloud Platform Project ID
}
resource "google_storage_bucket" "bucket" {
name = "${local.project}-bq-geojson" # Every bucket name must be globally unique
location = "US"
uniform_bucket_level_access = true
}
resource "google_storage_bucket_object" "object" {
name = "geojson-data.jsonl"
bucket = google_storage_bucket.bucket.name
content = <<EOF
{"type":"Feature","properties":{"continent":"Europe","region":"Scandinavia"},"geometry":{"type":"Polygon","coordinates":[[[-30.94,53.33],[33.05,53.33],[33.05,71.86],[-30.94,71.86],[-30.94,53.33]]]}}
{"type":"Feature","properties":{"continent":"Africa","region":"West Africa"},"geometry":{"type":"Polygon","coordinates":[[[-23.91,0],[11.95,0],[11.95,18.98],[-23.91,18.98],[-23.91,0]]]}}
EOF
}
resource "google_bigquery_table" "foo" {
deletion_protection = false
dataset_id = google_bigquery_dataset.bar.dataset_id
table_id = "job_load_table"
}
resource "google_bigquery_dataset" "bar" {
dataset_id = "job_load_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_bigquery_job" "job" {
job_id = "job_load"
labels = {
"my_job" = "load"
}
load {
source_uris = [
"gs://${google_storage_bucket_object.object.bucket}/${google_storage_bucket_object.object.name}"
]
destination_table {
project_id = google_bigquery_table.foo.project
dataset_id = google_bigquery_table.foo.dataset_id
table_id = google_bigquery_table.foo.table_id
}
write_disposition = "WRITE_TRUNCATE"
autodetect = true
source_format = "NEWLINE_DELIMITED_JSON"
json_extension = "GEOJSON"
}
depends_on = ["google_storage_bucket_object.object"]
}
```
<div class = "oics-button" style="float: right; margin: 0 0 -15px">
<a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_working_dir=bigquery_job_load_parquet&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&open_in_editor=main.tf&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md" target="_blank">
<img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;">
</a>
</div>
## Example Usage - Bigquery Job Load Parquet
```hcl
resource "google_storage_bucket" "test" {
name = "job_load_bucket"
location = "US"
uniform_bucket_level_access = true
}
resource "google_storage_bucket_object" "test" {
name = "job_load_bucket_object"
source = "./test-fixtures/test.parquet.gzip"
bucket = google_storage_bucket.test.name
}
resource "google_bigquery_dataset" "test" {
dataset_id = "job_load_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_bigquery_table" "test" {
deletion_protection = false
table_id = "job_load_table"
dataset_id = google_bigquery_dataset.test.dataset_id
}
resource "google_bigquery_job" "job" {
job_id = "job_load"
labels = {
"my_job" ="load"
}
load {
source_uris = [
"gs://${google_storage_bucket_object.test.bucket}/${google_storage_bucket_object.test.name}"
]
destination_table {
project_id = google_bigquery_table.test.project
dataset_id = google_bigquery_table.test.dataset_id
table_id = google_bigquery_table.test.table_id
}
schema_update_options = ["ALLOW_FIELD_RELAXATION", "ALLOW_FIELD_ADDITION"]
write_disposition = "WRITE_APPEND"
source_format = "PARQUET"
autodetect = true
parquet_options {
enum_as_string = true
enable_list_inference = true
}
}
}
```
## Example Usage - Bigquery Job Copy
```hcl
resource "google_bigquery_table" "source" {
deletion_protection = false
count = length(google_bigquery_dataset.source)
dataset_id = google_bigquery_dataset.source[count.index].dataset_id
table_id = "job_copy_${count.index}_table"
schema = <<EOF
[
{
"name": "name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "post_abbr",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "date",
"type": "DATE",
"mode": "NULLABLE"
}
]
EOF
}
resource "google_bigquery_dataset" "source" {
count = 2
dataset_id = "job_copy_${count.index}_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_bigquery_table" "dest" {
deletion_protection = false
dataset_id = google_bigquery_dataset.dest.dataset_id
table_id = "job_copy_dest_table"
schema = <<EOF
[
{
"name": "name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "post_abbr",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "date",
"type": "DATE",
"mode": "NULLABLE"
}
]
EOF
encryption_configuration {
kms_key_name = google_kms_crypto_key.crypto_key.id
}
depends_on = ["google_project_iam_member.encrypt_role"]
}
resource "google_bigquery_dataset" "dest" {
dataset_id = "job_copy_dest_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_kms_crypto_key" "crypto_key" {
name = "example-key"
key_ring = google_kms_key_ring.key_ring.id
}
resource "google_kms_key_ring" "key_ring" {
name = "example-keyring"
location = "global"
}
data "google_project" "project" {
project_id = "my-project-name"
}
resource "google_project_iam_member" "encrypt_role" {
project = data.google_project.project.project_id
role = "roles/cloudkms.cryptoKeyEncrypterDecrypter"
member = "serviceAccount:bq-${data.google_project.project.number}@bigquery-encryption.iam.gserviceaccount.com"
}
resource "google_bigquery_job" "job" {
job_id = "job_copy"
copy {
source_tables {
project_id = google_bigquery_table.source.0.project
dataset_id = google_bigquery_table.source.0.dataset_id
table_id = google_bigquery_table.source.0.table_id
}
source_tables {
project_id = google_bigquery_table.source.1.project
dataset_id = google_bigquery_table.source.1.dataset_id
table_id = google_bigquery_table.source.1.table_id
}
destination_table {
project_id = google_bigquery_table.dest.project
dataset_id = google_bigquery_table.dest.dataset_id
table_id = google_bigquery_table.dest.table_id
}
destination_encryption_configuration {
kms_key_name = google_kms_crypto_key.crypto_key.id
}
}
depends_on = ["google_project_iam_member.encrypt_role"]
}
```
<div class = "oics-button" style="float: right; margin: 0 0 -15px">
<a href="https://console.cloud.google.com/cloudshell/open?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fdocs-examples.git&cloudshell_working_dir=bigquery_job_extract&cloudshell_image=gcr.io%2Fcloudshell-images%2Fcloudshell%3Alatest&open_in_editor=main.tf&cloudshell_print=.%2Fmotd&cloudshell_tutorial=.%2Ftutorial.md" target="_blank">
<img alt="Open in Cloud Shell" src="//gstatic.com/cloudssh/images/open-btn.svg" style="max-height: 44px; margin: 32px auto; max-width: 100%;">
</a>
</div>
## Example Usage - Bigquery Job Extract
```hcl
resource "google_bigquery_table" "source-one" {
deletion_protection = false
dataset_id = google_bigquery_dataset.source-one.dataset_id
table_id = "job_extract_table"
schema = <<EOF
[
{
"name": "name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "post_abbr",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "date",
"type": "DATE",
"mode": "NULLABLE"
}
]
EOF
}
resource "google_bigquery_dataset" "source-one" {
dataset_id = "job_extract_dataset"
friendly_name = "test"
description = "This is a test description"
location = "US"
}
resource "google_storage_bucket" "dest" {
name = "job_extract_bucket"
location = "US"
force_destroy = true
}
resource "google_bigquery_job" "job" {
job_id = "job_extract"
extract {
destination_uris = ["${google_storage_bucket.dest.url}/extract"]
source_table {
project_id = google_bigquery_table.source-one.project
dataset_id = google_bigquery_table.source-one.dataset_id
table_id = google_bigquery_table.source-one.table_id
}
destination_format = "NEWLINE_DELIMITED_JSON"
compression = "GZIP"
}
}
```
## Argument Reference
The following arguments are supported:
* `job_id` -
(Required)
The ID of the job. The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or dashes (-). The maximum length is 1,024 characters.
<a name="nested_query"></a>The `query` block supports:
* `query` -
(Required)
SQL query text to execute. The useLegacySql field can be used to indicate whether the query uses legacy SQL or standard SQL.
*NOTE*: queries containing [DML language](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-manipulation-language)
(`DELETE`, `UPDATE`, `MERGE`, `INSERT`) must specify `create_disposition = ""` and `write_disposition = ""`.
* `destination_table` -
(Optional)
Describes the table where the query results should be stored.
This property must be set for large results that exceed the maximum response size.
For queries that produce anonymous (cached) results, this field will be populated by BigQuery.
Structure is [documented below](#nested_destination_table).
* `user_defined_function_resources` -
(Optional)
Describes user-defined function resources used in the query.
Structure is [documented below](#nested_user_defined_function_resources).
* `create_disposition` -
(Optional)
Specifies whether the job is allowed to create new tables. The following values are supported:
CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table.
CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result.
Creation, truncation and append actions occur as one atomic update upon job completion
Default value is `CREATE_IF_NEEDED`.
Possible values are: `CREATE_IF_NEEDED`, `CREATE_NEVER`.
* `write_disposition` -
(Optional)
Specifies the action that occurs if the destination table already exists. The following values are supported:
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the query result.
WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.
Each action is atomic and only occurs if BigQuery is able to complete the job successfully.
Creation, truncation and append actions occur as one atomic update upon job completion.
Default value is `WRITE_EMPTY`.
Possible values are: `WRITE_TRUNCATE`, `WRITE_APPEND`, `WRITE_EMPTY`.
* `default_dataset` -
(Optional)
Specifies the default dataset to use for unqualified table names in the query. Note that this does not alter behavior of unqualified dataset names.
Structure is [documented below](#nested_default_dataset).
* `priority` -
(Optional)
Specifies a priority for the query.
Default value is `INTERACTIVE`.
Possible values are: `INTERACTIVE`, `BATCH`.
* `allow_large_results` -
(Optional)
If true and query uses legacy SQL dialect, allows the query to produce arbitrarily large result tables at a slight cost in performance.
Requires destinationTable to be set. For standard SQL queries, this flag is ignored and large results are always allowed.
However, you must still set destinationTable when result size exceeds the allowed maximum response size.
* `use_query_cache` -
(Optional)
Whether to look for the result in the query cache. The query cache is a best-effort cache that will be flushed whenever
tables in the query are modified. Moreover, the query cache is only available when a query does not have a destination table specified.
The default value is true.
* `flatten_results` -
(Optional)
If true and query uses legacy SQL dialect, flattens all nested and repeated fields in the query results.
allowLargeResults must be true if this is set to false. For standard SQL queries, this flag is ignored and results are never flattened.
* `maximum_billing_tier` -
(Optional)
Limits the billing tier for this job. Queries that have resource usage beyond this tier will fail (without incurring a charge).
If unspecified, this will be set to your project default.
* `maximum_bytes_billed` -
(Optional)
Limits the bytes billed for this job. Queries that will have bytes billed beyond this limit will fail (without incurring a charge).
If unspecified, this will be set to your project default.
* `use_legacy_sql` -
(Optional)
Specifies whether to use BigQuery's legacy SQL dialect for this query. The default value is true.
If set to false, the query will use BigQuery's standard SQL.
* `parameter_mode` -
(Optional)
Standard SQL only. Set to POSITIONAL to use positional (?) query parameters or to NAMED to use named (@myparam) query parameters in this query.
* `schema_update_options` -
(Optional)
Allows the schema of the destination table to be updated as a side effect of the query job.
Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND;
when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table,
specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema.
One or more of the following values are specified:
ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema.
ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable.
* `destination_encryption_configuration` -
(Optional)
Custom encryption configuration (e.g., Cloud KMS keys)
Structure is [documented below](#nested_destination_encryption_configuration).
* `script_options` -
(Optional)
Options controlling the execution of scripts.
Structure is [documented below](#nested_script_options).
<a name="nested_destination_table"></a>The `destination_table` block supports:
* `project_id` -
(Optional)
The ID of the project containing this table.
* `dataset_id` -
(Optional)
The ID of the dataset containing this table.
* `table_id` -
(Required)
The table. Can be specified `{{table_id}}` if `project_id` and `dataset_id` are also set,
or of the form `projects/{{project}}/datasets/{{dataset_id}}/tables/{{table_id}}` if not.
<a name="nested_user_defined_function_resources"></a>The `user_defined_function_resources` block supports:
* `resource_uri` -
(Optional)
A code resource to load from a Google Cloud Storage URI (gs://bucket/path).
* `inline_code` -
(Optional)
An inline resource that contains code for a user-defined function (UDF).
Providing a inline code resource is equivalent to providing a URI for a file containing the same code.
<a name="nested_default_dataset"></a>The `default_dataset` block supports:
* `dataset_id` -
(Required)
The dataset. Can be specified `{{dataset_id}}` if `project_id` is also set,
or of the form `projects/{{project}}/datasets/{{dataset_id}}` if not.
* `project_id` -
(Optional)
The ID of the project containing this table.
<a name="nested_destination_encryption_configuration"></a>The `destination_encryption_configuration` block supports:
* `kms_key_name` -
(Required)
Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table.
The BigQuery Service Account associated with your project requires access to this encryption key.
* `kms_key_version` -
(Output)
Describes the Cloud KMS encryption key version used to protect destination BigQuery table.
<a name="nested_script_options"></a>The `script_options` block supports:
* `statement_timeout_ms` -
(Optional)
Timeout period for each statement in a script.
* `statement_byte_budget` -
(Optional)
Limit on the number of bytes billed per statement. Exceeding this budget results in an error.
* `key_result_statement` -
(Optional)
Determines which statement in the script represents the "key result",
used to populate the schema and query results of the script job.
Possible values are: `LAST`, `FIRST_SELECT`.
<a name="nested_load"></a>The `load` block supports:
* `source_uris` -
(Required)
The fully-qualified URIs that point to your data in Google Cloud.
For Google Cloud Storage URIs: Each URI can contain one '\*' wildcard character
and it must come after the 'bucket' name. Size limits related to load jobs apply
to external data sources. For Google Cloud Bigtable URIs: Exactly one URI can be
specified and it has be a fully specified and valid HTTPS URL for a Google Cloud Bigtable table.
For Google Cloud Datastore backups: Exactly one URI can be specified. Also, the '\*' wildcard character is not allowed.
* `destination_table` -
(Required)
The destination table to load the data into.
Structure is [documented below](#nested_destination_table).
* `create_disposition` -
(Optional)
Specifies whether the job is allowed to create new tables. The following values are supported:
CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table.
CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result.
Creation, truncation and append actions occur as one atomic update upon job completion
Default value is `CREATE_IF_NEEDED`.
Possible values are: `CREATE_IF_NEEDED`, `CREATE_NEVER`.
* `write_disposition` -
(Optional)
Specifies the action that occurs if the destination table already exists. The following values are supported:
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the query result.
WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.
Each action is atomic and only occurs if BigQuery is able to complete the job successfully.
Creation, truncation and append actions occur as one atomic update upon job completion.
Default value is `WRITE_EMPTY`.
Possible values are: `WRITE_TRUNCATE`, `WRITE_APPEND`, `WRITE_EMPTY`.
* `null_marker` -
(Optional)
Specifies a string that represents a null value in a CSV file. For example, if you specify "\N", BigQuery interprets "\N" as a null value
when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an
empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as
an empty value.
* `field_delimiter` -
(Optional)
The separator for fields in a CSV file. The separator can be any ISO-8859-1 single-byte character.
To use a character in the range 128-255, you must encode the character as UTF8. BigQuery converts
the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the
data in its raw, binary state. BigQuery also supports the escape sequence "\t" to specify a tab separator.
The default value is a comma (',').
* `skip_leading_rows` -
(Optional)
The number of rows at the top of a CSV file that BigQuery will skip when loading the data.
The default value is 0. This property is useful if you have header rows in the file that should be skipped.
When autodetect is on, the behavior is the following:
skipLeadingRows unspecified - Autodetect tries to detect headers in the first row. If they are not detected,
the row is read as data. Otherwise data is read starting from the second row.
skipLeadingRows is 0 - Instructs autodetect that there are no headers and data should be read starting from the first row.
skipLeadingRows = N > 0 - Autodetect skips N-1 rows and tries to detect headers in row N. If headers are not detected,
row N is just skipped. Otherwise row N is used to extract column names for the detected schema.
* `encoding` -
(Optional)
The character encoding of the data. The supported values are UTF-8 or ISO-8859-1.
The default value is UTF-8. BigQuery decodes the data after the raw, binary data
has been split using the values of the quote and fieldDelimiter properties.
* `quote` -
(Optional)
The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding,
and then uses the first byte of the encoded string to split the data in its raw, binary state.
The default value is a double-quote ('"'). If your data does not contain quoted sections, set the property value to an empty string.
If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true.
* `max_bad_records` -
(Optional)
The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value,
an invalid error is returned in the job result. The default value is 0, which requires that all records are valid.
* `allow_quoted_newlines` -
(Optional)
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file.
The default value is false.
* `source_format` -
(Optional)
The format of the data files. For CSV files, specify "CSV". For datastore backups, specify "DATASTORE_BACKUP".
For newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON". For Avro, specify "AVRO". For parquet, specify "PARQUET".
For orc, specify "ORC". [Beta] For Bigtable, specify "BIGTABLE".
The default value is CSV.
* `json_extension` -
(Optional)
If sourceFormat is set to newline-delimited JSON, indicates whether it should be processed as a JSON variant such as GeoJSON.
For a sourceFormat other than JSON, omit this field. If the sourceFormat is newline-delimited JSON: - for newline-delimited
GeoJSON: set to GEOJSON.
* `allow_jagged_rows` -
(Optional)
Accept rows that are missing trailing optional columns. The missing values are treated as nulls.
If false, records with missing trailing columns are treated as bad records, and if there are too many bad records,
an invalid error is returned in the job result. The default value is false. Only applicable to CSV, ignored for other formats.
* `ignore_unknown_values` -
(Optional)
Indicates if BigQuery should allow extra values that are not represented in the table schema.
If true, the extra values are ignored. If false, records with extra columns are treated as bad records,
and if there are too many bad records, an invalid error is returned in the job result.
The default value is false. The sourceFormat property determines what BigQuery treats as an extra value:
CSV: Trailing columns
JSON: Named values that don't match any column names
* `projection_fields` -
(Optional)
If sourceFormat is set to "DATASTORE_BACKUP", indicates which entity properties to load into BigQuery from a Cloud Datastore backup.
Property names are case sensitive and must be top-level properties. If no properties are specified, BigQuery loads all properties.
If any named property isn't found in the Cloud Datastore backup, an invalid error is returned in the job result.
* `autodetect` -
(Optional)
Indicates if we should automatically infer the options and schema for CSV and JSON sources.
* `schema_update_options` -
(Optional)
Allows the schema of the destination table to be updated as a side effect of the load job if a schema is autodetected or
supplied in the job configuration. Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND;
when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators.
For normal tables, WRITE_TRUNCATE will always overwrite the schema. One or more of the following values are specified:
ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema.
ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable.
* `time_partitioning` -
(Optional)
Time-based partitioning specification for the destination table.
Structure is [documented below](#nested_time_partitioning).
* `destination_encryption_configuration` -
(Optional)
Custom encryption configuration (e.g., Cloud KMS keys)
Structure is [documented below](#nested_destination_encryption_configuration).
* `parquet_options` -
(Optional)
Parquet Options for load and make external tables.
Structure is [documented below](#nested_parquet_options).
<a name="nested_destination_table"></a>The `destination_table` block supports:
* `project_id` -
(Optional)
The ID of the project containing this table.
* `dataset_id` -
(Optional)
The ID of the dataset containing this table.
* `table_id` -
(Required)
The table. Can be specified `{{table_id}}` if `project_id` and `dataset_id` are also set,
or of the form `projects/{{project}}/datasets/{{dataset_id}}/tables/{{table_id}}` if not.
<a name="nested_time_partitioning"></a>The `time_partitioning` block supports:
* `type` -
(Required)
The only type supported is DAY, which will generate one partition per day. Providing an empty string used to cause an error,
but in OnePlatform the field will be treated as unset.
* `expiration_ms` -
(Optional)
Number of milliseconds for which to keep the storage for a partition. A wrapper is used here because 0 is an invalid value.
* `field` -
(Optional)
If not set, the table is partitioned by pseudo column '_PARTITIONTIME'; if set, the table is partitioned by this field.
The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED.
A wrapper is used here because an empty string is an invalid value.
<a name="nested_destination_encryption_configuration"></a>The `destination_encryption_configuration` block supports:
* `kms_key_name` -
(Required)
Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table.
The BigQuery Service Account associated with your project requires access to this encryption key.
* `kms_key_version` -
(Output)
Describes the Cloud KMS encryption key version used to protect destination BigQuery table.
<a name="nested_parquet_options"></a>The `parquet_options` block supports:
* `enum_as_string` -
(Optional)
If sourceFormat is set to PARQUET, indicates whether to infer Parquet ENUM logical type as STRING instead of BYTES by default.
* `enable_list_inference` -
(Optional)
If sourceFormat is set to PARQUET, indicates whether to use schema inference specifically for Parquet LIST logical type.
<a name="nested_copy"></a>The `copy` block supports:
* `source_tables` -
(Required)
Source tables to copy.
Structure is [documented below](#nested_source_tables).
* `destination_table` -
(Optional)
The destination table.
Structure is [documented below](#nested_destination_table).
* `create_disposition` -
(Optional)
Specifies whether the job is allowed to create new tables. The following values are supported:
CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table.
CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result.
Creation, truncation and append actions occur as one atomic update upon job completion
Default value is `CREATE_IF_NEEDED`.
Possible values are: `CREATE_IF_NEEDED`, `CREATE_NEVER`.
* `write_disposition` -
(Optional)
Specifies the action that occurs if the destination table already exists. The following values are supported:
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the query result.
WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.
Each action is atomic and only occurs if BigQuery is able to complete the job successfully.
Creation, truncation and append actions occur as one atomic update upon job completion.
Default value is `WRITE_EMPTY`.
Possible values are: `WRITE_TRUNCATE`, `WRITE_APPEND`, `WRITE_EMPTY`.
* `destination_encryption_configuration` -
(Optional)
Custom encryption configuration (e.g., Cloud KMS keys)
Structure is [documented below](#nested_destination_encryption_configuration).
<a name="nested_source_tables"></a>The `source_tables` block supports:
* `project_id` -
(Optional)
The ID of the project containing this table.
* `dataset_id` -
(Optional)
The ID of the dataset containing this table.
* `table_id` -
(Required)
The table. Can be specified `{{table_id}}` if `project_id` and `dataset_id` are also set,
or of the form `projects/{{project}}/datasets/{{dataset_id}}/tables/{{table_id}}` if not.
<a name="nested_destination_table"></a>The `destination_table` block supports:
* `project_id` -
(Optional)
The ID of the project containing this table.
* `dataset_id` -
(Optional)
The ID of the dataset containing this table.
* `table_id` -
(Required)
The table. Can be specified `{{table_id}}` if `project_id` and `dataset_id` are also set,
or of the form `projects/{{project}}/datasets/{{dataset_id}}/tables/{{table_id}}` if not.
<a name="nested_destination_encryption_configuration"></a>The `destination_encryption_configuration` block supports:
* `kms_key_name` -
(Required)
Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table.
The BigQuery Service Account associated with your project requires access to this encryption key.
* `kms_key_version` -
(Output)
Describes the Cloud KMS encryption key version used to protect destination BigQuery table.
<a name="nested_extract"></a>The `extract` block supports:
* `destination_uris` -
(Required)
A list of fully-qualified Google Cloud Storage URIs where the extracted table should be written.
* `print_header` -
(Optional)
Whether to print out a header row in the results. Default is true.
* `field_delimiter` -
(Optional)
When extracting data in CSV format, this defines the delimiter to use between fields in the exported data.
Default is ','
* `destination_format` -
(Optional)
The exported file format. Possible values include CSV, NEWLINE_DELIMITED_JSON and AVRO for tables and SAVED_MODEL for models.
The default value for tables is CSV. Tables with nested or repeated fields cannot be exported as CSV.
The default value for models is SAVED_MODEL.
* `compression` -
(Optional)
The compression type to use for exported files. Possible values include GZIP, DEFLATE, SNAPPY, and NONE.
The default value is NONE. DEFLATE and SNAPPY are only supported for Avro.
* `use_avro_logical_types` -
(Optional)
Whether to use logical types when extracting to AVRO format.
* `source_table` -
(Optional)
A reference to the table being exported.
Structure is [documented below](#nested_source_table).
* `source_model` -
(Optional)
A reference to the model being exported.
Structure is [documented below](#nested_source_model).
<a name="nested_source_table"></a>The `source_table` block supports:
* `project_id` -
(Optional)
The ID of the project containing this table.
* `dataset_id` -
(Optional)
The ID of the dataset containing this table.
* `table_id` -
(Required)
The table. Can be specified `{{table_id}}` if `project_id` and `dataset_id` are also set,
or of the form `projects/{{project}}/datasets/{{dataset_id}}/tables/{{table_id}}` if not.
<a name="nested_source_model"></a>The `source_model` block supports:
* `project_id` -
(Required)
The ID of the project containing this model.
* `dataset_id` -
(Required)
The ID of the dataset containing this model.
* `model_id` -
(Required)
The ID of the model.
- - -
* `job_timeout_ms` -
(Optional)
Job timeout in milliseconds. If this time limit is exceeded, BigQuery may attempt to terminate the job.
* `labels` -
(Optional)
The labels associated with this job. You can use these to organize and group your jobs.
**Note**: This field is non-authoritative, and will only manage the labels present in your configuration.
Please refer to the field `effective_labels` for all of the labels present on the resource.
* `query` -
(Optional)
Configures a query job.
Structure is [documented below](#nested_query).
* `load` -
(Optional)
Configures a load job.
Structure is [documented below](#nested_load).
* `copy` -
(Optional)
Copies a table.
Structure is [documented below](#nested_copy).
* `extract` -
(Optional)
Configures an extract job.
Structure is [documented below](#nested_extract).
* `location` -
(Optional)
The geographic location of the job. The default value is US.
* `project` - (Optional) The ID of the project in which the resource belongs.
If it is not provided, the provider project is used.
## Attributes Reference
In addition to the arguments listed above, the following computed attributes are exported:
* `id` - an identifier for the resource with format `projects/{{project}}/jobs/{{job_id}}`
* `user_email` -
Email address of the user who ran the job.
* `job_type` -
(Output)
The type of the job.
* `terraform_labels` -
(Output)
The combination of labels configured directly on the resource
and default labels configured on the provider.
* `effective_labels` -
(Output)
All of labels (key/value pairs) present on the resource in GCP, including the labels configured through Terraform, other clients and services.
* `status` -
The status of this job. Examine this value when polling an asynchronous job to see if the job is complete.
Structure is [documented below](#nested_status).
<a name="nested_status"></a>The `status` block contains:
* `error_result` -
(Output)
Final error result of the job. If present, indicates that the job has completed and was unsuccessful.
Structure is [documented below](#nested_error_result).
* `errors` -
(Output)
The first errors encountered during the running of the job. The final message
includes the number of errors that caused the process to stop. Errors here do
not necessarily mean that the job has not completed or was unsuccessful.
Structure is [documented below](#nested_errors).
* `state` -
(Output)
Running state of the job. Valid states include 'PENDING', 'RUNNING', and 'DONE'.
<a name="nested_error_result"></a>The `error_result` block contains:
* `reason` -
(Optional)
A short error code that summarizes the error.
* `location` -
(Optional)
Specifies where the error occurred, if present.
* `message` -
(Optional)
A human-readable description of the error.
<a name="nested_errors"></a>The `errors` block contains:
* `reason` -
(Optional)
A short error code that summarizes the error.
* `location` -
(Optional)
Specifies where the error occurred, if present.
* `message` -
(Optional)
A human-readable description of the error.
## Timeouts
This resource provides the following
[Timeouts](https://developer.hashicorp.com/terraform/plugin/sdkv2/resources/retries-and-customizable-timeouts) configuration options:
- `create` - Default is 20 minutes.
- `update` - Default is 20 minutes.
- `delete` - Default is 20 minutes.
## Import
Job can be imported using any of these accepted formats:
* `projects/{{project}}/jobs/{{job_id}}/location/{{location}}`
* `projects/{{project}}/jobs/{{job_id}}`
* `{{project}}/{{job_id}}/{{location}}`
* `{{job_id}}/{{location}}`
* `{{project}}/{{job_id}}`
* `{{job_id}}`
In Terraform v1.5.0 and later, use an [`import` block](https://developer.hashicorp.com/terraform/language/import) to import Job using one of the formats above. For example:
```tf
import {
id = "projects/{{project}}/jobs/{{job_id}}/location/{{location}}"
to = google_bigquery_job.default
}
```
When using the [`terraform import` command](https://developer.hashicorp.com/terraform/cli/commands/import), Job can be imported using one of the formats above. For example:
```
$ terraform import google_bigquery_job.default projects/{{project}}/jobs/{{job_id}}/location/{{location}}
$ terraform import google_bigquery_job.default projects/{{project}}/jobs/{{job_id}}
$ terraform import google_bigquery_job.default {{project}}/{{job_id}}/{{location}}
$ terraform import google_bigquery_job.default {{job_id}}/{{location}}
$ terraform import google_bigquery_job.default {{project}}/{{job_id}}
$ terraform import google_bigquery_job.default {{job_id}}
```
## User Project Overrides
This resource supports [User Project Overrides](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#user_project_override).