Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Adds a write option for the underlying data source. For some available options, see Options.
Syntax
option(key, value)
Parameters
| Parameter | Type | Description |
|---|---|---|
key |
str | The option key. |
value |
str, int, float, or bool | The option value. |
Options
The following table contains some writer options:
| Key | Formats | Description |
|---|---|---|
arrayElementName |
XML | The element name for array elements that have no explicit name. Default: item. Applies to xml (DataFrameWriter). |
attributePrefix |
XML | The prefix prepended to field names that correspond to XML attributes. Default: _. Applies to xml (DataFrameWriter). |
avroSchema |
Avro | The full Avro schema as a JSON string. Use this option to convert Spark SQL types to specific Avro types. Applies to Avro file. |
charToEscapeQuoteEscaping |
CSV | The character used to escape the escape character when it differs from the quote character. Default: \0 (not enabled). Applies to csv (DataFrameWriter). |
clusterByAuto |
Delta Lake | Whether to enable automatic liquid clustering, where Azure Databricks selects clustering columns based on query patterns. Only valid with mode("overwrite"). Cannot be used with append mode. Default: false. Available in Databricks Runtime 16.4 and above. Applies to Use liquid clustering for tables. |
compression |
CSV, JSON, ORC, Parquet, Text, XML | Compression codec to use when writing. Valid values vary by format. Applies to csv (DataFrameWriter), json (DataFrameWriter), orc (DataFrameWriter), parquet (DataFrameWriter), text (DataFrameWriter), xml (DataFrameWriter). |
dateFormat |
CSV, JSON, XML | Format string for date column values. Default: yyyy-MM-dd. Applies to csv (DataFrameWriter), json (DataFrameWriter), xml (DataFrameWriter). |
declaration |
XML | The XML declaration string written at the top of each output file. Set to an empty string to suppress the declaration. Default: version="1.0" encoding="UTF-8" standalone="yes". Applies to xml (DataFrameWriter). |
emptyValue |
CSV | The string written for empty (non-null) values. Default: "". Applies to csv (DataFrameWriter). |
encoding |
CSV, JSON, XML | The character encoding for the output files. Default: UTF-8. Applies to csv (DataFrameWriter), json (DataFrameWriter), xml (DataFrameWriter). |
escape |
CSV | The character used to escape quoted values. Default: \. Applies to csv (DataFrameWriter). |
escapeQuotes |
CSV | Whether to escape quote characters inside quoted field values. Default: true. Applies to csv (DataFrameWriter). |
header |
CSV | Whether to write column names as the first line of the output. Default: false. Applies to csv (DataFrameWriter). |
ignoreLeadingWhiteSpace |
CSV | Whether to trim leading whitespace from values when writing. Default: false. Applies to csv (DataFrameWriter). |
ignoreNullFields |
JSON | Whether to omit fields with null values from the JSON output. Default: value of spark.sql.jsonGenerator.ignoreNullFields. Applies to json (DataFrameWriter). |
ignoreTrailingWhiteSpace |
CSV | Whether to trim trailing whitespace from values when writing. Default: false. Applies to csv (DataFrameWriter). |
lineSep |
CSV, JSON, Text | The line separator string used between records. Default: \n. Applies to csv (DataFrameWriter), json (DataFrameWriter), text (DataFrameWriter). |
mergeSchema |
Delta Lake | Whether to enable schema evolution for the write operation. New columns in the source DataFrame are added to the target table schema. Applies to batch and streaming appends. Applies to Update table schema. |
nullValue |
CSV | String written for null values. Default: "". Applies to csv (DataFrameWriter). |
nullValue |
XML | The string written for null values. Default: null. When set to null, attributes and child elements for null fields are omitted. Applies to xml (DataFrameWriter). |
overwriteSchema |
Delta Lake | Whether to replace the table schema and partitioning when overwriting. Requires mode("overwrite") without replaceWhere. Cannot be used with partitionOverwriteMode. Applies to Update table schema. |
partitionOverwriteMode |
Delta Lake | The partition overwrite mode. Set this to dynamic to overwrite only partitions containing new data, leaving all other partitions unchanged. Legacy mode; not supported on serverless compute or Databricks SQL. Applies to Selectively overwrite data with Delta Lake. |
quote |
CSV | The character used to quote field values that contain the separator. Default: ". Applies to csv (DataFrameWriter). |
quoteAll |
CSV | Whether to enclose all field values in quotes regardless of content. Default: false. Applies to csv (DataFrameWriter). |
recordName |
Avro | The top-level record name in the output Avro schema. Default: topLevelRecord. Applies to Avro file. |
recordNamespace |
Avro | The namespace for the top-level record in the output Avro schema. Default: "". Applies to Avro file. |
replaceWhere |
Delta Lake | A predicate expression. Atomically overwrites only the records that match the predicate. Applies to Selectively overwrite data with Delta Lake. |
rootTag |
XML | The root element tag that wraps all row elements in the output. Default: ROWS. Applies to xml (DataFrameWriter). |
rowTag |
XML | The element tag that represents a row in the output. Default: ROW. Applies to xml (DataFrameWriter). |
sep |
CSV | The field delimiter character. Default: ,. Applies to csv (DataFrameWriter). |
timestampFormat |
CSV, JSON, XML | The format string for timestamp column values. Default: yyyy-MM-dd'T'HH:mm:ss[.SSS][XXX]. Applies to csv (DataFrameWriter), json (DataFrameWriter), xml (DataFrameWriter). |
txnAppId |
Delta Lake | A unique string identifying the application for idempotent writes in foreachBatch operations. Use together with txnVersion to ensure exactly-once writes to multiple Delta Lake tables. Applies to Use foreachBatch for idempotent table writes. |
txnVersion |
Delta Lake | A monotonically increasing number used as the transaction version for idempotent writes in foreachBatch operations. Use together with txnAppId to ensure exactly-once writes to multiple Delta Lake tables. Applies to Use foreachBatch for idempotent table writes. |
userMetadata |
Delta Lake, Apache Iceberg | A user-defined string appended to the commit metadata for the write operation. Visible in the output of DESCRIBE HISTORY. Applies to Enrich tables with custom metadata. |
validateName |
XML | Whether to throw an exception if a column name is not a valid XML element identifier. Default: true. Applies to xml (DataFrameWriter). |
valueTag |
XML | The field name used for character data in XML elements that also have attributes or child elements. Default: _VALUE. Applies to xml (DataFrameWriter). |
Returns
DataFrameWriterV2