Del via


to

Returns a new DataFrame where each row is reconciled to match the specified schema.

Syntax

to(schema: StructType)

Parameters

Parameter Type Description
schema StructType Specified schema.

Returns

DataFrame: Reconciled DataFrame.

Notes

  • Reorder columns and/or inner fields by name to match the specified schema.
  • Project away columns and/or inner fields that are not needed by the specified schema. Missing columns and/or inner fields (present in the specified schema but not input DataFrame) lead to failures.
  • Cast the columns and/or inner fields to match the data types in the specified schema, if the types are compatible, e.g., numeric to numeric (error if overflows), but not string to int.
  • Carry over the metadata from the specified schema, while the columns and/or inner fields still keep their own metadata if not overwritten by the specified schema.
  • Fail if the nullability is not compatible. For example, the column and/or inner field is nullable but the specified schema requires them to be not nullable.

Supports Spark Connect.

Examples

from pyspark.sql.types import StructField, StringType
df = spark.createDataFrame([("a", 1)], ["i", "j"])
df.schema
# StructType([StructField('i', StringType(), True), StructField('j', LongType(), True)])

schema = StructType([StructField("j", StringType()), StructField("i", StringType())])
df2 = df.to(schema)
df2.schema
# StructType([StructField('j', StringType(), True), StructField('i', StringType(), True)])
df2.show()
# +---+---+
# |  j|  i|
# +---+---+
# |  1|  a|
# +---+---+