Del via


unpivot

Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set. This is the reverse to groupBy(...).pivot(...).agg(...), except for the aggregation, which cannot be reversed.

Added in Databricks Runtime 11.1

Syntax

unpivot(ids: Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", ...]], values: Optional[Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", ...]]], variableColumnName: str, valueColumnName: str)

Parameters

Parameter Type Description
ids str, Column, tuple, list Column(s) to use as identifiers. Can be a single column or column name, or a list or tuple for multiple columns.
values str, Column, tuple, list, optional Column(s) to unpivot. Can be a single column or column name, or a list or tuple for multiple columns. If specified, must not be empty. If not specified, uses all columns that are not set as ids.
variableColumnName str Name of the variable column.
valueColumnName str Name of the value column.

Returns

DataFrame: Unpivoted DataFrame.

Notes

Supports Spark Connect.

Examples

df = spark.createDataFrame(
    [(1, 11, 1.1), (2, 12, 1.2)],
    ["id", "int", "double"],
)
df.show()
# +---+---+------+
# | id|int|double|
# +---+---+------+
# |  1| 11|   1.1|
# |  2| 12|   1.2|
# +---+---+------+

from pyspark.sql import functions as sf
df.unpivot(
    "id", ["int", "double"], "var", "val"
).sort("id", sf.desc("var")).show()
# +---+------+----+
# | id|   var| val|
# +---+------+----+
# |  1|   int|11.0|
# |  1|double| 1.1|
# |  2|   int|12.0|
# |  2|double| 1.2|
# +---+------+----+