Merk
Tilgang til denne siden krever autorisasjon. Du kan prøve å logge på eller endre kataloger.
Tilgang til denne siden krever autorisasjon. Du kan prøve å endre kataloger.
Returns a DataFrame containing a new row for each element in the given array or map. The default column name is col for elements in an array and key and value for elements in a map. To use different column names, call toDF() on the returned DataFrame.
Syntax
spark.tvf.explode(collection)
Parameters
| Parameter | Type | Description |
|---|---|---|
collection |
pyspark.sql.Column |
Target column to work on. |
Returns
pyspark.sql.DataFrame: A DataFrame with a new row for each element.
Examples
Example 1: Exploding an array column
import pyspark.sql.functions as sf
spark.tvf.explode(sf.array(sf.lit(1), sf.lit(2), sf.lit(3))).show()
+---+
|col|
+---+
| 1|
| 2|
| 3|
+---+
Example 2: Exploding a map column
import pyspark.sql.functions as sf
spark.tvf.explode(
sf.create_map(sf.lit("a"), sf.lit("b"), sf.lit("c"), sf.lit("d"))
).show()
+---+-----+
|key|value|
+---+-----+
| a| b|
| c| d|
+---+-----+
Example 3: Exploding an array of struct column
import pyspark.sql.functions as sf
spark.tvf.explode(sf.array(
sf.named_struct(sf.lit("a"), sf.lit(1), sf.lit("b"), sf.lit(2)),
sf.named_struct(sf.lit("a"), sf.lit(3), sf.lit("b"), sf.lit(4))
)).select("col.*").show()
+---+---+
| a| b|
+---+---+
| 1| 2|
| 3| 4|
+---+---+
Example 4: Exploding an empty array column
import pyspark.sql.functions as sf
spark.tvf.explode(sf.array()).show()
+---+
|col|
+---+
+---+
Example 5: Exploding an empty map column
import pyspark.sql.functions as sf
spark.tvf.explode(sf.create_map()).show()
+---+-----+
|key|value|
+---+-----+
+---+-----+
Example 6: Overriding the default column names
Because spark.tvf.explode returns a DataFrame, use toDF() to rename the output columns. .alias() has no effect on the exploded columns.
import pyspark.sql.functions as sf
# Array: rename the single output column
spark.tvf.explode(sf.array(sf.lit(1), sf.lit(2), sf.lit(3))).toDF("number").show()
+------+
|number|
+------+
| 1|
| 2|
| 3|
+------+
# Map: rename both output columns
spark.tvf.explode(
sf.create_map(sf.lit("a"), sf.lit("b"), sf.lit("c"), sf.lit("d"))
).toDF("letter", "pair").show()
+------+----+
|letter|pair|
+------+----+
| a| b|
| c| d|
+------+----+