Merk
Tilgang til denne siden krever autorisasjon. Du kan prøve å logge på eller endre kataloger.
Tilgang til denne siden krever autorisasjon. Du kan prøve å endre kataloger.
Converts a binary column of Avro format into its corresponding catalyst value. The specified schema must match the read data, otherwise the behavior is undefined: it may fail or return an arbitrary result.
If jsonFormatSchema is not provided but both subject and schemaRegistryAddress are provided, the function converts a binary column of Schema Registry Avro format into its corresponding catalyst value.
Syntax
from pyspark.sql.avro.functions import from_avro
from_avro(data, jsonFormatSchema=None, options=None, subject=None, schemaRegistryAddress=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
data |
pyspark.sql.Column or str |
The binary column containing Avro-encoded data. |
jsonFormatSchema |
str, optional | The Avro schema in JSON string format. |
options |
dict, optional | Options to control how the Avro record is parsed and configuration for the schema registry client. |
subject |
str, optional | The subject in Schema Registry that the data belongs to. |
schemaRegistryAddress |
str, optional | The address (host and port) of the Schema Registry. |
Returns
pyspark.sql.Column: A new column containing the deserialized Avro data as the corresponding catalyst value.
Examples
Example 1: Deserializing an Avro binary column using a JSON schema
from pyspark.sql import Row
from pyspark.sql.avro.functions import from_avro, to_avro
data = [(1, Row(age=2, name='Alice'))]
df = spark.createDataFrame(data, ("key", "value"))
avro_df = df.select(to_avro(df.value).alias("avro"))
json_format_schema = '''{"type":"record","name":"topLevelRecord","fields":
[{"name":"avro","type":[{"type":"record","name":"value",
"namespace":"topLevelRecord","fields":[{"name":"age","type":["long","null"]},
{"name":"name","type":["string","null"]}]},"null"]}]}'''
avro_df.select(from_avro(avro_df.avro, json_format_schema).alias("value")).show(truncate=False)
+------------------+
|value |
+------------------+
|{{2, Alice}} |
+------------------+