Merk
Tilgang til denne siden krever autorisasjon. Du kan prøve å logge på eller endre kataloger.
Tilgang til denne siden krever autorisasjon. Du kan prøve å endre kataloger.
Creates a box-and-whisker plot from DataFrame columns.
A box plot is a method for graphically depicting groups of numerical data through their quartiles. The box extends from the Q1 to Q3 quartile values of the data, with a line at the median (Q2). The whiskers extend from the edges of the box to show the range of the data. By default, they extend no more than 1.5 × IQR (IQR = Q3 - Q1) from the edges of the box, ending at the farthest data point within that interval. Outliers are plotted as separate dots.
Syntax
box(column=None, **kwargs)
Parameters
| Parameter | Type | Description |
|---|---|---|
column |
str or list of str, optional | Column name or list of names to use for creating the box plot. If None (default), all numeric columns are used. |
**kwargs |
optional | Additional keyword arguments. Supports precision: a float used to compute approximate statistics for the box plot. Default: 0.01. Use smaller values for more precise statistics. |
Returns
plotly.graph_objs.Figure
Examples
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
data = [
("A", 50, 55),
("B", 55, 60),
("C", 60, 65),
("D", 65, 70),
("E", 70, 75),
("F", 10, 15),
("G", 85, 90),
("H", 5, 150),
]
columns = ["student", "math_score", "english_score"]
df = spark.createDataFrame(data, columns)
df.plot.box()