Accelerating the pace of engineering and science

• Trials

# scatterhist

Scatter plot with marginal histograms

## Description

example

scatterhist(x,y) creates a 2-D scatter plot of the data in the vectors x and y, and puts a univariate histogram on the horizontal and vertical axes of the plot.

example

scatterhist(x,y,Name,Value) creates the plot using additional options specified by one or more name-value pair arguments. For example, you can specify a grouping variable or change the display options.

example

h = scatterhist(___) returns a vector of three axis handles for the scatter plot, the histogram along the horizontal axis, and the histogram along the vertical axis, respectively, using any of the input arguments in the previous syntaxes.

## Examples

expand all

### Create a Scatterhist Plot

Load the sample data. Create data vector x from the first column of the data matrix, which contains sepal length measurements from iris flowers. Create data vector y from the second column of the data matrix, which contains sepal width measurements from the same flowers.

```load fisheriris.mat;
x = meas(:,1);
y = meas(:,2);
```

Create a scatter plot and two marginal histograms to visualize the relationship between sepal length and sepal width.

`scatterhist(x,y)`

### Plot Grouped Data

Load the sample data. Create data vector x from the first column of the data matrix, which contains sepal length measurements from three species of iris flowers. Create data vector y from the second column of the data matrix, which contains sepal width measurements from the same flowers.

```load fisheriris.mat;
x = meas(:,1);
y = meas(:,2);
```

Create a scatter plot and six kernel density plots to visualize the relationship between sepal length and sepal width, grouped by species.

`scatterhist(x,y,'Group',species)`

The plot shows that the relationship between sepal length and width varies depending on the flower species.

### Customize the Plot Display

Load the sample data. Create data vector x from the first column of the data matrix, which contains sepal length measurements from three different species of iris flowers. Create data vector y from the second column of the data matrix, which contains sepal width measurements from the same flowers.

```load fisheriris.mat;
x = meas(:,1);
y = meas(:,2);
```

Create a scatter plot and six kernel density plots to visualize the relationship between sepal length and sepal width as measured on three species of iris flowers, grouped by species. Customize the appearance of the plots.

```scatterhist(x,y,'Group',species,'Location','SouthEast',...
'Direction','out','Color','kbr','LineStyle',{'-','-.',':'},...
'LineWidth',[2,2,2],'Marker','+od','MarkerSize',[4,5,6]);```

### Customize Plots Using Axes Handles

Load the sample data. Create data vector x from the first column of the data matrix, which contains sepal length measurements from three species of iris flowers. Create data vector y from the second column of the data matrix, which contains sepal width measurements from the same flowers.

```load fisheriris.mat;
x = meas(:,1);
y = meas(:,2);
```

Use axis handles to replace the marginal histograms with box plots.

```h = scatterhist(x,y,'Group',species);
hold on;
boxplot(h(2),x,species,'orientation','horizontal');
boxplot(h(3),y,species,'labelorientation','inline');
hold off;```

## Input Arguments

expand all

### x — Sample datavector

Sample data, specified as a vector. The data vectors x and y must be the same length. If x or y contain NaN values, then scatterhist:

• Removes rows with NaN values in either x or y from both data vectors when generating the scatter plot

• Removes rows with NaN values only from the corresponding x or y data vector when generating the marginal histograms

Data Types: single | double

### y — Sample datavector

Sample data, specified as a vector. The data vectors x and y must be the same length.

Data Types: single | double

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Location','SouthEast','Direction','out' specifies a plot with histograms located below and to the right of the scatter plot, with the bars directed away from the scatter plot.

### 'NBins' — Number of bins for histogramspositive integer value | vector

Number of bins for histograms, specified as the comma-separated pair consisting of 'Nbins' and a positive integer value greater than or equal to 2, or vector of two such values. If the number of bins is specified as a positive integer value, that value is the number of bins for both the x and y histograms. If the number of bins is specified by a vector, the first value is the number of bins for the x data, and the second value is the number of bins for the y data. By default, the number of bins is computed based on the sample standard deviation using Scott's rule.

Example: 'NBins',[5,7]

Data Types: single | double

### 'Location' — Location of marginal histograms'SouthWest' (default) | 'SouthEast' | 'NorthEast' | 'NorthWest'

Location of the marginal histograms in the figure, specified as the comma-separated pair consisting of 'Location' and one of the following.

 'SouthWest' Plot the histograms below and to the left of the scatter plot. 'SouthEast' Plot the histograms below and to the right of the scatter plot. 'NorthEast' Plot the histograms above and to the right of the scatter plot. 'NorthWest' Plot the histograms above and to the left of the scatter plot.

Example: 'Location','SouthEast'

### 'Direction' — Direction of marginal histograms'in' (default) | 'out'

Direction of the marginal histograms, specified as the comma-separated pair consisting of 'Direction' and one of the following.

 'in' Plot the histograms with the bars directed toward the scatter plot. 'out' Plot the histograms with the bars directed away from the scatter plot.

Example: 'Direction','out'

### 'Group' — Grouping variablecategorical array | logical or numeric vector | cell array of strings

Grouping variable, specified as the comma-separated pair consisting of 'Group' and a categorical array, logical or numeric vector, or cell array of strings. Each unique value in a grouping variable defines a group.

For example, if Gender is a cell array of strings with values 'Male' and 'Female', you can use Gender as a grouping variable to plot your data by gender.

Multiple grouping variables can be used by specifying a cell array of grouping variable names. Observations are placed in the same group if they have common values of all specified grouping variables.

For example, if Smoker is a logical vector with values 0 for nonsmokers and 1 for smokers, then specifying the cell array {Gender,Smoker} divides observations into four groups: Male Smoker, Male Nonsmoker, Female Smoker, and Female Nonsmoker.

Example: 'Group',{Gender,Smoker}

Data Types: single | double | logical | cell | char

### 'Kernel' — Grouped kernel density plot indicator'on' | 'off' | 'overlay'

Grouped kernel density plot indicator, specified as the comma-separated pair consisting of 'Kernel' and one of the following.

 'on' Display kernel density plots for each group. This is the default if a Group parameter is specified. 'off' Display the overall marginal distribution as histograms. This is the default if a Group parameter is not specified. 'overlay' Display the overall marginal distribution as kernel density plots overlaid onto histograms, similar to histfit.

Example: 'Kernel','overlay'

### 'Bandwidth' — Bandwidth of kernel smoothing windowmatrix

Bandwidth of kernel smoothing window, specified as the comma-separated pair consisting of 'Bandwidth' and a matrix of size 2-by-K, where K is the number of unique groups. The first row of the matrix gives the bandwidth of each group in x, and the second row gives the bandwidth of each group in y. By default, scatterhist finds the optimal bandwidth for estimating normal densities. Specifying a different bandwidth value changes the smoothing characteristics of the resulting kernel density plot. The value specified is a scaling factor for the normal distribution used to generate the kernel density plot.

Example: 'Bandwidth',[.5,.2,.1;.15,.25,.35]

Data Types: single | double

### 'Legend' — Legend visibility indicator'on' | 'off'

Legend visibility indicator, specified as the comma-separated pair consisting of 'Legend' and one of the following.

 'on' Set legend visible. This is the default if a Group parameter is specified. 'off' Set legend invisible. This is the default if a Group parameter is not specified.

Example: 'Legend','on'

### 'LineStyle' — Style of kernel density plot linevalid line style string | cell array of strings

Style of kernel density plot line, specified as the comma-separated pair consisting of 'LineStyle' and a valid line style string or a cell array of valid line style strings. See plot for valid line style strings. The default is a solid line. Use a cell array to specify different line styles for each group. When the total number of groups exceeds the number of specified values, scatterhist cycles through the specified values.

Example: 'LineStyle',{'-',':','-.'}

### 'LineWidth' — Width of kernel density plot line0.5 (default) | nonnegative scalar value | vector

Width of kernel density plot line, specified as the comma-separated pair consisting of 'LineWidth' and a nonnegative scalar value or vector of nonnegative scalar values. The specified value is the size of the kernel density plot line measured in points. The default size is 0.5 points. Use a vector to specify different line widths for each group. When the total number of groups is greater than the number of specified values, scatterhist cycles through the specified values.

Example: 'LineWidth',[0.5,1,2]

Data Types: single | double

### 'Color' — Marker color for each scatter plot groupvalid color designation char | string of chars | matrix of RGB values

Marker color for each scatter plot group, specified as the comma-separated pair consisting of 'Color' and a valid color designation character, a string of valid color designation characters, or a 3-column matrix of RGB values in the range [0,1]. See ColorSpec for predefined colors and their RGB equivalents. If colors are specified using a matrix, each row of the matrix represents a group, and the three columns represent the R value, G value, and B value, respectively. When the total number of groups exceeds the number of specified colors, scatterhist cycles through the specified colors.

Example: 'Color','kcm'

Example: 'Color',[.5,0,1;0,.5,.5]

Data Types: single | double | char

### 'Marker' — Marker symbol for each scatterplot group'o' (default) | valid marker symbol | string of valid marker symbols

Marker symbol for each scatter plot group, specified as the comma-separated pair consisting of 'Marker' and a valid marker symbol or string of valid marker symbols. See plot for valid symbols. The default is 'o', a circle. When the total number of groups exceeds the number of specified symbols, scatterhist cycles through the specified symbols.

Example: 'Marker','+do'

### 'MarkerSize' — Marker size for each scatter plot group6 (default) | nonnegative scalar value | vector

Marker size for each scatter plot group, specified as the comma-separated pair consisting of 'MarkerSize' and a nonnegative scalar value or a vector of nonnegative scalar values, measured in points. When the total number of groups exceeds the number of specified values, scatterhist cycles through the specified values.

Example: 'MarkerSize',10

Data Types: single | double

## Output Arguments

expand all

### h — Axes handlesvector

Axes handles for the three plots, returned as a vector. The vector contains the handles for the scatter plot, the histogram along the horizontal axis, and the histogram along the vertical axis, respectively.