This tool lets you paste or type pairs of numeric values and instantly see a scatter plot, the Pearson correlation coefficient r, and a least-squares regression line. It runs entirely in your browser, so your data is not uploaded to a server.
x,y.A scatter plot is a graph that shows how two numerical variables relate to each other. Each point on the plot represents a single observation with an x-value and a corresponding y-value. By looking at the pattern of points, you can quickly see whether there is a visible relationship, whether the cloud of points is tightly clustered or widely spread, and whether any obvious outliers stand apart from the rest.
Common examples include:
When the points tend to rise from left to right, the relationship is positive: larger x values usually come with larger y values. When the points tend to fall from left to right, the relationship is negative: larger x values tend to come with smaller y values. A loose cloud with no obvious tilt suggests little or no linear relationship.
This generator computes the Pearson correlation coefficient, often written as r. The value of r is always between -1 and 1:
Conceptually, Pearson's r compares how much x and y vary together (their covariance) with how much they vary on their own (their standard deviations). The sample correlation formula is:
In practice, the calculator uses an equivalent computational formula that avoids rounding errors, but the meaning is the same: it measures the strength and direction of a linear association.
There are no universal cutoffs, but many practitioners use rules of thumb like these:
Always look at the plot itself in addition to the number. A single outlier can change r dramatically, and a curved pattern can produce an r value near zero even when there is a clear nonlinear relationship.
Along with the scatter plot and correlation, the tool draws the least-squares regression line. This is the straight line that best summarizes the linear trend in the data by minimizing the sum of squared vertical distances between the points and the line.
The regression line has the equation
y = m x + b
where m is the slope and b is the y-intercept. For a set of n points (xi, yi), the slope and intercept can be written in terms of sample means and sums of squares. In simplified symbolic form:
m = Sxy / Sxxb = ȳ − m x̄, where x̄ and ȳ are the sample means of x and y.On the chart, the regression line gives a quick visual summary of the trend. You can also use it to make rough predictions: plug a new x into the equation to estimate the corresponding y. Remember that such predictions are only reliable within the range of your observed data and only when the linear model is a good fit.
Suppose you collect data on hours spent studying and exam scores for five students:
| Student | Study time (hours) | Score (%) |
|---|---|---|
| A | 1 | 65 |
| B | 2 | 70 |
| C | 3 | 78 |
| D | 4 | 85 |
| E | 5 | 88 |
You would enter these data as:
1,65 2,70 3,78 4,85 5,88
After generating the plot, you would see points that rise from left to right, indicating a positive relationship between study time and score. The calculator will report an r value close to 1, signaling a strong positive linear correlation. The regression line might have an equation similar to:
score = 5.8 × hours + 59 (values will vary slightly depending on rounding).
This means that, on average, each extra hour of study time is associated with about 5.8 additional percentage points on the exam. You can visually check how well the line follows the data and whether any point lies unusually far from the trend.
| Feature | What it shows | Best use | Main limitation |
|---|---|---|---|
| Scatter plot | Individual points for each (x,y) pair. | Spot patterns, clusters, outliers, and general shape of the relationship. | Visual only; does not give a single numeric summary. |
| Pearson r | Single number between -1 and 1 summarizing linear association. | Quickly judge strength and direction of a linear relationship. | Insensitive to nonlinear patterns; can be distorted by outliers and small samples. |
| Regression line | Best-fitting straight line through the data points. | Summarize trend and make approximate predictions within the data range. | Assumes a linear relationship and can mislead if the pattern is curved or heavily influenced by outliers. |
To use the scatter plot generator effectively, it is important to understand its assumptions and the situations where its results may be misleading.
x,y pair. Empty lines are ignored.For classroom work, quick analysis, and exploratory data visualization, these limitations are usually not a problem. For high-stakes decisions or formal statistical studies, consider using specialized statistical software and consulting a statistics reference or expert.