Logistic regression models binary outcomes with the logistic function . Parameters and control the slope and offset. By adjusting them, the sigmoid curve can represent probabilities between 0 and 1 for any value.
Given a set of sample points with either 0 or 1, we determine and by minimizing the negative log-likelihood
The gradients of with respect to and yield simple update rules. Starting with zero parameters, gradient descent iteratively subtracts a fraction of these gradients until convergence. This approach works well for small data sets and illustrates how logistic regression learns to separate classes.
This calculator implements a basic gradient descent optimizer. Input your training points one per line, set a learning rate and iteration count, and click "Fit Model". The algorithm repeatedly computes predicted probabilities, accumulates the gradients, and updates and . After training, the resulting parameters are displayed along with the probability estimate for each input point.
While simple, logistic regression remains a cornerstone of statistical modeling. It forms the basis for classification tasks from medical testing to marketing. Understanding its mechanics helps demystify more complex machine-learning algorithms that extend or generalize it. By experimenting with this tool, you can see firsthand how varying the learning rate or number of iterations affects convergence and final accuracy.
A small learning rate produces gradual, stable progress, whereas a large value may overshoot the optimum and fail to converge. If the result oscillates or diverges, try lowering the rate or normalizing your input data. The iteration count controls how many passes gradient descent makes over the dataset; more iterations usually improve accuracy but at the cost of computation time.
Once the parameters stabilize, the model defines a decision boundary where the predicted probability equals
0.5. For one-dimensional data this boundary occurs at x = –b/a
. Points on one side are classified
as 1 and those on the other as 0. Plotting your data along with this boundary can reveal whether the
classes are well separated or whether additional features may be needed.
Beyond inspecting the raw probabilities, you can compute accuracy, precision, or the F1 score to evaluate model performance. A confusion matrix showing true positives, false positives, true negatives, and false negatives provides deeper insight. These metrics help determine whether the model overfits, underfits, or requires more balanced data.
In practice, logistic regression often includes a regularization term such as L2 penalty to discourage extremely large coefficients. Regularization improves generalization on new data by preventing the model from fitting noise. Multivariate logistic regression extends the concept to multiple features, producing a hyperplane decision boundary. Although this calculator focuses on a single predictor, the underlying math scales naturally.
After fitting the model, use the copy button to preserve the coefficients for documentation or reuse in other scripts. Comparing parameter sets across experiments can highlight how different datasets or learning rates influence the final classifier.
Assume three points: (1,0), (2,0), and (3,1). Running the calculator with learning rate 0.1 for 500 iterations yields parameters near and . The decision boundary x = –b/a
equals roughly 2.27, meaning values above this threshold are classified as 1. Probabilities for the inputs might be 0.01, 0.13, and 0.86 respectively, illustrating how the sigmoid smoothly transitions between classes.
Approach | When to Use | Pros | Cons |
---|---|---|---|
Logistic Regression | Binary outcomes with linear boundary | Fast, interpretable | Struggles with complex relationships |
Linear Regression | Continuous targets | Simple math | Not bounded 0–1; poor for classification |
Naive Bayes | Text or categorical data | Handles many features | Assumes independence |
This tool fits a single-feature model without regularization, sample weighting, or intercept constraints. Convergence is not guaranteed for poorly scaled data or extreme learning rates. Real-world problems often require feature engineering, cross-validation, and assessment of class imbalance. Treat outputs as educational estimates rather than production-ready models.
Enter paired X and Y data to compute the slope, intercept, and R-squared of a simple linear regression.
Fit a polynomial of chosen degree to data using least squares regression and view coefficients, predictions, and residuals.
Estimate the likelihood of a sleep regression based on your baby's age, wakeups, and routine stability.