Logistic Regression Calculator

Sigmoid Model

Logistic regression models binary outcomes with the logistic function $p (x) = \frac{1}{1 + e^{- (a x + b)}}$ . Parameters $a$ and $b$ control the slope and offset. By adjusting them, the sigmoid curve can represent probabilities between 0 and 1 for any $x$ value.

Given a set of sample points $(x_{i}, y_{i})$ with $y_{i}$ either 0 or 1, we determine $a$ and $b$ by minimizing the negative log-likelihood

$L = - \sum_{i}^{} (y_{i} \ln p (x_{i}) + (1 - y_{i}) \ln (1 - p (x_{i})))$

The gradients of $L$ with respect to $a$ and $b$ yield simple update rules. Starting with zero parameters, gradient descent iteratively subtracts a fraction of these gradients until convergence. This approach works well for small data sets and illustrates how logistic regression learns to separate classes.

This calculator implements a basic gradient descent optimizer. Input your training points one per line, set a learning rate and iteration count, and click "Fit Model". The algorithm repeatedly computes predicted probabilities, accumulates the gradients, and updates $a$ and $b$ . After training, the resulting parameters are displayed along with the probability estimate for each input point.

While simple, logistic regression remains a cornerstone of statistical modeling. It forms the basis for classification tasks from medical testing to marketing. Understanding its mechanics helps demystify more complex machine-learning algorithms that extend or generalize it. By experimenting with this tool, you can see firsthand how varying the learning rate or number of iterations affects convergence and final accuracy.

Choosing Learning Parameters

A small learning rate produces gradual, stable progress, whereas a large value may overshoot the optimum and fail to converge. If the result oscillates or diverges, try lowering the rate or normalizing your input data. The iteration count controls how many passes gradient descent makes over the dataset; more iterations usually improve accuracy but at the cost of computation time.

Decision Boundary and Interpretation

Once the parameters stabilize, the model defines a decision boundary where the predicted probability equals 0.5. For one-dimensional data this boundary occurs at x = –b/a. Points on one side are classified as 1 and those on the other as 0. Plotting your data along with this boundary can reveal whether the classes are well separated or whether additional features may be needed.

Evaluating the Fit

Beyond inspecting the raw probabilities, you can compute accuracy, precision, or the F1 score to evaluate model performance. A confusion matrix showing true positives, false positives, true negatives, and false negatives provides deeper insight. These metrics help determine whether the model overfits, underfits, or requires more balanced data.

Regularization and Extensions

In practice, logistic regression often includes a regularization term such as L2 penalty $λ (a$ ² + b²) to discourage extremely large coefficients. Regularization improves generalization on new data by preventing the model from fitting noise. Multivariate logistic regression extends the concept to multiple features, producing a hyperplane decision boundary. Although this calculator focuses on a single predictor, the underlying math scales naturally.

Saving Model Parameters

After fitting the model, use the copy button to preserve the coefficients for documentation or reuse in other scripts. Comparing parameter sets across experiments can highlight how different datasets or learning rates influence the final classifier.

Worked Example

Assume three points: (1,0), (2,0), and (3,1). Running the calculator with learning rate 0.1 for 500 iterations yields parameters near $a = 2.2$ and $b = - 5$ . The decision boundary x = –b/a equals roughly 2.27, meaning values above this threshold are classified as 1. Probabilities for the inputs might be 0.01, 0.13, and 0.86 respectively, illustrating how the sigmoid smoothly transitions between classes.

Model Comparison

Approach	When to Use	Pros	Cons
Logistic Regression	Binary outcomes with linear boundary	Fast, interpretable	Struggles with complex relationships
Linear Regression	Continuous targets	Simple math	Not bounded 0–1; poor for classification
Naive Bayes	Text or categorical data	Handles many features	Assumes independence

Limitations and Assumptions

This tool fits a single-feature model without regularization, sample weighting, or intercept constraints. Convergence is not guaranteed for poorly scaled data or extreme learning rates. Real-world problems often require feature engineering, cross-validation, and assessment of class imbalance. Treat outputs as educational estimates rather than production-ready models.