slsqp_jax package

Top-level re-exports. Most users import from this namespace; the submodule pages below describe where each symbol actually lives.

slsqp_jax

SLSQP-JAX: Sequential Least Squares Programming in pure JAX.

This package provides a GPU-compatible implementation of the SLSQP optimization algorithm using JAX and the Optimistix framework. Designed for large-scale problems (n > 5000) using matrix-free L-BFGS Hessian approximation and projected conjugate gradient for the QP subproblem.

class slsqp_jax.SLSQP[source]

Bases: AbstractMinimiser

SLSQP minimiser using Sequential Quadratic Programming.

See README.md and docs/source/index.md for the full algorithmic description. The user-facing API collapses the legacy 40+ flat keyword arguments into a single SLSQPConfig instance grouping the parameters by purpose; consult slsqp_jax.config for the sub-config dataclasses.

Attributes:
config: Aggregate configuration. Defaults to

SLSQPConfig with all sub-config defaults.

eq_constraint_fn: Function (x, args) -> c_eq(x) evaluated

for equality-constraint feasibility c_eq(x) = 0.

ineq_constraint_fn: Function (x, args) -> c_ineq(x)

evaluated for inequality-constraint feasibility c_ineq(x) >= 0.

n_eq_constraints: Number of equality constraints (static). n_ineq_constraints: Number of inequality constraints (static). bounds: Optional (n, 2) array of [lower, upper] per

variable; iterates are projected onto this box after every step. Use -inf / +inf for unbounded dimensions.

obj_grad_fn / eq_jac_fn / ineq_jac_fn / obj_hvp_fn / eq_hvp_fn / ineq_hvp_fn: Optional user-supplied derivative

callables; the AD fallbacks (jax.grad / jax.jacrev / forward-over-reverse jvp(grad(.))) are used when these are None.

inner_solver: Optional pluggable inner equality-constrained

QP solver. None constructs a default ProjectedCGCholesky derived from config.

verbose: True/False or a custom (**kwargs) -> None

callable for per-step diagnostics.

norm() Shaped[Array, '']

Compute the L-infinity norm of a PyTree of arrays.

This is the largest absolute elementwise value. Considering the input x as a flat vector (x_1, …, x_n), then this computes max_i |x_i|.

Parameters:

x (PyTree[jax.Array | numpy.ndarray | numpy.bool | numpy.number | bool | int | float | complex | jax._src.literals.TypedNdArray])

Return type:

Shaped[Array, ‘’]

config: SLSQPConfig
eq_constraint_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None = None
ineq_constraint_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None = None
n_eq_constraints: int = 0
n_ineq_constraints: int = 0
bounds: Float[Array, 'n 2'] | None = None
obj_grad_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None = None
eq_jac_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
ineq_jac_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
obj_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None = None
eq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
ineq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
inner_solver: AbstractInnerSolver | None = None
verbose: Callable = False
property rtol: float
property atol: float
property max_steps: int
property min_steps: int
property stagnation_tol: float
property divergence_factor: float
property divergence_patience: int
property lbfgs_memory: int
property damping_threshold: float
property lbfgs_diag_floor: float
property lbfgs_diag_ceil: float
property line_search_max_steps: int
property armijo_c1: float
property ls_failure_patience: int
property qp_max_iter: int
property qp_max_cg_iter: int
property qp_failure_patience: int
property zero_step_patience: int
property qp_ping_pong_threshold: int
property mult_drop_floor: float
property cg_regularization: float
property use_exact_hvp_in_qp: bool
property proximal_tau: float
property proximal_mu_min: float | None
property proximal_mu_max: float
property use_preconditioner: bool
property preconditioner_type: str
property diagonal_n_probes: int
property active_set_method: str
property lpeca_sigma: float
property lpeca_beta: float | None
property lpeca_use_lp: bool
property lpeca_trust_threshold: float
property lpeca_warmup_steps: int
property lpeca_predict_bounds: bool
property adaptive_cg_tol: bool
property use_inexact_stationarity: bool
init(fn, y, args, options, f_struct, aux_struct, tags)[source]

Perform all initial computation needed to initialise the solver state.

For example, the [optimistix.Chord][] method computes the Jacobian df/dy with respect to the initial guess y, and then uses it throughout the computation.

Arguments:

  • fn: The function to iterate over. This is expected to take two argumetns

    fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.

  • y: The value of y at the current (first) iteration.

  • args: Passed as the args of fn(y, args).

  • options: Individual solvers may accept additional runtime arguments.

    See each individual solver’s documentation for more details.

  • f_struct: A pytree of `jax.ShapeDtypeStruct`s of the same shape as the

    output of fn. This is used to initialise any information in the state which may rely on the pytree structure, array shapes, or dtype of the output of fn.

  • aux_struct: A pytree of `jax.ShapeDtypeStruct`s of the same shape as the

    auxiliary data returned by fn.

  • tags: exact meaning depends on whether this is a fixed point, root find,

    least squares, or minimisation problem; see their relevant entry points.

Returns:

A PyTree representing the initial state of the solver.

Return type:

SLSQPState

Parameters:
step(fn, y, args, options, state, tags)[source]

Perform one step of the iterative solve.

Arguments:

  • fn: The function to iterate over. This is expected to take two argumetns

    fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.

  • y: The value of y at the current (first) iteration.

  • args: Passed as the args of fn(y, args).

  • options: Individual solvers may accept additional runtime arguments.

    See each individual solver’s documentation for more details.

  • state: A pytree representing the state of a solver. The shape of this

    pytree is solver-dependent.

  • tags: exact meaning depends on whether this is a fixed point, root find,

    least squares, or minimisation problem; see their relevant entry points.

Returns:

A 3-tuple containing the new y value in the first element, the next solver state in the second element, and the aux output of fn(y, args) in the third element.

Return type:

tuple[Float[Array, 'n'], SLSQPState, Any]

Parameters:
terminate(fn, y, args, options, state, tags)[source]

Determine whether or not to stop the iterative solve.

Arguments:

  • fn: The function to iterate over. This is expected to take two argumetns

    fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.

  • y: The value of y at the current iteration.

  • args: Passed as the args of fn(y, args).

  • options: Individual solvers may accept additional runtime arguments.

    See each individual solver’s documentation for more details.

  • state: A pytree representing the state of a solver. The shape of this

    pytree is solver-dependent.

  • tags: exact meaning depends on whether this is a fixed point, root find,

    least squares, or minimisation problem; see their relevant entry points.

Returns:

A 2-tuple containing a bool indicating whether or not to stop iterating in the first element, and an [optimistix.RESULTS][] object in the second element.

Return type:

tuple[Bool[Array, ''], Any]

Parameters:
postprocess(fn, y, aux, args, options, state, tags, result)[source]

Any final postprocessing to perform on the result of the solve.

Arguments:

  • fn: The function to iterate over. This is expected to take two argumetns

    fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.

  • y: The value of y at the last iteration.

  • aux: The auxiliary output at the last iteration.

  • args: Passed as the args of fn(y, args).

  • options: Individual solvers may accept additional runtime arguments.

    See each individual solver’s documentation for more details.

  • state: A pytree representing the final state of a solver. The shape of this

    pytree is solver-dependent.

  • tags: exact meaning depends on whether this is a fixed point, root find,

    least squares, or minimisation problem; see their relevant entry points.

  • result: as returned by the final call to terminate.

Returns:

A 3-tuple of:

  • final_y: the final y to return as the solution of the solve.

  • final_aux: the final aux to return as the auxiliary output of the solve.

  • stats: any additional information to place in the sol.stats dictionary.

!!! info

Most solvers will not need to use this, so that this method may be defined as: ```python def postprocess(self, fn, y, aux, args, options, state, tags, result):

return y, aux, {}

```

Return type:

tuple[Float[Array, 'n'], Any, dict[str, Any]]

Parameters:
__init__(norm=<function max_norm>, config=<factory>, eq_constraint_fn=None, ineq_constraint_fn=None, n_eq_constraints=0, n_ineq_constraints=0, bounds=None, _lower_bound_mask=None, _upper_bound_mask=None, _n_lower_bounds=0, _n_upper_bounds=0, _lower_indices=None, _upper_indices=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, obj_hvp_fn=None, eq_hvp_fn=None, ineq_hvp_fn=None, _grad_impl=None, _eq_jac_impl=None, _ineq_jac_impl=None, _eq_hvp_contrib_impl=None, _ineq_hvp_contrib_impl=None, _obj_hvp_impl=None, inner_solver=None, _stagnation_window=10, _proximal_mu_min=1e-06, _proximal_mu_max=0.1, verbose=False)
Parameters:
  • norm (Callable)

  • config (SLSQPConfig)

  • eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • n_eq_constraints (int)

  • n_ineq_constraints (int)

  • bounds (Float[Array, 'n 2'] | None)

  • _lower_bound_mask (tuple[bool, ...] | None)

  • _upper_bound_mask (tuple[bool, ...] | None)

  • _n_lower_bounds (int)

  • _n_upper_bounds (int)

  • _lower_indices (tuple[int, ...] | None)

  • _upper_indices (tuple[int, ...] | None)

  • obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • obj_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • _grad_impl (Callable)

  • _eq_jac_impl (Callable)

  • _ineq_jac_impl (Callable)

  • _eq_hvp_contrib_impl (Callable)

  • _ineq_hvp_contrib_impl (Callable)

  • _obj_hvp_impl (Callable | None)

  • inner_solver (AbstractInnerSolver | None)

  • _stagnation_window (int)

  • _proximal_mu_min (float)

  • _proximal_mu_max (float)

  • verbose (Callable)

Return type:

None

class slsqp_jax.SLSQPState[source]

Bases: Module

State for the SLSQP solver.

This is a JAX PyTree (via eqx.Module) that holds all mutable state needed across SLSQP iterations.

Attributes:

step_count: Current iteration number. f_val: Current objective function value f(x_k). grad: Gradient of objective at current point. eq_val: Equality constraint values c_eq(x_k). ineq_val: Inequality constraint values c_ineq(x_k). eq_jac: Jacobian of equality constraints at x_k. ineq_jac: Jacobian of inequality constraints at x_k. lbfgs_history: L-BFGS history for matrix-free Hessian approximation. multipliers_eq_qp: QP-recovered equality multipliers

λ_QP = (A_k A_kᵀ)⁻¹ A_k (B d + g_k) from the active-set QP solver. Consumed by Han-Powell’s penalty rule (update_penalty_parameter), the LPEC-A predictor’s rho_bar proximity measure, and the active-set warm-start of the next QP. Carries O(s_f / s_eq · cond(B)) noise when the L-BFGS Hessian is poorly conditioned or when auto-scaling inflates the multiplier scale; do not use for ∇f Jᵀλ stationarity checks.

multipliers_ineq_qp: QP-recovered inequality multipliers

(general-inequality block + bound block). The bound block comes from the post-line-search recovery in slsqp_jax.slsqp.bounds.recover_bound_multipliers() so it agrees with multipliers_ineq_ls for the bound rows; only the general-inequality block carries the QP noise. Consumed by Han-Powell + LPEC-A + warm-start.

multipliers_eq_ls: Hessian-free least-squares multipliers

λ_LS = (J(x_{k+1}) J(x_{k+1})ᵀ)⁻¹ J(x_{k+1}) ∇f(x_{k+1}) recovered post-line-search at the accepted iterate (see slsqp_jax.slsqp.multipliers). Independent of B, d and α. Consumed by the L-BFGS secant pair (so the same vector is used at both endpoints) and by _terminate_impl’s filterSQP-style stationarity denominator (the multipliers feed both the Lagrangian gradient numerator and the mu_max denominator from slsqp_jax.slsqp.termination.compute_mu_max()). These are the multipliers exposed via Solution.stats["multipliers_eq"] because they are the stationarity-quality estimate. At init() time both _qp and _ls are seeded from the same lstsq solution at x_0.

multipliers_ineq_ls: Hessian-free LS inequality multipliers

(general-inequality block from slsqp_jax.slsqp.multipliers with active-row max(0, ·) clamp + bound block from slsqp_jax.slsqp.bounds.recover_bound_multipliers()). At init() initialised to zeros — bound blocks only get populated once step() runs the post-step recovery.

kkt_residual_grad: Lagrangian-gradient snapshot consumed by the

next QP subproblem as the KKT-residual proxy (kkt_residual = ||state.kkt_residual_grad|| inside SLSQP._solve_qp_subproblem()). Currently always written equal to grad_lagrangian at the end of each step; the field is kept separate from grad_lagrangian purely so that future variants (e.g. carrying a previous iterate’s Lagrangian gradient instead of the current one) can be introduced without re-shaping the state pytree.

grad_lagrangian: Current gradient of the Lagrangian evaluated at

the accepted iterate using the LS multipliers (matching the L-BFGS secant pair). Reused by terminate so the stationarity check does not fall out of sync with the L-BFGS secant pair.

merit_penalty: Current penalty parameter for L1 merit function. bound_jac: Constant Jacobian for bound constraints (computed once in init). qp_iterations: Total accumulated QP active-set iterations across all steps. qp_converged: Whether the most recent QP solve converged. prev_active_set: Active inequality constraint set from the previous QP solve,

used for warm-starting the next QP subproblem.

termination_code: Granular slsqp_jax.RESULTS classification

(successful / merit_stagnation / line_search_failure / iterate_blowup / infeasible / qp_subproblem_failure / nonlinear_max_steps_reached / nonfinite). Surfaced via Solution.stats["slsqp_result"]; Solution.result itself remains the coarse optx.RESULTS code accepted by optimistix’s driver.

step_count: Int[Array, '']
f_val: Float[Array, '']
grad: Float[Array, 'n']
eq_val: Float[Array, 'm_eq']
ineq_val: Float[Array, 'm_ineq']
eq_jac: Float[Array, 'm_eq n']
ineq_jac: Float[Array, 'm_ineq n']
lbfgs_history: LBFGSHistory
multipliers_eq_qp: Float[Array, 'm_eq']
multipliers_ineq_qp: Float[Array, 'm_ineq']
multipliers_eq_ls: Float[Array, 'm_eq']
multipliers_ineq_ls: Float[Array, 'm_ineq']
kkt_residual_grad: Float[Array, 'n']
grad_lagrangian: Float[Array, 'n']
merit_penalty: Float[Array, '']
bound_jac: Float[Array, 'm_bounds n']
qp_iterations: Int[Array, '']
qp_converged: Bool[Array, '']
prev_active_set: Bool[Array, 'm_ineq']
consecutive_qp_failures: Int[Array, '']
consecutive_ls_failures: Int[Array, '']
consecutive_zero_steps: Int[Array, '']
qp_optimal: Bool[Array, '']
best_merit: Float[Array, '']
steps_without_improvement: Int[Array, '']
stagnation: Bool[Array, '']
last_alpha: Float[Array, '']
last_projected_grad_norm: Float[Array, '']
ls_success: Bool[Array, '']
ls_fatal: Bool[Array, '']
qp_fatal: Bool[Array, '']
best_x: Float[Array, 'n']
blowup_count: Int[Array, '']
diverging: Bool[Array, '']
termination_code: Any
diagnostics: SLSQPDiagnostics
__init__(step_count, f_val, grad, eq_val, ineq_val, eq_jac, ineq_jac, lbfgs_history, multipliers_eq_qp, multipliers_ineq_qp, multipliers_eq_ls, multipliers_ineq_ls, kkt_residual_grad, grad_lagrangian, merit_penalty, bound_jac, qp_iterations, qp_converged, prev_active_set, consecutive_qp_failures, consecutive_ls_failures, consecutive_zero_steps, qp_optimal, best_merit, steps_without_improvement, stagnation, last_alpha, last_projected_grad_norm, ls_success, ls_fatal, qp_fatal, best_x, blowup_count, diverging, termination_code, diagnostics)
Parameters:
  • step_count (Int[Array, ''])

  • f_val (Float[Array, ''])

  • grad (Float[Array, 'n'])

  • eq_val (Float[Array, 'm_eq'])

  • ineq_val (Float[Array, 'm_ineq'])

  • eq_jac (Float[Array, 'm_eq n'])

  • ineq_jac (Float[Array, 'm_ineq n'])

  • lbfgs_history (LBFGSHistory)

  • multipliers_eq_qp (Float[Array, 'm_eq'])

  • multipliers_ineq_qp (Float[Array, 'm_ineq'])

  • multipliers_eq_ls (Float[Array, 'm_eq'])

  • multipliers_ineq_ls (Float[Array, 'm_ineq'])

  • kkt_residual_grad (Float[Array, 'n'])

  • grad_lagrangian (Float[Array, 'n'])

  • merit_penalty (Float[Array, ''])

  • bound_jac (Float[Array, 'm_bounds n'])

  • qp_iterations (Int[Array, ''])

  • qp_converged (Bool[Array, ''])

  • prev_active_set (Bool[Array, 'm_ineq'])

  • consecutive_qp_failures (Int[Array, ''])

  • consecutive_ls_failures (Int[Array, ''])

  • consecutive_zero_steps (Int[Array, ''])

  • qp_optimal (Bool[Array, ''])

  • best_merit (Float[Array, ''])

  • steps_without_improvement (Int[Array, ''])

  • stagnation (Bool[Array, ''])

  • last_alpha (Float[Array, ''])

  • last_projected_grad_norm (Float[Array, ''])

  • ls_success (Bool[Array, ''])

  • ls_fatal (Bool[Array, ''])

  • qp_fatal (Bool[Array, ''])

  • best_x (Float[Array, 'n'])

  • blowup_count (Int[Array, ''])

  • diverging (Bool[Array, ''])

  • termination_code (Any)

  • diagnostics (SLSQPDiagnostics)

Return type:

None

class slsqp_jax.SLSQPDiagnostics[source]

Bases: Module

Diagnostic counters and statistics accumulated during an SLSQP run.

These fields are populated inside step() without Python-side branching, and can be inspected by the user after a solve to decide whether the optimistix result code is meaningful (e.g. to detect a RESULTS.successful that actually came from chronic line-search failure rather than real convergence).

Attributes:
n_qp_inner_failures: Number of iterations where the QP solver

reported converged=False.

n_ls_failures: Number of iterations where the line search

reported success=False.

n_lbfgs_skips: Number of iterations where the L-BFGS append

skipped the new curvature pair.

n_nan_directions: Number of iterations where the QP returned

a non-finite search direction.

max_gamma: Maximum L-BFGS gamma observed across iterations. min_diag: Minimum L-BFGS per-variable diagonal entry observed. max_diag: Maximum L-BFGS per-variable diagonal entry observed. eq_jac_min_sv_est: A lower bound on the smallest singular value

of J_eq estimated from the Cholesky factor of J_eq J_eq^T. Small values indicate near rank-deficiency.

ls_alpha_min: Smallest line-search step accepted across the run. tail_ls_failures: Consecutive line-search failure count at

termination. Non-zero values suggest stagnation.

n_bound_fix_solves: Total number of non-trivial bound-fixing inner

solves that actually ran (no-op passes are skipped via jax.lax.cond and do not count). Growing unboundedly is a sign that the bound active set never stabilises.

max_bound_fixed: Largest number of variables pinned to their box

bounds across all iterations (from the final per-iteration free_mask).

max_active_ineq: Largest number of active general inequalities

observed across iterations.

n_merit_regressions: Number of iterations where the L1 merit

function value increased despite the line search reporting success. A non-zero count points to merit-function mis-calibration (too-small rho) or line-search slippage.

n_qp_budget_exhausted: Number of SQP steps where the QP

active-set loop hit qp_max_iter (i.e. exited because the budget ran out, not because of clean convergence or a ping-pong short-circuit). A non-zero count usually indicates degeneracy, multiplier-noise cycling, or an LPEC-A over-prediction that was not caught by the trust gate; consider raising mult_drop_floor or tightening lpeca_trust_threshold.

n_qp_ping_pong: Number of SQP steps where the QP loop’s

anti-cycling ping-pong detector fired. These are good short-circuits (the loop avoided wasting its full budget) but a chronic non-zero value still points to a degenerate constraint pair worth investigating.

max_qp_iterations: Peak active-set iteration count observed

across all SQP steps. Useful for sizing qp_max_iter: if this is consistently equal to qp_max_iter the budget is too tight.

max_qp_active_size: Peak |active_set| (general

inequalities only) observed across all SQP steps.

n_lpeca_bypassed: Number of SQP steps where the LPEC-A

prediction was skipped, either because of the warm-up window (step_count < lpeca_warmup_steps) or because the trust gate vetoed it (rho_bar > lpeca_trust_threshold). Always 0 when active_set_method == "expand".

n_lpeca_capped: Number of SQP steps where the LPEC-A

rank-aware size cap truncated the prediction.

n_lpeca_bounds_prefixed: Cumulative count of box-bound

pre-fixes contributed by LPEC-A across all SQP steps (summed over the bound predictions that survived the trust / warm-up gates and were installed as the initial free_mask for the bound-fixing loop). Always 0 when active_set_method == "expand" or when lpeca_predict_bounds=False.

n_proj_refinements: Cumulative number of M-metric projection

refinement rounds taken across all inner solves performed by MinresQLPSolver over the run. Always 0 for null-space CG / CRAIG.

max_proj_residual: High-water mark of the post-refinement

constraint residual ||A d - b|| reported by MinresQLPSolver on the accepted QP direction across all SQP steps.

n_divergence_blowups: Total number of merit blowup events

observed across the run, whether or not the patience threshold was eventually reached.

divergence_triggered: True when the best-iterate divergence

rollback fired at least once during the run.

min_projected_grad_norm: Low-water mark of the inner solver’s

projected-gradient norm ||W̃_k g_k|| across the run. Always inf for inner solvers other than HRInexactSTCG. Surfaced for post-hoc inspection of how close the run actually got to KKT in the inexact-projection sense, regardless of whether use_inexact_stationarity was on.

n_steps_inexact_below_classical: Number of iterations where the

inner solver’s projected-gradient norm was strictly smaller than the classical Lagrangian gradient norm. A high count indicates that multiplier-recovery noise is the limiting factor and the user might benefit from use_inexact_stationarity=True paired with HRInexactSTCG.

n_qp_inner_failures: Int[Array, '']
n_ls_failures: Int[Array, '']
n_lbfgs_skips: Int[Array, '']
n_nan_directions: Int[Array, '']
max_gamma: Float[Array, '']
min_diag: Float[Array, '']
max_diag: Float[Array, '']
eq_jac_min_sv_est: Float[Array, '']
ls_alpha_min: Float[Array, '']
tail_ls_failures: Int[Array, '']
n_bound_fix_solves: Int[Array, '']
max_bound_fixed: Int[Array, '']
max_active_ineq: Int[Array, '']
n_merit_regressions: Int[Array, '']
n_qp_budget_exhausted: Int[Array, '']
n_qp_ping_pong: Int[Array, '']
max_qp_iterations: Int[Array, '']
max_qp_active_size: Int[Array, '']
n_lpeca_bypassed: Int[Array, '']
n_lpeca_capped: Int[Array, '']
n_lpeca_bounds_prefixed: Int[Array, '']
n_proj_refinements: Int[Array, '']
max_proj_residual: Float[Array, '']
n_divergence_blowups: Int[Array, '']
divergence_triggered: Bool[Array, '']
min_projected_grad_norm: Float[Array, '']
n_steps_inexact_below_classical: Int[Array, '']
__init__(n_qp_inner_failures, n_ls_failures, n_lbfgs_skips, n_nan_directions, max_gamma, min_diag, max_diag, eq_jac_min_sv_est, ls_alpha_min, tail_ls_failures, n_bound_fix_solves, max_bound_fixed, max_active_ineq, n_merit_regressions, n_qp_budget_exhausted, n_qp_ping_pong, max_qp_iterations, max_qp_active_size, n_lpeca_bypassed, n_lpeca_capped, n_lpeca_bounds_prefixed, n_proj_refinements, max_proj_residual, n_divergence_blowups, divergence_triggered, min_projected_grad_norm, n_steps_inexact_below_classical)
Parameters:
  • n_qp_inner_failures (Int[Array, ''])

  • n_ls_failures (Int[Array, ''])

  • n_lbfgs_skips (Int[Array, ''])

  • n_nan_directions (Int[Array, ''])

  • max_gamma (Float[Array, ''])

  • min_diag (Float[Array, ''])

  • max_diag (Float[Array, ''])

  • eq_jac_min_sv_est (Float[Array, ''])

  • ls_alpha_min (Float[Array, ''])

  • tail_ls_failures (Int[Array, ''])

  • n_bound_fix_solves (Int[Array, ''])

  • max_bound_fixed (Int[Array, ''])

  • max_active_ineq (Int[Array, ''])

  • n_merit_regressions (Int[Array, ''])

  • n_qp_budget_exhausted (Int[Array, ''])

  • n_qp_ping_pong (Int[Array, ''])

  • max_qp_iterations (Int[Array, ''])

  • max_qp_active_size (Int[Array, ''])

  • n_lpeca_bypassed (Int[Array, ''])

  • n_lpeca_capped (Int[Array, ''])

  • n_lpeca_bounds_prefixed (Int[Array, ''])

  • n_proj_refinements (Int[Array, ''])

  • max_proj_residual (Float[Array, ''])

  • n_divergence_blowups (Int[Array, ''])

  • divergence_triggered (Bool[Array, ''])

  • min_projected_grad_norm (Float[Array, ''])

  • n_steps_inexact_below_classical (Int[Array, ''])

Return type:

None

slsqp_jax.get_diagnostics(state)[source]

Return the SLSQPDiagnostics accumulator from a final state.

Use this after optimistix.minimise (or a manual solve / step loop) to inspect solver health indicators that the Optimistix result code alone does not expose. See SLSQPDiagnostics for field meanings.

Return type:

SLSQPDiagnostics

Parameters:

state (SLSQPState)

class slsqp_jax.QPResult[source]

Bases: Module

Outer-facing result of solving the SLSQP QP subproblem.

Returned by SLSQP._solve_qp_subproblem() after the bound-fixing loop has post-processed the direction returned by solve_qp().

Attributes:

direction: The search direction d from the QP solution. multipliers_eq: Lagrange multipliers for equality constraints. multipliers_ineq: Lagrange multipliers for inequality constraints. active_set: Boolean mask of active inequality constraints at the solution. converged: Whether the QP solver converged successfully. iterations: Number of active-set iterations taken. bound_fix_solves: Number of non-trivial bound-fixing passes that

actually ran the reduced-space inner solve (passes where the free mask did not change are short-circuited). Useful for debugging bound-handling overhead; 0 when the problem has no bound constraints.

n_bound_fixed: Number of variables pinned to a box bound in the

final QP direction (counts both lower- and upper-bound activations).

ping_ponged: True when the QP active-set loop short-circuited

on a detected add/drop ping-pong cycle.

reached_max_iter: True when the QP active-set loop exhausted

its iteration budget (qp_max_iter).

lpeca_bypassed: True when the LPEC-A prediction was skipped

for this step (warm-up window or trust gate fired).

lpeca_capped: True when the LPEC-A prediction was truncated

by the rank-aware size cap.

n_lpeca_bounds_prefixed: Number of box-bound variables

pre-fixed by LPEC-A before entering the bound-fixing loop. Always 0 when active_set_method == "expand", when lpeca_predict_bounds=False, or when LPEC-A was bypassed by the warm-up / trust gates.

proj_residual: Post-refinement constraint residual ``||A d -

b||`` from the inner solver producing the final QP direction (after bound-fixing). Always 0 for null-space CG / CRAIG; relevant for MinresQLPSolver where it reflects the M-metric range-space projection floor and feeds the outer divergence detector via SLSQPDiagnostics.max_proj_residual.

n_proj_refinements: Cumulative number of M-metric projection

refinement rounds across all inner solves performed for this QP step (active-set loop + bound-fixing loop). Always 0 for null-space solvers.

projected_grad_norm: Latest inner-solver projected-gradient

norm (HRInexactSTCG only; inf otherwise).

direction: Float[Array, 'n']
multipliers_eq: Float[Array, 'm_eq']
multipliers_ineq: Float[Array, 'm_ineq']
active_set: Bool[Array, 'm_ineq']
converged: Bool[Array, '']
iterations: Int[Array, '']
bound_fix_solves: Int[Array, '']
n_bound_fixed: Int[Array, '']
ping_ponged: Bool[Array, '']
reached_max_iter: Bool[Array, '']
lpeca_bypassed: Bool[Array, '']
lpeca_capped: Bool[Array, '']
n_lpeca_bounds_prefixed: Int[Array, '']
proj_residual: Float[Array, '']
n_proj_refinements: Int[Array, '']
projected_grad_norm: Float[Array, '']
__init__(direction, multipliers_eq, multipliers_ineq, active_set, converged, iterations, bound_fix_solves, n_bound_fixed, ping_ponged, reached_max_iter, lpeca_bypassed, lpeca_capped, n_lpeca_bounds_prefixed, proj_residual, n_proj_refinements, projected_grad_norm)
Parameters:
  • direction (Float[Array, 'n'])

  • multipliers_eq (Float[Array, 'm_eq'])

  • multipliers_ineq (Float[Array, 'm_ineq'])

  • active_set (Bool[Array, 'm_ineq'])

  • converged (Bool[Array, ''])

  • iterations (Int[Array, ''])

  • bound_fix_solves (Int[Array, ''])

  • n_bound_fixed (Int[Array, ''])

  • ping_ponged (Bool[Array, ''])

  • reached_max_iter (Bool[Array, ''])

  • lpeca_bypassed (Bool[Array, ''])

  • lpeca_capped (Bool[Array, ''])

  • n_lpeca_bounds_prefixed (Int[Array, ''])

  • proj_residual (Float[Array, ''])

  • n_proj_refinements (Int[Array, ''])

  • projected_grad_norm (Float[Array, ''])

Return type:

None

class slsqp_jax.SLSQPConfig[source]

Bases: Module

Aggregate configuration for SLSQP.

Replaces the legacy 40+ flat keyword arguments with a small set of grouped sub-configs. Pass directly to SLSQP via the config= keyword. All sub-configs default to their dataclass defaults so SLSQPConfig() reproduces the legacy default settings.

tolerance: ToleranceConfig
lbfgs: LBFGSConfig
qp: QPConfig
proximal: ProximalConfig
preconditioner: PreconditionerConfig
lpeca: LPECAConfig
adaptive_cg: AdaptiveCGConfig
__init__(tolerance=<factory>, lbfgs=<factory>, line_search=<factory>, qp=<factory>, proximal=<factory>, preconditioner=<factory>, lpeca=<factory>, adaptive_cg=<factory>)
Parameters:
Return type:

None

class slsqp_jax.ToleranceConfig[source]

Bases: Module

Outer-loop tolerances and iteration / divergence budgets.

Attributes:
rtol: Relative tolerance for the stationarity convergence

check. The test is the filterSQP normalised KKT residual (Fletcher & Leyffer, User manual for filterSQP, eqs. 5 and 6): ||grad_L|| <= rtol * max(mu_max, 1) where mu_max = max_i {||grad_f||_2, |nu_i|, ||a_i||_2 |lambda_i|} is the largest single contributor to the Lagrangian gradient residual (objective-gradient norm, every bound multiplier, and every general-constraint ||row||_2 * |multiplier|). The same test is applied to the inexact projected-gradient numerator ||W_tilde g|| when AdaptiveCGConfig.use_inexact_stationarity is on. Replaces the legacy |L|-based denominator so the test is invariant to absolute objective magnitude and tracks multiplier blow-up under near-rank-deficient active sets. Default 1e-6.

atol: Absolute tolerance for primal feasibility and a number of

internal heuristic floors (steepest-descent fallback, adaptive CG tolerance floor, default proximal mu floor). Default 1e-6.

max_steps: Maximum number of outer SQP iterations. Default 100. min_steps: Minimum iterations before convergence is allowed.

Prevents premature termination at trivial starting points. Default 1.

stagnation_tol: Relative-improvement threshold for the

merit-based stagnation counter. Default 1e-12.

divergence_factor: Best-iterate divergence rollback fires when

the merit grows by more than divergence_factor * max(|best_merit|, 1) for divergence_patience consecutive steps. Default 10.0.

divergence_patience: Number of consecutive blow-up steps required

before the divergence rollback latches. Default 3.

rtol: float = 1e-06
atol: float = 1e-06
max_steps: int = 100
min_steps: int = 1
stagnation_tol: float = 1e-12
divergence_factor: float = 10.0
divergence_patience: int = 3
__init__(rtol=1e-06, atol=1e-06, max_steps=100, min_steps=1, stagnation_tol=1e-12, divergence_factor=10.0, divergence_patience=3)
Parameters:
Return type:

None

class slsqp_jax.LBFGSConfig[source]

Bases: Module

L-BFGS Hessian-approximation parameters.

Attributes:
memory: Number of curvature pairs (s, y) stored in the

ring buffer. Default 10.

damping_threshold: VARCHEN damping threshold applied to each

curvature pair before storage; set to 0.0 to disable. Default 0.2.

diag_floor: Lower clip for the per-variable secant diagonal

B_0 = diag(d). Default 1e-4.

diag_ceil: Upper clip for the per-variable secant diagonal.

Default 1e6.

memory: int = 10
damping_threshold: float = 0.2
diag_floor: float = 0.0001
diag_ceil: float = 1000000.0
__init__(memory=10, damping_threshold=0.2, diag_floor=0.0001, diag_ceil=1000000.0)
Parameters:
Return type:

None

class slsqp_jax.LineSearchConfig[source]

Bases: Module

Backtracking L1-merit line-search parameters.

Attributes:

max_steps: Maximum number of backtracking iterations. Default 20. armijo_c1: Armijo condition coefficient c_1. Default 1e-4. failure_patience: After this many consecutive line-search

failures, the L-BFGS history is hard-reset to identity. Default 3.

max_steps: int = 20
armijo_c1: float = 0.0001
failure_patience: int = 3
__init__(max_steps=20, armijo_c1=0.0001, failure_patience=3)
Parameters:
  • max_steps (int)

  • armijo_c1 (float)

  • failure_patience (int)

Return type:

None

class slsqp_jax.QPConfig[source]

Bases: Module

QP-subproblem parameters.

Attributes:
max_iter: Maximum number of active-set iterations per QP solve.

Default 100.

max_cg_iter: Maximum number of CG iterations per inner solve.

Default 50.

failure_patience: After this many consecutive QP failures, the

L-BFGS history is hard-reset to identity. Default 3.

zero_step_patience: After this many consecutive iterations where

the QP returns ||d|| < atol and primal feasibility holds, convergence is declared via the guarded qp_kkt_success disjunct. Default 3.

ping_pong_threshold: Threshold for the QP add/drop ping-pong

short-circuit. Default 2**31 - 1 (effectively disabled); opt in by setting to 3-8 on degenerate problems.

mult_drop_floor: Floor on the negative-multiplier drop test

inside the QP active-set loop. Default 1e-6.

cg_regularization: Minimum eigenvalue threshold delta**2 for

the CG curvature guard. Default 1e-6.

use_exact_hvp: When True, the QP inner CG uses the exact

Lagrangian HVP (via AD) instead of the L-BFGS approximation. Default False.

max_iter: int = 100
max_cg_iter: int = 50
failure_patience: int = 3
zero_step_patience: int = 3
ping_pong_threshold: int = 2147483647
mult_drop_floor: float = 1e-06
cg_regularization: float = 1e-06
use_exact_hvp: bool = False
__init__(max_iter=100, max_cg_iter=50, failure_patience=3, zero_step_patience=3, ping_pong_threshold=2147483647, mult_drop_floor=1e-06, cg_regularization=1e-06, use_exact_hvp=False)
Parameters:
  • max_iter (int)

  • max_cg_iter (int)

  • failure_patience (int)

  • zero_step_patience (int)

  • ping_pong_threshold (int)

  • mult_drop_floor (float)

  • cg_regularization (float)

  • use_exact_hvp (bool)

Return type:

None

class slsqp_jax.ProximalConfig[source]

Bases: Module

Adaptive proximal multiplier stabilization (sSQP, Wright 2002).

Attributes:
tau: Exponent in mu = clip(kkt_residual^tau, mu_min, mu_max).

Must lie in the half-open interval [0, 1). Set to 0.0 to disable sSQP entirely (equality constraints are then enforced via direct null-space projection). Default 0.5.

mu_min: Floor on the adaptive proximal mu. None resolves to

ToleranceConfig.atol at runtime. Default None.

mu_max: Ceiling on the adaptive proximal mu. Default 0.1.

tau: float = 0.5
mu_min: float | None = None
mu_max: float = 0.1
__init__(tau=0.5, mu_min=None, mu_max=0.1)
Parameters:
Return type:

None

class slsqp_jax.PreconditionerConfig[source]

Bases: Module

QP-inner-solver preconditioner configuration.

Attributes:

enabled: Whether to use a preconditioner at all. Default True. type: Either "lbfgs" (default) or "diagonal". The

diagonal estimator requires an exact HVP (set QPConfig.use_exact_hvp or provide obj_hvp_fn).

diagonal_n_probes: Number of Rademacher probes for the

stochastic diagonal estimator. Default 20.

enabled: bool = True
type: str = 'lbfgs'
diagonal_n_probes: int = 20
__init__(enabled=True, type='lbfgs', diagonal_n_probes=20)
Parameters:
  • enabled (bool)

  • type (str)

  • diagonal_n_probes (int)

Return type:

None

class slsqp_jax.LPECAConfig[source]

Bases: Module

LPEC-A active-set identification (Oberlin & Wright, 2005).

Attributes:
method: One of "expand" (default), "lpeca_init" or

"lpeca". "expand" disables LPEC-A entirely.

sigma: Threshold exponent (sigma_bar in the paper). Must

lie in the open interval (0, 1). Default 0.9.

beta: Threshold scaling factor. None resolves to

1 / (m_ineq + n + m_eq) at runtime. Default None.

use_lp: When True, solve the LPEC-A LP (via mpax.r2HPDHG)

for tighter multiplier estimates. Requires mpax. Default False.

trust_threshold: Trust gate on rho_bar. When rho_bar

exceeds this value the prediction is replaced with an empty set. Default 1.0.

warmup_steps: The first warmup_steps outer SQP iterations

bypass LPEC-A. Default 3.

predict_bounds: When True (default), extend the LPEC-A prediction

to box constraints (warm-start the bound-fixing loop).

method: str = 'expand'
sigma: float = 0.9
beta: float | None = None
use_lp: bool = False
trust_threshold: float = 1.0
warmup_steps: int = 3
predict_bounds: bool = True
__init__(method='expand', sigma=0.9, beta=None, use_lp=False, trust_threshold=1.0, warmup_steps=3, predict_bounds=True)
Parameters:
Return type:

None

class slsqp_jax.AdaptiveCGConfig[source]

Bases: Module

Adaptive CG / inexact stationarity configuration.

Attributes:
enabled: When True, the CG convergence tolerance is adapted from

the outer KKT residual (Eisenstat-Walker style). Default False to preserve baseline behaviour.

use_inexact_stationarity: When True, the projected-gradient norm

from a noise-aware inner solver (e.g. HRInexactSTCG) is added as a logical-OR disjunct to the classical stationarity test. Default False.

enabled: bool = False
use_inexact_stationarity: bool = False
__init__(enabled=False, use_inexact_stationarity=False)
Parameters:
  • enabled (bool)

  • use_inexact_stationarity (bool)

Return type:

None

class slsqp_jax.RESULTS[source]

Bases: RESULTS

SLSQP-specific termination codes.

Subclasses optimistix.RESULTS and adds finer classifications for the failure modes that the upstream solver lumps into optimistix.RESULTS.nonlinear_divergence.

Note

equinox.Enumeration reports its concrete subclasses as @final via the metaclass, but optimistix itself does the same trick (optx.RESULTS subclasses lx.RESULTS) and the docstring example for Enumeration.promote shows the pattern is supported. ty: ignore[subclass-of-final-class] suppresses the false positive.

slsqp_jax.is_successful(result)[source]

Return True if result represents successful convergence.

Robust to both optimistix.RESULTS and slsqp_jax.RESULTS inputs by promoting the value before comparison. Useful for downstream code that may receive results from either enumeration:

from slsqp_jax import is_successful
if is_successful(sol.result):
    ...
Return type:

bool

Parameters:

result (Any)

Args:
result: A RESULTS member from either optx.RESULTS or

its RESULTS subclass.

Returns:

True iff the result equals RESULTS.successful.

class slsqp_jax.ParsedConstraints[source]

Bases: object

Result of converting SciPy-style constraints for use with SLSQP.

All fields map directly to the corresponding SLSQP constructor arguments.

eq_constraint_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None = None
ineq_constraint_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None = None
n_eq_constraints: int = 0
n_ineq_constraints: int = 0
eq_jac_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
ineq_jac_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
eq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
ineq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None
__init__(eq_constraint_fn=None, ineq_constraint_fn=None, n_eq_constraints=0, n_ineq_constraints=0, eq_jac_fn=None, ineq_jac_fn=None, eq_hvp_fn=None, ineq_hvp_fn=None)
Parameters:
  • eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • n_eq_constraints (int)

  • n_ineq_constraints (int)

  • eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

Return type:

None

slsqp_jax.parse_constraints(constraints, x0)[source]

Convert SciPy-style constraints into SLSQP-JAX constraint functions.

Parameters:
  • constraints (dict | list | LinearConstraint | NonlinearConstraint | tuple) – Any form accepted by scipy.optimize.minimize: a dict, list of dicts, LinearConstraint, NonlinearConstraint, or a list/tuple mixing those types. An empty tuple/list means “no constraints”.

  • x0 (Array) – Initial guess – used to evaluate dict constraint functions once to determine their output size.

Returns:

Dataclass whose fields map to SLSQP constructor arguments.

Return type:

ParsedConstraints

slsqp_jax.minimize_like_scipy(fun, x0, args=(), *, jac=None, hessp=None, bounds=None, constraints=(), options=None, has_aux=False, throw=False, verbose=False, auto_scale=True, auto_scale_target_gradient=None, auto_scale_max_factor=None)[source]

Minimise a function using SLSQP with a SciPy-like interface.

This is a convenience wrapper that accepts SciPy-style arguments, converts them for the SLSQP solver, and delegates to optimistix.minimise.

Parameters:
  • fun (Callable) – Objective function. Signature (x, *args) -> scalar or, when has_aux is True, (x, *args) -> (scalar, aux).

  • x0 (Any) – Initial guess (array-like).

  • args (tuple, default: ()) – Extra positional arguments forwarded to fun (unpacked).

  • jac (Callable | bool | None, default: None) – Gradient of fun. A callable (x, *args) -> array or True to indicate that fun returns (f, g) (or ((f, g), aux) when has_aux is set).

  • hessp (Callable | None, default: None) – Hessian-vector product (x, p, *args) -> array.

  • bounds (Bounds | list | tuple | None, default: None) – Variable bounds – None, Bounds, or sequence of (min, max) pairs.

  • constraints (dict | list | LinearConstraint | NonlinearConstraint | tuple, default: ()) – SciPy-style constraints (dict / list-of-dicts / LinearConstraint / NonlinearConstraint). A NonlinearConstraint may carry a user-attached hessp attribute (non-standard; not part of SciPy’s API) that, if callable, is used as the per-component constraint HVP with precedence over hess. See the module-level docstring for the full contract.

  • options (dict[str, Any] | None, default: None) –

    Solver options dict. The following keys are popped with the listed defaults (which match the SLSQP constructor defaults):

    • rtol (1e-6) – relative tolerance for stationarity.

    • atol (1e-6) – absolute tolerance for feasibility.

    • max_steps or maxiter (100) – maximum outer iterations.

    • min_steps (1) – minimum iterations before convergence is allowed.

    • lbfgs_memory (10) – number of L-BFGS pairs.

    • line_search_max_steps (20) – backtracking steps.

    • armijo_c1 (1e-4) – Armijo sufficient decrease.

    • qp_max_iter (100) – active-set iteration budget.

    • qp_max_cg_iter (50) – CG iterations per QP step.

    Any remaining keys are forwarded as **kwargs to the SLSQP constructor, so any SLSQP attribute can be set here (e.g. proximal_tau, proximal_mu_min, proximal_mu_max, use_preconditioner, adaptive_cg_tol, cg_regularization, stagnation_tol).

  • has_aux (bool, default: False) – If True, fun returns (value, aux).

  • throw (bool, default: False) – Whether to raise on solver failure.

  • verbose (bool | Callable[..., None], default: False) – Passed to the SLSQP constructor. False (default) for silent, True to print all diagnostics, or a custom callable. When auto_scale is on the built-in printer is wrapped to show user-unit values for f / |c| / |grad_f| / |grad_L| / |d|; merit / rho / gamma / L-BFGS internals keep a (s) suffix on their label to flag scaled units. See slsqp_jax.wrap_verbose_for_scaling().

  • auto_scale (bool | str, default: True) –

    Automatic problem scaling at the initial point (gradient-based, IPOPT/KNITRO-style). On by default as of this release.

    • True (default) -> "uniform" (target_gradient=1.0, max_factor=1e3, uniform=True). A single shared scalar s_c is applied to every constraint row (equality + general inequality) and a separate s_f to the objective, both symmetrically clipped to [1/max_factor, max_factor]. Preserves inter-row magnitude ratios (the right default for budget-style problems where one constraint is intentionally orders of magnitude larger than the others); fully fixes the documented ||J_eq|| >> ||grad_f|| divergence cascade. Note that atol_internal = s_c * atol_user (no min(., 1.0) cap, so the feasibility tolerance handed to the inner solver can exceed atol_user when s_c > 1).

    • "balanced" -> target_gradient=1.0, max_factor=1e3, uniform=False. The legacy per-row default. Each constraint row gets its own factor driving ||grad c_i||_inf -> 1. Flattens inter-row magnitudes; opt-in when one row’s gradient is vastly out of band and that’s not a meaningful spread.

    • False -> no wrapping (pre-feature behaviour).

    • "knitro" -> target=1.0, max_factor=1.0 (strict shrink-only per-row; opt-in for users who want zero amplification).

    • "ipopt" -> target=100.0, max_factor=1.0 (very conservative per-row; may not fix all cascades).

    • "aggressive" -> target=1.0, max_factor=1e6 (per-row, pushes amplification to the noise-floor limit).

    When scaling is applied, sol.stats carries a scale_factors entry plus _user-suffixed copies of the multiplier vectors and the Lagrangian gradient norm. atol is auto-compensated so the user-perceived feasibility tolerance is preserved (uniform mode does this via atol_internal = s_c * atol_user; per-row modes via atol_internal = atol_user * min(min(s_eq), min(s_ineq), 1.0)).

  • auto_scale_target_gradient (float | None, default: None) – Optional explicit override of the mode’s target_gradient. Under uniform mode this value is consumed by both the s_f derivation (against ||grad_f||_inf) and the s_c derivation (against the cross-row max max_i ||grad c_i||_inf); under per-row modes it drives every row’s individual factor.

  • auto_scale_max_factor (float | None, default: None) – Optional explicit override of the mode’s max_factor. Under uniform mode the bound is symmetric so the scale factor lives in [1/max_factor, max_factor] and the value must satisfy max_factor >= 1.0 (smaller raises ValueError; max_factor == 1.0 is legal but emits a UserWarning because it disables scaling). Under per-row modes the bound is one-sided (s in [eps, max_factor]); max_factor == 1.0 means shrink-only.

Return type:

optimistix.Solution

class slsqp_jax.ScaleFactors[source]

Bases: object

The factors compute_scale_factors_at_x0() (per-row) or compute_uniform_scale_factors_at_x0() (uniform) returned.

Attributes:
s_f: Scalar objective-scaling factor (f_scaled = s_f * f).

1.0 when no scaling was applied or the objective gradient was below grad_floor.

s_eq: Equality-constraint factors, shape (m_eq,). Empty

array when m_eq == 0. Per-row varying under uniform=False; constant-valued (every entry equals the shared s_c) under uniform=True.

s_ineq: Inequality-constraint factors, shape (m_ineq,).

Empty array when m_ineq == 0. Note that this is the general inequality count – bound constraints are scaled separately (and trivially) by the bound-handling machinery. Per-row under uniform=False; constant and equal to the same shared s_c as s_eq under uniform=True.

atol_user: The user-supplied feasibility tolerance. atol_internal: The compensated tolerance handed to the inner

solver. Under uniform=False this is atol_user * min(min(s_eq), min(s_ineq), 1.0) (worst-row conservative). Under uniform=True this is s_c * atol_user exactly (can exceed atol_user when s_c > 1).

target_gradient: Echo of the ScalingConfig field

actually used.

max_factor: Echo of the ScalingConfig field

actually used.

grad_floor: Echo of the ScalingConfig field

actually used.

n_skipped_eq: Number of equality rows whose

||grad_eq[i]||_inf was below grad_floor. Always 0 under uniform=True (uniform mode does not skip individual rows).

n_skipped_ineq: Number of inequality rows whose

||grad_ineq[i]||_inf was below grad_floor. Always 0 under uniform=True.

skipped_obj: True when ||grad_f||_inf was below

grad_floor (a UserWarning was emitted).

uniform: True when the factors were produced by

compute_uniform_scale_factors_at_x0() (single shared s_c across constraints, symmetric max_factor clipping, exact-equivalence atol_internal). False for the per-row compute_scale_factors_at_x0().

s_f: float
s_eq: Array
s_ineq: Array
atol_user: float
atol_internal: float
target_gradient: float
max_factor: float
grad_floor: float
n_skipped_eq: int = 0
n_skipped_ineq: int = 0
skipped_obj: bool = False
uniform: bool = False
__init__(s_f, s_eq, s_ineq, atol_user, atol_internal, target_gradient, max_factor, grad_floor, n_skipped_eq=0, n_skipped_ineq=0, skipped_obj=False, uniform=False)
Parameters:
Return type:

None

class slsqp_jax.ScaledProblem[source]

Bases: object

Bundle of scaled callables + the ScaleFactors they came from.

Attributes mirror the SLSQP constructor’s per-callable slots so the wrapper can be threaded through auto_scaled_minimise() or slsqp_jax.minimize_like_scipy() with minimal plumbing.

The objective fn returns (s_f * value, aux) to match the has_aux=True convention used inside optx.minimise. Constraint callables, Jacobians, and HVPs return scaled values directly.

fn: Callable
eq_constraint_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None
ineq_constraint_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None
obj_grad_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None
eq_jac_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None
ineq_jac_fn: Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None
obj_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None
eq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None
ineq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None
factors: ScaleFactors
__init__(fn, eq_constraint_fn, ineq_constraint_fn, obj_grad_fn, eq_jac_fn, ineq_jac_fn, obj_hvp_fn, eq_hvp_fn, ineq_hvp_fn, factors)
Parameters:
  • fn (Callable)

  • eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • obj_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • factors (ScaleFactors)

Return type:

None

class slsqp_jax.ScalingConfig[source]

Bases: object

Parameters for compute_scale_factors_at_x0() (per-row) and compute_uniform_scale_factors_at_x0() (uniform).

Attributes:
target_gradient: Desired ||grad||_inf after scaling.

s = target_gradient / ||grad|| (clipped). Under uniform=True this same target is consumed by both the objective scalar s_f and the single shared constraint scalar s_c.

max_factor: Bound on the scale factor. Under uniform=False

(per-row modes) this is a one-sided amplification cap, so s in [eps, max_factor]; max_factor=1.0 means “shrink-only” (KNITRO/IPOPT). Under uniform=True the bound is symmetric so s in [1/max_factor, max_factor] and the value must satisfy max_factor >= 1.0; passing max_factor=1.0 under uniform disables scaling entirely and emits a UserWarning. The project default max_factor=1e3 is well below typical AD relative-error noise floors (~1e-12) so amplification cannot promote roundoff to signal under any reasonable AD.

grad_floor: Rows whose ||grad||_inf falls below this

value are skipped (left at s = 1.0). 1e-12 is the right default: it is comfortably above eps so machine-zero is caught, but small enough that genuinely tiny-but-non-degenerate gradients still get scaled.

uniform: When True, apply a single shared scalar s_c

across all constraint rows (equality and inequality unioned) preserving inter-row ratios; clip s_c and s_f symmetrically by max_factor; set atol_internal = s_c * atol_user exactly (no min(., 1.0) cap, so atol_internal can exceed atol_user). When False (the legacy default) each constraint row gets its own factor and atol_internal = atol_user * min(min(s_eq), min(s_ineq), 1.0).

target_gradient: float = 1.0
max_factor: float = 1000.0
grad_floor: float = 1e-12
uniform: bool = False
__init__(target_gradient=1.0, max_factor=1000.0, grad_floor=1e-12, uniform=False)
Parameters:
Return type:

None

slsqp_jax.auto_scaled_minimise(fn, solver, x0, args=None, *, auto_scale=True, auto_scale_target_gradient=None, auto_scale_max_factor=None, has_aux=False, options=None, max_steps=256, throw=True, tags=frozenset({}), adjoint=None)[source]

Convenience wrapper around optx.minimise with auto-scaling.

Mirrors the slsqp_jax.minimize_like_scipy() default-on auto-scaling path for users who construct SLSQP directly. Builds a ScaledProblem from the user’s callables, replaces the solver’s eq/ineq/jac/hvp slots with the scaled wrappers, overrides the SLSQPConfig.atol with atol_internal, and unscales the returned optx.Solution.

When auto_scale=False the call passes through to optx.minimise unchanged.

Return type:

Solution

Parameters:
Args:
fn: Objective. Same convention as optx.minimise

(x, args) -> value with has_aux=False or (x, args) -> (value, aux) with has_aux=True.

solver: An SLSQP instance. The constraint,

Jacobian, HVP, and verbose slots are read off and replaced with their scaled counterparts.

x0: Initial iterate. args: Extra payload threaded through fn / constraints. auto_scale: True (default) -> "uniform" mode (a

single shared s_c across all constraint rows + independent s_f for the objective, both symmetrically clipped by max_factor). False -> no scaling (passthrough). String -> explicit mode name; pass "balanced" to recover the legacy per-row default. See resolve_scaling_mode() for the full table.

auto_scale_target_gradient: Optional explicit target

gradient override. Under uniform mode this is consumed by both s_f and s_c derivations.

auto_scale_max_factor: Optional explicit max-factor

override. Under uniform mode the symmetric bound requires max_factor >= 1.0 (smaller raises ValueError; == 1.0 warns).

has_aux: Whether fn returns (value, aux). options: Forwarded to optx.minimise. max_steps: Forwarded to optx.minimise. throw: Forwarded to optx.minimise. tags: Forwarded to optx.minimise. adjoint: Forwarded to optx.minimise when not None.

Returns:

An optx.Solution; when scaling was applied, the stats dict carries scale_factors and the _user suffixed fields documented on unscale_solution(). Under uniform mode the ScaleFactors instance has uniform=True and s_eq / s_ineq constant-valued and equal to the same shared s_c.

slsqp_jax.auto_scaled_problem(fn, x0, args, has_aux, *, eq_constraint_fn=None, ineq_constraint_fn=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, obj_hvp_fn=None, eq_hvp_fn=None, ineq_hvp_fn=None, scaling_config, atol_user=1e-06)[source]

Build a ScaledProblem from the user’s callables and x0.

Computes scale factors via either compute_scale_factors_at_x0() (when scaling_config.uniform is False — the legacy per-row behavior) or compute_uniform_scale_factors_at_x0() (when True — a single shared scalar across all constraint rows), wraps every supplied callable, and returns the bundle. Callables left as None stay as None – the SLSQP solver will fall back to its AD paths, which automatically pick up the scaling from the wrapped fn / constraint_fn callables.

The returned ScaledProblem.fn adheres to the has_aux=True convention (returning (value, aux) even when the user’s fn returned just a value), matching what optimistix.minimise expects on the SLSQP path.

Return type:

ScaledProblem

Parameters:
  • fn (Callable)

  • x0 (Array)

  • args (Any)

  • has_aux (bool)

  • eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • obj_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • scaling_config (ScalingConfig)

  • atol_user (float)

slsqp_jax.compute_scale_factors_at_x0(fn, x0, args, has_aux, *, eq_constraint_fn=None, ineq_constraint_fn=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, target_gradient=1.0, max_factor=1000.0, grad_floor=1e-12, atol_user=1e-06)[source]

Evaluate gradients at x0 and return per-component scale factors.

Return type:

ScaleFactors

Parameters:
  • fn (Callable)

  • x0 (Array)

  • args (Any)

  • has_aux (bool)

  • eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • target_gradient (float)

  • max_factor (float)

  • grad_floor (float)

  • atol_user (float)

Args:
fn: Objective. (x, args) -> value or

(x, args) -> (value, aux) when has_aux=True.

x0: Initial iterate. args: Extra payload threaded through fn / constraints. has_aux: Whether fn returns (value, aux). eq_constraint_fn: Optional equality constraint function. ineq_constraint_fn: Optional inequality constraint function. obj_grad_fn: Optional user-supplied objective gradient.

When None we fall back to jax.grad(fn).

eq_jac_fn: Optional user-supplied equality Jacobian.

jax.jacrev fallback.

ineq_jac_fn: Optional user-supplied inequality Jacobian.

jax.jacrev fallback.

target_gradient: See ScalingConfig. max_factor: See ScalingConfig. grad_floor: See ScalingConfig. atol_user: User-perceived feasibility tolerance. The

returned ScaleFactors.atol_internal compensates for the constraint-shrinking factors.

Returns:

A ScaleFactors instance.

Notes:

Emits a UserWarning per component whose gradient is below grad_floor (objective and any constraint rows). Skipped rows keep s = 1.0; the user is expected to either pick a different x0 or pass auto_scale=False and provide their own scaling.

slsqp_jax.compute_uniform_scale_factors_at_x0(fn, x0, args, has_aux, *, eq_constraint_fn=None, ineq_constraint_fn=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, target_gradient=1.0, max_factor=1000.0, grad_floor=1e-12, atol_user=1e-06)[source]

Evaluate gradients at x0 and return uniform scale factors.

Uniform mode applies a single shared scalar s_c to every constraint row (equality and inequality unioned) and an independent scalar s_f to the objective; both are clipped symmetrically by max_factor so they can amplify or shrink. The feasibility tolerance is propagated as atol_internal = s_c * atol_user exactly, so the inner solver’s feasibility test corresponds 1-1 to the user-unit test |c_user[i]| <= atol_user regardless of which direction s_c moved.

Return type:

ScaleFactors

Parameters:
  • fn (Callable)

  • x0 (Array)

  • args (Any)

  • has_aux (bool)

  • eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)

  • obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)

  • eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)

  • target_gradient (float)

  • max_factor (float)

  • grad_floor (float)

  • atol_user (float)

Args:
fn: Objective. (x, args) -> value or

(x, args) -> (value, aux) when has_aux=True.

x0: Initial iterate. args: Extra payload threaded through fn / constraints. has_aux: Whether fn returns (value, aux). eq_constraint_fn: Optional equality constraint function. ineq_constraint_fn: Optional inequality constraint function. obj_grad_fn: Optional user-supplied objective gradient.

When None we fall back to jax.grad(fn).

eq_jac_fn: Optional user-supplied equality Jacobian.

jax.jacrev fallback.

ineq_jac_fn: Optional user-supplied inequality Jacobian.

jax.jacrev fallback.

target_gradient: See ScalingConfig. Consumed by both

s_f and s_c derivations.

max_factor: See ScalingConfig. Must be >= 1.0.

max_factor == 1.0 is legal but disables scaling and emits a UserWarning.

grad_floor: See ScalingConfig. atol_user: User-perceived feasibility tolerance. The

returned ScaleFactors.atol_internal is s_c * atol_user and may exceed atol_user when s_c > 1.

Returns:

A ScaleFactors instance with uniform=True, s_eq = jnp.full((m_eq,), s_c), and s_ineq = jnp.full((m_ineq,), s_c).

Raises:
ValueError: If max_factor < 1.0 (the symmetric interval

[1/max_factor, max_factor] would be empty).

Notes:

Emits a UserWarning when ||grad_f(x0)||_inf is below grad_floor (s_f = 1.0), when every constraint row’s gradient inf-norm is below grad_floor (s_c = 1.0), or when max_factor == 1.0 (scaling disabled). No per-row warnings are emitted under uniform mode – the cross-row max dominates the per-row check, and per-row magnitudes are intentionally preserved.

slsqp_jax.resolve_scaling_mode(mode, *, target_gradient=None, max_factor=None)[source]

Map a user-facing auto_scale argument to a ScalingConfig.

Return type:

Optional[ScalingConfig]

Parameters:
Args:
mode: True -> "uniform" (the default), False ->

None (no scaling), or one of the string aliases "uniform", "balanced", "knitro", "ipopt", "aggressive".

target_gradient: Optional explicit override of the mode’s

default target. None uses the mode default.

max_factor: Optional explicit override of the mode’s default

cap. None uses the mode default.

Returns:

ScalingConfig or None if mode is False.

Raises:
ValueError: If mode is a string that is not one of the

recognised aliases.

TypeError: If mode is neither a bool nor a string.

slsqp_jax.unscale_solution(sol, factors)[source]

Post-process sol so user-facing stats use unscaled units.

The primary iterate sol.value lives in x-space, which is not scaled by this module (variable scaling is out-of-scope – see the deferred section of the auto-scaling plan). Multipliers and the Lagrangian gradient norm are converted from scaled to user units via lambda_user = (s / s_f) * lambda_scaled.

The merit penalty rho and its history live entirely in scaled units; we leave them as-is and add a merit_penalty_note entry flagging the unit.

Return type:

Solution

Parameters:
Args:
sol: Raw solution returned by optimistix.minimise (or

built post-hoc by the diagnostics layer).

factors: The ScaleFactors used to wrap the problem.

Returns:

A new optx.Solution with the augmented stats dict. sol is not mutated.

slsqp_jax.wrap_verbose_for_scaling(user_verbose, factors)[source]

Adapt a verbose callback to print user-unit values when scaling is on.

Return type:

Callable

Parameters:
Args:
user_verbose: Either a boolean (True / False) or a

user-supplied callable. Booleans select the built-in printer (slsqp_verbose from slsqp_jax.slsqp.verbose); a callable is forwarded after the tuple values have been rewritten.

factors: The ScaleFactors to undo.

Returns:

A callback with the same (**kwargs) signature as slsqp_jax.slsqp.verbose.slsqp_verbose(). Quantities with a clean unscaled equivalent are converted to user units in place; quantities that live in scaled space (merit, rho, L-BFGS internals) are printed with the suffix (s) appended to their label so the reader knows the unit.

class slsqp_jax.DebugReport[source]

Bases: object

User-facing report produced by the diagnostics layer.

Phase 1 instances carry signals=[] and diagnoses=[]; the renderer still emits a useful report from the termination metadata + diagnostics counters + trajectory. Phase 2 / 3 fill the signal and diagnosis sections in.

Attributes:
run: The DebugRunResult the report was built from.

Carries the SLSQPState, the StepSummary trajectory, and any signals/diagnoses already wired by later phases.

signals: Convenience accessor exposing

run.fired_signals keyed by name. Phase 1: empty.

diagnoses: List of multi-signal diagnoses produced by the

playbook. Phase 1: empty.

run: DebugRunResult
signals: dict[str, Any]
diagnoses: list[Any]
scale_factors: Any | None = None
classmethod from_run(run)[source]

Build a DebugReport from a DebugRunResult.

Return type:

DebugReport

Parameters:

run (DebugRunResult)

render()[source]

Return the full report as a single string.

Return type:

str

print_summary(*, file=None)[source]

Write the report to file (default sys.stdout).

Uses io.IOBase.write() rather than print() so the call passes the project-wide no-print-statements pre-commit hook (the hook is intentionally strict; presenting a multi-paragraph diagnostics report is one of the few legitimate exceptions and we route through write for clarity). The trailing newline matches what print() would emit so users do not have to special-case the output in a notebook.

Return type:

None

Parameters:

file (Any)

to_dict()[source]

Return a JSON-serialisable dict of the rendered fields.

Useful for piping the report into downstream tooling (CI, notebooks, dashboards). Heavy artifacts on signals (np.ndarray instances) are not included; call signal.artifacts directly when you want them.

Return type:

dict[str, Any]

__init__(run, signals=<factory>, diagnoses=<factory>, scale_factors=None)
Parameters:
Return type:

None

class slsqp_jax.DebugRunResult[source]

Bases: object

Aggregate result of a manual-loop debug run.

Carries everything a downstream signal evaluator or report renderer needs to do its job, plus the handles (solver, fn, x0, args, has_aux) required to re-execute the run for ad-hoc inspection via slsqp_jax.diagnostics.runner.capture_state_at_step().

Attributes:

solver: The slsqp_jax.SLSQP instance the run used. fn: The objective callable passed to debug_run. x0: The initial iterate. args: Extra positional payload threaded through fn /

constraint callables.

has_aux: Whether fn returns (value, aux). final_state: The terminal SLSQPState after the manual

loop exited. Carries the full SLSQPDiagnostics accumulator and every device array the end-of-run evaluators need.

final_y: The terminal iterate y returned by step(). final_result: The granular slsqp_jax.RESULTS termination

code stored on final_state.termination_code.

coarse_result: The coarse optimistix.RESULTS code that

terminate() returned (always a member of the parent enum, suitable for the optimistix driver-style check).

summaries: Per-step StepSummary records, one per

iteration the loop actually executed.

terminated_at_step: Index of the iteration where the loop

exited (0-based; len(summaries) - 1 on success).

max_steps_reached: True iff the loop ran out of iterations

without terminate() returning done.

fired_signals: Signals emitted by the per-step + end-of-run

evaluators. Empty list when the diagnostics layer is run without signals (Phase 1 default).

solver: SLSQP
fn: Any
x0: Any
args: Any
has_aux: bool
final_state: SLSQPState
final_y: Any
final_result: Any
coarse_result: Any
summaries: list[StepSummary]
terminated_at_step: int
max_steps_reached: bool
fired_signals: list[Any]
property diagnostics: SLSQPDiagnostics

Convenience accessor for final_state.diagnostics.

property n_steps: int

Number of iterations the loop actually executed.

property terminated_successfully: bool

True iff the run actually converged.

Both conditions must hold: the granular final_result is RESULTS.successful and the manual loop did not run out of its iteration budget without terminate() returning done=True.

The second clause matters because RESULTS.successful is the default termination_code on a fresh SLSQPState (it means “no failure has latched yet”) — so a truncated run with no failure flag set will report successful even though it never actually converged.

__init__(solver, fn, x0, args, has_aux, final_state, final_y, final_result, coarse_result, summaries, terminated_at_step, max_steps_reached, fired_signals=<factory>)
Parameters:
Return type:

None

class slsqp_jax.Diagnosis[source]

Bases: object

Named multi-signal diagnosis surfaced in the report.

Attributes:

name: Stable machine-readable identifier. cause: One-paragraph mechanism explanation. suggestions: Concrete starting-point fixes. related_signals: Names of the fired signals the rule combined.

name: str
cause: str
suggestions: list[str]
related_signals: list[str]
__init__(name, cause, suggestions=<factory>, related_signals=<factory>)
Parameters:
Return type:

None

class slsqp_jax.DiagnosticContext[source]

Bases: object

Container yielded by diagnostic_run()’s __enter__.

Carries every DebugRunResult and matching DebugReport produced inside the with block, plus counters distinguishing intercepted SLSQP calls from passthrough calls (i.e. optx.minimise invocations whose solver is something other than SLSQP).

Attributes:
runs: One DebugRunResult per intercepted call, in

execution order.

reports: One DebugReport per intercepted call, in

the same order as runs. reports[i] was built from runs[i].

intercepted_calls: Count of optx.minimise calls whose

solver was an SLSQP instance and were re-routed through the debug loop.

passthrough_calls: Count of optx.minimise calls whose

solver was not SLSQP and were forwarded unchanged to the original optx.minimise.

runs: list[DebugRunResult]
reports: list[DebugReport]
intercepted_calls: int = 0
passthrough_calls: int = 0
property n_failing: int

Number of captured runs that did not terminate successfully.

print_summary(idx=None, *, file=None)[source]

Print one or all captured reports to file (default stdout).

Pass an integer idx to print exactly that report (negative indices count from the end, matching list semantics). With idx=None (the default) every captured report is printed in order, separated by a blank line. Useful from a notebook or an except handler when the auto print on exit was suppressed (print_on_exit="never") or when the user wants to see a successful run’s report that the auto-suppress skipped.

Return type:

None

Parameters:
__init__(runs=<factory>, reports=<factory>, intercepted_calls=0, passthrough_calls=0)
Parameters:
Return type:

None

class slsqp_jax.Signal[source]

Bases: object

Single evidence-backed hypothesis emitted by an evaluator.

The wording contract is hypothesis with evidence, not verdict. summary reads “X looks suspicious because Y”; detail expands on the mechanism; suggestions are starting points for the user to investigate, not prescriptions to apply blindly.

Attributes:
name: Stable machine-readable identifier (used for the

playbook scope filter and the cap-policy enforcement test).

specificity: How uniquely this pattern points at one cause.

specific = essentially diagnostic for one cause; ambiguous = consistent with two or three named causes; generic = consistent with many causes. Static editorial tag set at registration.

magnitude: How far past threshold the evidence landed.

marginal = within 10x of threshold; moderate = 10x-100x past threshold; extreme = >100x past threshold. Computed from evidence at fire time.

confidence: Derived from (specificity, magnitude) via the

lookup table in slsqp_jax.diagnostics.playbook.confidence_for(). The single ranking axis the report sorts by.

summary: One-line “X looks suspicious because Y” framing. detail: Multi-paragraph mechanism explanation. evidence: Numeric evidence keyed by short name. Each value

is the actual measurement; the threshold is documented in the evaluator’s docstring.

suggestions: Concrete starting-point fixes (config knobs,

alternative inner solvers, etc).

artifacts: Heavy linalg objects the evaluator computed

immediately from the live state when it fired. May be empty.

offending_step: 1-indexed step the signal fired on, or None

for end-of-run evaluators that scope over the whole run.

name: str
specificity: Literal['specific', 'ambiguous', 'generic']
magnitude: Literal['marginal', 'moderate', 'extreme']
confidence: Literal['low', 'medium', 'high']
summary: str
detail: str
evidence: dict[str, float | int]
suggestions: list[str]
artifacts: dict[str, ndarray]
offending_step: int | None = None
__init__(name, specificity, magnitude, confidence, summary, detail, evidence=<factory>, suggestions=<factory>, artifacts=<factory>, offending_step=None)
Parameters:
Return type:

None

class slsqp_jax.StepSummary[source]

Bases: object

Host-resident scalar summary of a single SLSQP iteration.

Every field is a plain Python int / float / bool so the diagnostics loop can branch on it without an extra device sync. The only device → host transfer happens once inside StepSummary.from_state().

Attributes:

step_count: Iteration number after this step (1-indexed). f_val: Objective value at the post-step iterate. merit: L1 merit value at the post-step iterate. last_alpha: Line-search step size accepted at this step. qp_iterations_total: Cumulative QP active-set iterations across

all steps so far (matches state.qp_iterations).

qp_iterations_step: QP active-set iterations consumed by this

step alone (delta of qp_iterations_total).

qp_converged: Whether the QP solver reported success this step. qp_real_failure: True iff the QP did not converge AND did

not exhaust its iteration budget (the “real failure” distinction the L-BFGS reset chain uses; see AGENTS.md “Real-vs-budget QP failure distinction”).

qp_reached_max_iter: Whether the QP active-set loop exhausted

qp_max_iter this step (delta of diagnostics.n_qp_budget_exhausted).

qp_ping_ponged: Whether the QP ping-pong short-circuit fired

this step (delta of diagnostics.n_qp_ping_pong).

ls_success: Whether the line search reported success. consecutive_qp_failures: Outer-loop failure-streak counter. consecutive_ls_failures: Outer-loop failure-streak counter. consecutive_zero_steps: Zero-step convergence-detection counter. grad_norm: Euclidean norm of the objective gradient at the

post-step iterate.

grad_lagrangian_norm: Euclidean norm of the Lagrangian gradient

at the post-step iterate (used by the classical stationarity criterion).

lagrangian_value: Value of the Lagrangian

L = f - lambda_eq^T c_eq - mu_ineq^T c_ineq at the post-step iterate.

rel_kkt: grad_lagrangian_norm / max(|L|, 1) — the legacy

relative-stationarity proxy. Surfaced for backward- compatible diagnostics; note that the live convergence check now compares grad_lagrangian_norm / max(mu_max, 1) against rtol (filterSQP eqs. 5–6, see slsqp_jax.slsqp.termination.compute_mu_max()), so rel_kkt and the actual termination criterion may differ when |L| and mu_max disagree.

kkt_scale: filterSQP mu_max denominator at this step.

Because state.ineq_jac already contains the general inequality rows plus the unit-norm bound rows, this summary computes the same maximum by treating the whole inequality block as row-norm-scaled terms.

kkt_ratio: Active relative-stationarity ratio used by the

convergence test: grad_lagrangian_norm / max(kkt_scale, 1).

gamma: Scalar L-BFGS initial-Hessian scaling. min_diag: Minimum entry of the L-BFGS per-variable diagonal. max_diag: Maximum entry of the L-BFGS per-variable diagonal. diag_kappa: max_diag / max(min_diag, 1e-30), the L-BFGS

initial-Hessian condition-number proxy.

lbfgs_count: Number of curvature pairs currently stored in the

ring buffer.

lbfgs_skipped: Whether the L-BFGS append skipped its curvature

pair this step (delta of diagnostics.n_lbfgs_skips).

max_abs_mult_eq: max(|lambda_eq_ls|) at the post-step

iterate (the stationarity-quality multiplier — the same vector exposed via Solution.stats["multipliers_eq"]).

max_abs_mult_ineq: max(|mu_ineq_ls|) at the post-step

iterate.

qp_vs_ls_multiplier_ratio: Maximum

|λ_qp[i]| / max(|λ_ls[i]|, eps) across active equality + general-inequality rows. Surfaces “the QP multiplier was N x larger than the stationarity multiplier this step”, which is the textbook signature of QP-multiplier inflation (poor L-BFGS conditioning + auto-scaling). Always 1.0 on the very first step where both are seeded from the same lstsq initialisation; settles around 1.0 near a clean KKT point and grows on noisy / ill-conditioned iterates.

n_active_ineq: Number of active inequality constraints (general
  • bounds) at the post-step iterate.

eq_jac_min_sv_est: Lower bound on the smallest singular value

of J_eq estimated from the Cholesky of J_eq J_eq^T (cumulative low-water mark from diagnostics.eq_jac_min_sv_est).

projected_grad_norm: Norm of the inner solver’s projected

gradient (HRInexactSTCG only; inf otherwise).

merit_penalty: L1 merit penalty parameter rho after this

step.

max_eq_violation: max|c_eq| at the post-step iterate. max_ineq_violation: max(0, -c_ineq) at the post-step

iterate.

proj_residual_high_water: Cumulative high-water mark of the

MinresQLPSolver M-metric projection residual. Always 0.0 for null-space solvers.

diverging: Whether the best-iterate divergence rollback fired

this step.

blowup_count: Consecutive blowup events at this step. merit_regression_step: Whether compute_merit exceeded the

best merit despite the line search reporting success (delta of diagnostics.n_merit_regressions).

step_count: int
f_val: float
merit: float
last_alpha: float
qp_iterations_total: int
qp_iterations_step: int
qp_converged: bool
qp_real_failure: bool
qp_reached_max_iter: bool
qp_ping_ponged: bool
ls_success: bool
consecutive_qp_failures: int
consecutive_ls_failures: int
consecutive_zero_steps: int
grad_norm: float
grad_lagrangian_norm: float
lagrangian_value: float
rel_kkt: float
gamma: float
min_diag: float
max_diag: float
diag_kappa: float
lbfgs_count: int
lbfgs_skipped: bool
max_abs_mult_eq: float
max_abs_mult_ineq: float
qp_vs_ls_multiplier_ratio: float
n_active_ineq: int
eq_jac_min_sv_est: float
projected_grad_norm: float
merit_penalty: float
max_eq_violation: float
max_ineq_violation: float
proj_residual_high_water: float
diverging: bool
blowup_count: int
merit_regression_step: bool
kkt_scale: float = 0.0
kkt_ratio: float = inf
classmethod from_state(state, *, prev_state=None)[source]

Materialise a StepSummary from a live SLSQPState.

This is the single device → host transfer per debug-loop iteration. Everything the runner inspects for control flow below this point reads from the returned StepSummary.

prev_state is the SLSQPState from the previous iteration (or the initial-state from solver.init for step 1). It is used solely to compute single-step deltas of the cumulative diagnostics counters (n_qp_budget_exhausted, n_qp_ping_pong, n_lbfgs_skips, n_merit_regressions) and state.qp_iterations. Passing None treats every cumulative counter as starting from 0, which matches the behaviour of solver.init (which produces a state with all-zero diagnostics).

Return type:

StepSummary

Parameters:
reproducibility_digest()[source]

Return a short hex digest used by capture_state_at_step.

Two StepSummary instances produced by independent runs of the same (solver, x0, args) should hash to the same digest on the same hardware / jax.config settings. Floats are rounded to _HASH_PRECISION_DECIMALS decimals before hashing so a last-bit difference (e.g. from a fused-multiply-add reorder) does not false-positive on nondeterminism.

Return type:

str

__init__(step_count, f_val, merit, last_alpha, qp_iterations_total, qp_iterations_step, qp_converged, qp_real_failure, qp_reached_max_iter, qp_ping_ponged, ls_success, consecutive_qp_failures, consecutive_ls_failures, consecutive_zero_steps, grad_norm, grad_lagrangian_norm, lagrangian_value, rel_kkt, gamma, min_diag, max_diag, diag_kappa, lbfgs_count, lbfgs_skipped, max_abs_mult_eq, max_abs_mult_ineq, qp_vs_ls_multiplier_ratio, n_active_ineq, eq_jac_min_sv_est, projected_grad_norm, merit_penalty, max_eq_violation, max_ineq_violation, proj_residual_high_water, diverging, blowup_count, merit_regression_step, kkt_scale=0.0, kkt_ratio=inf)
Parameters:
  • step_count (int)

  • f_val (float)

  • merit (float)

  • last_alpha (float)

  • qp_iterations_total (int)

  • qp_iterations_step (int)

  • qp_converged (bool)

  • qp_real_failure (bool)

  • qp_reached_max_iter (bool)

  • qp_ping_ponged (bool)

  • ls_success (bool)

  • consecutive_qp_failures (int)

  • consecutive_ls_failures (int)

  • consecutive_zero_steps (int)

  • grad_norm (float)

  • grad_lagrangian_norm (float)

  • lagrangian_value (float)

  • rel_kkt (float)

  • gamma (float)

  • min_diag (float)

  • max_diag (float)

  • diag_kappa (float)

  • lbfgs_count (int)

  • lbfgs_skipped (bool)

  • max_abs_mult_eq (float)

  • max_abs_mult_ineq (float)

  • qp_vs_ls_multiplier_ratio (float)

  • n_active_ineq (int)

  • eq_jac_min_sv_est (float)

  • projected_grad_norm (float)

  • merit_penalty (float)

  • max_eq_violation (float)

  • max_ineq_violation (float)

  • proj_residual_high_water (float)

  • diverging (bool)

  • blowup_count (int)

  • merit_regression_step (bool)

  • kkt_scale (float)

  • kkt_ratio (float)

Return type:

None

slsqp_jax.capture_state_at_step(solver, fn, x0, step, *, args=None, has_aux=False, expected_summary=None)[source]

Re-run solver to step step and return the live SLSQPState.

This is the public ad-hoc inspection tool: signals already build their artifacts inline at the moment they fire, so most users will never call this. It exists for the case where the user wants to poke at a step their signals did not preserve a snapshot of.

Return type:

SLSQPState

Parameters:
  • solver (SLSQP)

  • fn (Callable)

  • x0 (Any)

  • step (int)

  • args (Any)

  • has_aux (bool)

  • expected_summary (Optional[StepSummary])

Args:

solver: The slsqp_jax.SLSQP instance to re-run. fn: Objective callable (same shape as for debug_run()). x0: Initial iterate. step: Target iteration to stop at (1-indexed; step=k

returns the state immediately after the k-th call to solver.step).

args: Extra payload threaded through fn. has_aux: Whether fn returns (value, aux). expected_summary: Optional StepSummary from the

original debug_run() at the same step. When supplied, the recovered state’s summary is hashed and compared against it; a mismatch raises RuntimeError. This is the load-bearing reproducibility check that prevents the tool from silently lying about which iterate it is showing.

Returns:

The live SLSQPState immediately after step step.

Raises:

ValueError: If step is non-positive. RuntimeError: If expected_summary is supplied and the

recovered state’s reproducibility digest does not match.

slsqp_jax.debug_run(solver, fn, x0, *, args=None, max_steps=None, has_aux=False, per_step_evaluators=(), end_of_run_evaluators=())[source]

Run solver under a manual Python loop and return a DebugRunResult.

The loop reproduces what optimistix.minimise would do, iterating solver.initsolver.stepsolver.terminate on each step. After every step the live SLSQPState is summarised into a StepSummary and (when Phase 2 wires them) the per-step evaluators in per_step_evaluators are run against (state, summary, summaries_so_far). After the loop exits, end_of_run_evaluators are run against (final_state, summaries, final_result).

Return type:

DebugRunResult

Parameters:
Args:

solver: The slsqp_jax.SLSQP instance to run. fn: Objective callable. (x, args) -> value by default,

or (x, args) -> (value, aux) when has_aux=True.

x0: Initial iterate. args: Extra payload threaded through fn and the constraint

callables on solver.

max_steps: Optional iteration budget override. Defaults to

solver.max_steps.

has_aux: Whether fn returns (value, aux). per_step_evaluators: Tuple of callables

(state, summary, summaries) -> Signal | None invoked after each step. Empty by default (Phase 1 ships without signals).

end_of_run_evaluators: Tuple of callables

(final_state, summaries, result) -> Signal | None invoked once after the loop exits. Empty by default.

Returns:

DebugRunResult with the per-step StepSummary trajectory, the terminal state, the granular and coarse termination codes, and any signals fired during the run.

slsqp_jax.diagnose(solver, fn, x0, *, args=None, max_steps=None, has_aux=False)[source]

Run solver under a manual loop and return a DebugReport.

Convenience wrapper around debug_run() + DebugReport. Uses the registered PER_STEP_EVALUATORS and END_OF_RUN_EVALUATORS; the report is always produced (even on a successful run), so the user can request a diagnosis when they suspect slow-but-converged behaviour, not just on hard failures.

Return type:

DebugReport

Parameters:
  • solver (SLSQP)

  • fn (Callable)

  • x0 (Any)

  • args (Any)

  • max_steps (Optional[int])

  • has_aux (bool)

slsqp_jax.diagnose_minimize_like_scipy(fun, x0, args=(), *, jac=None, hessp=None, bounds=None, constraints=(), options=None, has_aux=False, throw=False, verbose=False, max_steps_override=None)[source]

Drop-in replacement for slsqp_jax.minimize_like_scipy() with diagnostics.

Mirrors slsqp_jax.minimize_like_scipy()’s public signature 1:1 so callers can swap one import + one call name without any other code changes:

# before
from slsqp_jax import minimize_like_scipy
sol = minimize_like_scipy(fun, x0, ...)

# after
from slsqp_jax import diagnose_minimize_like_scipy
sol, report = diagnose_minimize_like_scipy(fun, x0, ...)
report.print_summary()

Unlike diagnostic_run(), this function performs no monkey-patching — it is the recommended path for users whose wrapper code does from optimistix import minimise (which bypasses the context manager’s patch), as well as for tests and CI where global state must be avoided.

The extra keyword max_steps_override accepts an integer to cap the diagnostic loop independently of the options['maxiter'] that builds the underlying SLSQP; None (default) honours the option dict’s own value.

Return type:

tuple[Solution, DebugReport]

Parameters:
Returns:

A (sol, report) tuple. sol is a normal optx.Solution indistinguishable from what slsqp_jax.minimize_like_scipy() would have returned (modulo the device-resident state the diagnostics layer keeps around); report is the populated DebugReport with all signals + diagnoses already evaluated.

Raises:

Whatever minimize_like_scipy would raise plus, when throw=True and the run did not converge, optimistix.RESULTS-flavoured runtime errors via EnumerationItem.error_if.

slsqp_jax.diagnostic_run(*, max_steps=None, print_on_exit='auto', output=None)[source]

Context manager that intercepts optimistix.minimise calls.

Inside the with block, any optimistix.minimise(fn, solver, ...) whose solver is an slsqp_jax.SLSQP instance is re-routed through slsqp_jax.diagnostics.debug_run(), the resulting DebugRunResult and DebugReport are stashed on the yielded DiagnosticContext, and a fully populated optx.Solution is returned to the caller — so the user’s wrapper code (whether it goes through slsqp_jax.minimize_like_scipy() or builds its own SLSQP) runs unchanged. Calls whose solver is something other than SLSQP are forwarded to the original optimistix.minimise unchanged, with the only side effect being a bump on DiagnosticContext.passthrough_calls.

Return type:

Iterator[DiagnosticContext]

Parameters:
  • max_steps (int | None)

  • print_on_exit (Literal['auto', 'always', 'never'])

  • output (Any)

Args:
max_steps: Optional iteration-budget override forwarded to

every intercepted debug_run. None (the default) uses each call’s own max_steps argument, falling back to solver.max_steps.

print_on_exit: When the context exits, print captured reports

to output. "auto" (default): only failing runs plus a trailer; "always": every run plus the trailer; "never": nothing.

output: File-like sink for the on-exit print. Defaults to

sys.stdout.

Yields:

DiagnosticContext collecting the runs and reports for every intercepted call.

Raises:
RuntimeError: If another diagnostic_run() is already

active (re-entrant or concurrent enter). The patch is a global mutation; serialising would mask bugs.

Example:

with slsqp_jax.diagnostic_run() as ctx:
    sol = slsqp_jax.minimize_like_scipy(fun, x0, ...)
# On exit, any failing report has already been printed.
# The Solution is the real optimistix.Solution.
ctx.runs[0].diagnostics  # SLSQPDiagnostics counters
class slsqp_jax.AbstractInnerSolver[source]

Bases: Module

Strategy for solving the equality-constrained QP subproblem.

Subclasses implement solve to compute the search direction d and Lagrange multipliers for the active constraints.

abstractmethod solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]

Solve the equality-constrained QP subproblem.

Solves:

minimize    (1/2) d^T B d + g^T d
subject to  A[active] d = b[active]
            d[i] = d_fixed[i]  for i where free_mask[i] is False

where B is given implicitly via hvp_fn(v) = B @ v.

Return type:

InnerSolveResult

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

  • adaptive_tol (Float[Array, ''] | float | None)

Args:

hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only

variables with free_mask[i] = True are optimized.

d_fixed: Values for fixed variables (n,). Required when

free_mask is provided.

adaptive_tol: Optional Eisenstat-Walker tolerance override.

When provided, overrides the solver’s default convergence tolerance for this call only.

Returns:

InnerSolveResult with the direction, multipliers, and convergence flag.

build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]

Build a reusable projector + multiplier-recovery context.

Composed strategies (e.g. HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.

The default implementation raises NotImplementedError so full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projector W̃_k that HR Algorithm 4.5 needs.

Return type:

ProjectionContext

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

__init__()
Return type:

None

class slsqp_jax.HRInexactSTCG[source]

Bases: AbstractInnerSolver

Heinkenschloss-Ridzal (2014) Algorithm 4.5 — STCG with inexact null-space projections.

Composes an existing null-space inner solver (ProjectedCGCholesky or ProjectedCGCraig) to obtain its projector W̃_k, particular solution d_p and multiplier-recovery closure, then runs a separate CG iteration on top whose three textbook three-term-recurrence cancellations are replaced by full H-conjugacy reorthogonalisation against every previous search direction.

See AGENTS.md (“Pluggable Inner QP Solvers” → HRInexactSTCG) for the full algorithmic discussion and references.

Attributes:
inner: Composed null-space inner solver supplying the

projector and multiplier-recovery infrastructure. Must implement build_projection_context; the saddle-point MinresQLPSolver does not and will raise on the first solve call.

max_cg_iter: Static upper bound on the number of inner CG

iterations. Determines the size of the reorth buffers.

cg_tol: Relative convergence tolerance for the projected

residual ‖z̃_i‖ tol · ‖r̃_0‖.

cg_regularization: Curvature-guard threshold δ² used by

the SNOPT-style scale-invariant short-circuit ⟨p̃, H p̃⟩ δ² ‖p̃‖². Defaults to 1e-6; set to 0.0 to disable.

inner: AbstractInnerSolver
max_cg_iter: int
cg_tol: Float[Array, ''] | float
cg_regularization: float = 1e-06
build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]

Build a reusable projector + multiplier-recovery context.

Composed strategies (e.g. HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.

The default implementation raises NotImplementedError so full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projector W̃_k that HR Algorithm 4.5 needs.

Return type:

ProjectionContext

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]

Solve the equality-constrained QP subproblem.

Solves:

minimize    (1/2) d^T B d + g^T d
subject to  A[active] d = b[active]
            d[i] = d_fixed[i]  for i where free_mask[i] is False

where B is given implicitly via hvp_fn(v) = B @ v.

Return type:

InnerSolveResult

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

  • adaptive_tol (Float[Array, ''] | float | None)

Args:

hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only

variables with free_mask[i] = True are optimized.

d_fixed: Values for fixed variables (n,). Required when

free_mask is provided.

adaptive_tol: Optional Eisenstat-Walker tolerance override.

When provided, overrides the solver’s default convergence tolerance for this call only.

Returns:

InnerSolveResult with the direction, multipliers, and convergence flag.

__init__(inner, max_cg_iter, cg_tol, cg_regularization=1e-06)
Parameters:
Return type:

None

class slsqp_jax.InnerSolveResult[source]

Bases: NamedTuple

Result from an inner equality-constrained QP solve.

Attributes:

d: Search direction. multipliers: Lagrange multipliers (shape (m,); entries for

inactive constraints are zero).

converged: True when the inner Krylov / projection iteration

satisfied its tolerance.

proj_residual: Post-solve constraint residual ||A d - b||

(Euclidean norm, restricted to active rows). Always 0 for null-space solvers (CG / CRAIG) where feasibility is enforced structurally; non-zero for MinresQLPSolver where it reflects the floor of the M-metric range-space projection after iterative refinement.

n_proj_refinements: Number of M-metric projection refinement

rounds actually applied. Always 0 for null-space solvers. At most MinresQLPSolver.proj_refine_max_iter.

projected_grad_norm: Norm of the projected initial gradient

W̃_k g that the inner solver actually iterated against (HR 2014, Theorem 3.5). This is the noise-aware stationarity proxy: when the outer SQP enables use_inexact_stationarity, the run is allowed to converge once this value drops below rtol * max(mu_max, 1) (filterSQP eq. 6 with the shared denominator from slsqp_jax.slsqp.termination.compute_mu_max()). Defaults to inf so that solvers which do not produce this quantity (i.e. anything other than HRInexactSTCG) cannot accidentally satisfy a < rtol test even if the user toggles the flag — the inexact path silently degrades to “never converges this way”.

d: Float[Array, 'n']

Alias for field number 0

multipliers: Float[Array, 'm']

Alias for field number 1

converged: Bool[Array, '']

Alias for field number 2

proj_residual: Float[Array, '']

Alias for field number 3

n_proj_refinements: Int[Array, '']

Alias for field number 4

projected_grad_norm: Float[Array, '']

Alias for field number 5

class slsqp_jax.MinresQLPSolver[source]

Bases: AbstractInnerSolver

Preconditioned MINRES-QLP on the full saddle-point KKT system.

Solves the KKT system directly:

[B    A^T] [d]       [-g]
[A    0  ] [lambda] = [b ]

using PMINRES-QLP (Choi, Paige & Saunders, SISC 2011, Table 3.5) with a block-diagonal SPD preconditioner:

M = [B_diag^{-1}    0      ]
    [0              S^{-1} ]

where B_diag = diag(B_0) (L-BFGS diagonal) and S = A B_diag^{-1} A^T is the Schur complement.

After PMINRES-QLP returns the iterate d, an M-metric range-space projection drives A d = b on the active rows. The single shot is followed by up to proj_refine_max_iter rounds of iterative refinement, each costing one matvec + one Schur back-solve (no refactorisation). Refinement squares the relative feasibility error per round. See HR (2014, Algorithm 4.18 step 1(a)) for the motivation.

max_iter: int = 200
tol: float = 1e-10
max_cg_iter: int = 50
proj_refine_max_iter: int = 3
proj_refine_rtol: float = 1e-10
proj_refine_atol: float = 1e-14
solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]

Solve the equality-constrained QP subproblem.

Solves:

minimize    (1/2) d^T B d + g^T d
subject to  A[active] d = b[active]
            d[i] = d_fixed[i]  for i where free_mask[i] is False

where B is given implicitly via hvp_fn(v) = B @ v.

Return type:

InnerSolveResult

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

  • adaptive_tol (Float[Array, ''] | float | None)

Args:

hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only

variables with free_mask[i] = True are optimized.

d_fixed: Values for fixed variables (n,). Required when

free_mask is provided.

adaptive_tol: Optional Eisenstat-Walker tolerance override.

When provided, overrides the solver’s default convergence tolerance for this call only.

Returns:

InnerSolveResult with the direction, multipliers, and convergence flag.

__init__(max_iter=200, tol=1e-10, max_cg_iter=50, proj_refine_max_iter=3, proj_refine_rtol=1e-10, proj_refine_atol=1e-14)
Parameters:
  • max_iter (int)

  • tol (float)

  • max_cg_iter (int)

  • proj_refine_max_iter (int)

  • proj_refine_rtol (float)

  • proj_refine_atol (float)

Return type:

None

class slsqp_jax.ProjectedCGCholesky[source]

Bases: AbstractInnerSolver

Projected CG with Cholesky-based null-space projection.

This is the original implementation: Cholesky-factor A A^T (with regularization), use it for the null-space projector and particular solution, run CG in the null space, and recover multipliers via iterative refinement.

When use_constraint_preconditioner is True and a preconditioner is provided, the constraint preconditioner (Gould, Hribar & Nocedal, 2001) is used instead of the naive P(M(r)).

max_cg_iter: int
cg_tol: Float[Array, ''] | float
cg_regularization: float = 1e-06
use_constraint_preconditioner: bool = False
solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]

Solve the equality-constrained QP subproblem.

Solves:

minimize    (1/2) d^T B d + g^T d
subject to  A[active] d = b[active]
            d[i] = d_fixed[i]  for i where free_mask[i] is False

where B is given implicitly via hvp_fn(v) = B @ v.

Return type:

InnerSolveResult

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

  • adaptive_tol (Float[Array, ''] | float | None)

Args:

hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only

variables with free_mask[i] = True are optimized.

d_fixed: Values for fixed variables (n,). Required when

free_mask is provided.

adaptive_tol: Optional Eisenstat-Walker tolerance override.

When provided, overrides the solver’s default convergence tolerance for this call only.

Returns:

InnerSolveResult with the direction, multipliers, and convergence flag.

build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]

Build a reusable projector + multiplier-recovery context.

Composed strategies (e.g. HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.

The default implementation raises NotImplementedError so full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projector W̃_k that HR Algorithm 4.5 needs.

Return type:

ProjectionContext

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

__init__(max_cg_iter, cg_tol, cg_regularization=1e-06, use_constraint_preconditioner=False)
Parameters:
  • max_cg_iter (int)

  • cg_tol (Float[Array, ''] | float)

  • cg_regularization (float)

  • use_constraint_preconditioner (bool)

Return type:

None

class slsqp_jax.ProjectedCGCraig[source]

Bases: AbstractInnerSolver

Projected CG with CRAIG-based iterative null-space projection.

Replaces the Cholesky factorization of A A^T with iterative CRAIG solves (Golub-Kahan bidiagonalization). This eliminates the O(m^3) factorization cost and the 1e-8 diagonal regularization, at the cost of an iterative solve per projection.

For multiplier recovery (done once after the CG loop), CG on the normal equations A A^T y = rhs is used, reusing the existing solve_unconstrained_cg infrastructure.

max_cg_iter: int
cg_tol: Float[Array, ''] | float
cg_regularization: float = 1e-06
use_constraint_preconditioner: bool = False
craig_tol: float = 1e-10
craig_max_iter: int = 200
mult_recovery_tol: float = 1e-12
mult_recovery_max_iter: int = 200
solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]

Solve the equality-constrained QP subproblem.

Solves:

minimize    (1/2) d^T B d + g^T d
subject to  A[active] d = b[active]
            d[i] = d_fixed[i]  for i where free_mask[i] is False

where B is given implicitly via hvp_fn(v) = B @ v.

Return type:

InnerSolveResult

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

  • adaptive_tol (Float[Array, ''] | float | None)

Args:

hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only

variables with free_mask[i] = True are optimized.

d_fixed: Values for fixed variables (n,). Required when

free_mask is provided.

adaptive_tol: Optional Eisenstat-Walker tolerance override.

When provided, overrides the solver’s default convergence tolerance for this call only.

Returns:

InnerSolveResult with the direction, multipliers, and convergence flag.

build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]

Build a reusable projector + multiplier-recovery context.

Composed strategies (e.g. HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.

The default implementation raises NotImplementedError so full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projector W̃_k that HR Algorithm 4.5 needs.

Return type:

ProjectionContext

Parameters:
  • hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])

  • g (Float[Array, 'n'])

  • A (Float[Array, 'm n'])

  • b (Float[Array, 'm'])

  • active_mask (Bool[Array, 'm'])

  • precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)

  • free_mask (Bool[Array, 'n'] | None)

  • d_fixed (Float[Array, 'n'] | None)

__init__(max_cg_iter, cg_tol, cg_regularization=1e-06, use_constraint_preconditioner=False, craig_tol=1e-10, craig_max_iter=200, mult_recovery_tol=1e-12, mult_recovery_max_iter=200)
Parameters:
  • max_cg_iter (int)

  • cg_tol (Float[Array, ''] | float)

  • cg_regularization (float)

  • use_constraint_preconditioner (bool)

  • craig_tol (float)

  • craig_max_iter (int)

  • mult_recovery_tol (float)

  • mult_recovery_max_iter (int)

Return type:

None

class slsqp_jax.ProjectionContext[source]

Bases: NamedTuple

Reusable bundle of projector + particular-solution + multiplier-recovery closures for an active equality system A_active d = b_active.

Existing strategies (ProjectedCGCholesky, ProjectedCGCraig) build these inline inside their solve methods. Composed strategies (e.g. HRInexactSTCG) call inner.build_projection_context(...) to reuse the projector and multiplier-recovery infrastructure of an underlying null-space solver while running their own CG loop on top.

Attributes:
project: Inexact null-space projector W̃_k(v). Maps a vector

in the (free) ambient space to its projection onto null(A_work) using whatever inner approximation the underlying strategy provides (Cholesky for ProjectedCGCholesky, CRAIG for ProjectedCGCraig).

d_p: Particular solution. A_work @ d_p == b_work to inner

solver precision; d_p already incorporates d_fixed on the bound-fixed coordinates, so it lives in the full n- dimensional space.

recover_multipliers: Closure mapping (B d + g) to a length-m

multiplier vector with zeros on inactive rows. Encapsulates both the inversion of A_work A_workᵀ and the active-mask zeroing. HR Algorithm 4.5 calls this once per outer step (after the modified-residual CG iteration converges).

hvp_work: Working-subspace HVP. Equal to hvp_fn when no bound

fixing is in effect; otherwise v -> _free * hvp_fn(_free * v) so the iteration only sees the free coordinates.

g_eff: Effective gradient g + B @ d_p evaluated against

hvp_work. HR’s notation calls this g_k; it is the input to the projected-residual recurrence.

A_work: Already-masked working constraint matrix (active rows,

free columns). Surfaced primarily so callers can sanity-check the residual ||A_work @ d - b_work||; project and recover_multipliers already incorporate it.

free_mask: Boolean mask of free variables (always present;

equals ones(n) when no bound-fixing is in effect).

d_fixed: Fixed-variable values on bound-active coordinates

(zeros elsewhere; zeros everywhere when no bound-fixing).

has_fixed: True iff any coordinate is bound-fixed. Cheap

indicator so consumers do not have to redo the mask check.

converged: Convergence flag of the inner projector solve that

built the context (always True for Cholesky; carries the CRAIG breakdown / convergence flag for the iterative projector). Composed strategies AND this with their own convergence to surface inner-projector failures upstream.

project: Callable[[Float[Array, 'n']], Float[Array, 'n']]

Alias for field number 0

d_p: Float[Array, 'n']

Alias for field number 1

recover_multipliers: Callable[[Float[Array, 'n']], Float[Array, 'm']]

Alias for field number 2

hvp_work: Callable[[Float[Array, 'n']], Float[Array, 'n']]

Alias for field number 3

g_eff: Float[Array, 'n']

Alias for field number 4

A_work: Float[Array, 'm n']

Alias for field number 5

free_mask: Bool[Array, 'n']

Alias for field number 6

d_fixed: Float[Array, 'n']

Alias for field number 7

has_fixed: bool

Alias for field number 8

converged: Bool[Array, '']

Alias for field number 9

slsqp_jax.solve_qp(hvp_fn, g, A_eq, b_eq, A_ineq, b_ineq, max_iter=100, max_cg_iter=50, tol=1e-08, expand_factor=1.0, initial_active_set=None, kkt_residual=0.0, proximal_mu=0.0, prev_multipliers_eq=None, precond_fn=None, cg_tol=None, cg_regularization=1e-06, use_proximal=True, predicted_active_set=None, active_set_method='expand', use_constraint_preconditioner=False, inner_solver=None, mult_drop_floor=1e-06, ping_pong_threshold=2147483647)[source]

Solve a QP with equality and inequality constraints.

Solves:

minimize    (1/2) d^T H d + g^T d
subject to  A_eq d = b_eq
            A_ineq d >= b_ineq

where H is provided implicitly via hvp_fn(v) = H @ v.

The QP active-set loop dispatches to one of three strategies:

  • Proximal sSQP (m_eq > 0 and use_proximal=True): equality constraints absorbed into the objective via augmented-Lagrangian penalty. See slsqp_jax.qp.proximal.

  • Direct projection (m_eq > 0 and use_proximal=False): equality constraints enforced via null-space projection in the inner solver. See slsqp_jax.qp.direct.

  • Inequality-only (m_eq == 0): no equality block; just an active-set loop on inequality constraints. See slsqp_jax.qp.inequality.

All three share the same EXPAND / ping-pong active-set loop body (see slsqp_jax.qp.active_set).

Args:

hvp_fn: Hessian-vector product function v -> H @ v. g: Linear term of the objective (gradient). A_eq: Equality constraint matrix (m_eq x n). b_eq: Equality constraint RHS (m_eq,). A_ineq: Inequality constraint matrix (m_ineq x n). b_ineq: Inequality constraint RHS (m_ineq,). max_iter: Maximum active-set iterations. max_cg_iter: Maximum CG iterations per active-set step. tol: Feasibility and optimality tolerance. expand_factor: EXPAND tolerance growth rate. initial_active_set: Optional warm-start active set from a

previous QP solve.

kkt_residual: Norm of the KKT residual from the outer solver. proximal_mu: Adaptive proximal parameter for sSQP. prev_multipliers_eq: Equality multipliers from the previous

outer iteration (proximal centre).

precond_fn: Optional preconditioner v -> M @ v. cg_tol: Optional CG convergence tolerance overriding tol

for the inner solver only.

cg_regularization: Curvature-guard threshold for CG. use_proximal: When True, equality constraints go through the

sSQP proximal path. When False, direct projection.

predicted_active_set: Optional LPEC-A predicted active set

for warm-start.

active_set_method: "expand", "lpeca_init", or "lpeca". use_constraint_preconditioner: Used only when constructing a

default inner_solver.

inner_solver: Pluggable strategy for the inner equality-

constrained QP solve. Defaults to ProjectedCGCholesky.

mult_drop_floor: Floor on the negative-multiplier drop test. ping_pong_threshold: Threshold for the explicit ping-pong

short-circuit. Defaults to 2**31 - 1 (effectively disabled).

Returns:

QPSolverResult containing the solution, multipliers, active set, and convergence info.

Return type:

QPSolverResult

Parameters:
  • hvp_fn (Callable)

  • g (Float[Array, 'n'])

  • A_eq (Float[Array, 'm_eq n'])

  • b_eq (Float[Array, 'm_eq'])

  • A_ineq (Float[Array, 'm_ineq n'])

  • b_ineq (Float[Array, 'm_ineq'])

  • max_iter (int)

  • max_cg_iter (int)

  • tol (Float[Array, ''] | float)

  • expand_factor (float)

  • initial_active_set (Bool[Array, 'm_ineq'] | None)

  • kkt_residual (Float[Array, ''] | float)

  • proximal_mu (Float[Array, ''] | float)

  • prev_multipliers_eq (Float[Array, 'm_eq'] | None)

  • precond_fn (Callable | None)

  • cg_tol (Float[Array, ''] | float | None)

  • cg_regularization (float)

  • use_proximal (bool)

  • predicted_active_set (Bool[Array, 'm_ineq'] | None)

  • active_set_method (str)

  • use_constraint_preconditioner (bool)

  • inner_solver (AbstractInnerSolver | None)

  • mult_drop_floor (float)

  • ping_pong_threshold (int)

class slsqp_jax.LBFGSHistory[source]

Bases: Module

L-BFGS history buffer for matrix-free Hessian approximation.

Stores the last k (s, y) pairs in a circular buffer and provides efficient Hessian-vector products via the compact representation.

Attributes:

s_history: Stored step vectors s_i = x_{i+1} - x_i. y_history: Stored (damped) gradient differences y_i. gamma: Scalar summary of the initial Hessian scaling. diagonal: Per-variable initial Hessian scaling (B_0 = diag(d)).

During normal operation this equals gamma * ones(n). After an SNOPT-style reset it stores per-variable curvature.

count: Number of valid pairs stored (0 to memory size). next_idx: Next write position in the circular buffer. eig_lower: Estimate of lambda_min(H) from VARCHEN Theorem 2. eig_upper: Estimate of lambda_max(H) from VARCHEN Theorem 2.

s_history: Float[Array, 'memory n']
y_history: Float[Array, 'memory n']
gamma: Float[Array, '']
diagonal: Float[Array, 'n']
count: Int[Array, '']
next_idx: Int[Array, '']
eig_lower: Float[Array, '']
eig_upper: Float[Array, '']
__init__(s_history, y_history, gamma, diagonal, count, next_idx, eig_lower, eig_upper)
Parameters:
  • s_history (Float[Array, 'memory n'])

  • y_history (Float[Array, 'memory n'])

  • gamma (Float[Array, ''])

  • diagonal (Float[Array, 'n'])

  • count (Int[Array, ''])

  • next_idx (Int[Array, ''])

  • eig_lower (Float[Array, ''])

  • eig_upper (Float[Array, ''])

Return type:

None

slsqp_jax.lbfgs_init(n, memory)[source]

Initialize an empty L-BFGS history buffer.

Return type:

LBFGSHistory

Parameters:
Args:

n: Dimension of the parameter space. memory: Maximum number of (s, y) pairs to store (typically 5-20).

Returns:

An initialized LBFGSHistory with no stored pairs and gamma=1.

slsqp_jax.lbfgs_hvp(history, v)[source]

Compute B @ v using the L-BFGS compact representation.

Uses the compact form with diagonal initial Hessian B_0 = diag(d) (Byrd, Nocedal & Schnabel, 1994, Theorem 2.2):

B = B_0 - [B_0 S, Y] @ M^{-1} @ [S^T B_0; Y^T]

where:

M = [[S^T B_0 S, L], [L^T, -D_sy]] (2k x 2k) L_{ij} = s_i^T y_j for i > j (strictly lower triangular) D_sy = diag(s_i^T y_i) (diagonal)

When no pairs are stored (count=0), this reduces to B = diag(d).

Complexity: O(k^2 n) where k is the number of stored pairs.

Args:

history: L-BFGS history buffer. v: Vector to multiply by the Hessian approximation.

Returns:

B @ v, the Hessian-vector product.

Return type:

Float[Array, 'n']

Parameters:
slsqp_jax.lbfgs_inverse_hvp(history, v)[source]

Compute H @ v = B^{-1} @ v via the L-BFGS two-loop recursion.

Implements Nocedal & Wright Algorithm 7.4 with diagonal initial scaling H_0 = diag(1/diagonal) instead of scalar (1/gamma) I.

Complexity: O(kn) where k is the number of stored pairs.

Return type:

Float[Array, 'n']

Parameters:
Args:

history: L-BFGS history buffer. v: Vector to multiply by the inverse Hessian approximation.

Returns:

H @ v = B^{-1} @ v, the inverse Hessian-vector product.

slsqp_jax.lbfgs_append(history, s, y, damping_threshold=0.2, skip_threshold=1e-08, diag_floor=0.0001, diag_ceil=1000000.0)[source]

Append a new (s, y) pair to the L-BFGS history with Powell damping.

Uses VARCHEN-style damping toward B0 (Lotfi et al., 2020, eq 7-8): damping is computed against the diagonal initial Hessian B_0 = diag(diagonal) instead of the full L-BFGS approximation B. This is O(n) instead of O(k^2 n), always well-conditioned (the diagonal is clipped to [diag_floor, diag_ceil]), and avoids the circular dependency where a badly-conditioned B poisons its own damping.

Return type:

LBFGSHistory

Parameters:
The damped gradient difference is:

y_damped = theta * y + (1 - theta) * B0 s

where theta in [0, 1] is chosen to satisfy:

s^T y_damped >= threshold * s^T B0 s

If ||s|| is too small or the curvature ratio is too extreme, the update is skipped entirely to avoid numerical issues. The curvature_ratio bracket is [1e-8, 1e8] (more permissive than the previous [1e-6, 1e6]) so we no longer drop noisy-but- informative pairs on high-dimensional problems; these pairs instead flow through Powell damping. The relative_curvature < 1e-8 skip is now gated on sᵀy > 0 (positive curvature) so negative- curvature pairs are handled by damping rather than dropped.

After appending, the scalar gamma is updated to y_damped^T y_damped / (y_damped^T s) and clipped to [diag_floor, diag_ceil]. The per-variable diagonal is updated using a component-wise secant: d[i] = |y_damped[i] * s[i]| / (s[i]^2 + eps) and clipped to the same bracket. This gives d[i] |H_{ii}| for diagonal Hessians regardless of step direction, unlike the classical Shanno-Phua formula d[i] = y_i^2 / (y^T s) which produces d h_i^2 for multi-component steps.

The old 1e-6 floor on clip_lo allowed the inverse-Hessian diagonal H_0 = 1/d to balloon to 1e6, producing very large CG directions that the line search could not backtrack. Raising the floor to 1e-4 keeps the inverse diagonal bounded by 1e4 and dramatically reduces the “non-descent direction + chronic line- search backtracking” failure mode.

Args:

history: Current L-BFGS history. s: Step vector s = x_{k+1} - x_k. y: Gradient difference y = nabla L_{k+1} - nabla L_k. damping_threshold: Powell damping threshold (default 0.2). skip_threshold: Minimum step norm for update (default 1e-8). diag_floor: Minimum per-variable diagonal entry (default 1e-4). diag_ceil: Maximum per-variable diagonal entry (default 1e6).

Returns:

Updated L-BFGS history with the new pair appended.

slsqp_jax.lbfgs_compute_diagonal(history)[source]

Extract diag(B_k) from the L-BFGS compact representation.

From the compact form B = B_0 - W M^{-1} W^T, the diagonal is

diag(B_k) = diagonal - diag(W M^{-1} W^T)

where W = [B_0 S, Y] is (n, 2k) and M is the 2k x 2k inner matrix. The correction term is computed in O(k^2 n) by forming Q = W M^{-1} and summing (Q * W) row-wise.

This is used by lbfgs_reset() to implement the SNOPT diagonal reset strategy (Gill, Murray & Saunders, 2005, Section 3.3).

Return type:

Float[Array, 'n']

Parameters:

history (LBFGSHistory)

slsqp_jax.lbfgs_estimate_condition(history, damping_threshold=0.2)[source]

Estimate eigenvalue bounds of the inverse Hessian approximation H.

Uses the diagonal B_0 = diag(d) as a proxy for the full L-BFGS condition number. The inverse Hessian eigenvalues are bounded by [1/max(d), 1/min(d)], giving kappa(H) = max(d) / min(d).

This is a practical simplification of the VARCHEN Theorem 2 bounds (Lotfi et al., 2020). The full recursive bounds are theoretically tight but overly pessimistic in practice because they propagate worst-case Lipschitz estimates, leading to false soft-reset triggers on moderately conditioned problems. Since we damp toward B0 (not the full B), the diagonal condition number is the most relevant quantity.

Return type:

tuple[Float[Array, ''], Float[Array, '']]

Parameters:
Returns:

(lambda_min_est, lambda_max_est) bounding the eigenvalues of H.

slsqp_jax.lbfgs_reset(history, diag_floor=0.0001, diag_ceil=1000000.0)[source]

SNOPT-style diagonal reset of the L-BFGS history.

Extracts diag(B_k) from the current approximation, discards all stored (s, y) pairs, and restarts with B_0 = diag(diag(B_k)). This preserves per-variable curvature information across the reset, preventing the “everything is flat” effect that occurs when the scalar gamma becomes very small.

The extracted diagonal is clipped to [diag_floor, diag_ceil] to keep the reset point aligned with the clipping used inside lbfgs_append(); previous versions used a hardcoded [1e-2, 1e6] that diverged from the append-path floor.

Based on the SNOPT limited-memory reset strategy (Gill, Murray & Saunders, SIAM Review, 47(1), 2005, Section 3.3).

Return type:

LBFGSHistory

Parameters:
slsqp_jax.lbfgs_should_skip(s, y, skip_threshold=1e-08)[source]

Return the scalar should_skip decision that lbfgs_append uses.

Exposed so callers can observe the skip decision without having to diff fields on the returned history (which becomes meaningless once the circular buffer saturates, since count stays at memory from then on regardless of whether a real append happened).

Return type:

Bool[Array, '']

Parameters:
  • s (Float[Array, 'n'])

  • y (Float[Array, 'n'])

  • skip_threshold (float)

slsqp_jax.lbfgs_curvature_diagnostics(s, y, skip_threshold=1e-08)[source]

Return (s.y, relative_curvature, skipped) for the curvature pair.

Exposes the same intermediate quantities lbfgs_should_skip uses so the verbose callback can surface them without re-implementing the skip predicate. relative_curvature = |s.y| / (||s|| * ||y||) is the cosine of the angle between s and y and the dominant contributor to the skip decision (a near-zero value indicates the curvature pair is at the floating-point noise floor and gets skipped by the relative_curvature < skip_threshold branch of lbfgs_should_skip).

Return type:

tuple[Float[Array, ''], Float[Array, ''], Bool[Array, '']]

Parameters:
  • s (Float[Array, 'n'])

  • y (Float[Array, 'n'])

  • skip_threshold (float)

Args:

s: Step vector x_{k+1} - x_k. y: Gradient difference ∇L_{k+1} - ∇L_k (Lagrangian gradient

difference for SLSQP).

skip_threshold: Threshold passed to lbfgs_should_skip.

Returns:

(sty, relative_curvature, skipped), where sty = s.y is the raw curvature inner product, relative_curvature is the normalised version used in the skip predicate, and skipped is the same boolean lbfgs_should_skip returns.

slsqp_jax.lbfgs_soft_reset(history)[source]

VARCHEN-style soft reset: keep only the most recent (s, y) pair.

When the estimated condition number of the inverse Hessian exceeds a threshold, this drops all but the newest curvature pair. This is less aggressive than lbfgs_reset() (which extracts the diagonal and drops everything) or lbfgs_identity_reset() (which restores B = I), preserving the most relevant curvature information.

Based on VARCHEN Algorithm 1, Step 7 (Lotfi et al., 2020).

Return type:

LBFGSHistory

Parameters:

history (LBFGSHistory)

slsqp_jax.lbfgs_identity_reset(history)[source]

Hard reset of L-BFGS history to the identity Hessian.

Unlike lbfgs_reset() which preserves per-variable curvature, this discards everything and restarts with B_0 = I. Used as an escalation when repeated SNOPT-style diagonal resets fail to break an ill-conditioning cycle (e.g. consecutive QP failures where the extracted diagonal perpetuates the same problematic scaling).

Return type:

LBFGSHistory

Parameters:

history (LBFGSHistory)

slsqp_jax.estimate_hessian_diagonal(hvp_fn, n, key, n_probes=20)[source]

Estimate the diagonal of a matrix given only its HVP.

Uses the Bekas-Kokiopoulou-Saad (2007) stochastic estimator: for Rademacher random vectors z (entries i.i.d. from {-1, +1}), E[z * (H z)] = diag(H). Averaging over n_probes samples gives an unbiased estimate with variance O(||off-diag||^2 / k).

The estimate is accurate when the matrix is diagonally dominant, which is common for Hessians of smooth functions in the natural coordinate basis.

Return type:

Float[Array, 'n']

Parameters:
Args:

hvp_fn: Function v -> H @ v (Hessian-vector product). n: Dimension of the matrix. key: JAX PRNG key for generating random probes. n_probes: Number of Rademacher probes (default 20). More

probes reduce variance at the cost of more HVP calls. Each probe costs one forward-over-reverse AD pass.

Returns:

Estimated diagonal of H, shape (n,).

class slsqp_jax.LineSearchResult[source]

Bases: NamedTuple

Result from the line search.

Attributes:

alpha: The step size found. f_val: Function value at new point. eq_val: Equality constraint values at new point. ineq_val: Inequality constraint values at new point. success: Whether the line search succeeded. n_evals: Number of function evaluations.

alpha: Float[Array, '']

Alias for field number 0

f_val: Float[Array, '']

Alias for field number 1

eq_val: Float[Array, 'm_eq']

Alias for field number 2

ineq_val: Float[Array, 'm_ineq']

Alias for field number 3

success: Bool[Array, '']

Alias for field number 4

n_evals: Int[Array, '']

Alias for field number 5

slsqp_jax.compute_merit(f_val, eq_val, ineq_val, penalty)[source]

Compute the L1-exact penalty merit function value.

Return type:

Float[Array, '']

Parameters:
  • f_val (Float[Array, ''])

  • eq_val (Float[Array, 'm_eq'])

  • ineq_val (Float[Array, 'm_ineq'])

  • penalty (Float[Array, ''])

The merit function is:

φ(x; ρ) = f(x) + ρ * (‖c_eq(x)‖_1 + ‖max(0, -c_ineq(x))‖_1)

Args:

f_val: Objective function value f(x). eq_val: Equality constraint values c_eq(x). ineq_val: Inequality constraint values c_ineq(x). penalty: Penalty parameter ρ.

Returns:

Merit function value φ(x; ρ).

Perform backtracking line search with the L1 merit function.

Return type:

LineSearchResult

Parameters:
  • fn (Callable)

  • eq_constraint_fn (Callable | None)

  • ineq_constraint_fn (Callable | None)

  • x (Float[Array, 'n'])

  • direction (Float[Array, 'n'])

  • args (Any)

  • f_val (Float[Array, ''])

  • eq_val (Float[Array, 'm_eq'])

  • ineq_val (Float[Array, 'm_ineq'])

  • penalty (Float[Array, ''])

  • grad (Float[Array, 'n'])

  • c1 (float)

  • rho (float)

  • max_iter (int)

  • alpha_init (float)

  • bounds (Float[Array, 'n 2'] | None)

  • lower_bound_mask (tuple[bool, ...] | None)

  • upper_bound_mask (tuple[bool, ...] | None)

  • eq_jac (Float[Array, 'm_eq n'] | None)

  • ineq_jac (Float[Array, 'm_ineq_general n'] | None)

Finds α such that the Armijo condition is satisfied:

φ(x + α*d; ρ) ≤ φ(x; ρ) + c1 * α * φ’(x; d, ρ)

where φ is the L1 merit function and φ’ is the proper directional derivative including constraint Jacobian terms:

φ’(x; d, ρ) = ∇f·d + ρ Σ sign(c_eq_i)(J_eq d)_i
  • ρ Σ_{j: c_ineq_j<0} (J_ineq d)_j

When constraint Jacobians are not provided, falls back to the simpler ∇f·d approximation.

Args:

fn: Objective function fn(x, args) -> (f_val, aux). eq_constraint_fn: Equality constraint function or None. ineq_constraint_fn: Inequality constraint function or None. x: Current point. direction: Search direction. args: Arguments to pass to functions. f_val: Current objective value. eq_val: Current equality constraint values. ineq_val: Current inequality constraint values (including bounds). penalty: Penalty parameter. grad: Gradient of objective at x. c1: Armijo condition parameter (default 1e-4). rho: Step reduction factor (default 0.5). max_iter: Maximum number of iterations. alpha_init: Initial step size (default 1.0). bounds: Optional box constraints, shape (n, 2) with [lower, upper] per variable. lower_bound_mask: Tuple of bools indicating which lower bounds are finite. upper_bound_mask: Tuple of bools indicating which upper bounds are finite. eq_jac: Equality constraint Jacobian at x (m_eq, n), or None. ineq_jac: General inequality constraint Jacobian at x

(m_ineq_general, n), or None. Does NOT include bound constraint rows.

Returns:

LineSearchResult with the found step size and function values.

slsqp_jax.update_penalty_parameter(current_penalty, multipliers_eq, multipliers_ineq, margin=1.1)[source]

Update the penalty parameter based on Lagrange multipliers.

The penalty should be larger than the maximum absolute multiplier to ensure the merit function provides a descent direction.

ρ >= max(abs(λ_i), abs(μ_j)) + margin

Return type:

Float[Array, '']

Parameters:
  • current_penalty (Float[Array, ''])

  • multipliers_eq (Float[Array, 'm_eq'])

  • multipliers_ineq (Float[Array, 'm_ineq'])

  • margin (float)

Args:

current_penalty: Current penalty parameter. multipliers_eq: Lagrange multipliers for equality constraints. multipliers_ineq: Lagrange multipliers for inequality constraints. margin: Safety margin factor (default 1.1).

Returns:

Updated penalty parameter.

slsqp_jax.compute_lagrangian_gradient(grad_f, eq_jac, ineq_jac, multipliers_eq, multipliers_ineq)[source]

Compute the gradient of the Lagrangian function.

Return type:

Float[Array, 'n']

Parameters:
  • grad_f (Float[Array, 'n'])

  • eq_jac (Float[Array, 'm_eq n'])

  • ineq_jac (Float[Array, 'm_ineq n'])

  • multipliers_eq (Float[Array, 'm_eq'])

  • multipliers_ineq (Float[Array, 'm_ineq'])

The Lagrangian is:

L(x, lambda, mu) = f(x) - lambda^T c_eq(x) - mu^T c_ineq(x)

Its gradient with respect to x is:

nabla_x L = nabla f(x) - J_eq^T lambda - J_ineq^T mu

ineq_jac and multipliers_ineq must include the bound block when bounds are present (the ineq layout used by solver.py is [general; lower_bound; upper_bound]). Internally this is built on top of compute_partial_lagrangian_gradient() so the bound-multiplier recovery in the outer solver shares the same code path for the non-bound contribution.

Args:

grad_f: Gradient of objective function nabla f(x). eq_jac: Jacobian of equality constraints (m_eq x n). ineq_jac: Jacobian of inequality constraints (m_ineq x n),

including bound rows.

multipliers_eq: Lagrange multipliers for equality constraints. multipliers_ineq: Lagrange multipliers for inequality

constraints, including bound multipliers.

Returns:

Gradient of Lagrangian nabla_x L.

slsqp_jax.compute_partial_lagrangian_gradient(grad_f, eq_jac, multipliers_eq, gen_jac, multipliers_gen)[source]

Compute the partial Lagrangian gradient (without the bound block).

Returns nabla f(x) - J_eq(x)^T lambda - J_gen(x)^T mu_gen, where J_gen is the Jacobian of the general (nonlinear) inequality constraints — i.e. the inequality block excluding the constant identity-style rows that come from box bounds.

This is the core helper used by both compute_lagrangian_gradient (which adds the bound contribution on top) and the post-line-search NLP-level bound-multiplier recovery in solver.py: by reading off the partial Lagrangian gradient at indices that are at a bound at x_{k+1}, the recovery picks the bound multiplier that exactly zeros the corresponding component of the full Lagrangian gradient.

Return type:

Float[Array, 'n']

Parameters:
  • grad_f (Float[Array, 'n'])

  • eq_jac (Float[Array, 'm_eq n'])

  • multipliers_eq (Float[Array, 'm_eq'])

  • gen_jac (Float[Array, 'm_gen n'])

  • multipliers_gen (Float[Array, 'm_gen'])

Args:

grad_f: Gradient of the objective function nabla f(x). eq_jac: Jacobian of equality constraints (m_eq x n). multipliers_eq: Lagrange multipliers for equality constraints. gen_jac: Jacobian of general inequality constraints

(m_gen x n). Must exclude bound rows.

multipliers_gen: Lagrange multipliers for general inequality

constraints.

Returns:

Partial Lagrangian gradient with no bound contribution.

slsqp_jax.compute_rho_bar(c_ineq, c_eq, grad, A_ineq, A_eq, lambda_ineq, mu_eq)[source]

Compute the LPEC-A proximity measure rho_bar.

This implements Eq. 36 of Oberlin & Wright (2005), adapted to the c_ineq >= 0 (feasible) sign convention.

For feasible inequality constraints (c_ineq_i > 0), the contribution is sqrt(c_ineq_i * lambda_ineq_i); for violated constraints (c_ineq_i <= 0), it is -c_ineq_i (the violation magnitude).

Return type:

Float[Array, '']

Parameters:
  • c_ineq (Float[Array, 'm_ineq'])

  • c_eq (Float[Array, 'm_eq'])

  • grad (Float[Array, 'n'])

  • A_ineq (Float[Array, 'm_ineq n'])

  • A_eq (Float[Array, 'm_eq n'])

  • lambda_ineq (Float[Array, 'm_ineq'])

  • mu_eq (Float[Array, 'm_eq'])

Args:

c_ineq: Inequality constraint values at current point. c_eq: Equality constraint values at current point. grad: Objective gradient at current point. A_ineq: Inequality constraint Jacobian. A_eq: Equality constraint Jacobian. lambda_ineq: Multiplier estimates for inequality constraints. mu_eq: Multiplier estimates for equality constraints.

Returns:

The scalar proximity measure rho_bar >= 0.

slsqp_jax.identify_active_set_lpeca(c_ineq, c_eq, grad, A_ineq, A_eq, lambda_ineq, mu_eq, sigma=0.9, beta=None, trust_threshold=1.0)[source]

Predict the active inequality set using the LPEC-A threshold test.

Applies Eq. 43 of Oberlin & Wright (2005), adapted to our sign convention. An inequality constraint i is predicted active when c_ineq_i <= (beta * rho_bar) ** sigma and rho_bar is below the trust threshold (otherwise the prediction is empty).

The result is wrapped in an LPECAResult so the caller can distinguish “no constraints predicted active” (a valid prediction) from “LPEC-A bypassed” (valid=False) or “size cap fired” (capped=True).

Return type:

LPECAResult

Parameters:
  • c_ineq (Float[Array, 'm_ineq'])

  • c_eq (Float[Array, 'm_eq'])

  • grad (Float[Array, 'n'])

  • A_ineq (Float[Array, 'm_ineq n'])

  • A_eq (Float[Array, 'm_eq n'])

  • lambda_ineq (Float[Array, 'm_ineq'])

  • mu_eq (Float[Array, 'm_eq'])

  • sigma (float)

  • beta (float | None)

  • trust_threshold (float)

Args:

c_ineq: Inequality constraint values at current point. c_eq: Equality constraint values at current point. grad: Objective gradient at current point. A_ineq: Inequality constraint Jacobian. A_eq: Equality constraint Jacobian. lambda_ineq: Multiplier estimates for inequality constraints. mu_eq: Multiplier estimates for equality constraints. sigma: Threshold exponent (sigma_bar in the paper).

Must be in (0, 1). Default 0.9 per paper recommendation.

beta: Threshold scaling factor. Default None uses the

paper’s recommendation 1 / (m_ineq + n + m_eq).

trust_threshold: Maximum rho_bar for which the LPEC-A

prediction is trusted. When rho_bar > trust_threshold, the predicted set is empty (the iterate is considered too far from the solution for the asymptotic guarantees to apply). Default 1.0.

Returns:

LPECAResult containing the boolean prediction mask and the diagnostic flags valid / capped / rho_bar.

slsqp_jax.compute_lpeca_active_set(c_ineq, c_eq, grad, A_ineq, A_eq, lambda_ineq, mu_eq, sigma=0.9, beta=None, trust_threshold=1.0, use_lp=False, lp_lambda_bound=1000000.0, lp_eps=1e-06, lp_max_iter=1000)[source]

Compute the LPEC-A predicted active set, optionally refining multipliers.

This is the main entry point for LPEC-A active set identification. It optionally solves the LPEC-A LP to obtain tighter multiplier estimates, then applies the threshold test (with the trust gate and rank-aware size cap from identify_active_set_lpeca()).

Return type:

LPECAResult

Parameters:
  • c_ineq (Float[Array, 'm_ineq'])

  • c_eq (Float[Array, 'm_eq'])

  • grad (Float[Array, 'n'])

  • A_ineq (Float[Array, 'm_ineq n'])

  • A_eq (Float[Array, 'm_eq n'])

  • lambda_ineq (Float[Array, 'm_ineq'])

  • mu_eq (Float[Array, 'm_eq'])

  • sigma (float)

  • beta (float | None)

  • trust_threshold (float)

  • use_lp (bool)

  • lp_lambda_bound (float)

  • lp_eps (float)

  • lp_max_iter (int)

Args:

c_ineq: Inequality constraint values at current point. c_eq: Equality constraint values at current point. grad: Objective gradient at current point. A_ineq: Inequality constraint Jacobian. A_eq: Equality constraint Jacobian. lambda_ineq: Current multiplier estimates for inequalities. mu_eq: Current multiplier estimates for equalities. sigma: Threshold exponent (default 0.9). beta: Threshold scaling factor (default: paper recommendation). trust_threshold: Maximum rho_bar for which the prediction

is trusted (see identify_active_set_lpeca()).

use_lp: If True, solve the LPEC-A LP to refine multiplier

estimates before the threshold test. Requires mpax.

lp_lambda_bound: Upper bound K_1 on lambda in the LP. lp_eps: Tolerance for the LP solver. lp_max_iter: Maximum LP solver iterations.

Returns:

LPECAResult with the predicted active mask and diagnostic flags.

class slsqp_jax.LPECAResult[source]

Bases: NamedTuple

LPEC-A active-set identification result.

Attributes:
predicted: Boolean mask of shape (m_ineq,). True means

the constraint is predicted active. When valid=False this is the all-False mask (the trust gate fired).

valid: True when rho_bar is below the trust threshold,

so the prediction is theoretically meaningful. False indicates the iterate is too far from the solution for LPEC-A to be reliable.

capped: True when the rank-aware size cap truncated the

prediction (the raw threshold predicted more than n - m_eq - 1 constraints active, so the most confident entries were kept).

rho_bar: The scalar proximity measure used for the trust gate.

predicted: Bool[Array, 'm_ineq']

Alias for field number 0

valid: Bool[Array, '']

Alias for field number 1

capped: Bool[Array, '']

Alias for field number 2

rho_bar: Float[Array, '']

Alias for field number 3