slsqp_jax package¶
Top-level re-exports. Most users import from this namespace; the submodule pages below describe where each symbol actually lives.
slsqp_jax
SLSQP-JAX: Sequential Least Squares Programming in pure JAX.
This package provides a GPU-compatible implementation of the SLSQP optimization algorithm using JAX and the Optimistix framework. Designed for large-scale problems (n > 5000) using matrix-free L-BFGS Hessian approximation and projected conjugate gradient for the QP subproblem.
- class slsqp_jax.SLSQP[source]¶
Bases:
AbstractMinimiserSLSQP minimiser using Sequential Quadratic Programming.
See
README.mdanddocs/source/index.mdfor the full algorithmic description. The user-facing API collapses the legacy 40+ flat keyword arguments into a singleSLSQPConfiginstance grouping the parameters by purpose; consultslsqp_jax.configfor the sub-config dataclasses.- Attributes:
- config: Aggregate configuration. Defaults to
SLSQPConfigwith all sub-config defaults.- eq_constraint_fn: Function
(x, args) -> c_eq(x)evaluated for equality-constraint feasibility
c_eq(x) = 0.- ineq_constraint_fn: Function
(x, args) -> c_ineq(x) evaluated for inequality-constraint feasibility
c_ineq(x) >= 0.
n_eq_constraints: Number of equality constraints (static). n_ineq_constraints: Number of inequality constraints (static). bounds: Optional
(n, 2)array of[lower, upper]pervariable; iterates are projected onto this box after every step. Use
-inf/+inffor unbounded dimensions.obj_grad_fn / eq_jac_fn / ineq_jac_fn / obj_hvp_fn / eq_hvp_fn / ineq_hvp_fn: Optional user-supplied derivative
callables; the AD fallbacks (
jax.grad/jax.jacrev/ forward-over-reversejvp(grad(.))) are used when these areNone.- inner_solver: Optional pluggable inner equality-constrained
QP solver.
Noneconstructs a defaultProjectedCGCholeskyderived fromconfig.- verbose:
True/Falseor a custom(**kwargs) -> None callable for per-step diagnostics.
- norm() Shaped[Array, '']¶
Compute the L-infinity norm of a PyTree of arrays.
This is the largest absolute elementwise value. Considering the input x as a flat vector (x_1, …, x_n), then this computes max_i |x_i|.
- Parameters:
x (PyTree[jax.Array | numpy.ndarray | numpy.bool | numpy.number | bool | int | float | complex | jax._src.literals.TypedNdArray])
- Return type:
Shaped[Array, ‘’]
- config: SLSQPConfig¶
- eq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None¶
- ineq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None¶
- inner_solver: AbstractInnerSolver | None = None¶
- init(fn, y, args, options, f_struct, aux_struct, tags)[source]¶
Perform all initial computation needed to initialise the solver state.
For example, the [optimistix.Chord][] method computes the Jacobian df/dy with respect to the initial guess y, and then uses it throughout the computation.
Arguments:
- fn: The function to iterate over. This is expected to take two argumetns
fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.
y: The value of y at the current (first) iteration.
args: Passed as the args of fn(y, args).
- options: Individual solvers may accept additional runtime arguments.
See each individual solver’s documentation for more details.
- f_struct: A pytree of `jax.ShapeDtypeStruct`s of the same shape as the
output of fn. This is used to initialise any information in the state which may rely on the pytree structure, array shapes, or dtype of the output of fn.
- aux_struct: A pytree of `jax.ShapeDtypeStruct`s of the same shape as the
auxiliary data returned by fn.
- tags: exact meaning depends on whether this is a fixed point, root find,
least squares, or minimisation problem; see their relevant entry points.
Returns:
A PyTree representing the initial state of the solver.
- step(fn, y, args, options, state, tags)[source]¶
Perform one step of the iterative solve.
Arguments:
- fn: The function to iterate over. This is expected to take two argumetns
fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.
y: The value of y at the current (first) iteration.
args: Passed as the args of fn(y, args).
- options: Individual solvers may accept additional runtime arguments.
See each individual solver’s documentation for more details.
- state: A pytree representing the state of a solver. The shape of this
pytree is solver-dependent.
- tags: exact meaning depends on whether this is a fixed point, root find,
least squares, or minimisation problem; see their relevant entry points.
Returns:
A 3-tuple containing the new y value in the first element, the next solver state in the second element, and the aux output of fn(y, args) in the third element.
- terminate(fn, y, args, options, state, tags)[source]¶
Determine whether or not to stop the iterative solve.
Arguments:
- fn: The function to iterate over. This is expected to take two argumetns
fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.
y: The value of y at the current iteration.
args: Passed as the args of fn(y, args).
- options: Individual solvers may accept additional runtime arguments.
See each individual solver’s documentation for more details.
- state: A pytree representing the state of a solver. The shape of this
pytree is solver-dependent.
- tags: exact meaning depends on whether this is a fixed point, root find,
least squares, or minimisation problem; see their relevant entry points.
Returns:
A 2-tuple containing a bool indicating whether or not to stop iterating in the first element, and an [optimistix.RESULTS][] object in the second element.
- postprocess(fn, y, aux, args, options, state, tags, result)[source]¶
Any final postprocessing to perform on the result of the solve.
Arguments:
- fn: The function to iterate over. This is expected to take two argumetns
fn(y, args) and return a pytree of arrays in the first element, and any auxiliary data in the second argument.
y: The value of y at the last iteration.
aux: The auxiliary output at the last iteration.
args: Passed as the args of fn(y, args).
- options: Individual solvers may accept additional runtime arguments.
See each individual solver’s documentation for more details.
- state: A pytree representing the final state of a solver. The shape of this
pytree is solver-dependent.
- tags: exact meaning depends on whether this is a fixed point, root find,
least squares, or minimisation problem; see their relevant entry points.
result: as returned by the final call to terminate.
Returns:
A 3-tuple of:
final_y: the final y to return as the solution of the solve.
final_aux: the final aux to return as the auxiliary output of the solve.
stats: any additional information to place in the sol.stats dictionary.
!!! info
- __init__(norm=<function max_norm>, config=<factory>, eq_constraint_fn=None, ineq_constraint_fn=None, n_eq_constraints=0, n_ineq_constraints=0, bounds=None, _lower_bound_mask=None, _upper_bound_mask=None, _n_lower_bounds=0, _n_upper_bounds=0, _lower_indices=None, _upper_indices=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, obj_hvp_fn=None, eq_hvp_fn=None, ineq_hvp_fn=None, _grad_impl=None, _eq_jac_impl=None, _ineq_jac_impl=None, _eq_hvp_contrib_impl=None, _ineq_hvp_contrib_impl=None, _obj_hvp_impl=None, inner_solver=None, _stagnation_window=10, _proximal_mu_min=1e-06, _proximal_mu_max=0.1, verbose=False)¶
- Parameters:
norm (Callable)
config (SLSQPConfig)
eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
n_eq_constraints (int)
n_ineq_constraints (int)
bounds (Float[Array, 'n 2'] | None)
_n_lower_bounds (int)
_n_upper_bounds (int)
obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
obj_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
_grad_impl (Callable)
_eq_jac_impl (Callable)
_ineq_jac_impl (Callable)
_eq_hvp_contrib_impl (Callable)
_ineq_hvp_contrib_impl (Callable)
_obj_hvp_impl (Callable | None)
inner_solver (AbstractInnerSolver | None)
_stagnation_window (int)
_proximal_mu_min (float)
_proximal_mu_max (float)
verbose (Callable)
- Return type:
None
- class slsqp_jax.SLSQPState[source]¶
Bases:
ModuleState for the SLSQP solver.
This is a JAX PyTree (via eqx.Module) that holds all mutable state needed across SLSQP iterations.
- Attributes:
step_count: Current iteration number. f_val: Current objective function value f(x_k). grad: Gradient of objective at current point. eq_val: Equality constraint values c_eq(x_k). ineq_val: Inequality constraint values c_ineq(x_k). eq_jac: Jacobian of equality constraints at x_k. ineq_jac: Jacobian of inequality constraints at x_k. lbfgs_history: L-BFGS history for matrix-free Hessian approximation. multipliers_eq_qp: QP-recovered equality multipliers
λ_QP = (A_k A_kᵀ)⁻¹ A_k (B d + g_k)from the active-set QP solver. Consumed by Han-Powell’s penalty rule (update_penalty_parameter), the LPEC-A predictor’srho_barproximity measure, and the active-set warm-start of the next QP. CarriesO(s_f / s_eq · cond(B))noise when the L-BFGS Hessian is poorly conditioned or when auto-scaling inflates the multiplier scale; do not use for∇f − Jᵀλstationarity checks.- multipliers_ineq_qp: QP-recovered inequality multipliers
(general-inequality block + bound block). The bound block comes from the post-line-search recovery in
slsqp_jax.slsqp.bounds.recover_bound_multipliers()so it agrees withmultipliers_ineq_lsfor the bound rows; only the general-inequality block carries the QP noise. Consumed by Han-Powell + LPEC-A + warm-start.- multipliers_eq_ls: Hessian-free least-squares multipliers
λ_LS = (J(x_{k+1}) J(x_{k+1})ᵀ)⁻¹ J(x_{k+1}) ∇f(x_{k+1})recovered post-line-search at the accepted iterate (seeslsqp_jax.slsqp.multipliers). Independent ofB,dandα. Consumed by the L-BFGS secant pair (so the same vector is used at both endpoints) and by_terminate_impl’s filterSQP-style stationarity denominator (the multipliers feed both the Lagrangian gradient numerator and themu_maxdenominator fromslsqp_jax.slsqp.termination.compute_mu_max()). These are the multipliers exposed viaSolution.stats["multipliers_eq"]because they are the stationarity-quality estimate. Atinit()time both_qpand_lsare seeded from the samelstsqsolution atx_0.- multipliers_ineq_ls: Hessian-free LS inequality multipliers
(general-inequality block from
slsqp_jax.slsqp.multiplierswith active-rowmax(0, ·)clamp + bound block fromslsqp_jax.slsqp.bounds.recover_bound_multipliers()). Atinit()initialised to zeros — bound blocks only get populated oncestep()runs the post-step recovery.- kkt_residual_grad: Lagrangian-gradient snapshot consumed by the
next QP subproblem as the KKT-residual proxy (
kkt_residual = ||state.kkt_residual_grad||insideSLSQP._solve_qp_subproblem()). Currently always written equal tograd_lagrangianat the end of each step; the field is kept separate fromgrad_lagrangianpurely so that future variants (e.g. carrying a previous iterate’s Lagrangian gradient instead of the current one) can be introduced without re-shaping the state pytree.- grad_lagrangian: Current gradient of the Lagrangian evaluated at
the accepted iterate using the LS multipliers (matching the L-BFGS secant pair). Reused by
terminateso the stationarity check does not fall out of sync with the L-BFGS secant pair.
merit_penalty: Current penalty parameter for L1 merit function. bound_jac: Constant Jacobian for bound constraints (computed once in init). qp_iterations: Total accumulated QP active-set iterations across all steps. qp_converged: Whether the most recent QP solve converged. prev_active_set: Active inequality constraint set from the previous QP solve,
used for warm-starting the next QP subproblem.
- termination_code: Granular
slsqp_jax.RESULTSclassification (
successful/merit_stagnation/line_search_failure/iterate_blowup/infeasible/qp_subproblem_failure/nonlinear_max_steps_reached/nonfinite). Surfaced viaSolution.stats["slsqp_result"];Solution.resultitself remains the coarseoptx.RESULTScode accepted by optimistix’s driver.
- step_count: Int[Array, '']¶
- f_val: Float[Array, '']¶
- grad: Float[Array, 'n']¶
- eq_val: Float[Array, 'm_eq']¶
- ineq_val: Float[Array, 'm_ineq']¶
- eq_jac: Float[Array, 'm_eq n']¶
- ineq_jac: Float[Array, 'm_ineq n']¶
- lbfgs_history: LBFGSHistory¶
- multipliers_eq_qp: Float[Array, 'm_eq']¶
- multipliers_ineq_qp: Float[Array, 'm_ineq']¶
- multipliers_eq_ls: Float[Array, 'm_eq']¶
- multipliers_ineq_ls: Float[Array, 'm_ineq']¶
- kkt_residual_grad: Float[Array, 'n']¶
- grad_lagrangian: Float[Array, 'n']¶
- merit_penalty: Float[Array, '']¶
- bound_jac: Float[Array, 'm_bounds n']¶
- qp_iterations: Int[Array, '']¶
- qp_converged: Bool[Array, '']¶
- prev_active_set: Bool[Array, 'm_ineq']¶
- consecutive_qp_failures: Int[Array, '']¶
- consecutive_ls_failures: Int[Array, '']¶
- consecutive_zero_steps: Int[Array, '']¶
- qp_optimal: Bool[Array, '']¶
- best_merit: Float[Array, '']¶
- steps_without_improvement: Int[Array, '']¶
- stagnation: Bool[Array, '']¶
- last_alpha: Float[Array, '']¶
- last_projected_grad_norm: Float[Array, '']¶
- ls_success: Bool[Array, '']¶
- ls_fatal: Bool[Array, '']¶
- qp_fatal: Bool[Array, '']¶
- best_x: Float[Array, 'n']¶
- blowup_count: Int[Array, '']¶
- diverging: Bool[Array, '']¶
- diagnostics: SLSQPDiagnostics¶
- __init__(step_count, f_val, grad, eq_val, ineq_val, eq_jac, ineq_jac, lbfgs_history, multipliers_eq_qp, multipliers_ineq_qp, multipliers_eq_ls, multipliers_ineq_ls, kkt_residual_grad, grad_lagrangian, merit_penalty, bound_jac, qp_iterations, qp_converged, prev_active_set, consecutive_qp_failures, consecutive_ls_failures, consecutive_zero_steps, qp_optimal, best_merit, steps_without_improvement, stagnation, last_alpha, last_projected_grad_norm, ls_success, ls_fatal, qp_fatal, best_x, blowup_count, diverging, termination_code, diagnostics)¶
- Parameters:
step_count (Int[Array, ''])
f_val (Float[Array, ''])
grad (Float[Array, 'n'])
eq_val (Float[Array, 'm_eq'])
ineq_val (Float[Array, 'm_ineq'])
eq_jac (Float[Array, 'm_eq n'])
ineq_jac (Float[Array, 'm_ineq n'])
lbfgs_history (LBFGSHistory)
multipliers_eq_qp (Float[Array, 'm_eq'])
multipliers_ineq_qp (Float[Array, 'm_ineq'])
multipliers_eq_ls (Float[Array, 'm_eq'])
multipliers_ineq_ls (Float[Array, 'm_ineq'])
kkt_residual_grad (Float[Array, 'n'])
grad_lagrangian (Float[Array, 'n'])
merit_penalty (Float[Array, ''])
bound_jac (Float[Array, 'm_bounds n'])
qp_iterations (Int[Array, ''])
qp_converged (Bool[Array, ''])
prev_active_set (Bool[Array, 'm_ineq'])
consecutive_qp_failures (Int[Array, ''])
consecutive_ls_failures (Int[Array, ''])
consecutive_zero_steps (Int[Array, ''])
qp_optimal (Bool[Array, ''])
best_merit (Float[Array, ''])
steps_without_improvement (Int[Array, ''])
stagnation (Bool[Array, ''])
last_alpha (Float[Array, ''])
last_projected_grad_norm (Float[Array, ''])
ls_success (Bool[Array, ''])
ls_fatal (Bool[Array, ''])
qp_fatal (Bool[Array, ''])
best_x (Float[Array, 'n'])
blowup_count (Int[Array, ''])
diverging (Bool[Array, ''])
termination_code (Any)
diagnostics (SLSQPDiagnostics)
- Return type:
None
- class slsqp_jax.SLSQPDiagnostics[source]¶
Bases:
ModuleDiagnostic counters and statistics accumulated during an SLSQP run.
These fields are populated inside
step()without Python-side branching, and can be inspected by the user after a solve to decide whether theoptimistixresult code is meaningful (e.g. to detect aRESULTS.successfulthat actually came from chronic line-search failure rather than real convergence).- Attributes:
- n_qp_inner_failures: Number of iterations where the QP solver
reported
converged=False.- n_ls_failures: Number of iterations where the line search
reported
success=False.- n_lbfgs_skips: Number of iterations where the L-BFGS append
skipped the new curvature pair.
- n_nan_directions: Number of iterations where the QP returned
a non-finite search direction.
max_gamma: Maximum L-BFGS
gammaobserved across iterations. min_diag: Minimum L-BFGS per-variable diagonal entry observed. max_diag: Maximum L-BFGS per-variable diagonal entry observed. eq_jac_min_sv_est: A lower bound on the smallest singular valueof
J_eqestimated from the Cholesky factor ofJ_eq J_eq^T. Small values indicate near rank-deficiency.ls_alpha_min: Smallest line-search step accepted across the run. tail_ls_failures: Consecutive line-search failure count at
termination. Non-zero values suggest stagnation.
- n_bound_fix_solves: Total number of non-trivial bound-fixing inner
solves that actually ran (no-op passes are skipped via
jax.lax.condand do not count). Growing unboundedly is a sign that the bound active set never stabilises.- max_bound_fixed: Largest number of variables pinned to their box
bounds across all iterations (from the final per-iteration
free_mask).- max_active_ineq: Largest number of active general inequalities
observed across iterations.
- n_merit_regressions: Number of iterations where the L1 merit
function value increased despite the line search reporting success. A non-zero count points to merit-function mis-calibration (too-small
rho) or line-search slippage.- n_qp_budget_exhausted: Number of SQP steps where the QP
active-set loop hit
qp_max_iter(i.e. exited because the budget ran out, not because of clean convergence or a ping-pong short-circuit). A non-zero count usually indicates degeneracy, multiplier-noise cycling, or an LPEC-A over-prediction that was not caught by the trust gate; consider raisingmult_drop_flooror tighteninglpeca_trust_threshold.- n_qp_ping_pong: Number of SQP steps where the QP loop’s
anti-cycling ping-pong detector fired. These are good short-circuits (the loop avoided wasting its full budget) but a chronic non-zero value still points to a degenerate constraint pair worth investigating.
- max_qp_iterations: Peak active-set iteration count observed
across all SQP steps. Useful for sizing
qp_max_iter: if this is consistently equal toqp_max_iterthe budget is too tight.- max_qp_active_size: Peak
|active_set|(general inequalities only) observed across all SQP steps.
- n_lpeca_bypassed: Number of SQP steps where the LPEC-A
prediction was skipped, either because of the warm-up window (
step_count < lpeca_warmup_steps) or because the trust gate vetoed it (rho_bar > lpeca_trust_threshold). Always 0 whenactive_set_method == "expand".- n_lpeca_capped: Number of SQP steps where the LPEC-A
rank-aware size cap truncated the prediction.
- n_lpeca_bounds_prefixed: Cumulative count of box-bound
pre-fixes contributed by LPEC-A across all SQP steps (summed over the bound predictions that survived the trust / warm-up gates and were installed as the initial
free_maskfor the bound-fixing loop). Always 0 whenactive_set_method == "expand"or whenlpeca_predict_bounds=False.- n_proj_refinements: Cumulative number of M-metric projection
refinement rounds taken across all inner solves performed by
MinresQLPSolverover the run. Always 0 for null-space CG / CRAIG.- max_proj_residual: High-water mark of the post-refinement
constraint residual
||A d - b||reported byMinresQLPSolveron the accepted QP direction across all SQP steps.- n_divergence_blowups: Total number of merit blowup events
observed across the run, whether or not the patience threshold was eventually reached.
- divergence_triggered: True when the best-iterate divergence
rollback fired at least once during the run.
- min_projected_grad_norm: Low-water mark of the inner solver’s
projected-gradient norm
||W̃_k g_k||across the run. Alwaysinffor inner solvers other thanHRInexactSTCG. Surfaced for post-hoc inspection of how close the run actually got to KKT in the inexact-projection sense, regardless of whetheruse_inexact_stationaritywas on.- n_steps_inexact_below_classical: Number of iterations where the
inner solver’s projected-gradient norm was strictly smaller than the classical Lagrangian gradient norm. A high count indicates that multiplier-recovery noise is the limiting factor and the user might benefit from
use_inexact_stationarity=Truepaired withHRInexactSTCG.
- n_qp_inner_failures: Int[Array, '']¶
- n_ls_failures: Int[Array, '']¶
- n_lbfgs_skips: Int[Array, '']¶
- n_nan_directions: Int[Array, '']¶
- max_gamma: Float[Array, '']¶
- min_diag: Float[Array, '']¶
- max_diag: Float[Array, '']¶
- eq_jac_min_sv_est: Float[Array, '']¶
- ls_alpha_min: Float[Array, '']¶
- tail_ls_failures: Int[Array, '']¶
- n_bound_fix_solves: Int[Array, '']¶
- max_bound_fixed: Int[Array, '']¶
- max_active_ineq: Int[Array, '']¶
- n_merit_regressions: Int[Array, '']¶
- n_qp_budget_exhausted: Int[Array, '']¶
- n_qp_ping_pong: Int[Array, '']¶
- max_qp_iterations: Int[Array, '']¶
- max_qp_active_size: Int[Array, '']¶
- n_lpeca_bypassed: Int[Array, '']¶
- n_lpeca_capped: Int[Array, '']¶
- n_lpeca_bounds_prefixed: Int[Array, '']¶
- n_proj_refinements: Int[Array, '']¶
- max_proj_residual: Float[Array, '']¶
- n_divergence_blowups: Int[Array, '']¶
- divergence_triggered: Bool[Array, '']¶
- min_projected_grad_norm: Float[Array, '']¶
- n_steps_inexact_below_classical: Int[Array, '']¶
- __init__(n_qp_inner_failures, n_ls_failures, n_lbfgs_skips, n_nan_directions, max_gamma, min_diag, max_diag, eq_jac_min_sv_est, ls_alpha_min, tail_ls_failures, n_bound_fix_solves, max_bound_fixed, max_active_ineq, n_merit_regressions, n_qp_budget_exhausted, n_qp_ping_pong, max_qp_iterations, max_qp_active_size, n_lpeca_bypassed, n_lpeca_capped, n_lpeca_bounds_prefixed, n_proj_refinements, max_proj_residual, n_divergence_blowups, divergence_triggered, min_projected_grad_norm, n_steps_inexact_below_classical)¶
- Parameters:
n_qp_inner_failures (Int[Array, ''])
n_ls_failures (Int[Array, ''])
n_lbfgs_skips (Int[Array, ''])
n_nan_directions (Int[Array, ''])
max_gamma (Float[Array, ''])
min_diag (Float[Array, ''])
max_diag (Float[Array, ''])
eq_jac_min_sv_est (Float[Array, ''])
ls_alpha_min (Float[Array, ''])
tail_ls_failures (Int[Array, ''])
n_bound_fix_solves (Int[Array, ''])
max_bound_fixed (Int[Array, ''])
max_active_ineq (Int[Array, ''])
n_merit_regressions (Int[Array, ''])
n_qp_budget_exhausted (Int[Array, ''])
n_qp_ping_pong (Int[Array, ''])
max_qp_iterations (Int[Array, ''])
max_qp_active_size (Int[Array, ''])
n_lpeca_bypassed (Int[Array, ''])
n_lpeca_capped (Int[Array, ''])
n_lpeca_bounds_prefixed (Int[Array, ''])
n_proj_refinements (Int[Array, ''])
max_proj_residual (Float[Array, ''])
n_divergence_blowups (Int[Array, ''])
divergence_triggered (Bool[Array, ''])
min_projected_grad_norm (Float[Array, ''])
n_steps_inexact_below_classical (Int[Array, ''])
- Return type:
None
- slsqp_jax.get_diagnostics(state)[source]¶
Return the
SLSQPDiagnosticsaccumulator from a final state.Use this after
optimistix.minimise(or a manualsolve / steploop) to inspect solver health indicators that the Optimistix result code alone does not expose. SeeSLSQPDiagnosticsfor field meanings.- Return type:
- Parameters:
state (SLSQPState)
- class slsqp_jax.QPResult[source]¶
Bases:
ModuleOuter-facing result of solving the SLSQP QP subproblem.
Returned by
SLSQP._solve_qp_subproblem()after the bound-fixing loop has post-processed the direction returned bysolve_qp().- Attributes:
direction: The search direction d from the QP solution. multipliers_eq: Lagrange multipliers for equality constraints. multipliers_ineq: Lagrange multipliers for inequality constraints. active_set: Boolean mask of active inequality constraints at the solution. converged: Whether the QP solver converged successfully. iterations: Number of active-set iterations taken. bound_fix_solves: Number of non-trivial bound-fixing passes that
actually ran the reduced-space inner solve (passes where the free mask did not change are short-circuited). Useful for debugging bound-handling overhead; 0 when the problem has no bound constraints.
- n_bound_fixed: Number of variables pinned to a box bound in the
final QP direction (counts both lower- and upper-bound activations).
- ping_ponged: True when the QP active-set loop short-circuited
on a detected add/drop ping-pong cycle.
- reached_max_iter: True when the QP active-set loop exhausted
its iteration budget (
qp_max_iter).- lpeca_bypassed: True when the LPEC-A prediction was skipped
for this step (warm-up window or trust gate fired).
- lpeca_capped: True when the LPEC-A prediction was truncated
by the rank-aware size cap.
- n_lpeca_bounds_prefixed: Number of box-bound variables
pre-fixed by LPEC-A before entering the bound-fixing loop. Always 0 when
active_set_method == "expand", whenlpeca_predict_bounds=False, or when LPEC-A was bypassed by the warm-up / trust gates.- proj_residual: Post-refinement constraint residual ``||A d -
b||`` from the inner solver producing the final QP direction (after bound-fixing). Always 0 for null-space CG / CRAIG; relevant for
MinresQLPSolverwhere it reflects the M-metric range-space projection floor and feeds the outer divergence detector viaSLSQPDiagnostics.max_proj_residual.- n_proj_refinements: Cumulative number of M-metric projection
refinement rounds across all inner solves performed for this QP step (active-set loop + bound-fixing loop). Always 0 for null-space solvers.
- projected_grad_norm: Latest inner-solver projected-gradient
norm (
HRInexactSTCGonly;infotherwise).
- direction: Float[Array, 'n']¶
- multipliers_eq: Float[Array, 'm_eq']¶
- multipliers_ineq: Float[Array, 'm_ineq']¶
- active_set: Bool[Array, 'm_ineq']¶
- converged: Bool[Array, '']¶
- iterations: Int[Array, '']¶
- bound_fix_solves: Int[Array, '']¶
- n_bound_fixed: Int[Array, '']¶
- ping_ponged: Bool[Array, '']¶
- reached_max_iter: Bool[Array, '']¶
- lpeca_bypassed: Bool[Array, '']¶
- lpeca_capped: Bool[Array, '']¶
- n_lpeca_bounds_prefixed: Int[Array, '']¶
- proj_residual: Float[Array, '']¶
- n_proj_refinements: Int[Array, '']¶
- projected_grad_norm: Float[Array, '']¶
- __init__(direction, multipliers_eq, multipliers_ineq, active_set, converged, iterations, bound_fix_solves, n_bound_fixed, ping_ponged, reached_max_iter, lpeca_bypassed, lpeca_capped, n_lpeca_bounds_prefixed, proj_residual, n_proj_refinements, projected_grad_norm)¶
- Parameters:
direction (Float[Array, 'n'])
multipliers_eq (Float[Array, 'm_eq'])
multipliers_ineq (Float[Array, 'm_ineq'])
active_set (Bool[Array, 'm_ineq'])
converged (Bool[Array, ''])
iterations (Int[Array, ''])
bound_fix_solves (Int[Array, ''])
n_bound_fixed (Int[Array, ''])
ping_ponged (Bool[Array, ''])
reached_max_iter (Bool[Array, ''])
lpeca_bypassed (Bool[Array, ''])
lpeca_capped (Bool[Array, ''])
n_lpeca_bounds_prefixed (Int[Array, ''])
proj_residual (Float[Array, ''])
n_proj_refinements (Int[Array, ''])
projected_grad_norm (Float[Array, ''])
- Return type:
None
- class slsqp_jax.SLSQPConfig[source]¶
Bases:
ModuleAggregate configuration for
SLSQP.Replaces the legacy 40+ flat keyword arguments with a small set of grouped sub-configs. Pass directly to
SLSQPvia theconfig=keyword. All sub-configs default to their dataclass defaults soSLSQPConfig()reproduces the legacy default settings.- tolerance: ToleranceConfig¶
- lbfgs: LBFGSConfig¶
- line_search: LineSearchConfig¶
- proximal: ProximalConfig¶
- preconditioner: PreconditionerConfig¶
- lpeca: LPECAConfig¶
- adaptive_cg: AdaptiveCGConfig¶
- __init__(tolerance=<factory>, lbfgs=<factory>, line_search=<factory>, qp=<factory>, proximal=<factory>, preconditioner=<factory>, lpeca=<factory>, adaptive_cg=<factory>)¶
- Parameters:
tolerance (ToleranceConfig)
lbfgs (LBFGSConfig)
line_search (LineSearchConfig)
qp (QPConfig)
proximal (ProximalConfig)
preconditioner (PreconditionerConfig)
lpeca (LPECAConfig)
adaptive_cg (AdaptiveCGConfig)
- Return type:
None
- class slsqp_jax.ToleranceConfig[source]¶
Bases:
ModuleOuter-loop tolerances and iteration / divergence budgets.
- Attributes:
- rtol: Relative tolerance for the stationarity convergence
check. The test is the filterSQP normalised KKT residual (Fletcher & Leyffer, User manual for filterSQP, eqs. 5 and 6):
||grad_L|| <= rtol * max(mu_max, 1)wheremu_max = max_i {||grad_f||_2, |nu_i|, ||a_i||_2 |lambda_i|}is the largest single contributor to the Lagrangian gradient residual (objective-gradient norm, every bound multiplier, and every general-constraint||row||_2 * |multiplier|). The same test is applied to the inexact projected-gradient numerator||W_tilde g||whenAdaptiveCGConfig.use_inexact_stationarityis on. Replaces the legacy|L|-based denominator so the test is invariant to absolute objective magnitude and tracks multiplier blow-up under near-rank-deficient active sets. Default1e-6.- atol: Absolute tolerance for primal feasibility and a number of
internal heuristic floors (steepest-descent fallback, adaptive CG tolerance floor, default proximal mu floor). Default
1e-6.
max_steps: Maximum number of outer SQP iterations. Default 100. min_steps: Minimum iterations before convergence is allowed.
Prevents premature termination at trivial starting points. Default 1.
- stagnation_tol: Relative-improvement threshold for the
merit-based stagnation counter. Default
1e-12.- divergence_factor: Best-iterate divergence rollback fires when
the merit grows by more than
divergence_factor * max(|best_merit|, 1)fordivergence_patienceconsecutive steps. Default10.0.- divergence_patience: Number of consecutive blow-up steps required
before the divergence rollback latches. Default 3.
- class slsqp_jax.LBFGSConfig[source]¶
Bases:
ModuleL-BFGS Hessian-approximation parameters.
- Attributes:
- memory: Number of curvature pairs
(s, y)stored in the ring buffer. Default 10.
- damping_threshold: VARCHEN damping threshold applied to each
curvature pair before storage; set to
0.0to disable. Default0.2.- diag_floor: Lower clip for the per-variable secant diagonal
B_0 = diag(d). Default1e-4.- diag_ceil: Upper clip for the per-variable secant diagonal.
Default
1e6.
- memory: Number of curvature pairs
- class slsqp_jax.LineSearchConfig[source]¶
Bases:
ModuleBacktracking L1-merit line-search parameters.
- Attributes:
max_steps: Maximum number of backtracking iterations. Default 20. armijo_c1: Armijo condition coefficient
c_1. Default1e-4. failure_patience: After this many consecutive line-searchfailures, the L-BFGS history is hard-reset to identity. Default 3.
- class slsqp_jax.QPConfig[source]¶
Bases:
ModuleQP-subproblem parameters.
- Attributes:
- max_iter: Maximum number of active-set iterations per QP solve.
Default 100.
- max_cg_iter: Maximum number of CG iterations per inner solve.
Default 50.
- failure_patience: After this many consecutive QP failures, the
L-BFGS history is hard-reset to identity. Default 3.
- zero_step_patience: After this many consecutive iterations where
the QP returns
||d|| < atoland primal feasibility holds, convergence is declared via the guardedqp_kkt_successdisjunct. Default 3.- ping_pong_threshold: Threshold for the QP add/drop ping-pong
short-circuit. Default
2**31 - 1(effectively disabled); opt in by setting to3-8on degenerate problems.- mult_drop_floor: Floor on the negative-multiplier drop test
inside the QP active-set loop. Default
1e-6.- cg_regularization: Minimum eigenvalue threshold
delta**2for the CG curvature guard. Default
1e-6.- use_exact_hvp: When True, the QP inner CG uses the exact
Lagrangian HVP (via AD) instead of the L-BFGS approximation. Default False.
- __init__(max_iter=100, max_cg_iter=50, failure_patience=3, zero_step_patience=3, ping_pong_threshold=2147483647, mult_drop_floor=1e-06, cg_regularization=1e-06, use_exact_hvp=False)¶
- class slsqp_jax.ProximalConfig[source]¶
Bases:
ModuleAdaptive proximal multiplier stabilization (sSQP, Wright 2002).
- Attributes:
- tau: Exponent in
mu = clip(kkt_residual^tau, mu_min, mu_max). Must lie in the half-open interval
[0, 1). Set to0.0to disable sSQP entirely (equality constraints are then enforced via direct null-space projection). Default0.5.- mu_min: Floor on the adaptive proximal mu.
Noneresolves to ToleranceConfig.atolat runtime. DefaultNone.
mu_max: Ceiling on the adaptive proximal mu. Default
0.1.- tau: Exponent in
- class slsqp_jax.PreconditionerConfig[source]¶
Bases:
ModuleQP-inner-solver preconditioner configuration.
- Attributes:
enabled: Whether to use a preconditioner at all. Default True. type: Either
"lbfgs"(default) or"diagonal". Thediagonal estimator requires an exact HVP (set
QPConfig.use_exact_hvpor provideobj_hvp_fn).- diagonal_n_probes: Number of Rademacher probes for the
stochastic diagonal estimator. Default 20.
- class slsqp_jax.LPECAConfig[source]¶
Bases:
ModuleLPEC-A active-set identification (Oberlin & Wright, 2005).
- Attributes:
- method: One of
"expand"(default),"lpeca_init"or "lpeca"."expand"disables LPEC-A entirely.- sigma: Threshold exponent (
sigma_barin the paper). Must lie in the open interval
(0, 1). Default0.9.- beta: Threshold scaling factor.
Noneresolves to 1 / (m_ineq + n + m_eq)at runtime. DefaultNone.- use_lp: When True, solve the LPEC-A LP (via
mpax.r2HPDHG) for tighter multiplier estimates. Requires
mpax. Default False.- trust_threshold: Trust gate on
rho_bar. Whenrho_bar exceeds this value the prediction is replaced with an empty set. Default
1.0.- warmup_steps: The first
warmup_stepsouter SQP iterations bypass LPEC-A. Default 3.
- predict_bounds: When True (default), extend the LPEC-A prediction
to box constraints (warm-start the bound-fixing loop).
- method: One of
- class slsqp_jax.AdaptiveCGConfig[source]¶
Bases:
ModuleAdaptive CG / inexact stationarity configuration.
- Attributes:
- enabled: When True, the CG convergence tolerance is adapted from
the outer KKT residual (Eisenstat-Walker style). Default False to preserve baseline behaviour.
- use_inexact_stationarity: When True, the projected-gradient norm
from a noise-aware inner solver (e.g.
HRInexactSTCG) is added as a logical-OR disjunct to the classical stationarity test. Default False.
- class slsqp_jax.RESULTS[source]¶
Bases:
RESULTSSLSQP-specific termination codes.
Subclasses
optimistix.RESULTSand adds finer classifications for the failure modes that the upstream solver lumps intooptimistix.RESULTS.nonlinear_divergence.Note
equinox.Enumerationreports its concrete subclasses as@finalvia the metaclass, but optimistix itself does the same trick (optx.RESULTSsubclasseslx.RESULTS) and the docstring example forEnumeration.promoteshows the pattern is supported.ty: ignore[subclass-of-final-class]suppresses the false positive.
- slsqp_jax.is_successful(result)[source]¶
Return
Trueifresultrepresents successful convergence.Robust to both
optimistix.RESULTSandslsqp_jax.RESULTSinputs by promoting the value before comparison. Useful for downstream code that may receive results from either enumeration:from slsqp_jax import is_successful if is_successful(sol.result): ...
- Args:
- result: A
RESULTSmember from eitheroptx.RESULTSor its
RESULTSsubclass.
- result: A
- Returns:
Trueiff the result equalsRESULTS.successful.
- class slsqp_jax.ParsedConstraints[source]¶
Bases:
objectResult of converting SciPy-style constraints for use with SLSQP.
All fields map directly to the corresponding
SLSQPconstructor arguments.- eq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None¶
- ineq_hvp_fn: Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None = None¶
- __init__(eq_constraint_fn=None, ineq_constraint_fn=None, n_eq_constraints=0, n_ineq_constraints=0, eq_jac_fn=None, ineq_jac_fn=None, eq_hvp_fn=None, ineq_hvp_fn=None)¶
- Parameters:
eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
n_eq_constraints (int)
n_ineq_constraints (int)
eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
- Return type:
None
- slsqp_jax.parse_constraints(constraints, x0)[source]¶
Convert SciPy-style constraints into SLSQP-JAX constraint functions.
- Parameters:
constraints (
dict|list|LinearConstraint|NonlinearConstraint|tuple) – Any form accepted byscipy.optimize.minimize: a dict, list of dicts,LinearConstraint,NonlinearConstraint, or a list/tuple mixing those types. An empty tuple/list means “no constraints”.x0 (
Array) – Initial guess – used to evaluate dict constraint functions once to determine their output size.
- Returns:
Dataclass whose fields map to
SLSQPconstructor arguments.- Return type:
- slsqp_jax.minimize_like_scipy(fun, x0, args=(), *, jac=None, hessp=None, bounds=None, constraints=(), options=None, has_aux=False, throw=False, verbose=False, auto_scale=True, auto_scale_target_gradient=None, auto_scale_max_factor=None)[source]¶
Minimise a function using SLSQP with a SciPy-like interface.
This is a convenience wrapper that accepts SciPy-style arguments, converts them for the SLSQP solver, and delegates to
optimistix.minimise.- Parameters:
fun (
Callable) – Objective function. Signature(x, *args) -> scalaror, when has_aux isTrue,(x, *args) -> (scalar, aux).x0 (
Any) – Initial guess (array-like).args (
tuple, default:()) – Extra positional arguments forwarded to fun (unpacked).jac (
Callable|bool|None, default:None) – Gradient of fun. A callable(x, *args) -> arrayorTrueto indicate that fun returns(f, g)(or((f, g), aux)when has_aux is set).hessp (
Callable|None, default:None) – Hessian-vector product(x, p, *args) -> array.bounds (
Bounds|list|tuple|None, default:None) – Variable bounds –None,Bounds, or sequence of(min, max)pairs.constraints (
dict|list|LinearConstraint|NonlinearConstraint|tuple, default:()) – SciPy-style constraints (dict / list-of-dicts /LinearConstraint/NonlinearConstraint). ANonlinearConstraintmay carry a user-attachedhesspattribute (non-standard; not part of SciPy’s API) that, if callable, is used as the per-component constraint HVP with precedence overhess. See the module-level docstring for the full contract.options (
dict[str,Any] |None, default:None) –Solver options dict. The following keys are popped with the listed defaults (which match the
SLSQPconstructor defaults):rtol(1e-6) – relative tolerance for stationarity.atol(1e-6) – absolute tolerance for feasibility.max_stepsormaxiter(100) – maximum outer iterations.min_steps(1) – minimum iterations before convergence is allowed.lbfgs_memory(10) – number of L-BFGS pairs.line_search_max_steps(20) – backtracking steps.armijo_c1(1e-4) – Armijo sufficient decrease.qp_max_iter(100) – active-set iteration budget.qp_max_cg_iter(50) – CG iterations per QP step.
Any remaining keys are forwarded as
**kwargsto theSLSQPconstructor, so anySLSQPattribute can be set here (e.g.proximal_tau,proximal_mu_min,proximal_mu_max,use_preconditioner,adaptive_cg_tol,cg_regularization,stagnation_tol).has_aux (
bool, default:False) – IfTrue, fun returns(value, aux).throw (
bool, default:False) – Whether to raise on solver failure.verbose (
bool|Callable[...,None], default:False) – Passed to theSLSQPconstructor.False(default) for silent,Trueto print all diagnostics, or a custom callable. Whenauto_scaleis on the built-in printer is wrapped to show user-unit values forf/|c|/|grad_f|/|grad_L|/|d|; merit / rho / gamma / L-BFGS internals keep a(s)suffix on their label to flag scaled units. Seeslsqp_jax.wrap_verbose_for_scaling().auto_scale (
bool|str, default:True) –Automatic problem scaling at the initial point (gradient-based, IPOPT/KNITRO-style). On by default as of this release.
True(default) ->"uniform"(target_gradient=1.0, max_factor=1e3, uniform=True). A single shared scalars_cis applied to every constraint row (equality + general inequality) and a separates_fto the objective, both symmetrically clipped to[1/max_factor, max_factor]. Preserves inter-row magnitude ratios (the right default for budget-style problems where one constraint is intentionally orders of magnitude larger than the others); fully fixes the documented||J_eq|| >> ||grad_f||divergence cascade. Note thatatol_internal = s_c * atol_user(nomin(., 1.0)cap, so the feasibility tolerance handed to the inner solver can exceedatol_userwhens_c > 1)."balanced"->target_gradient=1.0, max_factor=1e3, uniform=False. The legacy per-row default. Each constraint row gets its own factor driving||grad c_i||_inf -> 1. Flattens inter-row magnitudes; opt-in when one row’s gradient is vastly out of band and that’s not a meaningful spread.False-> no wrapping (pre-feature behaviour)."knitro"->target=1.0, max_factor=1.0(strict shrink-only per-row; opt-in for users who want zero amplification)."ipopt"->target=100.0, max_factor=1.0(very conservative per-row; may not fix all cascades)."aggressive"->target=1.0, max_factor=1e6(per-row, pushes amplification to the noise-floor limit).
When scaling is applied,
sol.statscarries ascale_factorsentry plus_user-suffixed copies of the multiplier vectors and the Lagrangian gradient norm.atolis auto-compensated so the user-perceived feasibility tolerance is preserved (uniform mode does this viaatol_internal = s_c * atol_user; per-row modes viaatol_internal = atol_user * min(min(s_eq), min(s_ineq), 1.0)).auto_scale_target_gradient (
float|None, default:None) – Optional explicit override of the mode’starget_gradient. Underuniformmode this value is consumed by both thes_fderivation (against||grad_f||_inf) and thes_cderivation (against the cross-row maxmax_i ||grad c_i||_inf); under per-row modes it drives every row’s individual factor.auto_scale_max_factor (
float|None, default:None) – Optional explicit override of the mode’smax_factor. Underuniformmode the bound is symmetric so the scale factor lives in[1/max_factor, max_factor]and the value must satisfymax_factor >= 1.0(smaller raisesValueError;max_factor == 1.0is legal but emits aUserWarningbecause it disables scaling). Under per-row modes the bound is one-sided (s in [eps, max_factor]);max_factor == 1.0means shrink-only.
- Return type:
- class slsqp_jax.ScaleFactors[source]¶
Bases:
objectThe factors
compute_scale_factors_at_x0()(per-row) orcompute_uniform_scale_factors_at_x0()(uniform) returned.- Attributes:
- s_f: Scalar objective-scaling factor (
f_scaled = s_f * f). 1.0when no scaling was applied or the objective gradient was belowgrad_floor.- s_eq: Equality-constraint factors, shape
(m_eq,). Empty array when
m_eq == 0. Per-row varying underuniform=False; constant-valued (every entry equals the shareds_c) underuniform=True.- s_ineq: Inequality-constraint factors, shape
(m_ineq,). Empty array when
m_ineq == 0. Note that this is the general inequality count – bound constraints are scaled separately (and trivially) by the bound-handling machinery. Per-row underuniform=False; constant and equal to the same shareds_cass_equnderuniform=True.
atol_user: The user-supplied feasibility tolerance. atol_internal: The compensated tolerance handed to the inner
solver. Under
uniform=Falsethis isatol_user * min(min(s_eq), min(s_ineq), 1.0)(worst-row conservative). Underuniform=Truethis iss_c * atol_userexactly (can exceedatol_userwhens_c > 1).- target_gradient: Echo of the
ScalingConfigfield actually used.
- max_factor: Echo of the
ScalingConfigfield actually used.
- grad_floor: Echo of the
ScalingConfigfield actually used.
- n_skipped_eq: Number of equality rows whose
||grad_eq[i]||_infwas belowgrad_floor. Always0underuniform=True(uniform mode does not skip individual rows).- n_skipped_ineq: Number of inequality rows whose
||grad_ineq[i]||_infwas belowgrad_floor. Always0underuniform=True.- skipped_obj:
Truewhen||grad_f||_infwas below grad_floor(aUserWarningwas emitted).- uniform:
Truewhen the factors were produced by compute_uniform_scale_factors_at_x0()(single shareds_cacross constraints, symmetricmax_factorclipping, exact-equivalenceatol_internal).Falsefor the per-rowcompute_scale_factors_at_x0().
- s_f: Scalar objective-scaling factor (
- __init__(s_f, s_eq, s_ineq, atol_user, atol_internal, target_gradient, max_factor, grad_floor, n_skipped_eq=0, n_skipped_ineq=0, skipped_obj=False, uniform=False)¶
- class slsqp_jax.ScaledProblem[source]¶
Bases:
objectBundle of scaled callables + the
ScaleFactorsthey came from.Attributes mirror the
SLSQPconstructor’s per-callable slots so the wrapper can be threaded throughauto_scaled_minimise()orslsqp_jax.minimize_like_scipy()with minimal plumbing.The objective
fnreturns(s_f * value, aux)to match thehas_aux=Trueconvention used insideoptx.minimise. Constraint callables, Jacobians, and HVPs return scaled values directly.- factors: ScaleFactors¶
- __init__(fn, eq_constraint_fn, ineq_constraint_fn, obj_grad_fn, eq_jac_fn, ineq_jac_fn, obj_hvp_fn, eq_hvp_fn, ineq_hvp_fn, factors)¶
- Parameters:
fn (Callable)
eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
obj_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
factors (ScaleFactors)
- Return type:
None
- class slsqp_jax.ScalingConfig[source]¶
Bases:
objectParameters for
compute_scale_factors_at_x0()(per-row) andcompute_uniform_scale_factors_at_x0()(uniform).- Attributes:
- target_gradient: Desired
||grad||_infafter scaling. s = target_gradient / ||grad||(clipped). Underuniform=Truethis same target is consumed by both the objective scalars_fand the single shared constraint scalars_c.- max_factor: Bound on the scale factor. Under
uniform=False (per-row modes) this is a one-sided amplification cap, so
s in [eps, max_factor];max_factor=1.0means “shrink-only” (KNITRO/IPOPT). Underuniform=Truethe bound is symmetric sos in [1/max_factor, max_factor]and the value must satisfymax_factor >= 1.0; passingmax_factor=1.0under uniform disables scaling entirely and emits a UserWarning. The project defaultmax_factor=1e3is well below typical AD relative-error noise floors (~1e-12) so amplification cannot promote roundoff to signal under any reasonable AD.- grad_floor: Rows whose
||grad||_inffalls below this value are skipped (left at
s = 1.0).1e-12is the right default: it is comfortably aboveepsso machine-zero is caught, but small enough that genuinely tiny-but-non-degenerate gradients still get scaled.- uniform: When
True, apply a single shared scalars_c across all constraint rows (equality and inequality unioned) preserving inter-row ratios; clip
s_cands_fsymmetrically bymax_factor; setatol_internal = s_c * atol_userexactly (nomin(., 1.0)cap, soatol_internalcan exceedatol_user). WhenFalse(the legacy default) each constraint row gets its own factor andatol_internal = atol_user * min(min(s_eq), min(s_ineq), 1.0).
- target_gradient: Desired
- slsqp_jax.auto_scaled_minimise(fn, solver, x0, args=None, *, auto_scale=True, auto_scale_target_gradient=None, auto_scale_max_factor=None, has_aux=False, options=None, max_steps=256, throw=True, tags=frozenset({}), adjoint=None)[source]¶
Convenience wrapper around
optx.minimisewith auto-scaling.Mirrors the
slsqp_jax.minimize_like_scipy()default-on auto-scaling path for users who constructSLSQPdirectly. Builds aScaledProblemfrom the user’s callables, replaces the solver’s eq/ineq/jac/hvp slots with the scaled wrappers, overrides the SLSQPConfig.atol withatol_internal, and unscales the returnedoptx.Solution.When
auto_scale=Falsethe call passes through tooptx.minimiseunchanged.- Return type:
- Parameters:
- Args:
- fn: Objective. Same convention as
optx.minimise– (x, args) -> valuewithhas_aux=Falseor(x, args) -> (value, aux)withhas_aux=True.- solver: An
SLSQPinstance. The constraint, Jacobian, HVP, and verbose slots are read off and replaced with their scaled counterparts.
x0: Initial iterate. args: Extra payload threaded through
fn/ constraints. auto_scale:True(default) ->"uniform"mode (asingle shared
s_cacross all constraint rows + independents_ffor the objective, both symmetrically clipped bymax_factor).False-> no scaling (passthrough). String -> explicit mode name; pass"balanced"to recover the legacy per-row default. Seeresolve_scaling_mode()for the full table.- auto_scale_target_gradient: Optional explicit target
gradient override. Under
uniformmode this is consumed by boths_fands_cderivations.- auto_scale_max_factor: Optional explicit max-factor
override. Under
uniformmode the symmetric bound requiresmax_factor >= 1.0(smaller raisesValueError;== 1.0warns).
has_aux: Whether
fnreturns(value, aux). options: Forwarded tooptx.minimise. max_steps: Forwarded tooptx.minimise. throw: Forwarded tooptx.minimise. tags: Forwarded tooptx.minimise. adjoint: Forwarded tooptx.minimisewhen notNone.- fn: Objective. Same convention as
- Returns:
An
optx.Solution; when scaling was applied, thestatsdict carriesscale_factorsand the_usersuffixed fields documented onunscale_solution(). Underuniformmode theScaleFactorsinstance hasuniform=Trueands_eq/s_ineqconstant-valued and equal to the same shareds_c.
- slsqp_jax.auto_scaled_problem(fn, x0, args, has_aux, *, eq_constraint_fn=None, ineq_constraint_fn=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, obj_hvp_fn=None, eq_hvp_fn=None, ineq_hvp_fn=None, scaling_config, atol_user=1e-06)[source]¶
Build a
ScaledProblemfrom the user’s callables andx0.Computes scale factors via either
compute_scale_factors_at_x0()(whenscaling_config.uniformisFalse— the legacy per-row behavior) orcompute_uniform_scale_factors_at_x0()(whenTrue— a single shared scalar across all constraint rows), wraps every supplied callable, and returns the bundle. Callables left asNonestay asNone– the SLSQP solver will fall back to its AD paths, which automatically pick up the scaling from the wrappedfn/constraint_fncallables.The returned
ScaledProblem.fnadheres to thehas_aux=Trueconvention (returning(value, aux)even when the user’sfnreturned just a value), matching whatoptimistix.minimiseexpects on the SLSQP path.- Return type:
- Parameters:
fn (Callable)
x0 (Array)
args (Any)
has_aux (bool)
eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
obj_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_hvp_fn (Callable[[Float[Array, 'n'], Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
scaling_config (ScalingConfig)
atol_user (float)
- slsqp_jax.compute_scale_factors_at_x0(fn, x0, args, has_aux, *, eq_constraint_fn=None, ineq_constraint_fn=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, target_gradient=1.0, max_factor=1000.0, grad_floor=1e-12, atol_user=1e-06)[source]¶
Evaluate gradients at
x0and return per-component scale factors.- Return type:
- Parameters:
fn (Callable)
x0 (Array)
args (Any)
has_aux (bool)
eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
target_gradient (float)
max_factor (float)
grad_floor (float)
atol_user (float)
- Args:
- fn: Objective.
(x, args) -> valueor (x, args) -> (value, aux)whenhas_aux=True.
x0: Initial iterate. args: Extra payload threaded through
fn/ constraints. has_aux: Whetherfnreturns(value, aux). eq_constraint_fn: Optional equality constraint function. ineq_constraint_fn: Optional inequality constraint function. obj_grad_fn: Optional user-supplied objective gradient.When
Nonewe fall back tojax.grad(fn).- eq_jac_fn: Optional user-supplied equality Jacobian.
jax.jacrevfallback.- ineq_jac_fn: Optional user-supplied inequality Jacobian.
jax.jacrevfallback.
target_gradient: See
ScalingConfig. max_factor: SeeScalingConfig. grad_floor: SeeScalingConfig. atol_user: User-perceived feasibility tolerance. Thereturned
ScaleFactors.atol_internalcompensates for the constraint-shrinking factors.- fn: Objective.
- Returns:
A
ScaleFactorsinstance.- Notes:
Emits a
UserWarningper component whose gradient is belowgrad_floor(objective and any constraint rows). Skipped rows keeps = 1.0; the user is expected to either pick a differentx0or passauto_scale=Falseand provide their own scaling.
- slsqp_jax.compute_uniform_scale_factors_at_x0(fn, x0, args, has_aux, *, eq_constraint_fn=None, ineq_constraint_fn=None, obj_grad_fn=None, eq_jac_fn=None, ineq_jac_fn=None, target_gradient=1.0, max_factor=1000.0, grad_floor=1e-12, atol_user=1e-06)[source]¶
Evaluate gradients at
x0and return uniform scale factors.Uniform mode applies a single shared scalar
s_cto every constraint row (equality and inequality unioned) and an independent scalars_fto the objective; both are clipped symmetrically bymax_factorso they can amplify or shrink. The feasibility tolerance is propagated asatol_internal = s_c * atol_userexactly, so the inner solver’s feasibility test corresponds 1-1 to the user-unit test|c_user[i]| <= atol_userregardless of which directions_cmoved.- Return type:
- Parameters:
fn (Callable)
x0 (Array)
args (Any)
has_aux (bool)
eq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
ineq_constraint_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm']] | None)
obj_grad_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'n']] | None)
eq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
ineq_jac_fn (Callable[[Float[Array, 'n'], Any], Float[Array, 'm n']] | None)
target_gradient (float)
max_factor (float)
grad_floor (float)
atol_user (float)
- Args:
- fn: Objective.
(x, args) -> valueor (x, args) -> (value, aux)whenhas_aux=True.
x0: Initial iterate. args: Extra payload threaded through
fn/ constraints. has_aux: Whetherfnreturns(value, aux). eq_constraint_fn: Optional equality constraint function. ineq_constraint_fn: Optional inequality constraint function. obj_grad_fn: Optional user-supplied objective gradient.When
Nonewe fall back tojax.grad(fn).- eq_jac_fn: Optional user-supplied equality Jacobian.
jax.jacrevfallback.- ineq_jac_fn: Optional user-supplied inequality Jacobian.
jax.jacrevfallback.- target_gradient: See
ScalingConfig. Consumed by both s_fands_cderivations.- max_factor: See
ScalingConfig. Must be>= 1.0. max_factor == 1.0is legal but disables scaling and emits aUserWarning.
grad_floor: See
ScalingConfig. atol_user: User-perceived feasibility tolerance. Thereturned
ScaleFactors.atol_internaliss_c * atol_userand may exceedatol_userwhens_c > 1.- fn: Objective.
- Returns:
A
ScaleFactorsinstance withuniform=True,s_eq = jnp.full((m_eq,), s_c), ands_ineq = jnp.full((m_ineq,), s_c).- Raises:
- ValueError: If
max_factor < 1.0(the symmetric interval [1/max_factor, max_factor]would be empty).
- ValueError: If
- Notes:
Emits a
UserWarningwhen||grad_f(x0)||_infis belowgrad_floor(s_f = 1.0), when every constraint row’s gradient inf-norm is belowgrad_floor(s_c = 1.0), or whenmax_factor == 1.0(scaling disabled). No per-row warnings are emitted under uniform mode – the cross-row max dominates the per-row check, and per-row magnitudes are intentionally preserved.
- slsqp_jax.resolve_scaling_mode(mode, *, target_gradient=None, max_factor=None)[source]¶
Map a user-facing
auto_scaleargument to aScalingConfig.- Return type:
- Parameters:
- Args:
- mode:
True->"uniform"(the default),False-> None(no scaling), or one of the string aliases"uniform","balanced","knitro","ipopt","aggressive".- target_gradient: Optional explicit override of the mode’s
default target.
Noneuses the mode default.- max_factor: Optional explicit override of the mode’s default
cap.
Noneuses the mode default.
- mode:
- Returns:
ScalingConfigorNoneifmodeisFalse.- Raises:
- ValueError: If
modeis a string that is not one of the recognised aliases.
TypeError: If
modeis neither a bool nor a string.- ValueError: If
- slsqp_jax.unscale_solution(sol, factors)[source]¶
Post-process
solso user-facingstatsuse unscaled units.The primary iterate
sol.valuelives inx-space, which is not scaled by this module (variable scaling is out-of-scope – see the deferred section of the auto-scaling plan). Multipliers and the Lagrangian gradient norm are converted from scaled to user units vialambda_user = (s / s_f) * lambda_scaled.The merit penalty
rhoand its history live entirely in scaled units; we leave them as-is and add amerit_penalty_noteentry flagging the unit.- Return type:
- Parameters:
sol (Solution)
factors (ScaleFactors)
- Args:
- sol: Raw solution returned by
optimistix.minimise(or built post-hoc by the diagnostics layer).
factors: The
ScaleFactorsused to wrap the problem.- sol: Raw solution returned by
- Returns:
A new
optx.Solutionwith the augmentedstatsdict.solis not mutated.
- slsqp_jax.wrap_verbose_for_scaling(user_verbose, factors)[source]¶
Adapt a verbose callback to print user-unit values when scaling is on.
- Return type:
- Parameters:
factors (ScaleFactors)
- Args:
- user_verbose: Either a boolean (
True/False) or a user-supplied callable. Booleans select the built-in printer (
slsqp_verbosefromslsqp_jax.slsqp.verbose); a callable is forwarded after the tuple values have been rewritten.
factors: The
ScaleFactorsto undo.- user_verbose: Either a boolean (
- Returns:
A callback with the same
(**kwargs)signature asslsqp_jax.slsqp.verbose.slsqp_verbose(). Quantities with a clean unscaled equivalent are converted to user units in place; quantities that live in scaled space (merit,rho, L-BFGS internals) are printed with the suffix(s)appended to their label so the reader knows the unit.
- class slsqp_jax.DebugReport[source]¶
Bases:
objectUser-facing report produced by the diagnostics layer.
Phase 1 instances carry
signals=[]anddiagnoses=[]; the renderer still emits a useful report from the termination metadata + diagnostics counters + trajectory. Phase 2 / 3 fill the signal and diagnosis sections in.- Attributes:
- run: The
DebugRunResultthe report was built from. Carries the
SLSQPState, theStepSummarytrajectory, and any signals/diagnoses already wired by later phases.- signals: Convenience accessor exposing
run.fired_signalskeyed by name.Phase 1: empty.- diagnoses: List of multi-signal diagnoses produced by the
playbook.
Phase 1: empty.
- run: The
- run: DebugRunResult¶
- classmethod from_run(run)[source]¶
Build a
DebugReportfrom aDebugRunResult.- Return type:
- Parameters:
run (DebugRunResult)
- print_summary(*, file=None)[source]¶
Write the report to
file(defaultsys.stdout).Uses
io.IOBase.write()rather thanprint()so the call passes the project-wideno-print-statementspre-commit hook (the hook is intentionally strict; presenting a multi-paragraph diagnostics report is one of the few legitimate exceptions and we route throughwritefor clarity). The trailing newline matches whatprint()would emit so users do not have to special-case the output in a notebook.
- class slsqp_jax.DebugRunResult[source]¶
Bases:
objectAggregate result of a manual-loop debug run.
Carries everything a downstream signal evaluator or report renderer needs to do its job, plus the handles
(solver, fn, x0, args, has_aux)required to re-execute the run for ad-hoc inspection viaslsqp_jax.diagnostics.runner.capture_state_at_step().- Attributes:
solver: The
slsqp_jax.SLSQPinstance the run used. fn: The objective callable passed todebug_run. x0: The initial iterate. args: Extra positional payload threaded throughfn/constraint callables.
has_aux: Whether
fnreturns(value, aux). final_state: The terminalSLSQPStateafter the manualloop exited. Carries the full
SLSQPDiagnosticsaccumulator and every device array the end-of-run evaluators need.final_y: The terminal iterate
yreturned bystep(). final_result: The granularslsqp_jax.RESULTSterminationcode stored on
final_state.termination_code.- coarse_result: The coarse
optimistix.RESULTScode that terminate()returned (always a member of the parent enum, suitable for the optimistix driver-style check).- summaries: Per-step
StepSummaryrecords, one per iteration the loop actually executed.
- terminated_at_step: Index of the iteration where the loop
exited (0-based;
len(summaries) - 1on success).- max_steps_reached:
Trueiff the loop ran out of iterations without
terminate()returningdone.- fired_signals: Signals emitted by the per-step + end-of-run
evaluators. Empty list when the diagnostics layer is run without signals (Phase 1 default).
- coarse_result: The coarse
- final_state: SLSQPState¶
- summaries: list[StepSummary]¶
- property diagnostics: SLSQPDiagnostics¶
Convenience accessor for
final_state.diagnostics.
- property terminated_successfully: bool¶
Trueiff the run actually converged.Both conditions must hold: the granular
final_resultisRESULTS.successfuland the manual loop did not run out of its iteration budget withoutterminate()returningdone=True.The second clause matters because
RESULTS.successfulis the default termination_code on a freshSLSQPState(it means “no failure has latched yet”) — so a truncated run with no failure flag set will reportsuccessfuleven though it never actually converged.
- __init__(solver, fn, x0, args, has_aux, final_state, final_y, final_result, coarse_result, summaries, terminated_at_step, max_steps_reached, fired_signals=<factory>)¶
- class slsqp_jax.Diagnosis[source]¶
Bases:
objectNamed multi-signal diagnosis surfaced in the report.
- Attributes:
name: Stable machine-readable identifier. cause: One-paragraph mechanism explanation. suggestions: Concrete starting-point fixes. related_signals: Names of the fired signals the rule combined.
- class slsqp_jax.DiagnosticContext[source]¶
Bases:
objectContainer yielded by
diagnostic_run()’s__enter__.Carries every
DebugRunResultand matchingDebugReportproduced inside thewithblock, plus counters distinguishing intercepted SLSQP calls from passthrough calls (i.e.optx.minimiseinvocations whosesolveris something other thanSLSQP).- Attributes:
- runs: One
DebugRunResultper intercepted call, in execution order.
- reports: One
DebugReportper intercepted call, in the same order as
runs.reports[i]was built fromruns[i].- intercepted_calls: Count of
optx.minimisecalls whose solverwas anSLSQPinstance and were re-routed through the debug loop.- passthrough_calls: Count of
optx.minimisecalls whose solverwas notSLSQPand were forwarded unchanged to the originaloptx.minimise.
- runs: One
- runs: list[DebugRunResult]¶
- reports: list[DebugReport]¶
- print_summary(idx=None, *, file=None)[source]¶
Print one or all captured reports to
file(default stdout).Pass an integer
idxto print exactly that report (negative indices count from the end, matching list semantics). Withidx=None(the default) every captured report is printed in order, separated by a blank line. Useful from a notebook or anexcepthandler when the auto print on exit was suppressed (print_on_exit="never") or when the user wants to see a successful run’s report that the auto-suppress skipped.
- __init__(runs=<factory>, reports=<factory>, intercepted_calls=0, passthrough_calls=0)¶
- Parameters:
runs (list[DebugRunResult])
reports (list[DebugReport])
intercepted_calls (int)
passthrough_calls (int)
- Return type:
None
- class slsqp_jax.Signal[source]¶
Bases:
objectSingle evidence-backed hypothesis emitted by an evaluator.
The wording contract is hypothesis with evidence, not verdict.
summaryreads “X looks suspicious because Y”;detailexpands on the mechanism;suggestionsare starting points for the user to investigate, not prescriptions to apply blindly.- Attributes:
- name: Stable machine-readable identifier (used for the
playbook scope filter and the cap-policy enforcement test).
- specificity: How uniquely this pattern points at one cause.
specific= essentially diagnostic for one cause;ambiguous= consistent with two or three named causes;generic= consistent with many causes. Static editorial tag set at registration.- magnitude: How far past threshold the evidence landed.
marginal= within 10x of threshold;moderate= 10x-100x past threshold;extreme= >100x past threshold. Computed fromevidenceat fire time.- confidence: Derived from
(specificity, magnitude)via the lookup table in
slsqp_jax.diagnostics.playbook.confidence_for(). The single ranking axis the report sorts by.
summary: One-line “X looks suspicious because Y” framing. detail: Multi-paragraph mechanism explanation. evidence: Numeric evidence keyed by short name. Each value
is the actual measurement; the threshold is documented in the evaluator’s docstring.
- suggestions: Concrete starting-point fixes (config knobs,
alternative inner solvers, etc).
- artifacts: Heavy linalg objects the evaluator computed
immediately from the live state when it fired. May be empty.
- offending_step: 1-indexed step the signal fired on, or
None for end-of-run evaluators that scope over the whole run.
- __init__(name, specificity, magnitude, confidence, summary, detail, evidence=<factory>, suggestions=<factory>, artifacts=<factory>, offending_step=None)¶
- class slsqp_jax.StepSummary[source]¶
Bases:
objectHost-resident scalar summary of a single SLSQP iteration.
Every field is a plain Python
int/float/boolso the diagnostics loop can branch on it without an extra device sync. The only device → host transfer happens once insideStepSummary.from_state().- Attributes:
step_count: Iteration number after this step (1-indexed). f_val: Objective value at the post-step iterate. merit: L1 merit value at the post-step iterate. last_alpha: Line-search step size accepted at this step. qp_iterations_total: Cumulative QP active-set iterations across
all steps so far (matches
state.qp_iterations).- qp_iterations_step: QP active-set iterations consumed by this
step alone (delta of
qp_iterations_total).
qp_converged: Whether the QP solver reported success this step. qp_real_failure:
Trueiff the QP did not converge AND didnot exhaust its iteration budget (the “real failure” distinction the L-BFGS reset chain uses; see
AGENTS.md“Real-vs-budget QP failure distinction”).- qp_reached_max_iter: Whether the QP active-set loop exhausted
qp_max_iterthis step (delta ofdiagnostics.n_qp_budget_exhausted).- qp_ping_ponged: Whether the QP ping-pong short-circuit fired
this step (delta of
diagnostics.n_qp_ping_pong).
ls_success: Whether the line search reported success. consecutive_qp_failures: Outer-loop failure-streak counter. consecutive_ls_failures: Outer-loop failure-streak counter. consecutive_zero_steps: Zero-step convergence-detection counter. grad_norm: Euclidean norm of the objective gradient at the
post-step iterate.
- grad_lagrangian_norm: Euclidean norm of the Lagrangian gradient
at the post-step iterate (used by the classical stationarity criterion).
- lagrangian_value: Value of the Lagrangian
L = f - lambda_eq^T c_eq - mu_ineq^T c_ineqat the post-step iterate.- rel_kkt:
grad_lagrangian_norm / max(|L|, 1)— the legacy relative-stationarity proxy. Surfaced for backward- compatible diagnostics; note that the live convergence check now compares
grad_lagrangian_norm / max(mu_max, 1)againstrtol(filterSQP eqs. 5–6, seeslsqp_jax.slsqp.termination.compute_mu_max()), sorel_kktand the actual termination criterion may differ when|L|andmu_maxdisagree.- kkt_scale: filterSQP
mu_maxdenominator at this step. Because
state.ineq_jacalready contains the general inequality rows plus the unit-norm bound rows, this summary computes the same maximum by treating the whole inequality block as row-norm-scaled terms.- kkt_ratio: Active relative-stationarity ratio used by the
convergence test:
grad_lagrangian_norm / max(kkt_scale, 1).
gamma: Scalar L-BFGS initial-Hessian scaling. min_diag: Minimum entry of the L-BFGS per-variable diagonal. max_diag: Maximum entry of the L-BFGS per-variable diagonal. diag_kappa:
max_diag / max(min_diag, 1e-30), the L-BFGSinitial-Hessian condition-number proxy.
- lbfgs_count: Number of curvature pairs currently stored in the
ring buffer.
- lbfgs_skipped: Whether the L-BFGS append skipped its curvature
pair this step (delta of
diagnostics.n_lbfgs_skips).- max_abs_mult_eq:
max(|lambda_eq_ls|)at the post-step iterate (the stationarity-quality multiplier — the same vector exposed via
Solution.stats["multipliers_eq"]).- max_abs_mult_ineq:
max(|mu_ineq_ls|)at the post-step iterate.
- qp_vs_ls_multiplier_ratio: Maximum
|λ_qp[i]| / max(|λ_ls[i]|, eps)across active equality + general-inequality rows. Surfaces “the QP multiplier wasN xlarger than the stationarity multiplier this step”, which is the textbook signature of QP-multiplier inflation (poor L-BFGS conditioning + auto-scaling). Always1.0on the very first step where both are seeded from the samelstsqinitialisation; settles around1.0near a clean KKT point and grows on noisy / ill-conditioned iterates.- n_active_ineq: Number of active inequality constraints (general
bounds) at the post-step iterate.
- eq_jac_min_sv_est: Lower bound on the smallest singular value
of
J_eqestimated from the Cholesky ofJ_eq J_eq^T(cumulative low-water mark fromdiagnostics.eq_jac_min_sv_est).- projected_grad_norm: Norm of the inner solver’s projected
gradient (
HRInexactSTCGonly;infotherwise).- merit_penalty: L1 merit penalty parameter
rhoafter this step.
max_eq_violation:
max|c_eq|at the post-step iterate. max_ineq_violation:max(0, -c_ineq)at the post-stepiterate.
- proj_residual_high_water: Cumulative high-water mark of the
MinresQLPSolverM-metric projection residual. Always0.0for null-space solvers.- diverging: Whether the best-iterate divergence rollback fired
this step.
blowup_count: Consecutive blowup events at this step. merit_regression_step: Whether
compute_meritexceeded thebest merit despite the line search reporting success (delta of
diagnostics.n_merit_regressions).
- classmethod from_state(state, *, prev_state=None)[source]¶
Materialise a
StepSummaryfrom a liveSLSQPState.This is the single device → host transfer per debug-loop iteration. Everything the runner inspects for control flow below this point reads from the returned
StepSummary.prev_stateis theSLSQPStatefrom the previous iteration (or the initial-state fromsolver.initfor step 1). It is used solely to compute single-step deltas of the cumulative diagnostics counters (n_qp_budget_exhausted,n_qp_ping_pong,n_lbfgs_skips,n_merit_regressions) andstate.qp_iterations. PassingNonetreats every cumulative counter as starting from 0, which matches the behaviour ofsolver.init(which produces a state with all-zero diagnostics).- Return type:
- Parameters:
state (SLSQPState)
prev_state (SLSQPState | None)
- reproducibility_digest()[source]¶
Return a short hex digest used by
capture_state_at_step.Two
StepSummaryinstances produced by independent runs of the same(solver, x0, args)should hash to the same digest on the same hardware /jax.configsettings. Floats are rounded to_HASH_PRECISION_DECIMALSdecimals before hashing so a last-bit difference (e.g. from a fused-multiply-add reorder) does not false-positive on nondeterminism.- Return type:
- __init__(step_count, f_val, merit, last_alpha, qp_iterations_total, qp_iterations_step, qp_converged, qp_real_failure, qp_reached_max_iter, qp_ping_ponged, ls_success, consecutive_qp_failures, consecutive_ls_failures, consecutive_zero_steps, grad_norm, grad_lagrangian_norm, lagrangian_value, rel_kkt, gamma, min_diag, max_diag, diag_kappa, lbfgs_count, lbfgs_skipped, max_abs_mult_eq, max_abs_mult_ineq, qp_vs_ls_multiplier_ratio, n_active_ineq, eq_jac_min_sv_est, projected_grad_norm, merit_penalty, max_eq_violation, max_ineq_violation, proj_residual_high_water, diverging, blowup_count, merit_regression_step, kkt_scale=0.0, kkt_ratio=inf)¶
- Parameters:
step_count (int)
f_val (float)
merit (float)
last_alpha (float)
qp_iterations_total (int)
qp_iterations_step (int)
qp_converged (bool)
qp_real_failure (bool)
qp_reached_max_iter (bool)
qp_ping_ponged (bool)
ls_success (bool)
consecutive_qp_failures (int)
consecutive_ls_failures (int)
consecutive_zero_steps (int)
grad_norm (float)
grad_lagrangian_norm (float)
lagrangian_value (float)
rel_kkt (float)
gamma (float)
min_diag (float)
max_diag (float)
diag_kappa (float)
lbfgs_count (int)
lbfgs_skipped (bool)
max_abs_mult_eq (float)
max_abs_mult_ineq (float)
qp_vs_ls_multiplier_ratio (float)
n_active_ineq (int)
eq_jac_min_sv_est (float)
projected_grad_norm (float)
merit_penalty (float)
max_eq_violation (float)
max_ineq_violation (float)
proj_residual_high_water (float)
diverging (bool)
blowup_count (int)
merit_regression_step (bool)
kkt_scale (float)
kkt_ratio (float)
- Return type:
None
- slsqp_jax.capture_state_at_step(solver, fn, x0, step, *, args=None, has_aux=False, expected_summary=None)[source]¶
Re-run
solverto stepstepand return the liveSLSQPState.This is the public ad-hoc inspection tool: signals already build their artifacts inline at the moment they fire, so most users will never call this. It exists for the case where the user wants to poke at a step their signals did not preserve a snapshot of.
- Return type:
- Parameters:
solver (SLSQP)
fn (Callable)
x0 (Any)
step (int)
args (Any)
has_aux (bool)
expected_summary (Optional[StepSummary])
- Args:
solver: The
slsqp_jax.SLSQPinstance to re-run. fn: Objective callable (same shape as fordebug_run()). x0: Initial iterate. step: Target iteration to stop at (1-indexed;step=kreturns the state immediately after the
k-th call tosolver.step).args: Extra payload threaded through
fn. has_aux: Whetherfnreturns(value, aux). expected_summary: OptionalStepSummaryfrom theoriginal
debug_run()at the same step. When supplied, the recovered state’s summary is hashed and compared against it; a mismatch raisesRuntimeError. This is the load-bearing reproducibility check that prevents the tool from silently lying about which iterate it is showing.- Returns:
The live
SLSQPStateimmediately after stepstep.- Raises:
ValueError: If
stepis non-positive. RuntimeError: Ifexpected_summaryis supplied and therecovered state’s reproducibility digest does not match.
- slsqp_jax.debug_run(solver, fn, x0, *, args=None, max_steps=None, has_aux=False, per_step_evaluators=(), end_of_run_evaluators=())[source]¶
Run
solverunder a manual Python loop and return aDebugRunResult.The loop reproduces what
optimistix.minimisewould do, iteratingsolver.init→solver.step→solver.terminateon each step. After every step the liveSLSQPStateis summarised into aStepSummaryand (when Phase 2 wires them) the per-step evaluators inper_step_evaluatorsare run against(state, summary, summaries_so_far). After the loop exits,end_of_run_evaluatorsare run against(final_state, summaries, final_result).- Return type:
- Parameters:
- Args:
solver: The
slsqp_jax.SLSQPinstance to run. fn: Objective callable.(x, args) -> valueby default,or
(x, args) -> (value, aux)whenhas_aux=True.x0: Initial iterate. args: Extra payload threaded through
fnand the constraintcallables on
solver.- max_steps: Optional iteration budget override. Defaults to
solver.max_steps.
has_aux: Whether
fnreturns(value, aux). per_step_evaluators: Tuple of callables(state, summary, summaries) -> Signal | Noneinvoked after each step. Empty by default (Phase 1 ships without signals).- end_of_run_evaluators: Tuple of callables
(final_state, summaries, result) -> Signal | Noneinvoked once after the loop exits. Empty by default.
- Returns:
DebugRunResultwith the per-stepStepSummarytrajectory, the terminal state, the granular and coarse termination codes, and any signals fired during the run.
- slsqp_jax.diagnose(solver, fn, x0, *, args=None, max_steps=None, has_aux=False)[source]¶
Run
solverunder a manual loop and return aDebugReport.Convenience wrapper around
debug_run()+DebugReport. Uses the registeredPER_STEP_EVALUATORSandEND_OF_RUN_EVALUATORS; the report is always produced (even on a successful run), so the user can request a diagnosis when they suspect slow-but-converged behaviour, not just on hard failures.- Return type:
- Parameters:
- slsqp_jax.diagnose_minimize_like_scipy(fun, x0, args=(), *, jac=None, hessp=None, bounds=None, constraints=(), options=None, has_aux=False, throw=False, verbose=False, max_steps_override=None)[source]¶
Drop-in replacement for
slsqp_jax.minimize_like_scipy()with diagnostics.Mirrors
slsqp_jax.minimize_like_scipy()’s public signature 1:1 so callers can swap one import + one call name without any other code changes:# before from slsqp_jax import minimize_like_scipy sol = minimize_like_scipy(fun, x0, ...) # after from slsqp_jax import diagnose_minimize_like_scipy sol, report = diagnose_minimize_like_scipy(fun, x0, ...) report.print_summary()
Unlike
diagnostic_run(), this function performs no monkey-patching — it is the recommended path for users whose wrapper code doesfrom optimistix import minimise(which bypasses the context manager’s patch), as well as for tests and CI where global state must be avoided.The extra keyword
max_steps_overrideaccepts an integer to cap the diagnostic loop independently of theoptions['maxiter']that builds the underlyingSLSQP;None(default) honours the option dict’s own value.- Return type:
- Parameters:
- Returns:
A
(sol, report)tuple.solis a normaloptx.Solutionindistinguishable from whatslsqp_jax.minimize_like_scipy()would have returned (modulo the device-resident state the diagnostics layer keeps around);reportis the populatedDebugReportwith all signals + diagnoses already evaluated.- Raises:
Whatever
minimize_like_scipywould raise plus, whenthrow=Trueand the run did not converge,optimistix.RESULTS-flavoured runtime errors viaEnumerationItem.error_if.
- slsqp_jax.diagnostic_run(*, max_steps=None, print_on_exit='auto', output=None)[source]¶
Context manager that intercepts
optimistix.minimisecalls.Inside the
withblock, anyoptimistix.minimise(fn, solver, ...)whosesolveris anslsqp_jax.SLSQPinstance is re-routed throughslsqp_jax.diagnostics.debug_run(), the resultingDebugRunResultandDebugReportare stashed on the yieldedDiagnosticContext, and a fully populatedoptx.Solutionis returned to the caller — so the user’s wrapper code (whether it goes throughslsqp_jax.minimize_like_scipy()or builds its ownSLSQP) runs unchanged. Calls whosesolveris something other thanSLSQPare forwarded to the originaloptimistix.minimiseunchanged, with the only side effect being a bump onDiagnosticContext.passthrough_calls.- Return type:
- Parameters:
- Args:
- max_steps: Optional iteration-budget override forwarded to
every intercepted
debug_run.None(the default) uses each call’s ownmax_stepsargument, falling back tosolver.max_steps.- print_on_exit: When the context exits, print captured reports
to
output."auto"(default): only failing runs plus a trailer;"always": every run plus the trailer;"never": nothing.- output: File-like sink for the on-exit print. Defaults to
sys.stdout.
- Yields:
DiagnosticContextcollecting the runs and reports for every intercepted call.- Raises:
- RuntimeError: If another
diagnostic_run()is already active (re-entrant or concurrent enter). The patch is a global mutation; serialising would mask bugs.
- RuntimeError: If another
Example:
with slsqp_jax.diagnostic_run() as ctx: sol = slsqp_jax.minimize_like_scipy(fun, x0, ...) # On exit, any failing report has already been printed. # The Solution is the real optimistix.Solution. ctx.runs[0].diagnostics # SLSQPDiagnostics counters
- class slsqp_jax.AbstractInnerSolver[source]¶
Bases:
ModuleStrategy for solving the equality-constrained QP subproblem.
Subclasses implement
solveto compute the search directiondand Lagrange multipliers for the active constraints.- abstractmethod solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]¶
Solve the equality-constrained QP subproblem.
Solves:
minimize (1/2) d^T B d + g^T d subject to A[active] d = b[active] d[i] = d_fixed[i] for i where free_mask[i] is False
where B is given implicitly via
hvp_fn(v) = B @ v.- Return type:
- Parameters:
hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])
g (Float[Array, 'n'])
A (Float[Array, 'm n'])
b (Float[Array, 'm'])
active_mask (Bool[Array, 'm'])
precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)
free_mask (Bool[Array, 'n'] | None)
d_fixed (Float[Array, 'n'] | None)
adaptive_tol (Float[Array, ''] | float | None)
- Args:
hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only
variables with
free_mask[i] = Trueare optimized.- d_fixed: Values for fixed variables (n,). Required when
free_maskis provided.- adaptive_tol: Optional Eisenstat-Walker tolerance override.
When provided, overrides the solver’s default convergence tolerance for this call only.
- Returns:
InnerSolveResultwith the direction, multipliers, and convergence flag.
- build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]¶
Build a reusable projector + multiplier-recovery context.
Composed strategies (e.g.
HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.The default implementation raises
NotImplementedErrorso full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projectorW̃_kthat HR Algorithm 4.5 needs.- Return type:
- Parameters:
- __init__()¶
- Return type:
None
- class slsqp_jax.HRInexactSTCG[source]¶
Bases:
AbstractInnerSolverHeinkenschloss-Ridzal (2014) Algorithm 4.5 — STCG with inexact null-space projections.
Composes an existing null-space inner solver (
ProjectedCGCholeskyorProjectedCGCraig) to obtain its projectorW̃_k, particular solutiond_pand multiplier-recovery closure, then runs a separate CG iteration on top whose three textbook three-term-recurrence cancellations are replaced by full H-conjugacy reorthogonalisation against every previous search direction.See
AGENTS.md(“Pluggable Inner QP Solvers” →HRInexactSTCG) for the full algorithmic discussion and references.- Attributes:
- inner: Composed null-space inner solver supplying the
projector and multiplier-recovery infrastructure. Must implement
build_projection_context; the saddle-pointMinresQLPSolverdoes not and will raise on the firstsolvecall.- max_cg_iter: Static upper bound on the number of inner CG
iterations. Determines the size of the reorth buffers.
- cg_tol: Relative convergence tolerance for the projected
residual
‖z̃_i‖ ≤ tol · ‖r̃_0‖.- cg_regularization: Curvature-guard threshold
δ²used by the SNOPT-style scale-invariant short-circuit
⟨p̃, H p̃⟩ ≤ δ² ‖p̃‖². Defaults to1e-6; set to0.0to disable.
- inner: AbstractInnerSolver¶
- build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]¶
Build a reusable projector + multiplier-recovery context.
Composed strategies (e.g.
HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.The default implementation raises
NotImplementedErrorso full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projectorW̃_kthat HR Algorithm 4.5 needs.- Return type:
- Parameters:
- solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]¶
Solve the equality-constrained QP subproblem.
Solves:
minimize (1/2) d^T B d + g^T d subject to A[active] d = b[active] d[i] = d_fixed[i] for i where free_mask[i] is False
where B is given implicitly via
hvp_fn(v) = B @ v.- Return type:
- Parameters:
hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])
g (Float[Array, 'n'])
A (Float[Array, 'm n'])
b (Float[Array, 'm'])
active_mask (Bool[Array, 'm'])
precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)
free_mask (Bool[Array, 'n'] | None)
d_fixed (Float[Array, 'n'] | None)
adaptive_tol (Float[Array, ''] | float | None)
- Args:
hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only
variables with
free_mask[i] = Trueare optimized.- d_fixed: Values for fixed variables (n,). Required when
free_maskis provided.- adaptive_tol: Optional Eisenstat-Walker tolerance override.
When provided, overrides the solver’s default convergence tolerance for this call only.
- Returns:
InnerSolveResultwith the direction, multipliers, and convergence flag.
- __init__(inner, max_cg_iter, cg_tol, cg_regularization=1e-06)¶
- Parameters:
inner (AbstractInnerSolver)
max_cg_iter (int)
cg_tol (Float[Array, ''] | float)
cg_regularization (float)
- Return type:
None
- class slsqp_jax.InnerSolveResult[source]¶
Bases:
NamedTupleResult from an inner equality-constrained QP solve.
- Attributes:
d: Search direction. multipliers: Lagrange multipliers (shape
(m,); entries forinactive constraints are zero).
- converged: True when the inner Krylov / projection iteration
satisfied its tolerance.
- proj_residual: Post-solve constraint residual
||A d - b|| (Euclidean norm, restricted to active rows). Always
0for null-space solvers (CG / CRAIG) where feasibility is enforced structurally; non-zero forMinresQLPSolverwhere it reflects the floor of the M-metric range-space projection after iterative refinement.- n_proj_refinements: Number of M-metric projection refinement
rounds actually applied. Always
0for null-space solvers. At mostMinresQLPSolver.proj_refine_max_iter.- projected_grad_norm: Norm of the projected initial gradient
W̃_k gthat the inner solver actually iterated against (HR 2014, Theorem 3.5). This is the noise-aware stationarity proxy: when the outer SQP enablesuse_inexact_stationarity, the run is allowed to converge once this value drops belowrtol * max(mu_max, 1)(filterSQP eq. 6 with the shared denominator fromslsqp_jax.slsqp.termination.compute_mu_max()). Defaults toinfso that solvers which do not produce this quantity (i.e. anything other thanHRInexactSTCG) cannot accidentally satisfy a< rtoltest even if the user toggles the flag — the inexact path silently degrades to “never converges this way”.
- d: Float[Array, 'n']¶
Alias for field number 0
- multipliers: Float[Array, 'm']¶
Alias for field number 1
- converged: Bool[Array, '']¶
Alias for field number 2
- proj_residual: Float[Array, '']¶
Alias for field number 3
- n_proj_refinements: Int[Array, '']¶
Alias for field number 4
- projected_grad_norm: Float[Array, '']¶
Alias for field number 5
- class slsqp_jax.MinresQLPSolver[source]¶
Bases:
AbstractInnerSolverPreconditioned MINRES-QLP on the full saddle-point KKT system.
Solves the KKT system directly:
[B A^T] [d] [-g] [A 0 ] [lambda] = [b ]
using PMINRES-QLP (Choi, Paige & Saunders, SISC 2011, Table 3.5) with a block-diagonal SPD preconditioner:
M = [B_diag^{-1} 0 ] [0 S^{-1} ]
where
B_diag = diag(B_0)(L-BFGS diagonal) andS = A B_diag^{-1} A^Tis the Schur complement.After PMINRES-QLP returns the iterate
d, an M-metric range-space projection drivesA d = bon the active rows. The single shot is followed by up toproj_refine_max_iterrounds of iterative refinement, each costing one matvec + one Schur back-solve (no refactorisation). Refinement squares the relative feasibility error per round. See HR (2014, Algorithm 4.18 step 1(a)) for the motivation.- solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]¶
Solve the equality-constrained QP subproblem.
Solves:
minimize (1/2) d^T B d + g^T d subject to A[active] d = b[active] d[i] = d_fixed[i] for i where free_mask[i] is False
where B is given implicitly via
hvp_fn(v) = B @ v.- Return type:
- Parameters:
hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])
g (Float[Array, 'n'])
A (Float[Array, 'm n'])
b (Float[Array, 'm'])
active_mask (Bool[Array, 'm'])
precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)
free_mask (Bool[Array, 'n'] | None)
d_fixed (Float[Array, 'n'] | None)
adaptive_tol (Float[Array, ''] | float | None)
- Args:
hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only
variables with
free_mask[i] = Trueare optimized.- d_fixed: Values for fixed variables (n,). Required when
free_maskis provided.- adaptive_tol: Optional Eisenstat-Walker tolerance override.
When provided, overrides the solver’s default convergence tolerance for this call only.
- Returns:
InnerSolveResultwith the direction, multipliers, and convergence flag.
- class slsqp_jax.ProjectedCGCholesky[source]¶
Bases:
AbstractInnerSolverProjected CG with Cholesky-based null-space projection.
This is the original implementation: Cholesky-factor
A A^T(with regularization), use it for the null-space projector and particular solution, run CG in the null space, and recover multipliers via iterative refinement.When
use_constraint_preconditionerisTrueand a preconditioner is provided, the constraint preconditioner (Gould, Hribar & Nocedal, 2001) is used instead of the naiveP(M(r)).- solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]¶
Solve the equality-constrained QP subproblem.
Solves:
minimize (1/2) d^T B d + g^T d subject to A[active] d = b[active] d[i] = d_fixed[i] for i where free_mask[i] is False
where B is given implicitly via
hvp_fn(v) = B @ v.- Return type:
- Parameters:
hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])
g (Float[Array, 'n'])
A (Float[Array, 'm n'])
b (Float[Array, 'm'])
active_mask (Bool[Array, 'm'])
precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)
free_mask (Bool[Array, 'n'] | None)
d_fixed (Float[Array, 'n'] | None)
adaptive_tol (Float[Array, ''] | float | None)
- Args:
hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only
variables with
free_mask[i] = Trueare optimized.- d_fixed: Values for fixed variables (n,). Required when
free_maskis provided.- adaptive_tol: Optional Eisenstat-Walker tolerance override.
When provided, overrides the solver’s default convergence tolerance for this call only.
- Returns:
InnerSolveResultwith the direction, multipliers, and convergence flag.
- build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]¶
Build a reusable projector + multiplier-recovery context.
Composed strategies (e.g.
HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.The default implementation raises
NotImplementedErrorso full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projectorW̃_kthat HR Algorithm 4.5 needs.- Return type:
- Parameters:
- class slsqp_jax.ProjectedCGCraig[source]¶
Bases:
AbstractInnerSolverProjected CG with CRAIG-based iterative null-space projection.
Replaces the Cholesky factorization of
A A^Twith iterative CRAIG solves (Golub-Kahan bidiagonalization). This eliminates theO(m^3)factorization cost and the1e-8diagonal regularization, at the cost of an iterative solve per projection.For multiplier recovery (done once after the CG loop), CG on the normal equations
A A^T y = rhsis used, reusing the existingsolve_unconstrained_cginfrastructure.- solve(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None, adaptive_tol=None)[source]¶
Solve the equality-constrained QP subproblem.
Solves:
minimize (1/2) d^T B d + g^T d subject to A[active] d = b[active] d[i] = d_fixed[i] for i where free_mask[i] is False
where B is given implicitly via
hvp_fn(v) = B @ v.- Return type:
- Parameters:
hvp_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']])
g (Float[Array, 'n'])
A (Float[Array, 'm n'])
b (Float[Array, 'm'])
active_mask (Bool[Array, 'm'])
precond_fn (Callable[[Float[Array, 'n']], Float[Array, 'n']] | None)
free_mask (Bool[Array, 'n'] | None)
d_fixed (Float[Array, 'n'] | None)
adaptive_tol (Float[Array, ''] | float | None)
- Args:
hvp_fn: Hessian-vector product function v -> B @ v. g: Linear term (gradient of objective). A: Combined constraint matrix (m x n). b: Combined RHS vector (m,). active_mask: Boolean mask (m,) indicating active constraints. precond_fn: Optional preconditioner v -> M @ v where M ~ B^{-1}. free_mask: Optional boolean mask (n,). When provided, only
variables with
free_mask[i] = Trueare optimized.- d_fixed: Values for fixed variables (n,). Required when
free_maskis provided.- adaptive_tol: Optional Eisenstat-Walker tolerance override.
When provided, overrides the solver’s default convergence tolerance for this call only.
- Returns:
InnerSolveResultwith the direction, multipliers, and convergence flag.
- build_projection_context(hvp_fn, g, A, b, active_mask, precond_fn=None, free_mask=None, d_fixed=None)[source]¶
Build a reusable projector + multiplier-recovery context.
Composed strategies (e.g.
HRInexactSTCG) call this on the underlying inner solver to obtain its null-space projector, particular solution and multiplier-recovery closure without running the projector’s own CG loop.The default implementation raises
NotImplementedErrorso full-KKT solvers (MinresQLPSolver) cleanly opt out — they have no separate projection step and therefore cannot supply the inexact-projectorW̃_kthat HR Algorithm 4.5 needs.- Return type:
- Parameters:
- __init__(max_cg_iter, cg_tol, cg_regularization=1e-06, use_constraint_preconditioner=False, craig_tol=1e-10, craig_max_iter=200, mult_recovery_tol=1e-12, mult_recovery_max_iter=200)¶
- class slsqp_jax.ProjectionContext[source]¶
Bases:
NamedTupleReusable bundle of projector + particular-solution + multiplier-recovery closures for an active equality system
A_active d = b_active.Existing strategies (
ProjectedCGCholesky,ProjectedCGCraig) build these inline inside theirsolvemethods. Composed strategies (e.g.HRInexactSTCG) callinner.build_projection_context(...)to reuse the projector and multiplier-recovery infrastructure of an underlying null-space solver while running their own CG loop on top.- Attributes:
- project: Inexact null-space projector
W̃_k(v). Maps a vector in the (free) ambient space to its projection onto
null(A_work)using whatever inner approximation the underlying strategy provides (Cholesky forProjectedCGCholesky, CRAIG forProjectedCGCraig).- d_p: Particular solution.
A_work @ d_p == b_workto inner solver precision;
d_palready incorporatesd_fixedon the bound-fixed coordinates, so it lives in the fulln- dimensional space.- recover_multipliers: Closure mapping
(B d + g)to a length-m multiplier vector with zeros on inactive rows. Encapsulates both the inversion of
A_work A_workᵀand the active-mask zeroing. HR Algorithm 4.5 calls this once per outer step (after the modified-residual CG iteration converges).- hvp_work: Working-subspace HVP. Equal to
hvp_fnwhen no bound fixing is in effect; otherwise
v -> _free * hvp_fn(_free * v)so the iteration only sees the free coordinates.- g_eff: Effective gradient
g + B @ d_pevaluated against hvp_work. HR’s notation calls thisg_k; it is the input to the projected-residual recurrence.- A_work: Already-masked working constraint matrix (active rows,
free columns). Surfaced primarily so callers can sanity-check the residual
||A_work @ d - b_work||;projectandrecover_multipliersalready incorporate it.- free_mask: Boolean mask of free variables (always present;
equals
ones(n)when no bound-fixing is in effect).- d_fixed: Fixed-variable values on bound-active coordinates
(zeros elsewhere; zeros everywhere when no bound-fixing).
- has_fixed:
Trueiff any coordinate is bound-fixed. Cheap indicator so consumers do not have to redo the mask check.
- converged: Convergence flag of the inner projector solve that
built the context (always
Truefor Cholesky; carries the CRAIG breakdown / convergence flag for the iterative projector). Composed strategies AND this with their own convergence to surface inner-projector failures upstream.
- project: Inexact null-space projector
- d_p: Float[Array, 'n']¶
Alias for field number 1
- g_eff: Float[Array, 'n']¶
Alias for field number 4
- A_work: Float[Array, 'm n']¶
Alias for field number 5
- free_mask: Bool[Array, 'n']¶
Alias for field number 6
- d_fixed: Float[Array, 'n']¶
Alias for field number 7
- converged: Bool[Array, '']¶
Alias for field number 9
- slsqp_jax.solve_qp(hvp_fn, g, A_eq, b_eq, A_ineq, b_ineq, max_iter=100, max_cg_iter=50, tol=1e-08, expand_factor=1.0, initial_active_set=None, kkt_residual=0.0, proximal_mu=0.0, prev_multipliers_eq=None, precond_fn=None, cg_tol=None, cg_regularization=1e-06, use_proximal=True, predicted_active_set=None, active_set_method='expand', use_constraint_preconditioner=False, inner_solver=None, mult_drop_floor=1e-06, ping_pong_threshold=2147483647)[source]¶
Solve a QP with equality and inequality constraints.
Solves:
minimize (1/2) d^T H d + g^T d subject to A_eq d = b_eq A_ineq d >= b_ineq
where H is provided implicitly via
hvp_fn(v) = H @ v.The QP active-set loop dispatches to one of three strategies:
Proximal sSQP (
m_eq > 0anduse_proximal=True): equality constraints absorbed into the objective via augmented-Lagrangian penalty. Seeslsqp_jax.qp.proximal.Direct projection (
m_eq > 0anduse_proximal=False): equality constraints enforced via null-space projection in the inner solver. Seeslsqp_jax.qp.direct.Inequality-only (
m_eq == 0): no equality block; just an active-set loop on inequality constraints. Seeslsqp_jax.qp.inequality.
All three share the same EXPAND / ping-pong active-set loop body (see
slsqp_jax.qp.active_set).- Args:
hvp_fn: Hessian-vector product function v -> H @ v. g: Linear term of the objective (gradient). A_eq: Equality constraint matrix (m_eq x n). b_eq: Equality constraint RHS (m_eq,). A_ineq: Inequality constraint matrix (m_ineq x n). b_ineq: Inequality constraint RHS (m_ineq,). max_iter: Maximum active-set iterations. max_cg_iter: Maximum CG iterations per active-set step. tol: Feasibility and optimality tolerance. expand_factor: EXPAND tolerance growth rate. initial_active_set: Optional warm-start active set from a
previous QP solve.
kkt_residual: Norm of the KKT residual from the outer solver. proximal_mu: Adaptive proximal parameter for sSQP. prev_multipliers_eq: Equality multipliers from the previous
outer iteration (proximal centre).
precond_fn: Optional preconditioner v -> M @ v. cg_tol: Optional CG convergence tolerance overriding
tolfor the inner solver only.
cg_regularization: Curvature-guard threshold for CG. use_proximal: When True, equality constraints go through the
sSQP proximal path. When False, direct projection.
- predicted_active_set: Optional LPEC-A predicted active set
for warm-start.
active_set_method:
"expand","lpeca_init", or"lpeca". use_constraint_preconditioner: Used only when constructing adefault
inner_solver.- inner_solver: Pluggable strategy for the inner equality-
constrained QP solve. Defaults to
ProjectedCGCholesky.
mult_drop_floor: Floor on the negative-multiplier drop test. ping_pong_threshold: Threshold for the explicit ping-pong
short-circuit. Defaults to
2**31 - 1(effectively disabled).- Returns:
QPSolverResultcontaining the solution, multipliers, active set, and convergence info.
- Return type:
- Parameters:
hvp_fn (Callable)
g (Float[Array, 'n'])
A_eq (Float[Array, 'm_eq n'])
b_eq (Float[Array, 'm_eq'])
A_ineq (Float[Array, 'm_ineq n'])
b_ineq (Float[Array, 'm_ineq'])
max_iter (int)
max_cg_iter (int)
tol (Float[Array, ''] | float)
expand_factor (float)
initial_active_set (Bool[Array, 'm_ineq'] | None)
kkt_residual (Float[Array, ''] | float)
proximal_mu (Float[Array, ''] | float)
prev_multipliers_eq (Float[Array, 'm_eq'] | None)
precond_fn (Callable | None)
cg_tol (Float[Array, ''] | float | None)
cg_regularization (float)
use_proximal (bool)
predicted_active_set (Bool[Array, 'm_ineq'] | None)
active_set_method (str)
use_constraint_preconditioner (bool)
inner_solver (AbstractInnerSolver | None)
mult_drop_floor (float)
ping_pong_threshold (int)
- class slsqp_jax.LBFGSHistory[source]¶
Bases:
ModuleL-BFGS history buffer for matrix-free Hessian approximation.
Stores the last k (s, y) pairs in a circular buffer and provides efficient Hessian-vector products via the compact representation.
- Attributes:
s_history: Stored step vectors s_i = x_{i+1} - x_i. y_history: Stored (damped) gradient differences y_i. gamma: Scalar summary of the initial Hessian scaling. diagonal: Per-variable initial Hessian scaling (B_0 = diag(d)).
During normal operation this equals
gamma * ones(n). After an SNOPT-style reset it stores per-variable curvature.count: Number of valid pairs stored (0 to memory size). next_idx: Next write position in the circular buffer. eig_lower: Estimate of lambda_min(H) from VARCHEN Theorem 2. eig_upper: Estimate of lambda_max(H) from VARCHEN Theorem 2.
- s_history: Float[Array, 'memory n']¶
- y_history: Float[Array, 'memory n']¶
- gamma: Float[Array, '']¶
- diagonal: Float[Array, 'n']¶
- count: Int[Array, '']¶
- next_idx: Int[Array, '']¶
- eig_lower: Float[Array, '']¶
- eig_upper: Float[Array, '']¶
- __init__(s_history, y_history, gamma, diagonal, count, next_idx, eig_lower, eig_upper)¶
- Parameters:
s_history (Float[Array, 'memory n'])
y_history (Float[Array, 'memory n'])
gamma (Float[Array, ''])
diagonal (Float[Array, 'n'])
count (Int[Array, ''])
next_idx (Int[Array, ''])
eig_lower (Float[Array, ''])
eig_upper (Float[Array, ''])
- Return type:
None
- slsqp_jax.lbfgs_init(n, memory)[source]¶
Initialize an empty L-BFGS history buffer.
- Return type:
- Parameters:
- Args:
n: Dimension of the parameter space. memory: Maximum number of (s, y) pairs to store (typically 5-20).
- Returns:
An initialized LBFGSHistory with no stored pairs and gamma=1.
- slsqp_jax.lbfgs_hvp(history, v)[source]¶
Compute B @ v using the L-BFGS compact representation.
Uses the compact form with diagonal initial Hessian B_0 = diag(d) (Byrd, Nocedal & Schnabel, 1994, Theorem 2.2):
B = B_0 - [B_0 S, Y] @ M^{-1} @ [S^T B_0; Y^T]
- where:
M = [[S^T B_0 S, L], [L^T, -D_sy]] (2k x 2k) L_{ij} = s_i^T y_j for i > j (strictly lower triangular) D_sy = diag(s_i^T y_i) (diagonal)
When no pairs are stored (count=0), this reduces to B = diag(d).
Complexity: O(k^2 n) where k is the number of stored pairs.
- Args:
history: L-BFGS history buffer. v: Vector to multiply by the Hessian approximation.
- Returns:
B @ v, the Hessian-vector product.
- Return type:
Float[Array, 'n']- Parameters:
history (LBFGSHistory)
v (Float[Array, 'n'])
- slsqp_jax.lbfgs_inverse_hvp(history, v)[source]¶
Compute H @ v = B^{-1} @ v via the L-BFGS two-loop recursion.
Implements Nocedal & Wright Algorithm 7.4 with diagonal initial scaling H_0 = diag(1/diagonal) instead of scalar (1/gamma) I.
Complexity: O(kn) where k is the number of stored pairs.
- Return type:
Float[Array, 'n']- Parameters:
history (LBFGSHistory)
v (Float[Array, 'n'])
- Args:
history: L-BFGS history buffer. v: Vector to multiply by the inverse Hessian approximation.
- Returns:
H @ v = B^{-1} @ v, the inverse Hessian-vector product.
- slsqp_jax.lbfgs_append(history, s, y, damping_threshold=0.2, skip_threshold=1e-08, diag_floor=0.0001, diag_ceil=1000000.0)[source]¶
Append a new (s, y) pair to the L-BFGS history with Powell damping.
Uses VARCHEN-style damping toward B0 (Lotfi et al., 2020, eq 7-8): damping is computed against the diagonal initial Hessian
B_0 = diag(diagonal)instead of the full L-BFGS approximation B. This is O(n) instead of O(k^2 n), always well-conditioned (the diagonal is clipped to[diag_floor, diag_ceil]), and avoids the circular dependency where a badly-conditioned B poisons its own damping.- Return type:
- Parameters:
history (LBFGSHistory)
s (Float[Array, 'n'])
y (Float[Array, 'n'])
damping_threshold (float)
skip_threshold (float)
diag_floor (float)
diag_ceil (float)
- The damped gradient difference is:
y_damped = theta * y + (1 - theta) * B0 s
- where theta in [0, 1] is chosen to satisfy:
s^T y_damped >= threshold * s^T B0 s
If
||s||is too small or the curvature ratio is too extreme, the update is skipped entirely to avoid numerical issues. Thecurvature_ratiobracket is[1e-8, 1e8](more permissive than the previous[1e-6, 1e6]) so we no longer drop noisy-but- informative pairs on high-dimensional problems; these pairs instead flow through Powell damping. Therelative_curvature < 1e-8skip is now gated onsᵀy > 0(positive curvature) so negative- curvature pairs are handled by damping rather than dropped.After appending, the scalar
gammais updated toy_damped^T y_damped / (y_damped^T s)and clipped to[diag_floor, diag_ceil]. The per-variable diagonal is updated using a component-wise secant:d[i] = |y_damped[i] * s[i]| / (s[i]^2 + eps)and clipped to the same bracket. This givesd[i] ≈ |H_{ii}|for diagonal Hessians regardless of step direction, unlike the classical Shanno-Phua formulad[i] = y_i^2 / (y^T s)which producesd ∝ h_i^2for multi-component steps.The old
1e-6floor onclip_loallowed the inverse-Hessian diagonalH_0 = 1/dto balloon to1e6, producing very large CG directions that the line search could not backtrack. Raising the floor to1e-4keeps the inverse diagonal bounded by1e4and dramatically reduces the “non-descent direction + chronic line- search backtracking” failure mode.- Args:
history: Current L-BFGS history. s: Step vector s = x_{k+1} - x_k. y: Gradient difference y = nabla L_{k+1} - nabla L_k. damping_threshold: Powell damping threshold (default 0.2). skip_threshold: Minimum step norm for update (default 1e-8). diag_floor: Minimum per-variable diagonal entry (default 1e-4). diag_ceil: Maximum per-variable diagonal entry (default 1e6).
- Returns:
Updated L-BFGS history with the new pair appended.
- slsqp_jax.lbfgs_compute_diagonal(history)[source]¶
Extract diag(B_k) from the L-BFGS compact representation.
From the compact form
B = B_0 - W M^{-1} W^T, the diagonal isdiag(B_k) = diagonal - diag(W M^{-1} W^T)
where
W = [B_0 S, Y]is(n, 2k)andMis the2k x 2kinner matrix. The correction term is computed in O(k^2 n) by formingQ = W M^{-1}and summing(Q * W)row-wise.This is used by
lbfgs_reset()to implement the SNOPT diagonal reset strategy (Gill, Murray & Saunders, 2005, Section 3.3).- Return type:
Float[Array, 'n']- Parameters:
history (LBFGSHistory)
- slsqp_jax.lbfgs_estimate_condition(history, damping_threshold=0.2)[source]¶
Estimate eigenvalue bounds of the inverse Hessian approximation H.
Uses the diagonal
B_0 = diag(d)as a proxy for the full L-BFGS condition number. The inverse Hessian eigenvalues are bounded by[1/max(d), 1/min(d)], givingkappa(H) = max(d) / min(d).This is a practical simplification of the VARCHEN Theorem 2 bounds (Lotfi et al., 2020). The full recursive bounds are theoretically tight but overly pessimistic in practice because they propagate worst-case Lipschitz estimates, leading to false soft-reset triggers on moderately conditioned problems. Since we damp toward B0 (not the full B), the diagonal condition number is the most relevant quantity.
- Return type:
tuple[Float[Array, ''],Float[Array, '']]- Parameters:
history (LBFGSHistory)
damping_threshold (float)
- Returns:
(lambda_min_est, lambda_max_est) bounding the eigenvalues of H.
- slsqp_jax.lbfgs_reset(history, diag_floor=0.0001, diag_ceil=1000000.0)[source]¶
SNOPT-style diagonal reset of the L-BFGS history.
Extracts
diag(B_k)from the current approximation, discards all stored(s, y)pairs, and restarts withB_0 = diag(diag(B_k)). This preserves per-variable curvature information across the reset, preventing the “everything is flat” effect that occurs when the scalargammabecomes very small.The extracted diagonal is clipped to
[diag_floor, diag_ceil]to keep the reset point aligned with the clipping used insidelbfgs_append(); previous versions used a hardcoded[1e-2, 1e6]that diverged from the append-path floor.Based on the SNOPT limited-memory reset strategy (Gill, Murray & Saunders, SIAM Review, 47(1), 2005, Section 3.3).
- Return type:
- Parameters:
history (LBFGSHistory)
diag_floor (float)
diag_ceil (float)
- slsqp_jax.lbfgs_should_skip(s, y, skip_threshold=1e-08)[source]¶
Return the scalar
should_skipdecision thatlbfgs_appenduses.Exposed so callers can observe the skip decision without having to diff fields on the returned history (which becomes meaningless once the circular buffer saturates, since
countstays atmemoryfrom then on regardless of whether a real append happened).- Return type:
Bool[Array, '']- Parameters:
s (Float[Array, 'n'])
y (Float[Array, 'n'])
skip_threshold (float)
- slsqp_jax.lbfgs_curvature_diagnostics(s, y, skip_threshold=1e-08)[source]¶
Return
(s.y, relative_curvature, skipped)for the curvature pair.Exposes the same intermediate quantities
lbfgs_should_skipuses so the verbose callback can surface them without re-implementing the skip predicate.relative_curvature = |s.y| / (||s|| * ||y||)is the cosine of the angle betweensandyand the dominant contributor to the skip decision (a near-zero value indicates the curvature pair is at the floating-point noise floor and gets skipped by therelative_curvature < skip_thresholdbranch oflbfgs_should_skip).- Return type:
tuple[Float[Array, ''],Float[Array, ''],Bool[Array, '']]- Parameters:
s (Float[Array, 'n'])
y (Float[Array, 'n'])
skip_threshold (float)
- Args:
s: Step vector
x_{k+1} - x_k. y: Gradient difference∇L_{k+1} - ∇L_k(Lagrangian gradientdifference for SLSQP).
skip_threshold: Threshold passed to
lbfgs_should_skip.- Returns:
(sty, relative_curvature, skipped), wheresty = s.yis the raw curvature inner product,relative_curvatureis the normalised version used in the skip predicate, andskippedis the same booleanlbfgs_should_skipreturns.
- slsqp_jax.lbfgs_soft_reset(history)[source]¶
VARCHEN-style soft reset: keep only the most recent (s, y) pair.
When the estimated condition number of the inverse Hessian exceeds a threshold, this drops all but the newest curvature pair. This is less aggressive than
lbfgs_reset()(which extracts the diagonal and drops everything) orlbfgs_identity_reset()(which restoresB = I), preserving the most relevant curvature information.Based on VARCHEN Algorithm 1, Step 7 (Lotfi et al., 2020).
- Return type:
- Parameters:
history (LBFGSHistory)
- slsqp_jax.lbfgs_identity_reset(history)[source]¶
Hard reset of L-BFGS history to the identity Hessian.
Unlike
lbfgs_reset()which preserves per-variable curvature, this discards everything and restarts withB_0 = I. Used as an escalation when repeated SNOPT-style diagonal resets fail to break an ill-conditioning cycle (e.g. consecutive QP failures where the extracted diagonal perpetuates the same problematic scaling).- Return type:
- Parameters:
history (LBFGSHistory)
- slsqp_jax.estimate_hessian_diagonal(hvp_fn, n, key, n_probes=20)[source]¶
Estimate the diagonal of a matrix given only its HVP.
Uses the Bekas-Kokiopoulou-Saad (2007) stochastic estimator: for Rademacher random vectors z (entries i.i.d. from {-1, +1}), E[z * (H z)] = diag(H). Averaging over
n_probessamples gives an unbiased estimate with variance O(||off-diag||^2 / k).The estimate is accurate when the matrix is diagonally dominant, which is common for Hessians of smooth functions in the natural coordinate basis.
- Return type:
Float[Array, 'n']- Parameters:
- Args:
hvp_fn: Function v -> H @ v (Hessian-vector product). n: Dimension of the matrix. key: JAX PRNG key for generating random probes. n_probes: Number of Rademacher probes (default 20). More
probes reduce variance at the cost of more HVP calls. Each probe costs one forward-over-reverse AD pass.
- Returns:
Estimated diagonal of H, shape (n,).
- class slsqp_jax.LineSearchResult[source]¶
Bases:
NamedTupleResult from the line search.
- Attributes:
alpha: The step size found. f_val: Function value at new point. eq_val: Equality constraint values at new point. ineq_val: Inequality constraint values at new point. success: Whether the line search succeeded. n_evals: Number of function evaluations.
- alpha: Float[Array, '']¶
Alias for field number 0
- f_val: Float[Array, '']¶
Alias for field number 1
- eq_val: Float[Array, 'm_eq']¶
Alias for field number 2
- ineq_val: Float[Array, 'm_ineq']¶
Alias for field number 3
- success: Bool[Array, '']¶
Alias for field number 4
- n_evals: Int[Array, '']¶
Alias for field number 5
- slsqp_jax.compute_merit(f_val, eq_val, ineq_val, penalty)[source]¶
Compute the L1-exact penalty merit function value.
- Return type:
Float[Array, '']- Parameters:
f_val (Float[Array, ''])
eq_val (Float[Array, 'm_eq'])
ineq_val (Float[Array, 'm_ineq'])
penalty (Float[Array, ''])
- The merit function is:
φ(x; ρ) = f(x) + ρ * (‖c_eq(x)‖_1 + ‖max(0, -c_ineq(x))‖_1)
- Args:
f_val: Objective function value f(x). eq_val: Equality constraint values c_eq(x). ineq_val: Inequality constraint values c_ineq(x). penalty: Penalty parameter ρ.
- Returns:
Merit function value φ(x; ρ).
- slsqp_jax.backtracking_line_search(fn, eq_constraint_fn, ineq_constraint_fn, x, direction, args, f_val, eq_val, ineq_val, penalty, grad, c1=0.0001, rho=0.5, max_iter=20, alpha_init=1.0, bounds=None, lower_bound_mask=None, upper_bound_mask=None, eq_jac=None, ineq_jac=None)[source]¶
Perform backtracking line search with the L1 merit function.
- Return type:
- Parameters:
fn (Callable)
eq_constraint_fn (Callable | None)
ineq_constraint_fn (Callable | None)
x (Float[Array, 'n'])
direction (Float[Array, 'n'])
args (Any)
f_val (Float[Array, ''])
eq_val (Float[Array, 'm_eq'])
ineq_val (Float[Array, 'm_ineq'])
penalty (Float[Array, ''])
grad (Float[Array, 'n'])
c1 (float)
rho (float)
max_iter (int)
alpha_init (float)
bounds (Float[Array, 'n 2'] | None)
eq_jac (Float[Array, 'm_eq n'] | None)
ineq_jac (Float[Array, 'm_ineq_general n'] | None)
- Finds α such that the Armijo condition is satisfied:
φ(x + α*d; ρ) ≤ φ(x; ρ) + c1 * α * φ’(x; d, ρ)
where φ is the L1 merit function and φ’ is the proper directional derivative including constraint Jacobian terms:
- φ’(x; d, ρ) = ∇f·d + ρ Σ sign(c_eq_i)(J_eq d)_i
ρ Σ_{j: c_ineq_j<0} (J_ineq d)_j
When constraint Jacobians are not provided, falls back to the simpler
∇f·dapproximation.- Args:
fn: Objective function fn(x, args) -> (f_val, aux). eq_constraint_fn: Equality constraint function or None. ineq_constraint_fn: Inequality constraint function or None. x: Current point. direction: Search direction. args: Arguments to pass to functions. f_val: Current objective value. eq_val: Current equality constraint values. ineq_val: Current inequality constraint values (including bounds). penalty: Penalty parameter. grad: Gradient of objective at x. c1: Armijo condition parameter (default 1e-4). rho: Step reduction factor (default 0.5). max_iter: Maximum number of iterations. alpha_init: Initial step size (default 1.0). bounds: Optional box constraints, shape (n, 2) with [lower, upper] per variable. lower_bound_mask: Tuple of bools indicating which lower bounds are finite. upper_bound_mask: Tuple of bools indicating which upper bounds are finite. eq_jac: Equality constraint Jacobian at x (m_eq, n), or None. ineq_jac: General inequality constraint Jacobian at x
(m_ineq_general, n), or None. Does NOT include bound constraint rows.
- Returns:
LineSearchResult with the found step size and function values.
- slsqp_jax.update_penalty_parameter(current_penalty, multipliers_eq, multipliers_ineq, margin=1.1)[source]¶
Update the penalty parameter based on Lagrange multipliers.
The penalty should be larger than the maximum absolute multiplier to ensure the merit function provides a descent direction.
ρ >= max(abs(λ_i), abs(μ_j)) + margin- Return type:
Float[Array, '']- Parameters:
current_penalty (Float[Array, ''])
multipliers_eq (Float[Array, 'm_eq'])
multipliers_ineq (Float[Array, 'm_ineq'])
margin (float)
- Args:
current_penalty: Current penalty parameter. multipliers_eq: Lagrange multipliers for equality constraints. multipliers_ineq: Lagrange multipliers for inequality constraints. margin: Safety margin factor (default 1.1).
- Returns:
Updated penalty parameter.
- slsqp_jax.compute_lagrangian_gradient(grad_f, eq_jac, ineq_jac, multipliers_eq, multipliers_ineq)[source]¶
Compute the gradient of the Lagrangian function.
- Return type:
Float[Array, 'n']- Parameters:
grad_f (Float[Array, 'n'])
eq_jac (Float[Array, 'm_eq n'])
ineq_jac (Float[Array, 'm_ineq n'])
multipliers_eq (Float[Array, 'm_eq'])
multipliers_ineq (Float[Array, 'm_ineq'])
- The Lagrangian is:
L(x, lambda, mu) = f(x) - lambda^T c_eq(x) - mu^T c_ineq(x)
- Its gradient with respect to x is:
nabla_x L = nabla f(x) - J_eq^T lambda - J_ineq^T mu
ineq_jacandmultipliers_ineqmust include the bound block when bounds are present (the ineq layout used bysolver.pyis[general; lower_bound; upper_bound]). Internally this is built on top ofcompute_partial_lagrangian_gradient()so the bound-multiplier recovery in the outer solver shares the same code path for the non-bound contribution.- Args:
grad_f: Gradient of objective function nabla f(x). eq_jac: Jacobian of equality constraints (m_eq x n). ineq_jac: Jacobian of inequality constraints (m_ineq x n),
including bound rows.
multipliers_eq: Lagrange multipliers for equality constraints. multipliers_ineq: Lagrange multipliers for inequality
constraints, including bound multipliers.
- Returns:
Gradient of Lagrangian nabla_x L.
- slsqp_jax.compute_partial_lagrangian_gradient(grad_f, eq_jac, multipliers_eq, gen_jac, multipliers_gen)[source]¶
Compute the partial Lagrangian gradient (without the bound block).
Returns
nabla f(x) - J_eq(x)^T lambda - J_gen(x)^T mu_gen, whereJ_genis the Jacobian of the general (nonlinear) inequality constraints — i.e. the inequality block excluding the constant identity-style rows that come from box bounds.This is the core helper used by both
compute_lagrangian_gradient(which adds the bound contribution on top) and the post-line-search NLP-level bound-multiplier recovery insolver.py: by reading off the partial Lagrangian gradient at indices that are at a bound atx_{k+1}, the recovery picks the bound multiplier that exactly zeros the corresponding component of the full Lagrangian gradient.- Return type:
Float[Array, 'n']- Parameters:
grad_f (Float[Array, 'n'])
eq_jac (Float[Array, 'm_eq n'])
multipliers_eq (Float[Array, 'm_eq'])
gen_jac (Float[Array, 'm_gen n'])
multipliers_gen (Float[Array, 'm_gen'])
- Args:
grad_f: Gradient of the objective function
nabla f(x). eq_jac: Jacobian of equality constraints (m_eq x n). multipliers_eq: Lagrange multipliers for equality constraints. gen_jac: Jacobian of general inequality constraints(m_gen x n). Must exclude bound rows.
- multipliers_gen: Lagrange multipliers for general inequality
constraints.
- Returns:
Partial Lagrangian gradient with no bound contribution.
- slsqp_jax.compute_rho_bar(c_ineq, c_eq, grad, A_ineq, A_eq, lambda_ineq, mu_eq)[source]¶
Compute the LPEC-A proximity measure rho_bar.
This implements Eq. 36 of Oberlin & Wright (2005), adapted to the
c_ineq >= 0(feasible) sign convention.For feasible inequality constraints (
c_ineq_i > 0), the contribution issqrt(c_ineq_i * lambda_ineq_i); for violated constraints (c_ineq_i <= 0), it is-c_ineq_i(the violation magnitude).- Return type:
Float[Array, '']- Parameters:
c_ineq (Float[Array, 'm_ineq'])
c_eq (Float[Array, 'm_eq'])
grad (Float[Array, 'n'])
A_ineq (Float[Array, 'm_ineq n'])
A_eq (Float[Array, 'm_eq n'])
lambda_ineq (Float[Array, 'm_ineq'])
mu_eq (Float[Array, 'm_eq'])
- Args:
c_ineq: Inequality constraint values at current point. c_eq: Equality constraint values at current point. grad: Objective gradient at current point. A_ineq: Inequality constraint Jacobian. A_eq: Equality constraint Jacobian. lambda_ineq: Multiplier estimates for inequality constraints. mu_eq: Multiplier estimates for equality constraints.
- Returns:
The scalar proximity measure rho_bar >= 0.
- slsqp_jax.identify_active_set_lpeca(c_ineq, c_eq, grad, A_ineq, A_eq, lambda_ineq, mu_eq, sigma=0.9, beta=None, trust_threshold=1.0)[source]¶
Predict the active inequality set using the LPEC-A threshold test.
Applies Eq. 43 of Oberlin & Wright (2005), adapted to our sign convention. An inequality constraint
iis predicted active whenc_ineq_i <= (beta * rho_bar) ** sigmaandrho_baris below the trust threshold (otherwise the prediction is empty).The result is wrapped in an
LPECAResultso the caller can distinguish “no constraints predicted active” (a valid prediction) from “LPEC-A bypassed” (valid=False) or “size cap fired” (capped=True).- Return type:
- Parameters:
- Args:
c_ineq: Inequality constraint values at current point. c_eq: Equality constraint values at current point. grad: Objective gradient at current point. A_ineq: Inequality constraint Jacobian. A_eq: Equality constraint Jacobian. lambda_ineq: Multiplier estimates for inequality constraints. mu_eq: Multiplier estimates for equality constraints. sigma: Threshold exponent (
sigma_barin the paper).Must be in (0, 1). Default 0.9 per paper recommendation.
- beta: Threshold scaling factor. Default
Noneuses the paper’s recommendation
1 / (m_ineq + n + m_eq).- trust_threshold: Maximum
rho_barfor which the LPEC-A prediction is trusted. When
rho_bar > trust_threshold, the predicted set is empty (the iterate is considered too far from the solution for the asymptotic guarantees to apply). Default1.0.
- beta: Threshold scaling factor. Default
- Returns:
LPECAResultcontaining the boolean prediction mask and the diagnostic flagsvalid/capped/rho_bar.
- slsqp_jax.compute_lpeca_active_set(c_ineq, c_eq, grad, A_ineq, A_eq, lambda_ineq, mu_eq, sigma=0.9, beta=None, trust_threshold=1.0, use_lp=False, lp_lambda_bound=1000000.0, lp_eps=1e-06, lp_max_iter=1000)[source]¶
Compute the LPEC-A predicted active set, optionally refining multipliers.
This is the main entry point for LPEC-A active set identification. It optionally solves the LPEC-A LP to obtain tighter multiplier estimates, then applies the threshold test (with the trust gate and rank-aware size cap from
identify_active_set_lpeca()).- Return type:
- Parameters:
c_ineq (Float[Array, 'm_ineq'])
c_eq (Float[Array, 'm_eq'])
grad (Float[Array, 'n'])
A_ineq (Float[Array, 'm_ineq n'])
A_eq (Float[Array, 'm_eq n'])
lambda_ineq (Float[Array, 'm_ineq'])
mu_eq (Float[Array, 'm_eq'])
sigma (float)
beta (float | None)
trust_threshold (float)
use_lp (bool)
lp_lambda_bound (float)
lp_eps (float)
lp_max_iter (int)
- Args:
c_ineq: Inequality constraint values at current point. c_eq: Equality constraint values at current point. grad: Objective gradient at current point. A_ineq: Inequality constraint Jacobian. A_eq: Equality constraint Jacobian. lambda_ineq: Current multiplier estimates for inequalities. mu_eq: Current multiplier estimates for equalities. sigma: Threshold exponent (default 0.9). beta: Threshold scaling factor (default: paper recommendation). trust_threshold: Maximum
rho_barfor which the predictionis trusted (see
identify_active_set_lpeca()).- use_lp: If True, solve the LPEC-A LP to refine multiplier
estimates before the threshold test. Requires
mpax.
lp_lambda_bound: Upper bound K_1 on lambda in the LP. lp_eps: Tolerance for the LP solver. lp_max_iter: Maximum LP solver iterations.
- Returns:
LPECAResultwith the predicted active mask and diagnostic flags.
- class slsqp_jax.LPECAResult[source]¶
Bases:
NamedTupleLPEC-A active-set identification result.
- Attributes:
- predicted: Boolean mask of shape
(m_ineq,).Truemeans the constraint is predicted active. When
valid=Falsethis is the all-Falsemask (the trust gate fired).- valid:
Truewhenrho_baris below the trust threshold, so the prediction is theoretically meaningful.
Falseindicates the iterate is too far from the solution for LPEC-A to be reliable.- capped:
Truewhen the rank-aware size cap truncated the prediction (the raw threshold predicted more than
n - m_eq - 1constraints active, so the most confident entries were kept).
rho_bar: The scalar proximity measure used for the trust gate.
- predicted: Boolean mask of shape
- predicted: Bool[Array, 'm_ineq']¶
Alias for field number 0
- valid: Bool[Array, '']¶
Alias for field number 1
- capped: Bool[Array, '']¶
Alias for field number 2
- rho_bar: Float[Array, '']¶
Alias for field number 3