GPU Solver Jacobian: Autodiff vs Analytical¶

Historical document

This documents a solved investigation. The files referenced below (dc_gpu.py, transient_gpu.py, benchmarks/c6288.py) no longer exist. DC analysis is now in analysis/dc.py, transient in analysis/transient/, and all device models use OpenVAF analytical Jacobians.

Status: SOLVED¶

The GPU solver has been fully migrated to analytical Jacobians, eliminating the convergence issues. This document describes the original problem and solution for reference.

The Original Problem¶

The GPU solver originally used JAX autodiff (via sparsejac) to compute Jacobians. This caused convergence issues for circuits with floating nodes (like series NMOS stacks in AND gates).

Root Cause¶

When a MOSFET is in cutoff or deep saturation, the autodiff-computed dIds/dVds (output conductance) can be extremely small: - GPU autodiff: gds ≈ 1e-16 S in cutoff - VACASK analytical: gds = 1e-9 S minimum (enforced)

This creates nearly-singular Jacobians that cause Newton-Raphson to diverge.

Comparison at AND Gate `int` Node (Floating)¶

Property	GPU (autodiff)	VACASK (analytical)
gds in cutoff	~1e-16 S	1e-9 S (min enforced)
Convergence	Fails (oscillates)	Succeeds (34 iters)
Final `int` voltage	Diverges to ±2.4V	Settles to 0.6V

How openvaf_jax Works¶

The openvaf_jax.py module translates Verilog-A models (via OpenVAF MIR) to JAX functions that return both residual and analytical Jacobian:

# Generated function signature
def device_eval(inputs: List[float]) -> Tuple[Dict, Dict]:
    """
    Returns:
        residuals: Dict[node, {'resist': float, 'react': float}]
        jacobian: Dict[(row, col), {'resist': float, 'react': float}]
    """

Key Code Location¶

In openvaf_jax.py lines 262-277:

# Build output expressions
lines.append("    residuals = {")
for node, res in self.dae_data['residuals'].items():
    resist_val = res['resist'] if res['resist'] in defined_vars else '0.0'
    react_val = res['react'] if res['react'] in defined_vars else '0.0'
    lines.append(f"        '{node}': {{'resist': {resist_val}, 'react': {react_val}}},")

lines.append("    jacobian = {")
for entry in self.dae_data['jacobian']:
    key = f"('{entry['row']}', '{entry['col']}')"
    resist_val = entry['resist'] if entry['resist'] in defined_vars else '0.0'
    react_val = entry['react'] if entry['resist'] in defined_vars else '0.0'
    lines.append(f"        {key}: {{'resist': {resist_val}, 'react': {react_val}}},")

The Jacobian entries come from dae_data['jacobian'] which is extracted from the OpenVAF MIR - these are analytical derivatives computed by OpenVAF's symbolic differentiation of the Verilog-A model.

Current State¶

VACASK Path (works)¶

c6288.sim → parser → psp103n/psp103p model
                          ↓
              create_simple_mosfet_eval() ← Uses explicit gm, gds stamps
                          ↓
              build_sparse_jacobian_and_residual() ← Stamps conductances into J

GPU Path (broken)¶

c6288.sim → parser → psp103n/psp103p model
                          ↓
              create_simple_mosfet_eval() ← Same simplified model
                          ↓
              build_circuit_residual_fn() ← Residual only, no J
                          ↓
              sparsejac.jacrev(residual_fn) ← Autodiff gives wrong gds

Solution Options¶

Option 1: Add gds_min to GPU Model (Partial Fix)¶

Add minimum conductance leakage: Ids += gds_min * Vds

Status: Implemented but insufficient - gds_min only helps in cutoff, not saturation.

Option 2: Use Analytical Jacobian from openvaf_jax (Preferred)¶

Create PSP103 device via openvaf_jax and use its analytical Jacobian:

from openvaf_jax import OpenVAFToJAX

# Compile PSP103 model to JAX
translator = OpenVAFToJAX.from_file("psp103v4.va")
psp103_eval = translator.translate()

# In GPU residual function:
def residual_fn(V):
    for device in mosfets:
        inputs = build_inputs(V, device)
        residuals, jacobian = psp103_eval(inputs)  # Both from same call!
        # Stamp residuals into f
        # Jacobian available for building J analytically

This requires restructuring the GPU solver to: 1. Build residual AND Jacobian together (not separately via autodiff) 2. Use JAX's sparse matrix operations with explicit stamp values

Option 3: Hybrid Approach¶

Use autodiff for most of the circuit but override MOSFET gds with minimum value:

# After autodiff Jacobian computation
J_diag = jnp.diag(J)
J_diag = jnp.where(jnp.abs(J_diag) < gds_min, jnp.sign(J_diag) * gds_min, J_diag)

Files Involved¶

vajax/analysis/dc.py - DC operating point solver
vajax/analysis/transient/ - Transient analysis (scan/loop strategies)
vajax/analysis/solver.py - Newton-Raphson with lax.while_loop
openvaf_jax/ - OpenVAF→JAX translator
vajax/devices/verilog_a.py - VerilogADevice wrapper

Test Case¶

The and_test circuit in c6288.sim is a good minimal test: - 6 MOSFETs (NAND + inverter) - Floating int node between series NMOS mn1/mn2 - VACASK converges in 34 iterations - GPU solver diverges

Verification: PSP103 via openvaf_jax Works¶

Confirmed that PSP103 can be compiled and evaluated via openvaf_jax:

import openvaf_py
import openvaf_jax

va_path = 'vendor/OpenVAF/integration_tests/PSP103/psp103.va'
modules = openvaf_py.compile_va(va_path)
module = modules[0]

translator = openvaf_jax.OpenVAFToJAX(module)
eval_fn = translator.translate()

# Evaluate - returns BOTH residual and analytical Jacobian
residuals, jacobian = eval_fn(inputs)

Results: - Module: PSP103VA - Nodes: 13 (internal nodes for the compact model) - Parameters: 2616 - Jacobian entries: 56 (analytical, not autodiff!)

The 56 Jacobian entries are computed symbolically by OpenVAF from the Verilog-A model equations - these are the well-conditioned conductance stamps that SPICE simulators rely on.

How OpenVAF Handles Hidden States (Cache Slots)¶

OpenVAF's compilation pipeline distinguishes between: 1. Model/Instance Parameters: User-specified values (like VTH0, KP, W, L) 2. Voltages: Circuit node voltages from the solver 3. Hidden States (Cache Slots): Intermediate values computed by the device model

The Init/Eval Split in OSDI¶

OSDI models have two functions: - Init function: Runs once per operating point, computes parameter-dependent values - Eval function: Runs every Newton iteration, uses init outputs + voltages

The init function computes "cached" values that don't depend on voltages:

// Example from a resistor model
// Init computes temperature-adjusted resistance:
res = R * pow(temperature / tnom, zeta);  // Cached, doesn't change during NR

These cached values are stored in "cache slots" and passed to the eval function.

OpenVAF MIR Structure¶

In the MIR (Mid-level IR), cached values appear as:

// From openvaf/sim_back/src/init.rs
pub struct CacheSlot(u32);

// The init function outputs to cache slots
// The eval function reads from cache slots

The mapping between init outputs and eval inputs is tracked in cache_mapping:

# From openvaf_py
cache_mapping = [
    {'init_value': 'init_v30', 'eval_param': 'v86'},  # sign
    {'init_value': 'init_v31', 'eval_param': 'v37'},  # beta
]

How openvaf_jax Handles This¶

The JAX translator currently requires users to provide ALL parameters as inputs, including those that should be computed by init:

# What the user provides:
inputs = [
    0,        # PMOS flag
    1.0,      # sign (ideally computed from PMOS)
    0.0,      # V(s)
    1.2,      # V(g)
    1.2,      # Vgs (should be computed: V(g) - V(s))
    1.2,      # V(d)
    1.2,      # Vds (should be computed: V(d) - V(s))
    # ... etc
]

The eval function DOES compute voltage-derived values (Vgs, Vds) internally from the terminal voltages. But it expects the init-computed values (like beta) to be provided.

Key Insight: Voltage-Derived Values¶

Looking at the MOSFET model:

// These are computed EVERY iteration from voltages
Vgs = V(g) - V(s);
Vds = V(d) - V(s);
Vov = Vgs - VTH0;

// These are computed ONCE in init
beta = KP * W / L;  // Doesn't depend on voltages

The JAX-generated code correctly computes Vgs, Vds, Vov from the input voltages on every call. The issue is that values like beta need to be computed once and passed in.

Solution for GPU Solver Integration¶

To use openvaf_jax models in the GPU solver:

Run init once at start of Newton iteration to compute cache slots
Pass cache values as part of the inputs array
Eval function computes residuals and Jacobians from voltages + cache

This matches how VACASK works: it runs init once, then calls eval repeatedly with the cached values until convergence.

Solution Implemented¶

The GPU solvers now use analytical Jacobians computed via the Shichman-Hodges MOSFET model:

dc_gpu.py: build_analytical_residual_and_jacobian_fn() computes both residual and Jacobian in one pass
transient_gpu.py: build_transient_residual_and_jacobian_fn() adds capacitor handling for transient analysis

Key Implementation Details¶

# Level-1 Shichman-Hodges MOSFET with explicit gm/gds computation
vth0 = 0.4
kp = 200e-6
lambda_ = 0.01
gds_min = 1e-9  # Minimum conductance for cutoff

# Analytical Jacobian stamps (Y-matrix)
stamps = {
    ('D', 'D'): gds,
    ('D', 'G'): gm,
    ('D', 'S'): -gds - gm,
    ('S', 'D'): -gds,
    ('S', 'G'): -gm,
    ('S', 'S'): gds + gm,
}

Results¶

Circuit	Before (autodiff)	After (analytical)
Inverter	6 iterations	5 iterations
AND gate	FAILED	78 iterations
NOR gate	~15 iterations	~10 iterations

The sparsejac dependency has been removed from the GPU solver code path.