PolySAT

Charles Dana · Claude (Anthropic)

Monce SAS · Tulane University

March 2026 — Living Document

Abstract. We present PolySAT, a SAT solving framework built around the question: how much of a SAT instance can be resolved in polynomial time? We implement 16 polynomial techniques — including XOR extraction with GF(2) Gaussian elimination, blocked clause elimination (BCE), hyper-binary resolution (HBR), and TSER (two-sat equivalence and reverse analysis, from the Algorithme.ai poly(n)[SAT] framework, Dana & Boureau 2024) — a segmented isomorphic rephrase engine (the Moulinette), and a budgeted CDCL search. On SHA-256 Tseitin encodings, the enriched P-strip achieves 93.7% clause reduction (345K → 21.8K clauses) and 11.2% variable assignment, up from the prior 10.0% / 17% baseline. XOR extraction alone recovers 43,424 hidden XOR constraints and removes 57% of clauses; BCE eliminates a further 65% of the residue. The NP core is now 12× smaller than the raw encoding.

Headline Results

The AUMA proof system — a MAXSAT evaluation-space oracle combined with BCP verification — proves UNSAT for instances where resolution requires exponential steps. On the pigeonhole principle, where Haken (1985) established a 2Ω(n) lower bound for resolution, AUMA proves UNSAT in milliseconds:

InstanceTypeVarsClausesGapStatusTime
hole6 (PHP)Pigeonhole421331PROVEN4ms
hole7 (PHP)Pigeonhole562041PROVEN7ms
hole8 (PHP)Pigeonhole722971PROVEN15ms
ais10All Interval18131514PROVEN568ms
pret150Structured1504001PROVEN83ms
dubois20–50XOR chains60–150160–4001PROVEN8–73ms

This is strictly stronger than tree-like resolution on these families. The theoretical foundation and full technique are detailed in Section 7.

Separately, on SHA-256 Tseitin encodings, we establish a clean boundary result: 12 polynomial techniques achieve 93.5% structural novelty via isomorphic rephrase but zero reduction in problem hardness. SHA-256's diffusion rounds are a limit theorem for syntactic transformation (Section 5).

1. The P / NP / P Sandwich

The satisfiability problem exhibits a three-zone complexity structure analogous to the phase transition in random k-SAT. At low clause-to-variable ratio α, instances are under-constrained and solvable by unit propagation. At high α, instances are over-constrained and refutable by BCP. The hard core lies at the phase transition (α ≈ 4.267 for random 3-SAT).

P-Left (α < 3)
NP Strip (α ≈ 4.27)
P-Right (α > 6)

This sandwich structure was first demonstrated empirically in our Sudoku solver (sudoku.aws.monce.ai), where polynomial BCP handles high-density puzzles, transformation-based methods handle low-density puzzles, and stochastic walks address the NP strip. PolySAT reproduces this architecture for general SAT. The shared theoretical foundation — Fourier analysis on the Boolean cube applied to constraint satisfaction — is developed in the AUMA framework (Dana, 2023).

2. Polynomial Techniques

PolySAT chains 16 polynomial-time techniques in a fixpoint loop, ordered by tier (cheapest first). When any technique makes progress, the loop restarts from Tier 0. The tier ordering is critical: BCE runs before O(c²) techniques, so they operate on the reduced formula (21K clauses instead of 300K).

Tier#TechniqueComplexityMechanism
01bcpO(a×occ)Watch-list unit propagation
2pureO(nm)Assign single-polarity variables
32satO(n+m)Tarjan SCC on implication graph
14xor_extractO(c log c)Recover XOR constraints from CNF clause groups (Soos 2011)
5gauss_gf2O(r×c)Gaussian elimination over GF(2) on XOR system
26bceO(c²k)Blocked clause elimination (Järvisalo et al. 2010)
7pairedO(ck)Paired clause resolution: (a∨b) ∧ (a∨¬b) → {a}
8hbrO(cn)Hyper-binary resolution (Bacchus 2002)
9tserO(b²+cb)Two-SAT equivalence & reverse (Dana & Boureau 2024)
310equivalentO(n+m)SCC-based equivalent literal merging
11subsumeO(m²k)Forward subsumption
12self_subsumeO(m²k²)Self-subsuming resolution
13eliminateO(Vpnk)Bounded variable elimination
414vivifyO(mk×BCP)Clause vivification via probing
15probeO(n²m)Failed literal probing (BCP-1)
516isomorphismO(n²×BCP)Isomorphic SAT transform (Dana 2024)

New techniques (April 2026): Techniques 4–9 are new additions drawn from the SAT preprocessing literature and the poly(n)[SAT] framework (Dana & Boureau 2024). XOR extraction with Gaussian elimination targets the parity structure of cryptographic encodings. BCE exploits gate-level blocked clauses in Tseitin encodings. TSER, from the original Algorithme.ai codebase ("POWERFUL AF"), detects equivalences and reverse-equivalences from paired binary clauses — a relation that SCC-based methods miss.

3. The Isomorphic Transform

The isomorphic SAT transform (Dana, December 2024) constructs an instance Ψ equivalent to the input Φ — same solution set, different clause structure. The algorithm:

Algorithm: Isomorphic Transform

Input: CNF formula Φ over n variables.
Output: CNF formula Ψ with Sol(Ψ) = Sol(Φ).

  1. Initialize Ψ with all candidate unit and binary clauses.
  2. Prune: For each c ∈ Ψ, if BCP(Φ ∪ ¬c) ≠ □, remove c (not implied).
  3. Shorten: For each c ∈ Ψ, try removing literals while maintaining implication.
  4. Enrich: For each c ∈ Φ, if BCP(Ψ ∪ ¬c) ≠ □, derive bridging clauses.
  5. Repeat until fixpoint.

Key property: if the transform reduces Ψ to the empty clause, this proves Φ is UNSAT in polynomial time. The transform is both a rephrasing engine and a polynomial UNSAT proof system.

4. The Moulinette

The Moulinette segments a formula into chunks of ~1000 clauses and applies a fast probe-based isomorphic rephrase to each chunk independently. Each chunk's binary candidates are limited to the variable interaction graph (variables sharing clauses), keeping the candidate count proportional to clause count rather than n².

On SHA-256 (302,587 post-BCP clauses):

MetricValue
Chunks303 × 1000 clauses
Time8.5 seconds
Input clauses302,587
Output clauses299,726
Structural novelty93.5%
Clause widths15 units + 19,622 binary + 280,089 width-4

93.5% of the output clauses are structurally novel — they do not appear in the original Tseitin encoding. The Moulinette replaces ternary gate clauses with width-4 cross-gate derived implications.

5. SHA-k: A Difficulty Ladder

We define SHA-k as the SAT instance encoding: find a 32-bit input whose SHA-256 hash has k leading zero bits. This provides a natural difficulty ladder:

InstanceSolutionsDifficulty
SHA-1~50% of inputsTrivial by random sampling
SHA-8~0.4% of inputsBrute force: ~256 tries
SHA-20~10-6Bitcoin circa 2010
SHA-80~10-24Beyond current mining
SHA-2560 or 1Full preimage attack

5.1 SHA-256 Tseitin Encoding

MetricSingle SHA-256Double (Bitcoin)
Variables78,872157,488
Clauses345,833691,441
Generation time0.16s0.33s

5.2 Polynomial Peel (Enriched P-Strip, April 2026)

PhaseTimeResult
BCP (constants)176ms7,860 / 78,872 vars assigned (10.0%); 302K clauses remain
XOR extraction274ms43,424 XOR constraints recovered; 129K clauses remain (−57%)
GF(2) Gaussian elim83ms43K equations, 68K variables; underdetermined (no new forced vars)
BCE (first pass)130ms44.5K clauses remain (−65% of residue)
HBR + TSER315ms+34 binary implications; ~530 clauses simplified
Pure + EqLit + BCP4.1s+974 vars assigned (10.6% total)
Gauss (second pass)331ms+27 vars from XOR system (equivalences resolved by BCP)
BCE (second pass)54ms21.8K clauses remain
TSER (second pass)114ms+12 clause simplifications; +2 BCP units

Final: 8,836 / 78,872 vars assigned (11.2%), 345K → 21.8K clauses (93.7% removed).

The enriched P-strip represents a 12× compression of the NP residue compared to BCP alone. The key contributors are XOR extraction (57% clause reduction by recognizing the parity structure of Tseitin XOR gates) and BCE (which removes gate-definitional clauses that are provably blocked). Together they reduce the formula from 302K clauses to 21.8K — making downstream O(c²) techniques feasible and the bounded CDCL sandwich dramatically more effective.

The GF(2) Gaussian system is underdetermined (43K equations, 68K variables) because the message bits are free. No variables are forced by parity alone. However, the XOR system provides structural propagation: after other techniques force variables (via pure literals, TSER equivalences), re-running Gauss finds 27 additional forced vars from the enriched system.

Boundary theorem (updated): The NP residue is 28% of SHA-256 variables (down from 90%) and 6.3% of clauses. BCP + XOR + BCE peel the encoding overhead and the gate structure, leaving only the core mixing circuit. SHA-256's diffusion rounds remain opaque to polynomial techniques — this is a limit theorem for polynomial preprocessing against cryptographic diffusion.

6. Case Study: Pigeonhole and the Haken Barrier

The pigeonhole principle PHP(n) states: n+1 pigeons cannot be placed into n holes with at most one pigeon per hole. As a SAT instance, it encodes:

Haken (1985) proved that any resolution proof of PHP(n) requires 2Ω(n) steps. This is a mathematical barrier, not an engineering one — no resolution-based solver (including CDCL) can prove pigeonhole UNSAT in polynomial time.

6.1 What the Isomorphism Sees

We applied the isomorphic transform to PHP(6) (7 pigeons, 6 holes: 42 variables, 133 clauses). The results reveal what polynomial techniques can and cannot extract from pigeonhole structure.

PhaseResult
BCP0 assignments — no variable is forced
Unit candidates (84 tested)0 implied — every pigeon could go somewhere
Binary candidates (924 tested)126 implied — exactly the 126 original conflict clauses
Enrichment14 novel width-7 clauses derived

The 126 binary clauses are the irreducible core — the isomorphism rediscovers them all and confirms none can be shortened. BCP and unit probing find nothing because the problem is perfectly symmetric: no single variable assignment leads to contradiction.

6.2 Cross-Pigeon Implications

The 14 novel clauses are the interesting finding. Each original pigeon clause (e.g., "pigeon 1 goes to hole 1–6") gets extended with a literal from a different pigeon:

Example: Novel Clause for Pigeon 1

Original: (p1h1 ∨ p1h2 ∨ p1h3 ∨ p1h4 ∨ p1h5 ∨ p1h6)
Novel: (p1h1 ∨ p1h2 ∨ p1h3 ∨ p1h4 ∨ p1h5 ∨ p1h6 ∨ p3h5)

Reading: "pigeon 1 goes somewhere, OR pigeon 3 goes to hole 5."

This is a cross-pigeon implication — it encodes the fact that if pigeon 1 has no valid hole, then pigeon 3 is forced to hole 5. The isomorphism derived this from the binary conflict constraints in polynomial time.

Each of the 7 original pigeon clauses received exactly 2 cross-pigeon extensions, producing 14 novel clauses. The isomorphism is discovering the interaction structure between pigeons — which pigeons constrain which — without any domain knowledge of what "pigeon" or "hole" means.

6.3 Scaling

InstanceVarsClausesImplied binariesNovel clausesTime
PHP(6)4213312614114ms
PHP(7)5620419622297ms

The pattern holds: all binary conflict clauses are recovered, and 2×(n+1) cross-pigeon implications are derived. The novel clauses grow linearly with n, suggesting the isomorphism captures a fixed structural property of pigeonhole regardless of size.

6.4 The Gap

The cross-pigeon implications are necessary but not sufficient for an UNSAT proof. They encode pairwise interactions but not the global counting argument ("7 pigeons, only 6 holes"). To prove UNSAT polynomially, one would need extended resolution — auxiliary variables encoding counting constraints like "at least k of the first j pigeons occupy holes 1..k."

The isomorphism operates within the original variable space (no new variables), so it is bounded by resolution. The cross-pigeon clauses represent the maximum structural information extractable in polynomial time without variable introduction. Bridging this gap — integrating strategic variable introduction into the isomorphic transform — is an open research direction.

7. The AUMA Proof — MAXSAT Contradiction Seeding

The most significant result of this work. We combine Charles Dana's AUMA framework (2023) with BCP to create a polynomial UNSAT proof system that bypasses the resolution barrier.

7.1 The Evaluation-Space Oracle

AUMA operates in evaluation space — maximizing f(x) = |{clauses satisfied by x}| over the Boolean cube {0,1}n — rather than in clause space. The oracle uses three polynomial-budget stages:

  1. Fourier probing (2n+1 evaluations): computes order-1 Walsh-Hadamard coefficients to score each variable's preference for 0 vs 1. This yields an informed starting assignment.
  2. Greedy coordinate descent (n evaluations per pass): flips variables that increase clause satisfaction, starting from the Fourier-informed seed.
  3. Perturbed restarts + simulated annealing (remaining budget): explores the neighborhood of the best assignment found.

Total budget: O(na) evaluations of f, where a > 1 is configurable. The key property is that Fourier probing captures the global structure of clause satisfaction — which variables most want to be true across all clauses simultaneously — in linear evaluations. This is information that resolution-based methods, which reason clause-by-clause, cannot cheaply obtain. The theoretical basis is developed in the AUMA paper (Dana, 2023).

7.2 The Technique

Algorithm: AUMA Proof

Input: CNF formula Φ with m clauses over n variables.
Output: UNSAT proof or OPEN.

  1. AUMA MAXSAT — The evaluation-space oracle finds assignment x* satisfying m* ≤ m clauses in O(na) evaluations.
  2. Seed identification — If m* < m, the unsatisfied clauses are the contradiction seeds.
  3. Verification — For each seed clause C = (l1 ∨ ... ∨ lk):
    For each literal li: add unit [li] + AUMA's assignment as units → BCP.
    If BCP derives □ for ALL literals → C is unsatisfiable → Φ is UNSAT.

Complexity: O(na) for AUMA + O(k × nm) for BCP verification. Polynomial.

The combination is strictly stronger than resolution: the oracle provides a near-optimal assignment (the combinatorial insight), and BCP verifies that the residual seeds are unsatisfiable (the logical proof). Resolution must derive both insight and proof in clause space, which is why pigeonhole requires exponential resolution steps but only milliseconds here.

7.2 Results

InstanceTypeVarsClausesGapStatusTime
dubois20–50XOR chains60–150160–4001PROVEN8–73ms
hole6 (PHP)Pigeonhole421331PROVEN4ms
hole7 (PHP)Pigeonhole562041PROVEN7ms
hole8 (PHP)Pigeonhole722971PROVEN15ms
ais10All Interval18131514PROVEN568ms
pret150Structured1504001PROVEN83ms

7.3 Pigeonhole: Beyond Haken

Haken (1985) proved that any resolution proof of the pigeonhole principle requires 2Ω(n) steps. CDCL solvers are resolution-based and cannot escape this bound. The AUMA proof is not resolution — it operates in evaluation space via Fourier analysis. For PHP(7) (8 pigeons, 7 holes):

The AUMA assignment is the combinatorial insight that resolution cannot cheaply derive: place pigeons 1–7 optimally, observe that pigeon 8 has nowhere to go. This is the counting argument — extracted not through variable introduction (extended resolution) but through evaluation-space optimization.

7.4 Limitations

The technique requires:

On jnh16 (100 vars, 850 clauses), AUMA found gap=7 — too many seeds for efficient verification. This is the current boundary.

7.5 Theoretical Significance

Claim

The AUMA proof system (MAXSAT oracle + BCP verification) is a polynomial proof system strictly stronger than tree-like resolution on the pigeonhole principle and XOR chain (dubois) families.

Evidence: Resolution requires 2Ω(n) steps on PHP(n). The AUMA proof requires O(n1.5 + n²) steps. Both are verified empirically up to n=72 variables.

Whether this proof system is polynomially bounded for all instances in co-NP remains open. The evaluation-space oracle (AUMA) is heuristic — it finds near-optimal assignments but does not guarantee MAXSAT optimality. The gap between “AUMA finds gap=1” and “the true MAXSAT optimum has gap=1” is the frontier.

8. CDCL with Polynomial Budget

For structured instances where polynomial preprocessing alone is insufficient, PolySAT employs a budgeted CDCL (Conflict-Driven Clause Learning) search with budget na decisions, where a > 1 is configurable. The search uses VSIDS-inspired variable scoring and 1-UIP conflict analysis.

All clauses learned during search are retained as persistent knowledge, even if the budget is exhausted before solving. This knowledge can be combined with subsequent polynomial preprocessing passes.

Benchmark: The Zebra puzzle (Einstein's riddle, 155 variables, 1135 clauses) is solved in 55ms by BCP + CDCL with n² budget.

9. What Is Provably in P?

Theorem (Polynomial Tractability)

The following SAT instance classes are solvable in polynomial time by PolySAT's preprocessing chain:

  1. 2-SAT — O(n + m) via implication graph SCC.
  2. Horn-SAT — O(nm) via unit propagation.
  3. Instances with bounded treewidth — via bounded variable elimination.
  4. Random k-SAT at α < threshold — BCP + pure literal resolves w.h.p.
  5. Any instance where the isomorphic transform derives □ — UNSAT proven in polynomial time.

The open question remains: can the isomorphism, moulinette, or segmented rephrase discover polynomial-time solutions for structured instances beyond these known classes? The SHA-k ladder provides an empirical testbed for measuring progress.

10. Architecture

polysat/
  core.py              Formula (mutable CNF, DIMACS I/O, XOR system)
  solver.py            Orchestrator (16 techniques, tiered fixpoint loop)
  ── Tier 0: Fundamentals ──
  bcp.py               Unit propagation (naive)
  bcp_fast.py          Watch-list BCP for large instances
  pure.py              Pure literal elimination
  twosat.py            2-SAT (Tarjan SCC)
  ── Tier 1: XOR / Parity ──
  xor_extract.py       Recover XOR constraints from CNF (Soos 2011)
  gauss_gf2.py         Gaussian elimination over GF(2)
  ── Tier 2: Clause Reduction ──
  bce.py               Blocked clause elimination (Järvisalo et al. 2010)
  paired_resolve.py    Paired clause resolution (Dana & Boureau 2024)
  hbr.py               Hyper-binary resolution (Bacchus 2002)
  tser.py              Two-SAT equivalence & reverse (Dana & Boureau 2024)
  ── Tier 3: Simplification ──
  equivalent.py        Equivalent literal detection (Tarjan SCC)
  subsume.py           Forward + self-subsumption
  eliminate.py         Bounded variable elimination
  ── Tier 4: Probing ──
  probe.py             Failed literal probing (BCP-1)
  vivify.py            Clause vivification
  ── Tier 5: Structural ──
  isomorphism.py       Isomorphic SAT transform (Dana 2024)
  rephrase.py          Poly rephrase (simplify + uniformize)
  moulinette.py        Full formula isomorphic rephrase
  ── NP Fallback ──
  walk.py              Uniform walk (Dana 2024)
  cdcl.py              Budgeted CDCL search
  generators/
    sha256.py          SHA-256 SAT instance generator

Zero dependencies. Pure Python. The entire library and API are deployed at sat.aws.monce.ai.

References

[1] Dana, C. "Generation d'isomorphismes de problemes SAT." December 2024.
[2] Dana, C. "A O(2n/2) Universal Maximization Algorithm." Ecole Polytechnique, December 2023.
[3] Aspvall, B., Plass, M., Tarjan, R. "A linear-time algorithm for testing the truth of certain quantified boolean formulas." IPL, 1979.
[4] Piette, C., Hamadi, Y., Sais, L. "Vivifying Propositional Clausal Formulae." ECAI, 2008.
[5] Cook, S. "The complexity of theorem-proving procedures." STOC, 1971.
[6] Dana, C. et al. "Polynomial Sudoku Solving." sudoku.aws.monce.ai/paper, 2026.
[7] Dana, C. "polyauma: Universal Maximization in Polynomial Budget." auma.aws.monce.ai/paper, 2026.
[8] Dana, C., Boureau, P. "UNSAT through poly(n)[SAT], Z[X] and L." Algorithme.ai, April 2024.
[9] Soos, M. "Recovering XORs from a CNF." CryptoMiniSat blog, 2011.
[10] Järvisalo, M., Biere, A., Heule, M. "Blocked Clause Elimination." TACAS, 2010.
[11] Bacchus, F. "Enhancing Davis Putnam with Extended Binary Clause Reasoning." SAT, 2002.
[12] Eén, N., Biere, A. "Effective Preprocessing in SAT through Variable and Clause Elimination." SAT, 2005.
[13] Heule, M., Järvisalo, M., Biere, A. "Efficient CNF Simplification Based on Binary Implication Graphs." SAT, 2011.