Merge pull request #10 from bsubercaseaux/symm-break-discussion

Add discussion of formalization difficulties in symmetry breaking
bsubercaseaux · Jun 11, 2024 · 34edc98 · 34edc98
2 parents 6a4f60b + 4551f83
commit 34edc98
Show file tree

Hide file tree

Showing 4 changed files with 50 additions and 11 deletions.
diff --git a/ITP/conclusions.tex b/ITP/conclusions.tex
@@ -35,12 +35,13 @@
 that did not match the (correct) code of Heule and Scheucher.
 Our formalization corrects this.
 
-In terms of future work,
-we hope to formally verify the result $h(7) = \infty$ due to Horton~\cite{hortonSetsNoEmpty1983},
+\subparagraph*{Future Work}
+We hope to formally verify the result $h(7) = \infty$ due to Horton~\cite{hortonSetsNoEmpty1983},
 and other results in Erd\H{o}s-Szekeres style problems.
-A key challenge for the community
-is to improve the connection between verified SAT tools and ITPs.
-This presents a significant engineering task
-for proofs that are hundreds of terabytes long (as in this result).
+
+We also want to improve the trust story of importing ``cube and conquer''-style results into an ITP.
+Importing these kinds of proofs is a significant engineering task
+when the proof certificate is hundreds of terabytes in size,
+as it was for this result (see \Cref{sec:running_cnf}).
 Although we are confident that our results are correct,
-the trust story at this connection point has room for improvement.
+more work needs to be done to strengthen the trust in this connection point.
diff --git a/ITP/encoding.tex b/ITP/encoding.tex
@@ -105,7 +105,7 @@
 which can be assembled into a $6$-hole $ap'ed'cb'$.
 
 Justifying this formally turned out to be complex,
-requring a fair bit of reasoning about point \lstinline|Arc|s
+requiring a fair bit of reasoning about point \lstinline|Arc|s
 and \lstinline|σCCWPoints|: lists of points winding around a convex polygon.
 Luckily, the main argument can be summarized in terms of two facts:
 (a) any triangle $abc$ contains an empty triangle $ab'c$; and
@@ -154,7 +154,8 @@
 are both empty shapes in $P$,
 then $S$ is an empty shape in $P$.
 
-\subparagraph*{Running the CNF.}
+\subsection{Running the CNF.}
+\label{sec:running_cnf}
 Having now shown that our main result follows if $\phi_{30}$ is unsatisfiable,
 we run a distributed computation to check its unsatisfiability.
 We solve the SAT formula~$\phi_{30}$ produced by Lean using the same setup as
@@ -164,17 +165,29 @@
 we partition the problem into 312\,418 subproblems.
 Each of these subproblems was
 solved using {\tt CaDiCaL} version 1.9.5.
-The solver produced an LRAT proof for each execution,
+{\tt CaDiCaL} produced an LRAT proof for each execution,
 which was validated using the {\tt cake\_lpr} verified checker on-the-fly
 in order to avoid writing/storing/reading large files.
 The total runtime was 25\,876.5 CPU hours, or roughly 3 CPU years.
 The difference in runtime relative to Heule and Scheucher's original run
 is purely due to the difference in hardware.
 Additionally,
-we validated that the subproblems cover the entire search space as Heule and Scheucher did~\cite[Section 7.3]{emptyHexagonNumber}.
+we validated that the subproblems cover the entire search space
+as Heule and Scheucher did~\cite[Section 7.3]{emptyHexagonNumber}.
 This was done by verifying the unsatisfiability
 of another formula that took 20 seconds to solve.
 
+The artifact for this paper includes scripts to validate any individual subproblem,
+as well as the summary proof that the subproblems cover the search space.
+However, the unsatisfiability of $\phi_{30}$ depends on
+the unsatisfiability of \textit{all} (hundreds of thousands of) subproblems.
+A skeptical reader might wish to examine the proof files for all subproblems,
+but we estimated the total proof size to be tens or hundreds of terabytes,
+far too much to reasonably store and distribute.
+Instead, the skeptical reader must run the entire 3 CPU year computation.
+We believe this trust story can be somewhat improved,
+but we leave such a challenge to future work.
+
 % file-local attic:
 
 % By using the triangulation lemma repeatedly,

diff --git a/ITP/main.bib b/ITP/main.bib
@@ -671,3 +671,21 @@ @InProceedings{sbva
   IGNOREURN =		{urn:nbn:de:0030-drops-184736},
   doi =		{10.4230/LIPIcs.SAT.2023.11},
 }
+
+@InProceedings{cube_and_conquer,
+author="Heule, Marijn J. H.
+and Kullmann, Oliver
+and Wieringa, Siert
+and Biere, Armin",
+editor="Eder, Kerstin
+and Louren{\c{c}}o, Jo{\~a}o
+and Shehory, Onn",
+title="Cube and Conquer: Guiding CDCL SAT Solvers by Lookaheads",
+booktitle="Hardware and Software: Verification and Testing",
+year="2012",
+publisher="Springer Berlin Heidelberg",
+address="Berlin, Heidelberg",
+pages="50--65",
+abstract="Satisfiability (SAT) is considered as one of the most important core technologies in formal verification and related areas. Even though there is steady progress in improving practical SAT solving, there are limits on scalability of SAT solvers. We address this issue and present a new approach, called cube-and-conquer, targeted at reducing solving time on hard instances. This two-phase approach partitions a problem into many thousands (or millions) of cubes using lookahead techniques. Afterwards, a conflict-driven solver tackles the problem, using the cubes to guide the search. On several hard competition benchmarks, our hybrid approach outperforms both lookahead and conflict-driven solvers. Moreover, because cube-and-conquer is natural to parallelize, it is a competitive alternative for solving SAT problems in parallel.",
+isbn="978-3-642-34188-5"
+}
diff --git a/ITP/symmetry-breaking.tex b/ITP/symmetry-breaking.tex
@@ -131,3 +131,10 @@
 \end{align*}
 This concludes the proof.
 \end{proof}
+
+Compared to the symmetry-breaking transformation described by Heule and Scheucher,
+our transformation is simpler.
+Nonetheless, proving the above theorem in Lean was tedious,
+as we had to show that the properties from the previous steps were preserved at each new step,
+which carried substantial proof burden.
+In particular, steps 3 through 6 required careful bookkeeping and special handling of the distinguished point~$p_1$.