The following instructions were prepared using
versioninfo()
Julia Version 1.8.5 Commit 17cfb8e65ea (2023-01-08 06:45 UTC) Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 16 × AMD Ryzen 7 PRO 4750U with Radeon Graphics WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-13.0.1 (ORCJIT, znver2) Threads: 8 on 16 virtual cores Environment: JULIA_NUM_THREADS = 8
using Pkg
Pkg.activate(".")
using Dates
now()
Activating project at `~/Mathematics/Teaching/2022/Computational Group Theory/2022 WS Karlsruhe/Symbolic SOS`
2023-02-17T12:04:31.847
using Plots
using LinearAlgebra
using Groups
using PropertyT
using PropertyT.StarAlgebras
We will show that $$\Delta^2 - 0.2\Delta \in \operatorname{SL}(3, \mathbb{Z})$$ is very close to being a sum of squares.
So far we only made the needed packages available in the notebook.
In the next cell we define G
to be the set of all $3\times 3$ matrices over $\mathbb Z$.
N = 3
G = Groups.MatrixGroups.SpecialLinearGroup{N}(Int8)
special linear group of 3×3 matrices over Int8
Now we create the elementary matrices $E_{i,j}$. The set of all such matrices and their inverses is denoted by S
.
S = gens(G)
S = union!(S, inv.(S))
12-element Vector{FPGroupElement{Groups.MatrixGroups.SpecialLinearGroup{3, Int8, DataType, Alphabet{Groups.MatrixGroups.ElementaryMatrix{3, Int8}}, Vector{Groups.MatrixGroups.ElementaryMatrix{3, Int8}}}, …}}: E₁₂ E₁₃ E₂₁ E₂₃ E₃₁ E₃₂ E₁₂^-1 E₁₃^-1 E₂₁^-1 E₂₃^-1 E₃₁^-1 E₃₂^-1
Now we will generate the ball E_R
of radius $R=4$ in $\operatorname{SL}(N,\mathbb{Z})$ and use this as a (partial) basis in a group ring (denoted by RG
below). Such group ring also needs a multiplication table (pm
, which is actually a division table) which is created as follows: when $x,y$ reside at positions i
-th and j
-th in E_R
, then pm[i,j] = k
, where k
is the position of $x^{-1}y$ in E_R
.
RG, S, sizes = PropertyT.group_algebra(G, S; halfradius=2, twisted=true)
[ Info: generating wl-metric ball of radius 4
0.347782 seconds (871.81 k allocations: 30.018 MiB, 92.16% compilation time)
[ Info: sizes = [13, 121, 883, 5455] [ Info: computing the *-algebra structure for G
0.053713 seconds (158.85 k allocations: 7.217 MiB, 93.01% compilation time)
(*-algebra of special linear group of 3×3 matrices over Int8, FPGroupElement{Groups.MatrixGroups.SpecialLinearGroup{3, Int8, DataType, Alphabet{Groups.MatrixGroups.ElementaryMatrix{3, Int8}}, Vector{Groups.MatrixGroups.ElementaryMatrix{3, Int8}}}, …}[E₁₂, E₁₃, E₂₁, E₂₃, E₃₁, E₃₂, E₁₂^-1, E₁₃^-1, E₂₁^-1, E₂₃^-1, E₃₁^-1, E₃₂^-1], [13, 121, 883, 5455])
sizes
4-element Vector{Int64}: 13 121 883 5455
Δ = length(S)*one(RG) - sum(RG(s) for s in S)
# Δ.coeffs
12·(id) -1·E₁₂ -1·E₁₃ -1·E₂₁ -1·E₂₃ -1·E₃₁ -1·E₃₂ -1·E₁₂^-1 -1·E₁₃^-1 -1·E₂₁^-1 -1·E₂₃^-1 -1·E₃₁^-1 -1·E₃₂^-1
using JuMP
using SCS
UB = 0.2
SDP_problem = PropertyT.sos_problem_primal(Δ^2, Δ, upper_bound=UB, augmented=false);
SDP_problem
A JuMP Model Maximization problem with: Variables: 7382 Objective function type: VariableRef `AffExpr`-in-`MathOptInterface.EqualTo{Float64}`: 5455 constraints `AffExpr`-in-`MathOptInterface.LessThan{Float64}`: 1 constraint `Vector{VariableRef}`-in-`MathOptInterface.PositiveSemidefiniteConeTriangle`: 1 constraint Model mode: AUTOMATIC CachingOptimizer state: NO_OPTIMIZER Solver name: No optimizer attached. Names registered in the model: P, psd, λ
function scs_optimizer(;
accel = 50,
alpha = 1.95,
eps = 1e-10,
max_iters = 10_000,
verbose = true,
linear_solver = SCS.DirectSolver,
)
return JuMP.optimizer_with_attributes(
SCS.Optimizer,
"acceleration_lookback" => accel,
"acceleration_interval" => 10,
"alpha" => alpha,
"eps_abs" => eps,
"eps_rel" => eps,
"linear_solver" => linear_solver,
"max_iters" => max_iters,
"warm_start" => true,
"verbose" => verbose,
)
end
@time status, _ = PropertyT.solve(SDP_problem, scs_optimizer());
@info status
25.111573 seconds (25.65 M allocations: 1.333 GiB, 1.55% gc time, 62.95% compilation time: 0% of which was recompilation) ------------------------------------------------------------------ SCS v3.2.1 - Splitting Conic Solver (c) Brendan O'Donoghue, Stanford University, 2012 ------------------------------------------------------------------ problem: variables n: 7382, constraints m: 12837 cones: z: primal zero / dual free vars: 5455 l: linear vars: 1 s: psd vars: 7381, ssize: 1 settings: eps_abs: 1.0e-10, eps_rel: 1.0e-10, eps_infeas: 1.0e-07 alpha: 1.95, scale: 1.00e-01, adaptive_scale: 1 max_iters: 10000, normalize: 1, rho_x: 1.00e-06 acceleration_lookback: 50, acceleration_interval: 10 lin-sys: sparse-direct-amd-qdldl nnz(A): 22036, nnz(P): 0 ------------------------------------------------------------------ iter | pri res | dua res | gap | obj | scale | time (s) ------------------------------------------------------------------ 0| 1.56e+02 1.00e+00 5.29e+01 2.20e+01 1.00e-01 2.38e-02 250| 1.96e-04 2.35e-05 1.69e-03 -1.99e-01 1.00e-01 8.39e-01 500| 1.83e-04 8.40e-06 3.12e-04 -2.00e-01 5.42e-03 1.67e+00 750| 1.27e-04 4.40e-06 1.99e-04 -2.00e-01 5.42e-03 2.54e+00 1000| 7.03e-05 2.86e-06 1.29e-04 -2.00e-01 5.42e-03 3.43e+00 1250| 6.36e-05 2.20e-06 8.56e-05 -2.00e-01 5.42e-03 4.27e+00 1500| 5.24e-05 1.80e-06 6.80e-05 -2.00e-01 5.42e-03 5.10e+00 1750| 5.35e-05 1.38e-06 5.63e-05 -2.00e-01 5.42e-03 5.92e+00 2000| 2.78e-03 2.23e-05 4.87e-04 -2.00e-01 5.42e-03 6.77e+00 2250| 1.87e-04 8.19e-07 1.07e-06 -2.00e-01 5.42e-03 7.64e+00 2500| 1.59e-07 8.62e-10 3.97e-08 -2.00e-01 5.42e-03 8.40e+00 2725| 9.50e-10 4.10e-12 8.27e-11 -2.00e-01 5.42e-03 9.08e+00 ------------------------------------------------------------------ status: solved timings: total: 9.08e+00s = setup: 1.62e-02s + solve: 9.06e+00s lin-sys: 7.26e-01s, cones: 7.68e+00s, accel: 1.63e-01s ------------------------------------------------------------------ objective = -0.200000 ------------------------------------------------------------------
[ Info: OPTIMAL
λ = value(SDP_problem[:λ])
0.19999999990216422
P = value.(SDP_problem[:P])
Q = real.(sqrt(P))
heatmap(Q, clim=(-0.4,0.4), color=:bluesreds, yflip=true, aspect_ratio=:equal, size=(500,500), legend=false)
sos = PropertyT.compute_sos(RG, Q, augmented=false)
residual = (Δ^2 - λ*Δ) - sos
coeffs(residual)
5455-element Vector{Float64}: -8.526512829121202e-13 1.9184653865522705e-13 2.2737367544323206e-13 2.6290081223123707e-13 1.9184653865522705e-13 1.8474111129762605e-13 1.9895196601282805e-13 1.9184653865522705e-13 2.2737367544323206e-13 2.5579538487363607e-13 1.8118839761882555e-13 1.9184653865522705e-13 2.0250467969162855e-13 ⋮ 9.159550025439237e-14 9.206124132886958e-14 1.9525493143152618e-13 9.287367375738445e-14 -2.16886519241217e-13 9.212404629560732e-14 9.307885538200859e-14 1.3393404144724133e-13 9.173053169614453e-14 9.447255742562001e-14 -2.174385935898179e-13 1.9500200065802886e-13
norm(residual, 1)
8.013365126577617e-10
So now we're in position where $$\Delta^2 - \lambda \Delta = \sum \xi_i^*\xi_i + residual$$
and we would like to dominate $residual$ by $R\Delta$ for some $R$ depending on $|residual|_1$. The problem is that we can do it only if $residual$ belongs to the augmentation ideal, however...
StarAlgebras.aug(residual)
-1.928179000086013e-10
StarAlgebras.aug(sos)
1.9280101511025753e-10
There are two ways to solve this:
0.1 + 0.2 - 0.3
5.551115123125783e-17
using PropertyT.IntervalArithmetic
u = @interval(0.1)
[0.0999999, 0.100001]
r = @interval 0.1 + @interval 0.2 - @interval 0.3
[-6.9389e-17, 2.77556e-17]
0 ∈ r
true
PropertyT.augment_columns!(Q)
Q_aug = Interval.(Q)
121×121 Matrix{Interval{Float64}}: [2.00364, 2.00365] … [-0.00705892, -0.00705891] [-0.309393, -0.309392] [0.0593636, 0.0593637] [-0.309393, -0.309392] [0.0552353, 0.0552354] [-0.309393, -0.309392] [0.0552353, 0.0552354] [-0.309393, -0.309392] [-0.0250792, -0.0250791] [-0.309393, -0.309392] … [0.0593636, 0.0593637] [-0.309393, -0.309392] [-0.0237578, -0.0237577] [-0.309393, -0.309392] [0.0593636, 0.0593637] [-0.309393, -0.309392] [0.0552353, 0.0552354] [-0.309393, -0.309392] [0.0552353, 0.0552354] [-0.309393, -0.309392] … [0.030087, 0.0300871] [-0.309393, -0.309392] [0.0593636, 0.0593637] [-0.309393, -0.309392] [-0.218148, -0.218147] ⋮ ⋱ ⋮ [0.0226735, 0.0226736] [0.0399426, 0.0399427] [-0.0078798, -0.00787979] … [0.0079027, 0.00790271] [0.0226735, 0.0226736] [-0.0333089, -0.0333088] [-0.00705891, -0.0070589] [-0.0585376, -0.0585375] [0.00680343, 0.00680344] [-0.132314, -0.132313] [0.0226735, 0.0226736] [0.0248844, 0.0248845] [0.0226735, 0.0226736] … [0.0248844, 0.0248845] [0.0530596, 0.0530597] [0.116407, 0.116408] [0.0226735, 0.0226736] [0.0248844, 0.0248845] [0.0226735, 0.0226736] [0.0248844, 0.0248845] [-0.0078798, -0.00787979] [0.105081, 0.105082] [-0.00705891, -0.0070589] … [0.339944, 0.339945]
0 in sum(Q_aug[:, 1])
true
sos_int = PropertyT.compute_sos(RG, Q_aug, augmented=true);
residual_int = Δ^2 - @interval(λ)*Δ - sos_int;
aug(residual_int)
[-1.75794e-10, 1.72987e-10]
0 in aug(residual_int)
true
Recall: if $x \in I[G]$, then there exists $R\in \mathbb{R}$ such that $x + R\Delta$ is a sum of squares.
It can be shown that for our $residual_{int}$ it is enough to take $R >= 2|residual_{int}|_1$.
Therefore
\begin{align*} \Delta^2 - \lambda \Delta & = \sum_i\xi_i^*\xi_i + residual\\ \Delta^2 - \lambda \Delta + 2|residual|_1\Delta & = \sum_i\xi_i^*\xi_i + \underbrace{residual + 2|residual|_1\Delta}_{\in \Sigma^2 I[G]} \end{align*}i.e. $\Delta^2 - (\lambda - 2|residual|_1)\Delta \in \Sigma^2I[G]$
λ_certified = @interval(λ) - 2norm(residual_int, 1)
[0.199995, 0.199996]
round(λ_certified.lo, digits=7, RoundDown)
0.1999958
κ = sqrt(2λ_certified/length(S))
[0.182572, 0.182573]
0.1825740
0.182574
Previously the best known bound was $0.00017...$ by M. Kassabov.
Now something happens: in the next cell we split the subspace of $\mathbb{R} \operatorname{SL}(N, \mathbb{Z})$ supported on E_R
into irreducible representations of the wreath product $\mathbb Z / 2 \mathbb Z \wr \operatorname{Sym}_N$. The action of wreath product on the elements of the matrix space is by conjugation, i.e. permutation of rows and columns.
We also compute projections on the invariant subspaces to later speed up the optimisation step.
using Groups
using PermutationGroups
using SymbolicWedderburn
wd = let RG = RG, N = N
G = StarAlgebras.object(RG)
P = PermGroup(perm"(1,2)", Perm(circshift(1:N, -1)))
Σ = Groups.Constructions.WreathProduct(PermGroup(perm"(1,2)"), P)
act = PropertyT.action_by_conjugation(G, Σ)
wdfl = @time SymbolicWedderburn.WedderburnDecomposition(
Float64,
Σ,
act,
basis(RG),
StarAlgebras.Basis{UInt16}(@view basis(RG)[1:sizes[2]]),
)
end
17.944732 seconds (43.99 M allocations: 2.208 GiB, 5.26% gc time, 99.84% compilation time)
Wedderburn Decomposition into 247 orbits and 5 simple summands of sizes [14, 9, 6, 14, 12]
elt = Δ^2
order_unit = Δ
UB = 0.200001
SDP_problem, varPs = PropertyT.sos_problem_primal(elt, order_unit, wd, upper_bound=UB, augmented=false);
SDP_problem
A JuMP Model Maximization problem with: Variables: 355 Objective function type: VariableRef `AffExpr`-in-`MathOptInterface.EqualTo{Float64}`: 247 constraints `AffExpr`-in-`MathOptInterface.LessThan{Float64}`: 1 constraint `Vector{VariableRef}`-in-`MathOptInterface.PositiveSemidefiniteConeTriangle`: 5 constraints Model mode: AUTOMATIC CachingOptimizer state: NO_OPTIMIZER Solver name: No optimizer attached. Names registered in the model: λ
@time status, _ = PropertyT.solve(SDP_problem, scs_optimizer());
@info status
0.528419 seconds (19.09 k allocations: 1.431 MiB) ------------------------------------------------------------------ SCS v3.2.1 - Splitting Conic Solver (c) Brendan O'Donoghue, Stanford University, 2012 ------------------------------------------------------------------ problem: variables n: 355, constraints m: 602 cones: z: primal zero / dual free vars: 247 l: linear vars: 1 s: psd vars: 354, ssize: 5 settings: eps_abs: 1.0e-10, eps_rel: 1.0e-10, eps_infeas: 1.0e-07 alpha: 1.95, scale: 1.00e-01, adaptive_scale: 1 max_iters: 10000, normalize: 1, rho_x: 1.00e-06 acceleration_lookback: 50, acceleration_interval: 10 lin-sys: sparse-direct-amd-qdldl nnz(A): 4132, nnz(P): 0 ------------------------------------------------------------------ iter | pri res | dua res | gap | obj | scale | time (s) ------------------------------------------------------------------ 0| 1.56e+02 1.00e+00 2.57e+01 9.66e+00 1.00e-01 2.89e-03 250| 2.28e-04 4.35e-05 7.65e-04 -2.00e-01 1.00e-01 4.01e-02 500| 1.24e-04 2.01e-05 3.00e-04 -2.00e-01 3.08e-02 7.77e-02 750| 4.08e-04 8.17e-06 1.39e-04 -2.00e-01 5.76e-03 1.16e-01 1000| 3.27e-04 5.15e-06 7.08e-05 -2.00e-01 5.76e-03 1.56e-01 1250| 2.98e-04 3.91e-06 5.14e-05 -2.00e-01 5.76e-03 1.95e-01 1500| 3.03e-04 2.93e-06 3.72e-05 -2.00e-01 5.76e-03 2.34e-01 1750| 3.16e-04 2.10e-06 2.55e-05 -2.00e-01 5.76e-03 2.73e-01 2000| 3.30e-04 1.43e-06 1.73e-05 -2.00e-01 5.76e-03 3.12e-01 2250| 3.39e-04 8.61e-07 1.12e-05 -2.00e-01 5.76e-03 3.51e-01 2500| 3.42e-04 3.29e-07 6.55e-06 -2.00e-01 5.76e-03 3.90e-01 2750| 1.65e-03 9.81e-06 4.10e-05 -2.00e-01 5.76e-03 4.31e-01 3000| 2.60e-04 4.02e-06 2.21e-05 -2.00e-01 5.76e-03 4.73e-01 3250| 2.43e-08 3.61e-10 3.34e-09 -2.00e-01 5.76e-03 5.14e-01 3325| 4.54e-09 1.19e-11 1.19e-10 -2.00e-01 5.76e-03 5.25e-01 ------------------------------------------------------------------ status: solved timings: total: 5.25e-01s = setup: 2.52e-03s + solve: 5.23e-01s lin-sys: 7.01e-02s, cones: 4.03e-01s, accel: 1.49e-02s ------------------------------------------------------------------ objective = -0.200001 ------------------------------------------------------------------
[ Info: OPTIMAL
λ = value(SDP_problem[:λ])
0.2000009999714222
using SparseArrays
let Ps = [value.(P) for P in varPs]
Qs = real.(sqrt.(Ps));
heatmap(Matrix(blockdiag(sparse.(Qs)...)), clim=(-0.4,0.4), color=:bluesreds, yflip=true, aspect_ratio=:equal, size=(500,500), legend=false)
end
Q_new = let Ps = [value.(P) for P in varPs]
Qs = real.(sqrt.(Ps));
PropertyT.reconstruct(Qs, wd)
end
# p1 = heatmap(Q_new, clim=(-0.4,0.4), color=:bluesreds, yflip=true, aspect_ratio=:equal, size=(500,500), legend=false)
# p2 = heatmap(Q, clim=(-0.4,0.4), color=:bluesreds, yflip=true, aspect_ratio=:equal, size=(500,500), legend=false)
# plot(p1, p1, layout=(1,2), legend = false, size=(1000, 500))
heatmap(Q-Q_new, clim=(-0.4,0.4), color=:bluesreds, yflip=true, aspect_ratio=:equal, size=(500,500), legend=false)
# project to the augmentation ideal
Q_aug = Interval.(PropertyT.augment_columns!(Q_new));
# compute sum of squares in RG defined by projected Q
sos_int = PropertyT.compute_sos(RG, Q_aug, augmented=false);
# compute the residual
residual_int = Δ^2 - @interval(λ)*Δ - sos_int;
# @assert 0 in StarAlgebras.aug(residual_int)
norm(residual_int, 1)
[1.71435e-08, 1.71651e-08]
λ_certified = (@interval(λ) - 2norm(residual_int, 1)).lo
0.20000096564128686
round(λ_certified, digits=6, RoundDown)
0.2