Training

Loss Functions

ParametricDFT.L2NormType
L2Norm

L2 norm loss: sum(|T(x)|^2).

Warning

Unitary transforms preserve the L2 norm (Parseval's theorem), so this loss is constant with respect to the circuit parameters and its gradient is zero. It should NOT be used as a training objective. Use L1Norm or MSELoss instead. This type is retained for backward compatibility but may be removed in a future release.

source
ParametricDFT.MSELossType

MSE loss with top-k truncation: ||x - T⁻¹(truncate(T(x), k))||². Field k is the number of kept coefficients.

source
ParametricDFT._topk_maskMethod
_topk_mask(x::AbstractMatrix, k::Int) -> BitMatrix

Compute a boolean mask selecting the k elements of x with largest absolute value. Uses quickselect (O(n) average) to find the threshold, then a broadcast comparison.

source
ParametricDFT.loss_functionMethod
loss_function(tensors, m, n, optcode, pic::AbstractMatrix, loss; inverse_code=nothing)

Compute loss for a single image pic (2^m x 2^n) under the given circuit parameters.

source
ParametricDFT.loss_functionMethod
loss_function(tensors, m, n, optcode, pics::Vector{<:AbstractMatrix}, loss; inverse_code=nothing, batched_optcode=nothing)

Average loss over a batch of images. Uses batched einsum if batched_optcode is provided.

source
ParametricDFT.topk_truncateMethod
topk_truncate(x::AbstractMatrix, k::Integer)

Magnitude-based top-k truncation: keeps the k coefficients with largest absolute value, zeroing the rest. This is basis-agnostic — it does not assume any particular frequency layout.

source

Manifolds

ParametricDFT.batched_invMethod
batched_inv(A::AbstractArray{T,3})

Batched matrix inverse: C[:,:,k] = inv(A[:,:,k]) for each slice k. Uses LU factorization for general matrices.

source
ParametricDFT.batched_matmulMethod
batched_matmul(A::AbstractArray{T,3}, B::AbstractArray{T,3})

Batched matrix multiply: C[:,:,k] = A[:,:,k] * B[:,:,k] for each slice k.

source
ParametricDFT.projectFunction
project(m::AbstractRiemannianManifold, points, euclidean_grads)

Project Euclidean gradients onto the tangent space at points. Batched over last dim.

source
ParametricDFT.retractFunction
retract(m::AbstractRiemannianManifold, points, tangent_vec, α)

Retract from points along tangent_vec with step size α. Batched over last dim.

source
ParametricDFT.transportFunction
transport(m::AbstractRiemannianManifold, old_points, new_points, vec)

Parallel transport vec from old_points to new_points. Batched over last dim.

source

Optimizers

ParametricDFT._batched_projectMethod
_batched_project(manifold_groups, point_batches, grad_buf_batches, euclidean_grads)

Batched Riemannian projection. Returns (rg_batches, grad_norm).

source
ParametricDFT.optimize!Method
optimize!(opt::RiemannianAdam, tensors, loss_fn, grad_fn; max_iter=100, tol=1e-6, loss_trace=nothing)

Run Riemannian Adam with momentum transport. Returns optimized tensors. When loss_trace::Vector{Float64} is provided, per-iteration losses are appended to it.

source
ParametricDFT.optimize!Method
optimize!(opt::RiemannianGD, tensors, loss_fn, grad_fn; max_iter=100, tol=1e-6, loss_trace=nothing)

Run Riemannian gradient descent with Armijo line search. Returns optimized tensors. When loss_trace::Vector{Float64} is provided, per-iteration losses are appended to it.

source

Training Pipeline

ParametricDFT._train_basis_coreMethod

Core training loop shared by all basis types. Returns (final_tensors, best_val_loss, train_losses, val_losses, step_train_losses). Uses optimize! from optimizers.jl for all optimization (GPU and CPU). Supports optimizers: RiemannianGD(), RiemannianAdam(), or symbols :gradient_descent, :adam.

source
ParametricDFT.train_basisMethod
train_basis(::Type{EntangledQFTBasis}, dataset; m, n, entangle_phases, ...)

Train an EntangledQFTBasis on images. Same kwargs as QFTBasis plus entangle_phases.

source
ParametricDFT.train_basisMethod
train_basis(::Type{QFTBasis}, dataset; m, n, loss, epochs, steps_per_image,
            optimizer, batch_size, device, ...)

Train a QFTBasis on images. Returns (basis, history). Key kwargs: optimizer (RiemannianGD()/RiemannianAdam()/:gradient_descent/:adam), batch_size, device (:cpu/:gpu).

source
ParametricDFT.train_basisMethod
train_basis(::Type{TEBDBasis}, dataset; m, n, phases, ...)

Train a TEBDBasis on images. Same kwargs as QFTBasis plus phases (initial TEBD gate phases).

source

Einsum Cache

ParametricDFT.optimize_code_cachedFunction
optimize_code_cached(flat_code, size_dict, optimizer=TreeSA())

Like optimize_code(flat_code, size_dict, optimizer) but caches the result to disk. On cache hit, returns immediately without running the optimizer.

source