Training

Loss Functions

ParametricDFT.AbstractLoss — Type

Abstract base type for loss functions. Subtypes implement _loss_function.

ParametricDFT.L1Norm — Type

L1 norm loss: minimizes sum(|T(x)|) to encourage sparsity.

ParametricDFT.L2Norm — Type

L2Norm

L2 norm loss: sum(|T(x)|^2).

Warning

Unitary transforms preserve the L2 norm (Parseval's theorem), so this loss is constant with respect to the circuit parameters and its gradient is zero. It should NOT be used as a training objective. Use L1Norm or MSELoss instead. This type is retained for backward compatibility but may be removed in a future release.

ParametricDFT.MSELoss — Type

MSE loss with top-k truncation: ||x - T⁻¹(truncate(T(x), k))||². Field k is the number of kept coefficients.

ParametricDFT._topk_mask — Method

_topk_mask(x::AbstractMatrix, k::Int) -> BitMatrix

Compute a boolean mask selecting the k elements of x with largest absolute value. Uses quickselect (O(n) average) to find the threshold, then a broadcast comparison.

ParametricDFT.batched_forward — Method

Apply circuit to B images in a single einsum call. Returns (2,...,2,B) tensor.

ParametricDFT.batched_loss_l1 — Method

Batched L1 loss: (1/B) * sum(|forward(images)|).

ParametricDFT.batched_loss_l2 — Method

Batched L2 loss: (1/B) * sum(|forward(images)|^2).

ParametricDFT.batched_loss_mse — Method

Batched MSE loss: batched forward, per-image topk_truncate, batched inverse.

ParametricDFT.loss_function — Method

loss_function(tensors, m, n, optcode, pic::AbstractMatrix, loss; inverse_code=nothing)

Compute loss for a single image pic (2^m x 2^n) under the given circuit parameters.

ParametricDFT.loss_function — Method

loss_function(tensors, m, n, optcode, pics::Vector{<:AbstractMatrix}, loss; inverse_code=nothing, batched_optcode=nothing)

Average loss over a batch of images. Uses batched einsum if batched_optcode is provided.

ParametricDFT.make_batched_code — Method

Add a batch label to image input/output indices. Returns (batched_flat_code, batch_label).

ParametricDFT.optimize_batched_code — Method

Optimize contraction order for batched einsum. batch_size guides optimization but result works at any runtime batch size.

ParametricDFT.topk_truncate — Method

topk_truncate(x::AbstractMatrix, k::Integer)

Magnitude-based top-k truncation: keeps the k coefficients with largest absolute value, zeroing the rest. This is basis-agnostic — it does not assume any particular frequency layout.

Manifolds

ParametricDFT.AbstractRiemannianManifold — Type

Abstract base type for Riemannian manifolds used in circuit optimization.

ParametricDFT.PhaseManifold — Type

U(1)^d manifold: each element is a unit complex number.

ParametricDFT.UnitaryManifold — Type

U(n) unitary group manifold. Tensors are n x n unitary matrices.

ParametricDFT.batched_adjoint — Method

batched_adjoint(A::AbstractArray{T,3})

Batched conjugate transpose: C[:,:,k] = A[:,:,k]' for each slice k.

ParametricDFT.batched_inv — Method

batched_inv(A::AbstractArray{T,3})

Batched matrix inverse: C[:,:,k] = inv(A[:,:,k]) for each slice k. Uses LU factorization for general matrices.

ParametricDFT.batched_matmul — Method

batched_matmul(A::AbstractArray{T,3}, B::AbstractArray{T,3})

Batched matrix multiply: C[:,:,k] = A[:,:,k] * B[:,:,k] for each slice k.

ParametricDFT.classify_manifold — Function

Classify a tensor as UnitaryManifold or PhaseManifold based on unitarity.

ParametricDFT.group_by_manifold — Method

Group tensor indices by manifold type. Returns Dict{AbstractRiemannianManifold, Vector{Int}}.

ParametricDFT.is_unitary_general — Method

is_unitary_general(t::AbstractMatrix)

Check if a square matrix satisfies U*U' ≈ I. Returns false for non-square matrices.

ParametricDFT.project — Function

project(m::AbstractRiemannianManifold, points, euclidean_grads)

Project Euclidean gradients onto the tangent space at points. Batched over last dim.

ParametricDFT.project — Method

Batched U(1)^d projection: im * imag(conj(z).*g) .* z.

ParametricDFT.project — Method

Batched U(n) Riemannian projection: U * skew(U'G) on (d,d,n) arrays.

ParametricDFT.retract — Function

retract(m::AbstractRiemannianManifold, points, tangent_vec, α)

Retract from points along tangent_vec with step size α. Batched over last dim.

ParametricDFT.retract — Method

Batched U(1)^d retraction: normalize z + alpha*xi.

ParametricDFT.retract — Method

Batched Cayley retraction on U(n): (I - α/2·W)⁻¹(I + α/2·W)·U where W = Ξ·U'.

ParametricDFT.stack_tensors! — Method

In-place version: pack into pre-allocated (d1, d2, n) array.

ParametricDFT.stack_tensors — Method

Pack selected matrices into a batched (d1, d2, n) array.

ParametricDFT.transport — Function

transport(m::AbstractRiemannianManifold, old_points, new_points, vec)

Parallel transport vec from old_points to new_points. Batched over last dim.

ParametricDFT.transport — Method

Batched U(1)^d parallel transport via re-projection.

ParametricDFT.transport — Method

Batched U(n) parallel transport via re-projection.

ParametricDFT.unstack_tensors! — Method

Unpack (d1, d2, n) array back into individual matrices.

Optimizers

ParametricDFT.AbstractRiemannianOptimizer — Type

Abstract base type for Riemannian optimizers.

ParametricDFT.RiemannianAdam — Type

Riemannian Adam optimizer (Becigneul & Ganea, 2019) with batched manifold operations.

ParametricDFT.RiemannianGD — Type

Riemannian gradient descent with Armijo backtracking line search.

ParametricDFT._batched_project — Method

_batched_project(manifold_groups, point_batches, grad_buf_batches, euclidean_grads)

Batched Riemannian projection. Returns (rg_batches, grad_norm).

ParametricDFT._compute_gradients — Method

_compute_gradients(grad_fn, tensors)

Compute Euclidean gradients via grad_fn. Returns nothing on NaN/Inf.

ParametricDFT.optimize! — Method

optimize!(opt::RiemannianAdam, tensors, loss_fn, grad_fn; max_iter=100, tol=1e-6, loss_trace=nothing)

Run Riemannian Adam with momentum transport. Returns optimized tensors. When loss_trace::Vector{Float64} is provided, per-iteration losses are appended to it.

ParametricDFT.optimize! — Method

optimize!(opt::RiemannianGD, tensors, loss_fn, grad_fn; max_iter=100, tol=1e-6, loss_trace=nothing)

Run Riemannian gradient descent with Armijo line search. Returns optimized tensors. When loss_trace::Vector{Float64} is provided, per-iteration losses are appended to it.

Training Pipeline

ParametricDFT._compute_validation_loss — Method

Compute average loss over validation set.

ParametricDFT._save_checkpoint — Method

Save a training checkpoint (basis + loss history).

ParametricDFT._save_loss_history — Method

Save training loss history to JSON.

ParametricDFT._train_basis_core — Method

Core training loop shared by all basis types. Returns (final_tensors, best_val_loss, train_losses, val_losses, step_train_losses). Uses optimize! from optimizers.jl for all optimization (GPU and CPU). Supports optimizers: RiemannianGD(), RiemannianAdam(), or symbols :gradient_descent, :adam.

ParametricDFT.load_loss_history — Method

Load training loss history from JSON. Returns a TrainingHistory object.

ParametricDFT.save_loss_history — Method

Save training history (from train_basis) to a JSON file.

ParametricDFT.to_cpu — Method

Move array to CPU.

ParametricDFT.to_device — Method

Move array to device. :gpu requires CUDA.jl via CUDAExt.

ParametricDFT.train_basis — Method

train_basis(::Type{EntangledQFTBasis}, dataset; m, n, entangle_phases, ...)

Train an EntangledQFTBasis on images. Same kwargs as QFTBasis plus entangle_phases.

ParametricDFT.train_basis — Method

Train a MERABasis on images. Same kwargs as TEBDBasis.

ParametricDFT.train_basis — Method

train_basis(::Type{QFTBasis}, dataset; m, n, loss, epochs, steps_per_image,
            optimizer, batch_size, device, ...)

Train a QFTBasis on images. Returns (basis, history). Key kwargs: optimizer (RiemannianGD()/RiemannianAdam()/:gradient_descent/:adam), batch_size, device (:cpu/:gpu).

ParametricDFT.train_basis — Method

train_basis(::Type{TEBDBasis}, dataset; m, n, phases, ...)

Train a TEBDBasis on images. Same kwargs as QFTBasis plus phases (initial TEBD gate phases).

ParametricDFT.train_basis_from_files — Function

Train QFTBasis from image files. Requires Images.jl.

Einsum Cache

ParametricDFT._cache_key — Method

_cache_key(flat_code, size_dict)

Compute a stable hash key for an einsum code + size dict combination.

ParametricDFT.optimize_code_cached — Function

optimize_code_cached(flat_code, size_dict, optimizer=TreeSA())

Like optimize_code(flat_code, size_dict, optimizer) but caches the result to disk. On cache hit, returns immediately without running the optimizer.

ParametricDFT.set_einsum_cache_dir! — Method

set_einsum_cache_dir!(dir::String)

Set the directory used for caching optimized einsum codes.