Training
Loss Functions
ParametricDFT.AbstractLoss — Type
Abstract base type for loss functions. Subtypes implement _loss_function.
ParametricDFT.L1Norm — Type
L1 norm loss: minimizes sum(|T(x)|) to encourage sparsity.
ParametricDFT.MSELoss — Type
MSE loss with top-k truncation: ||x - T⁻¹(truncate(T(x), k))||². Field k is the number of kept coefficients.
ParametricDFT._ensure_tuple — Method
Convert AbstractVector tensors to Tuple for stable Zygote AD tangent types.
ParametricDFT._loss_function_batched — Method
Dispatch batched loss computation to the appropriate loss-specific function.
ParametricDFT._topk_mask — Method
_topk_mask(x::AbstractMatrix, k::Int) -> BitMatrixCompute a boolean mask selecting the k elements of x with largest absolute value. Uses quickselect (O(n) average) to find the threshold, then a broadcast comparison.
ParametricDFT.batched_forward — Method
Apply circuit to a pre-stacked image batch. Returns (2,...,2,B) tensor.
ParametricDFT.batched_forward — Method
Apply circuit to B images in a single einsum call. Returns (2,...,2,B) tensor.
ParametricDFT.batched_loss_l1 — Method
Batched L1 loss: (1/B) * sum(|forward(images)|).
ParametricDFT.batched_loss_mse — Method
Batched MSE loss: batched forward, batched top-k truncation, batched inverse.
ParametricDFT.batched_topk_truncate — Method
batched_topk_truncate(x_batched::AbstractArray{T,N}, m::Int, n::Int, k::Integer)Apply per-image top-k truncation to a batched frequency-domain tensor of shape (2, 2, …, 2, B) (with m + n qubit dims). Returns a tensor of the same shape with all but the k largest-magnitude entries of each image zeroed.
The mask is content-dependent — each image can keep a different set of coefficients — so on CPU this falls back to a per-image loop. GPU specialisations in ext/CUDAExt.jl compute all B masks in a single sort call.
ParametricDFT.loss_function — Method
loss_function(tensors, m, n, optcode, pic::AbstractMatrix, loss; inverse_code=nothing)Compute loss for a single image pic (2^m x 2^n) under the given circuit parameters.
ParametricDFT.loss_function — Method
loss_function(tensors, m, n, optcode, pics::Vector{<:AbstractMatrix}, loss; inverse_code=nothing, batched_optcode=nothing)Average loss over a batch of images. Uses batched einsum if batched_optcode is provided.
ParametricDFT.make_batched_code — Method
Add a batch label to image input/output indices. Returns (batched_flat_code, batch_label).
ParametricDFT.optimize_batched_code — Method
Optimize contraction order for batched einsum. batch_size guides optimization but result works at any runtime batch size.
ParametricDFT.stack_image_batch — Method
Stack B images into a single (2, ..., 2, B) tensor for batched einsum.
ParametricDFT.topk_truncate — Method
topk_truncate(x::AbstractMatrix, k::Integer)Magnitude-based top-k truncation: keeps the k coefficients with largest absolute value, zeroing the rest. This is basis-agnostic — it does not assume any particular frequency layout.
Manifolds
ParametricDFT.AbstractRiemannianManifold — Type
Abstract base type for Riemannian manifolds used in circuit optimization.
ParametricDFT.PhaseManifold — Type
U(1)^d manifold: each element is a unit complex number.
ParametricDFT.UnitaryManifold — Type
U(n) unitary group manifold. Tensors are n x n unitary matrices.
ParametricDFT._make_identity_batch — Method
_make_identity_batch(::Type{T}, d::Int, n::Int) -> Array{T,3}Create a (d, d, n) array of identity matrices. Used by optimizers for Cayley retraction pre-allocation and as fallback in retract(::UnitaryManifold, ...).
ParametricDFT.batched_adjoint — Method
batched_adjoint(A::AbstractArray{T,3})Batched conjugate transpose: C[:,:,k] = A[:,:,k]' for each slice k.
ParametricDFT.batched_inv — Method
batched_inv(A::AbstractArray{T,3})Batched matrix inverse: C[:,:,k] = inv(A[:,:,k]) for each slice k. Uses LU factorization for general matrices.
ParametricDFT.batched_matmul — Method
batched_matmul(A::AbstractArray{T,3}, B::AbstractArray{T,3})Batched matrix multiply: C[:,:,k] = A[:,:,k] * B[:,:,k] for each slice k.
ParametricDFT.classify_manifold — Function
Classify a tensor as UnitaryManifold or PhaseManifold based on unitarity.
ParametricDFT.group_by_manifold — Method
Group tensor indices by manifold type. Returns Dict{AbstractRiemannianManifold, Vector{Int}}.
ParametricDFT.is_unitary_general — Method
is_unitary_general(t::AbstractMatrix)Check if a square matrix satisfies U*U' ≈ I. Returns false for non-square matrices.
ParametricDFT.project — Function
project(m::AbstractRiemannianManifold, points, euclidean_grads)Project Euclidean gradients onto the tangent space at points. Batched over last dim.
ParametricDFT.project — Method
Batched U(1)^d projection: im * imag(conj(z).*g) .* z.
ParametricDFT.project — Method
Batched U(n) Riemannian projection: U * skew(U'G) on (d,d,n) arrays.
ParametricDFT.retract — Function
retract(m::AbstractRiemannianManifold, points, tangent_vec, α)Retract from points along tangent_vec with step size α. Batched over last dim.
ParametricDFT.retract — Method
Batched U(1)^d retraction: normalize z + alpha*xi.
ParametricDFT.retract — Method
Batched Cayley retraction on U(n): (I - α/2·W)⁻¹(I + α/2·W)·U where W = Ξ·U'. Pass I_batch to reuse a pre-allocated identity tensor and avoid repeated allocations.
ParametricDFT.stack_tensors! — Method
In-place version: pack into pre-allocated (d1, d2, n) array.
ParametricDFT.stack_tensors — Method
Pack selected matrices into a batched (d1, d2, n) array.
ParametricDFT.transport — Function
transport(m::AbstractRiemannianManifold, old_points, new_points, vec)Parallel transport vec from old_points to new_points. Batched over last dim.
ParametricDFT.transport — Method
Batched U(1)^d parallel transport via re-projection.
ParametricDFT.transport — Method
Batched U(n) parallel transport via re-projection.
ParametricDFT.unstack_tensors! — Method
Unpack (d1, d2, n) array back into individual matrices.
Optimizers
ParametricDFT.AbstractRiemannianOptimizer — Type
Abstract base type for Riemannian optimizers.
ParametricDFT.OptimizationState — Type
OptimizationState{ET, RT}Bundles shared loop state built by _common_setup. Holds manifold groupings, batched point/gradient buffers, the identity-batch cache for Cayley retraction, and a per-tensor Euclidean-gradient buffer that is reused across iterations.
ParametricDFT.RiemannianAdam — Type
Riemannian Adam optimizer (Becigneul & Ganea, 2019) with batched manifold operations.
ParametricDFT.RiemannianGD — Type
Riemannian gradient descent with Armijo backtracking line search.
ParametricDFT._batched_project — Method
_batched_project(manifold_groups, point_batches, grad_buf_batches, euclidean_grads)Batched Riemannian projection. Returns (rg_batches, grad_norm).
ParametricDFT._common_setup — Method
_common_setup(tensors)Build an OptimizationState from the initial tensor list. Groups tensors by manifold, stacks into batched arrays, allocates gradient buffers, and creates identity-batch caches for UnitaryManifold groups via _make_identity_batch.
ParametricDFT._compute_gradients! — Method
_compute_gradients!(buf, grad_fn, tensors)Compute Euclidean gradients via grad_fn, writing into the pre-allocated buf::Vector{AbstractMatrix} (typically state.euclidean_grads_buf) to avoid per-iteration wrapper allocation. Returns buf on success, nothing on NaN/Inf (after logging which tensor carried the non-finite value).
ParametricDFT._compute_gradients — Method
_compute_gradients(grad_fn, tensors)Allocating wrapper used outside the main optimization loop. Prefer _compute_gradients! inside the loop, where a buffer is already available.
ParametricDFT._init_optimizer_state — Method
_init_optimizer_state(opt::AbstractRiemannianOptimizer, state::OptimizationState)Initialize per-optimizer state. Returns nothing for GD, a NamedTuple of moment/direction buffers for Adam.
ParametricDFT._optimization_loop — Method
_optimization_loop(opt, tensors, loss_fn, grad_fn; max_iter, tol, loss_trace)Shared optimization loop. Delegates setup/gradient/convergence logic once, then calls _update_step! for per-optimizer behavior each iteration. Returns the optimized tensor vector.
ParametricDFT._update_step! — Method
_update_step!(opt, state, rg_batches, loss_fn, grad_norm_sq, opt_state, iter; cached_loss)Per-optimizer update dispatch. Returns cached_loss::RT (NaN when not evaluated).
RiemannianGD: Armijo backtracking line search, evaluates loss multiple times. Returns the accepted candidate loss, orRT(NaN)after exhausting line-search steps.RiemannianAdam: moment update + retract, does not evaluate loss. ReturnsRT(NaN).
ParametricDFT.optimize! — Method
optimize!(opt::AbstractRiemannianOptimizer, tensors, loss_fn, grad_fn; max_iter=100, tol=1e-6, loss_trace=nothing)Run Riemannian optimization on circuit tensors. Dispatches to _optimization_loop which uses per-optimizer hooks (_init_optimizer_state, _update_step!). Returns optimized tensors.
When loss_trace::Vector{Float64} is provided, per-iteration losses are appended to it.
Training Pipeline
ParametricDFT._compute_validation_loss — Method
Compute average loss over validation set. Uses batched path when available.
ParametricDFT._cosine_with_warmup — Method
_cosine_with_warmup(step, total_steps; warmup_frac, lr_peak, lr_final)Linear warmup followed by cosine decay. step is 0-indexed global step; warmup_frac ∈ (0, 1) sets the warmup portion of total steps.
ParametricDFT._save_checkpoint — Method
Save a training checkpoint (basis + loss history).
ParametricDFT._save_loss_history — Method
Save training loss history to JSON.
ParametricDFT._train_basis_core — Method
Core training loop shared by all basis types. Returns (final_tensors, best_val_loss, train_losses, val_losses, step_train_losses). Uses optimize! from optimizers.jl for all optimization (GPU and CPU). Supports optimizers: RiemannianGD(), RiemannianAdam(), or symbols :gradient_descent, :adam.
ParametricDFT.load_loss_history — Method
Load training loss history from JSON. Returns a TrainingHistory object.
ParametricDFT.save_loss_history — Method
Save training history (from train_basis) to a JSON file.
ParametricDFT.to_cpu — Method
Move array to CPU.
ParametricDFT.to_device — Method
Move array to device. :gpu requires CUDA.jl via CUDAExt.
ParametricDFT.train_basis — Method
train_basis(::Type{B}, dataset; m, n, loss, epochs, steps_per_image,
optimizer, batch_size, device, ...)Train any AbstractSparseBasis subtype on images. Returns (basis, history). Basis-specific kwargs (e.g. phases, entangle_phases) are forwarded to _init_circuit and _build_basis.
ParametricDFT.train_basis_from_files — Function
Train QFTBasis from image files. Requires Images.jl.
Einsum Cache
ParametricDFT._cache_key — Method
_cache_key(flat_code, size_dict)Compute a stable hash key for an einsum code + size dict combination.
ParametricDFT.optimize_code_cached — Function
optimize_code_cached(flat_code, size_dict, optimizer=TreeSA())Like optimize_code(flat_code, size_dict, optimizer) but caches the result to disk. On cache hit, returns immediately without running the optimizer.
ParametricDFT.set_einsum_cache_dir! — Method
set_einsum_cache_dir!(dir::String)Set the directory used for caching optimized einsum codes.