Training

Loss Functions

ParametricDFT.AbstractLoss — Type

Abstract base type for loss functions. Subtypes implement _loss_function.

source

ParametricDFT.L1Norm — Type

L1 norm loss: minimizes sum(|T(x)|) to encourage sparsity.

source

ParametricDFT.MSELoss — Type

MSE loss with top-k truncation: ||x - T⁻¹(truncate(T(x), k))||². Field k is the number of kept coefficients.

source

ParametricDFT._ensure_tuple — Method

Convert AbstractVector tensors to Tuple for stable Zygote AD tangent types.

source

ParametricDFT._loss_function_batched — Method

Dispatch batched loss computation to the appropriate loss-specific function.

source

ParametricDFT._topk_mask — Method

_topk_mask(x::AbstractMatrix, k::Int) -> BitMatrix

Compute a boolean mask selecting the k elements of x with largest absolute value. Uses quickselect (O(n) average) to find the threshold, then a broadcast comparison.

source

ParametricDFT.batched_forward — Method

Apply circuit to a pre-stacked image batch. Returns (2,...,2,B) tensor.

source

ParametricDFT.batched_forward — Method

Apply circuit to B images in a single einsum call. Returns (2,...,2,B) tensor.

source

ParametricDFT.batched_loss_l1 — Method

Batched L1 loss: (1/B) * sum(|forward(images)|).

source

ParametricDFT.batched_loss_mse — Method

Batched MSE loss: batched forward, batched top-k truncation, batched inverse.

source

ParametricDFT.batched_topk_truncate — Method

batched_topk_truncate(x_batched::AbstractArray{T,N}, m::Int, n::Int, k::Integer)

Apply per-image top-k truncation to a batched frequency-domain tensor of shape (2, 2, …, 2, B) (with m + n qubit dims). Returns a tensor of the same shape with all but the k largest-magnitude entries of each image zeroed.

The mask is content-dependent — each image can keep a different set of coefficients — so on CPU this falls back to a per-image loop. GPU specialisations in ext/CUDAExt.jl compute all B masks in a single sort call.

source

ParametricDFT.loss_function — Method

loss_function(tensors, m, n, optcode, pic::AbstractMatrix, loss; inverse_code=nothing)

Compute loss for a single image pic (2^m x 2^n) under the given circuit parameters.

source

ParametricDFT.loss_function — Method

loss_function(tensors, m, n, optcode, pics::Vector{<:AbstractMatrix}, loss; inverse_code=nothing, batched_optcode=nothing)

Average loss over a batch of images. Uses batched einsum if batched_optcode is provided.

source

ParametricDFT.make_batched_code — Method

Add a batch label to image input/output indices. Returns (batched_flat_code, batch_label).

source

ParametricDFT.optimize_batched_code — Method

Optimize contraction order for batched einsum. batch_size guides optimization but result works at any runtime batch size.

source

ParametricDFT.stack_image_batch — Method

Stack B images into a single (2, ..., 2, B) tensor for batched einsum.

source

ParametricDFT.topk_truncate — Method

topk_truncate(x::AbstractMatrix, k::Integer)

Magnitude-based top-k truncation: keeps the k coefficients with largest absolute value, zeroing the rest. This is basis-agnostic — it does not assume any particular frequency layout.

source

Manifolds

ParametricDFT.AbstractRiemannianManifold — Type

Abstract base type for Riemannian manifolds used in circuit optimization.

source

ParametricDFT.PhaseManifold — Type

U(1)^d manifold: each element is a unit complex number.

source

ParametricDFT.UnitaryManifold — Type

U(n) unitary group manifold. Tensors are n x n unitary matrices.

source

ParametricDFT._make_identity_batch — Method

_make_identity_batch(::Type{T}, d::Int, n::Int) -> Array{T,3}

Create a (d, d, n) array of identity matrices. Used by optimizers for Cayley retraction pre-allocation and as fallback in retract(::UnitaryManifold, ...).

source

ParametricDFT.batched_adjoint — Method

batched_adjoint(A::AbstractArray{T,3})

Batched conjugate transpose: C[:,:,k] = A[:,:,k]' for each slice k.

source

ParametricDFT.batched_inv — Method

batched_inv(A::AbstractArray{T,3})

Batched matrix inverse: C[:,:,k] = inv(A[:,:,k]) for each slice k. Uses LU factorization for general matrices.

source

ParametricDFT.batched_matmul — Method

batched_matmul(A::AbstractArray{T,3}, B::AbstractArray{T,3})

Batched matrix multiply: C[:,:,k] = A[:,:,k] * B[:,:,k] for each slice k.

source

ParametricDFT.classify_manifold — Function

Classify a tensor as UnitaryManifold or PhaseManifold based on unitarity.

source

ParametricDFT.group_by_manifold — Method

Group tensor indices by manifold type. Returns Dict{AbstractRiemannianManifold, Vector{Int}}.

source

ParametricDFT.is_unitary_general — Method

is_unitary_general(t::AbstractMatrix)

Check if a square matrix satisfies U*U' ≈ I. Returns false for non-square matrices.

source

ParametricDFT.project — Function

project(m::AbstractRiemannianManifold, points, euclidean_grads)

Project Euclidean gradients onto the tangent space at points. Batched over last dim.

source

ParametricDFT.project — Method

Batched U(1)^d projection: im * imag(conj(z).*g) .* z.

source

ParametricDFT.project — Method

Batched U(n) Riemannian projection: U * skew(U'G) on (d,d,n) arrays.

source

ParametricDFT.retract — Function

retract(m::AbstractRiemannianManifold, points, tangent_vec, α)

Retract from points along tangent_vec with step size α. Batched over last dim.

source

ParametricDFT.retract — Method

Batched U(1)^d retraction: normalize z + alpha*xi.

source

ParametricDFT.retract — Method

Batched Cayley retraction on U(n): (I - α/2·W)⁻¹(I + α/2·W)·U where W = Ξ·U'. Pass I_batch to reuse a pre-allocated identity tensor and avoid repeated allocations.

source

ParametricDFT.stack_tensors! — Method

In-place version: pack into pre-allocated (d1, d2, n) array.

source

ParametricDFT.stack_tensors — Method

Pack selected matrices into a batched (d1, d2, n) array.

source

ParametricDFT.transport — Function

transport(m::AbstractRiemannianManifold, old_points, new_points, vec)

Parallel transport vec from old_points to new_points. Batched over last dim.

source

ParametricDFT.transport — Method

Batched U(1)^d parallel transport via re-projection.

source

ParametricDFT.transport — Method

Batched U(n) parallel transport via re-projection.

source

ParametricDFT.unstack_tensors! — Method

Unpack (d1, d2, n) array back into individual matrices.

source

Optimizers

ParametricDFT.AbstractRiemannianOptimizer — Type

Abstract base type for Riemannian optimizers.

source

ParametricDFT.OptimizationState — Type

OptimizationState{ET, RT}

Bundles shared loop state built by _common_setup. Holds manifold groupings, batched point/gradient buffers, the identity-batch cache for Cayley retraction, and a per-tensor Euclidean-gradient buffer that is reused across iterations.

source

ParametricDFT.RiemannianAdam — Type

Riemannian Adam optimizer (Becigneul & Ganea, 2019) with batched manifold operations.

source

ParametricDFT.RiemannianGD — Type

Riemannian gradient descent with Armijo backtracking line search.

source

ParametricDFT._batched_project — Method

_batched_project(manifold_groups, point_batches, grad_buf_batches, euclidean_grads)

Batched Riemannian projection. Returns (rg_batches, grad_norm).

source

ParametricDFT._common_setup — Method

_common_setup(tensors)

Build an OptimizationState from the initial tensor list. Groups tensors by manifold, stacks into batched arrays, allocates gradient buffers, and creates identity-batch caches for UnitaryManifold groups via _make_identity_batch.

source

ParametricDFT._compute_gradients! — Method

_compute_gradients!(buf, grad_fn, tensors)

Compute Euclidean gradients via grad_fn, writing into the pre-allocated buf::Vector{AbstractMatrix} (typically state.euclidean_grads_buf) to avoid per-iteration wrapper allocation. Returns buf on success, nothing on NaN/Inf (after logging which tensor carried the non-finite value).

source

ParametricDFT._compute_gradients — Method

_compute_gradients(grad_fn, tensors)

Allocating wrapper used outside the main optimization loop. Prefer _compute_gradients! inside the loop, where a buffer is already available.

source

ParametricDFT._init_optimizer_state — Method

_init_optimizer_state(opt::AbstractRiemannianOptimizer, state::OptimizationState)

Initialize per-optimizer state. Returns nothing for GD, a NamedTuple of moment/direction buffers for Adam.

source

ParametricDFT._optimization_loop — Method

_optimization_loop(opt, tensors, loss_fn, grad_fn; max_iter, tol, loss_trace)

Shared optimization loop. Delegates setup/gradient/convergence logic once, then calls _update_step! for per-optimizer behavior each iteration. Returns the optimized tensor vector.

source

ParametricDFT._update_step! — Method

_update_step!(opt, state, rg_batches, loss_fn, grad_norm_sq, opt_state, iter; cached_loss)

Per-optimizer update dispatch. Returns cached_loss::RT (NaN when not evaluated).

RiemannianGD: Armijo backtracking line search, evaluates loss multiple times. Returns the accepted candidate loss, or RT(NaN) after exhausting line-search steps.
RiemannianAdam: moment update + retract, does not evaluate loss. Returns RT(NaN).

source

ParametricDFT.optimize! — Method

optimize!(opt::AbstractRiemannianOptimizer, tensors, loss_fn, grad_fn; max_iter=100, tol=1e-6, loss_trace=nothing)

Run Riemannian optimization on circuit tensors. Dispatches to _optimization_loop which uses per-optimizer hooks (_init_optimizer_state, _update_step!). Returns optimized tensors.

When loss_trace::Vector{Float64} is provided, per-iteration losses are appended to it.

source

Training Pipeline

ParametricDFT._compute_validation_loss — Method

Compute average loss over validation set. Uses batched path when available.

source

ParametricDFT._cosine_with_warmup — Method

_cosine_with_warmup(step, total_steps; warmup_frac, lr_peak, lr_final)

Linear warmup followed by cosine decay. step is 0-indexed global step; warmup_frac ∈ (0, 1) sets the warmup portion of total steps.

source

ParametricDFT._save_checkpoint — Method

Save a training checkpoint (basis + loss history).

source

ParametricDFT._save_loss_history — Method

Save training loss history to JSON.

source

ParametricDFT._train_basis_core — Method

Core training loop shared by all basis types. Returns (final_tensors, best_val_loss, train_losses, val_losses, step_train_losses). Uses optimize! from optimizers.jl for all optimization (GPU and CPU). Supports optimizers: RiemannianGD(), RiemannianAdam(), or symbols :gradient_descent, :adam.

source

ParametricDFT.load_loss_history — Method

Load training loss history from JSON. Returns a TrainingHistory object.

source

ParametricDFT.save_loss_history — Method

Save training history (from train_basis) to a JSON file.

source

ParametricDFT.to_cpu — Method

Move array to CPU.

source

ParametricDFT.to_device — Method

Move array to device. :gpu requires CUDA.jl via CUDAExt.

source

ParametricDFT.train_basis — Method

train_basis(::Type{B}, dataset; m, n, loss, epochs, steps_per_image,
            optimizer, batch_size, device, ...)

Train any AbstractSparseBasis subtype on images. Returns (basis, history). Basis-specific kwargs (e.g. phases, entangle_phases) are forwarded to _init_circuit and _build_basis.

source

ParametricDFT.train_basis_from_files — Function

Train QFTBasis from image files. Requires Images.jl.

source

Einsum Cache

ParametricDFT._cache_key — Method

_cache_key(flat_code, size_dict)

Compute a stable hash key for an einsum code + size dict combination.

source

ParametricDFT.optimize_code_cached — Function

optimize_code_cached(flat_code, size_dict, optimizer=TreeSA())

Like optimize_code(flat_code, size_dict, optimizer) but caches the result to disk. On cache hit, returns immediately without running the optimizer.

source

ParametricDFT.set_einsum_cache_dir! — Method

set_einsum_cache_dir!(dir::String)

Set the directory used for caching optimized einsum codes.

source