# Number-Theoretic Transform (NTT)

This competition track required us to build a Number Theoretic Transform (NTT) accelerator capable of performing transforms of size $2^24$. NTTs are conceptually similar to the Fourier Transforms - working over a finite field instead of complex numbers. For this project, the finite field contains values of size 64 bits modulo a so called Solinas prime equal to $2^64 - 2^32 + 1$.

The mathematical formulation of the NTT is as the following:

\[X_{i} = ∑↙{j=0}↖{N-1} x_{j} w^{ij} (mod P)\]where $N$ is the size of the input vector $x$, $P$ is the Solinas prime and $w$ is the $N$th root of unity. $X_{i}$ is defined for $0 ≤ i < N$.

The main design challenge is that such large transforms cannot fit within a single FPGA’s on-chip RAM resources, so we need to make efficient use of large external DRAM resources.

The platform targeted is the Xilinx Varium C1100 accelerator card. The card contains a Virtex UltrasScale+ FPGA with HBM2.

# Design overview

Our work is built around the 4-step algorithm to break down a large NTT into much smaller NTTs. The smaller NTTs are computed using the well-known decimation in time Cooley-Tukey FFT algorithm. The following pages discuss the algorithms and design considerations:

- 4-step details the 4-step algorithm
- Bandwidth Considerations

You can read more about our FPGA implementation in the following pages:

- core NTT design
- scaling Up performance with shared controllers and a wide memory bus
- top level Vitis design

We present the performance results for our design and show how you can build the design.

# Hardcaml on the Web

Configure our designs, download RTL and perform simulations all within your browser:

- Core NTT design which includes the IO RAMs, datapath and controller
- Top level design as a Vitis kernel which performs the full 4-step algorithm

# Code structure

The code is built from a couple of libraries within our submission repository:

The `Hardcaml_ntt`

library provides the base implementation of the NTT core and includes a software model and various unit tests.

`Zprize_ntt`

contains the code for both the top-level
Hardcaml NTT kernel and
tests, the
C++ HLS DMA kernel,
host benchmarking software, and
build scripts.