Field_ops_lib.Bram_reduce
Module to reduce numbers modulo a fixed constant with just two subtraction stages.
Given a fixed modular base \(p\) with \(n\) bits and a multiplicative error \(E\) with \(e\) bits, this module can reduce any input \(N\in[0, Ep)\) to an equivalent value modulo \(p\) in the range \([0, p)\) with just two subtraction stages.
The module accomplishes this by precomputing and loading a ROM with \(2^e\) entries. We set \(R[0] = 0\). Then, for \(i\geq1\), entry \(R[i]\) holds the final \(n\) bits of the largest multiple of \(p\) that has \(i-1\) as its \(e\)-bit prefix when written with \(e+n\) bits. In particular,
$$R[i] = \left\lfloor \frac{(i-1)\cdot 2^n}{p} \right\rfloor\pmod {2^{n}} $$Then, given an input \(N\in[0,Ep)\) with \(e+n\) bits, we can lookup its \(e\)-bit prefix in the ROM and subtract the resulting value from the \(n\)-bit suffix of our input value:
$$ N[n-1:0] - R\left[N[e-1+n:n]\right] $$(note that we only need to use the lower \(n\) bits because we know what the higher bits are by construction). Then, when the lookup index is nonzero, we have to invert the MSB of the signed result to get an unsigned reduction \(n^\prime\) to the range \([0,3p)\).
After this, we do one more subtraction stage where we just multiplex between \(\{n^\prime, n^\prime-p, n^\prime-2p\}\) to choose the final result.
This module can be generalized over any modular base \(p\) (not necessarily prime), rather than just our given prime. However, this requires a bit of care to double check the bounds above - in our case, the first ROM reduction goes to the range \([0,3p)\), but for some bases, it is possible that the reduction only goes to \([0,4p)\) and requires one more subtractor in the subsequent stage.