257 lines
8.1 KiB
Plaintext
257 lines
8.1 KiB
Plaintext
|
|
|
|
(* Copyright (c) 2011 Stefan Krah. All rights reserved. *)
|
|
|
|
|
|
The Matrix Fourier Transform:
|
|
=============================
|
|
|
|
In libmpdec, the Matrix Fourier Transform [1] is called four-step transform
|
|
after a variant that appears in [2]. The algorithm requires that the input
|
|
array can be viewed as an R*C matrix.
|
|
|
|
All operations are done modulo p. For readability, the proofs drop all
|
|
instances of (mod p).
|
|
|
|
|
|
Algorithm four-step (forward transform):
|
|
----------------------------------------
|
|
|
|
a := input array
|
|
d := len(a) = R * C
|
|
p := prime
|
|
w := primitive root of unity of the prime field
|
|
r := w**((p-1)/d)
|
|
A := output array
|
|
|
|
1) Apply a length R FNT to each column.
|
|
|
|
2) Multiply each matrix element (addressed by j*C+m) by r**(j*m).
|
|
|
|
3) Apply a length C FNT to each row.
|
|
|
|
4) Transpose the matrix.
|
|
|
|
|
|
Proof (forward transform):
|
|
--------------------------
|
|
|
|
The algorithm can be derived starting from the regular definition of
|
|
the finite-field transform of length d:
|
|
|
|
d-1
|
|
,----
|
|
\
|
|
A[k] = | a[l] * r**(k * l)
|
|
/
|
|
`----
|
|
l = 0
|
|
|
|
|
|
The sum can be rearranged into the sum of the sums of columns:
|
|
|
|
C-1 R-1
|
|
,---- ,----
|
|
\ \
|
|
= | | a[i * C + j] * r**(k * (i * C + j))
|
|
/ /
|
|
`---- `----
|
|
j = 0 i = 0
|
|
|
|
|
|
Extracting a constant from the inner sum:
|
|
|
|
C-1 R-1
|
|
,---- ,----
|
|
\ \
|
|
= | r**k*j * | a[i * C + j] * r**(k * i * C)
|
|
/ /
|
|
`---- `----
|
|
j = 0 i = 0
|
|
|
|
|
|
Without any loss of generality, let k = n * R + m,
|
|
where n < C and m < R:
|
|
|
|
C-1 R-1
|
|
,---- ,----
|
|
\ \
|
|
A[n*R+m] = | r**(R*n*j) * r**(m*j) * | a[i*C+j] * r**(R*C*n*i) * r**(C*m*i)
|
|
/ /
|
|
`---- `----
|
|
j = 0 i = 0
|
|
|
|
|
|
Since r = w ** ((p-1) / (R*C)):
|
|
|
|
a) r**(R*C*n*i) = w**((p-1)*n*i) = 1
|
|
|
|
b) r**(C*m*i) = w**((p-1) / R) ** (m*i) = r_R ** (m*i)
|
|
|
|
c) r**(R*n*j) = w**((p-1) / C) ** (n*j) = r_C ** (n*j)
|
|
|
|
r_R := root of the subfield of length R.
|
|
r_C := root of the subfield of length C.
|
|
|
|
|
|
C-1 R-1
|
|
,---- ,----
|
|
\ \
|
|
A[n*R+m] = | r_C**(n*j) * [ r**(m*j) * | a[i*C+j] * r_R**(m*i) ]
|
|
/ ^ /
|
|
`---- | `---- 1) transform the columns
|
|
j = 0 | i = 0
|
|
^ |
|
|
| `-- 2) multiply
|
|
|
|
|
`-- 3) transform the rows
|
|
|
|
|
|
Note that the entire RHS is a function of n and m and that the results
|
|
for each pair (n, m) are stored in Fortran order.
|
|
|
|
Let the term in square brackets be f(m, j). Step 1) and 2) precalculate
|
|
the term for all (m, j). After that, the original matrix is now a lookup
|
|
table with the mth element in the jth column at location m * C + j.
|
|
|
|
Let the complete RHS be g(m, n). Step 3) does an in-place transform of
|
|
length n on all rows. After that, the original matrix is now a lookup
|
|
table with the mth element in the nth column at location m * C + n.
|
|
|
|
But each (m, n) pair should be written to location n * R + m. Therefore,
|
|
step 4) transposes the result of step 3).
|
|
|
|
|
|
|
|
Algorithm four-step (inverse transform):
|
|
----------------------------------------
|
|
|
|
A := input array
|
|
d := len(A) = R * C
|
|
p := prime
|
|
d' := d**(p-2) # inverse of d
|
|
w := primitive root of unity of the prime field
|
|
r := w**((p-1)/d) # root of the subfield
|
|
r' := w**((p-1) - (p-1)/d) # inverse of r
|
|
a := output array
|
|
|
|
0) View the matrix as a C*R matrix.
|
|
|
|
1) Transpose the matrix, producing an R*C matrix.
|
|
|
|
2) Apply a length C FNT to each row.
|
|
|
|
3) Multiply each matrix element (addressed by i*C+n) by r**(i*n).
|
|
|
|
4) Apply a length R FNT to each column.
|
|
|
|
|
|
Proof (inverse transform):
|
|
--------------------------
|
|
|
|
The algorithm can be derived starting from the regular definition of
|
|
the finite-field inverse transform of length d:
|
|
|
|
d-1
|
|
,----
|
|
\
|
|
a[k] = d' * | A[l] * r' ** (k * l)
|
|
/
|
|
`----
|
|
l = 0
|
|
|
|
|
|
The sum can be rearranged into the sum of the sums of columns. Note
|
|
that at this stage we still have a C*R matrix, so C denotes the number
|
|
of rows:
|
|
|
|
R-1 C-1
|
|
,---- ,----
|
|
\ \
|
|
= d' * | | a[j * R + i] * r' ** (k * (j * R + i))
|
|
/ /
|
|
`---- `----
|
|
i = 0 j = 0
|
|
|
|
|
|
Extracting a constant from the inner sum:
|
|
|
|
R-1 C-1
|
|
,---- ,----
|
|
\ \
|
|
= d' * | r' ** (k*i) * | a[j * R + i] * r' ** (k * j * R)
|
|
/ /
|
|
`---- `----
|
|
i = 0 j = 0
|
|
|
|
|
|
Without any loss of generality, let k = m * C + n,
|
|
where m < R and n < C:
|
|
|
|
R-1 C-1
|
|
,---- ,----
|
|
\ \
|
|
A[m*C+n] = d' * | r' ** (C*m*i) * r' ** (n*i) * | a[j*R+i] * r' ** (R*C*m*j) * r' ** (R*n*j)
|
|
/ /
|
|
`---- `----
|
|
i = 0 j = 0
|
|
|
|
|
|
Since r' = w**((p-1) - (p-1)/d) and d = R*C:
|
|
|
|
a) r' ** (R*C*m*j) = w**((p-1)*R*C*m*j - (p-1)*m*j) = 1
|
|
|
|
b) r' ** (C*m*i) = w**((p-1)*C - (p-1)/R) ** (m*i) = r_R' ** (m*i)
|
|
|
|
c) r' ** (R*n*j) = r_C' ** (n*j)
|
|
|
|
d) d' = d**(p-2) = (R*C) ** (p-2) = R**(p-2) * C**(p-2) = R' * C'
|
|
|
|
r_R' := inverse of the root of the subfield of length R.
|
|
r_C' := inverse of the root of the subfield of length C.
|
|
R' := inverse of R
|
|
C' := inverse of C
|
|
|
|
|
|
R-1 C-1
|
|
,---- ,---- 2) transform the rows of a^T
|
|
\ \
|
|
A[m*C+n] = R' * | r_R' ** (m*i) * [ r' ** (n*i) * C' * | a[j*R+i] * r_C' ** (n*j) ]
|
|
/ ^ / ^
|
|
`---- | `---- |
|
|
i = 0 | j = 0 |
|
|
^ | `-- 1) Transpose input matrix
|
|
| `-- 3) multiply to address elements by
|
|
| i * C + j
|
|
`-- 3) transform the columns
|
|
|
|
|
|
|
|
Note that the entire RHS is a function of m and n and that the results
|
|
for each pair (m, n) are stored in C order.
|
|
|
|
Let the term in square brackets be f(n, i). Without step 1), the sum
|
|
would perform a length C transform on the columns of the input matrix.
|
|
This is a) inefficient and b) the results are needed in C order, so
|
|
step 1) exchanges rows and columns.
|
|
|
|
Step 2) and 3) precalculate f(n, i) for all (n, i). After that, the
|
|
original matrix is now a lookup table with the ith element in the nth
|
|
column at location i * C + n.
|
|
|
|
Let the complete RHS be g(m, n). Step 4) does an in-place transform of
|
|
length m on all columns. After that, the original matrix is now a lookup
|
|
table with the mth element in the nth column at location m * C + n,
|
|
which means that all A[k] = A[m * C + n] are in the correct order.
|
|
|
|
|
|
--
|
|
|
|
[1] Joerg Arndt: "Matters Computational"
|
|
http://www.jjj.de/fxt/
|
|
[2] David H. Bailey: FFTs in External or Hierarchical Memory
|
|
http://crd.lbl.gov/~dhbailey/dhbpapers/
|
|
|
|
|
|
|