Info Node: (fftw3.info)Real even/odd DFTs (cosine/sine transforms)

fftw3.info: Real even/odd DFTs (cosine/sine transforms)
More DFTs of Real Data
The Discrete Hartley Transform
The Halfcomplex-format DFT
Back to Software Index
2.5.2 Real even/odd DFTs (cosine/sine transforms)
-------------------------------------------------
The Fourier transform of a real-even function f(-x) = f(x) is
real-even, and i times the Fourier transform of a real-odd function
f(-x) = -f(x) is real-odd. Similar results hold for a discrete Fourier
transform, and thus for these symmetries the need for complex
inputs/outputs is entirely eliminated. Moreover, one gains a factor of
two in speed/space from the fact that the data are real, and an
additional factor of two from the even/odd symmetry: only the
non-redundant (first) half of the array need be stored. The result is
the real-even DFT ("REDFT") and the real-odd DFT ("RODFT"), also known
as the discrete cosine and sine transforms ("DCT" and "DST"),
respectively.
(In this section, we describe the 1d transforms; multi-dimensional
transforms are just a separable product of these transforms operating
along each dimension.)
Because of the discrete sampling, one has an additional choice: is
the data even/odd around a sampling point, or around the point halfway
between two samples? The latter corresponds to _shifting_ the samples
by _half_ an interval, and gives rise to several transform variants
denoted by REDFTab and RODFTab: a and b are 0 or 1, and indicate
whether the input (a) and/or output (b) are shifted by half a sample (1
means it is shifted). These are also known as types I-IV of the DCT
and DST, and all four types are supported by FFTW's r2r interface.(1)
The r2r kinds for the various REDFT and RODFT types supported by
FFTW, along with the boundary conditions at both ends of the _input_
array (`n' real numbers `in[j=0..n-1]'), are:
* `FFTW_REDFT00' (DCT-I): even around j=0 and even around j=n-1.
* `FFTW_REDFT10' (DCT-II, "the" DCT): even around j=-0.5 and even
around j=n-0.5.
* `FFTW_REDFT01' (DCT-III, "the" IDCT): even around j=0 and odd
around j=n.
* `FFTW_REDFT11' (DCT-IV): even around j=-0.5 and odd around j=n-0.5.
* `FFTW_RODFT00' (DST-I): odd around j=-1 and odd around j=n.
* `FFTW_RODFT10' (DST-II): odd around j=-0.5 and odd around j=n-0.5.
* `FFTW_RODFT01' (DST-III): odd around j=-1 and even around j=n-1.
* `FFTW_RODFT11' (DST-IV): odd around j=-0.5 and even around j=n-0.5.
Note that these symmetries apply to the "logical" array being
transformed; *there are no constraints on your physical input data*.
So, for example, if you specify a size-5 REDFT00 (DCT-I) of the data
abcde, it corresponds to the DFT of the logical even array abcdedcb of
size 8. A size-4 REDFT10 (DCT-II) of the data abcd corresponds to the
size-8 logical DFT of the even array abcddcba, shifted by half a sample.
All of these transforms are invertible. The inverse of R*DFT00 is
R*DFT00; of R*DFT10 is R*DFT01 and vice versa (these are often called
simply "the" DCT and IDCT, respectively); and of R*DFT11 is R*DFT11.
However, the transforms computed by FFTW are unnormalized, exactly like
the corresponding real and complex DFTs, so computing a transform
followed by its inverse yields the original array scaled by N, where N
is the _logical_ DFT size. For REDFT00, N=2(n-1); for RODFT00,
N=2(n+1); otherwise, N=2n.
Note that the boundary conditions of the transform output array are
given by the input boundary conditions of the inverse transform. Thus,
the above transforms are all inequivalent in terms of input/output
boundary conditions, even neglecting the 0.5 shift difference.
FFTW is most efficient when N is a product of small factors; note
that this _differs_ from the factorization of the physical size `n' for
REDFT00 and RODFT00! There is another oddity: `n=1' REDFT00 transforms
correspond to N=0, and so are _not defined_ (the planner will return
`NULL'). Otherwise, any positive `n' is supported.
For the precise mathematical definitions of these transforms as used
by FFTW, see Note: What FFTW Really Computes. (For people accustomed
to the DCT/DST, FFTW's definitions have a coefficient of 2 in front of
the cos/sin functions so that they correspond precisely to an even/odd
DFT of size N. Some authors also include additional multiplicative
factors of sqrt(2) for selected inputs and outputs; this makes the
transform orthogonal, but sacrifices the direct equivalence to a
symmetric DFT.)
Which type do you need?
.......................
Since the required flavor of even/odd DFT depends upon your problem,
you are the best judge of this choice, but we can make a few comments
on relative efficiency to help you in your selection. In particular,
R*DFT01 and R*DFT10 tend to be slightly faster than R*DFT11 (especially
for odd sizes), while the R*DFT00 transforms are sometimes
significantly slower (especially for even sizes).(2)
Thus, if only the boundary conditions on the transform inputs are
specified, we generally recommend R*DFT10 over R*DFT00 and R*DFT01 over
R*DFT11 (unless the half-sample shift or the self-inverse property is
significant for your problem).
If performance is important to you and you are using only small sizes
(say n<200), e.g. for multi-dimensional transforms, then you might
consider generating hard-coded transforms of those sizes and types that
you are interested in (Note: Generating your own code).
We are interested in hearing what types of symmetric transforms you
find most useful.
---------- Footnotes ----------
(1) There are also type V-VIII transforms, which correspond to a
logical DFT of _odd_ size N, independent of whether the physical size
`n' is odd, but we do not support these variants.
(2) R*DFT00 is sometimes slower in FFTW because we discovered that
the standard algorithm for computing this by a pre/post-processed real
DFT--the algorithm used in FFTPACK, Numerical Recipes, and other
sources for decades now--has serious numerical problems: it already
loses several decimal places of accuracy for 16k sizes. There seem to
be only two alternatives in the literature that do not suffer
similarly: a recursive decomposition into smaller DCTs, which would
require a large set of codelets for efficiency and generality, or
sacrificing a factor of 2 in speed to use a real DFT of twice the size.
We currently employ the latter technique for general n, as well as a
limited form of the former method: a split-radix decomposition when n
is odd (N a multiple of 4). For N containing many factors of 2, the
split-radix method seems to recover most of the speed of the standard
algorithm without the accuracy tradeoff.
automatically generated by info2www version 1.2