# Implementation of High Throughput FFT for Communication

Leenu Mathew Thomas<sup>1</sup>, R.Ramya<sup>2</sup>

<sup>1</sup>Student, Department of ECE, Sree Buddha College of Engineering, Kerala, India. <sup>2</sup>Associate Professor, Department of ECE, Sree Buddha College of Engineering, Kerala, India.

*Abstract*— Most of the wireless standards utilize OFDM technique to enhance the transmission rate which is the vital requirement that is to be met today. The increases in the throughput of FFT block in OFDM system which in turn improves the transmission rate of the system. This paper presents high throughput FFT processor based on Multi-path delay feedback architecture for OFDM systems. Mixed Radix- $2/2^2/2^3/2^4$  DIF FFT algorithm is used. A comparison is done for different pipeline architecture such as MDF, MDC and SDF using the same algorithm with respect to throughput. The design is fully coded in VHDL, simulated in Xilinx ISE 13.2 and is implemented on Virtex-5 FPGA.

#### Keywords— OFDM, FFT, FPGA

#### I. INTRODUCTION

In recent years the advancement in wireless communication is high. This necessitated the invention of communication system with higher data transmission rate, which is the key factor influencing the performance of communication systems. Transmission rate is directly associated with the system throughput. The demand for very high data rates has motivated the use of OFDM in wireless standards like IEEE 802.11, IEEE 802.16. Orthogonal Frequency Division Multiplexing is a multi-carrier digital transmission scheme that combines modulation and multiplexing. In an OFDM system, the input data sequence is first baseband modulated using modulation schemes like BPSK. The coded data symbols are then parallelised into N sub-streams. Each substream of coded data will then modulate a separate carrier through the Inverse Fast Fourier Transform (IFFT) modulation block. At the receiver the data symbols are obtained from the stream by using a FFT demodulator block. The FFT block is the main block of an OFDM system. The size of FFT depends on the number of subcarriers used in the wireless standard. Hence efficient implementation of FFT processor which have higher throughput is essential which in turn will improve the transmission rate of the system.

Among the various FFT architectures, pipeline architecture is the best choice for wireless system since it can provide high throughput rate with acceptable hardware cost. It is characterized with real-time, continuous processing of the data sequence. In addition to this, pipeline structure is highly regular, which can be easily scaled and parameterized when Hardware Description Language (HDL) is used. Pipeline FFT architecture can be divided into Single-path Delay Feedback [1]-[4] and Multi-path Delay Commutator Architectures [5], [6]. The SDF architecture has reduced hardware complexity and requires less memory, but hardware complexity is more. Contrary to single-path, the multi-path architecture has more number of input or output paths. Multi-path architecture achieves higher parallelism than regular pipeline architecture. MDC architecture can achieve higher throughput. This paper presents implementation of Multi-path Delay Feedback (MDF) architecture which combines SDF and MDC style. It is compared with other architectures based on throughput.

# II. MIXED RADIX FFT ALGORITHM

Mixed Radix- $2/2^2/2^3/2^4$  DIF FFT algorithm is used. The DIF algorithm takes complex number inputs in natural order and outputs in bit-reversed order. Radix- $2^4$ , a high radix algorithm, is the dominant kernel used which can reduce hardware complexity. 32-point FFT can't be realized using radix- $2^4$  only and so radix 2 algorithm is also used. Likewise, 64-point FFT by radix- $2^2/2^4$ , 128-point FFT by radix- $2^3/2^4$  and 256-point FFT by radix- $2^4/2^4$ . The mixed radix algorithm provides more flexibility in the selection of data points, regular butterfly structure and can be easily implemented.

# III. MDF FFT ARCHITECTURE

Fig.1 shows the block diagram of 256-point MDF architecture. The MDF architecture based FFT processor can deal with 1-2 simultaneous data sequence. The architecture includes 2 types of butterfly units BF1 and BF2, ROM-based multiplier, CSD1 multiplier, CSD2 multiplier, Shift registers and Switch unit. It consists of two paths. The upper path has 8 butterfly stages and lower path has 8 butterfly stages.

# A. Butterfly Unit

The BF1 and BF2 structure is shown in Fig.2 and Fig 3 respectively. The butterfly units does the calculation of the sum and difference of the 2 values given as input. In addition, BF2 does pre-multiplication of the data with -j when needed. It uses a simple swapping and negating logic of the real and imaginary parts of the input. Both units are controlled by signals for either storing the input into registers or performing the computation. Both butterflies share a control signal s, which controls four multiplexers. The signal s is driven by a bit in the counter output. This signal switches the butterfly between two different modes. When s is zero, the butterfly is in passing mode. When in passing mode, x1 and x2 are passed to z2 and z1 respectively. When s is one, the butterfly is in computation mode. In this mode, the butterfly operation is performed and the results are sent to the output signals.



Fig.1 256-point MDF Architecture.

Compared with BF1, BF2 has another control signal. The control signal t is used in BF2 to add some extra functionality to the butterfly when multiplication by -j is required.







Fig.3 BF2 structure

# B. CSD Multiplier

The FFT architecture needs fixed-coefficient multiplications in the case of radix-2<sup>4</sup> algorithm. If fixedcoefficient multiplications is realised with general multipliers, then extra costs is to be paid for chip area and power consumption. This means that implementing a constant multiplier will be enough to eliminate the need for the whole complex multipliers and the ROM to store the twiddle factor coefficients. The multiplication with a constant can be carried out by shifting and adding operation. To reduce the area and power consumption, the constant coefficient can be encoded such that it contains the fewest number of nonzero bits, which can be accomplished using CSD representation. Canonic signed digit is based on a ternary number system (1, 0, -1). It has 2 properties. One is the number of non zero digit is minimal and the other is no two consecutive digits are nonzero.

#### IV. RESULTS AND DISCUSSION

The simulation result for 32, 64,128 and 256 point FFT are shown in Fig.4, Fig.5, Fig.6, Fig.7 respectively.

# International Journal of Engineering Trends and Technology (IJETT) – Volume 14 Number 6 – Aug 2014

| Neve           | Ur.              | 1.00    |       | 90,000 m.   | 1.80    | Sec.            | PURY | 1     | 400  | 1      | 11,000   | 8.00                                     | here   | £200 H | P.      |
|----------------|------------------|---------|-------|-------------|---------|-----------------|------|-------|------|--------|----------|------------------------------------------|--------|--------|---------|
| g 616          | 1                |         |       |             | 1       | Π.              | 17   | m     |      | m.     | 11.1     |                                          | 111    | 11     |         |
| aren.          | 6.1              |         |       |             |         |                 |      |       |      |        |          |                                          |        |        |         |
| a tal,máti     | 10. C            |         |       |             |         |                 |      |       |      |        |          |                                          |        |        |         |
| - M - 04       | 1                | -       |       |             |         |                 |      | _     | 1    | 1      |          |                                          |        |        |         |
| N 10.03        | 4                |         |       |             |         |                 |      |       | T    |        |          |                                          |        |        |         |
| 1 m.con        | 1                |         | _     |             |         | _               |      |       | 1    |        |          |                                          |        |        |         |
| No.024         | 12 1             |         |       |             |         |                 |      |       | 1    |        |          |                                          |        |        | -       |
| 1 W            | 103000           |         | _     | . 1         |         |                 |      | -     | 281  | 1      |          |                                          | 1      |        | -       |
| Britten H +    | 100010           |         | -     | 4           |         |                 |      | -     | 311  | 1      |          |                                          |        |        | -       |
| · Went and     | 4                |         |       |             | 1       |                 |      |       |      |        |          |                                          | 1      |        |         |
| - M and (1294) | 1.               |         |       |             | 8       |                 |      |       |      | Τ.     |          |                                          | 1      |        |         |
| a Mentelle     | ditti            | . (61)  | 100   | 100 100     | i her   | 1000            | 0.50 | 10.05 | 1221 | 1 100  | 100 1    | 100 1 100                                | V INT  | 1051   | 100     |
| W reserved in  |                  | SHUL    | 24.0  | iunateuro   | 000     | (LOIL)          | 111  | 500   | 500  | (TAUA) | 0.000.00 | 0.000                                    | 00.001 | in the |         |
| Winds E.R.     |                  | ABAN    | 1110  | a a success | 000     | (3334           | ERU: | Jui   | 500  | TANK   |          | und root                                 | eciau  | E-MORT | 1000100 |
| All store W    | maannaannaa      | and a   | (IIII | in an an    | 200     | 1000            |      | -     | -    | 144    | سنبره    | un u |        | i.u.   | receip  |
| N Room Mark    | anternation int  | Ration. | 1000  |             | ine:    | Junit           | u    | (IIII | 500  | 1000   |          | u din                                    |        | i iu   | mu      |
| W mast fill    | TRADER OF THE OF | TORNE . | -     |             | 1 100-2 | 31.00           | 117  | -     |      | 1983   | Sec. ()  | NI 1981                                  | 0.00   | OK)    | 28 5    |
| Kinnell A      | DOMESTIC, MARK   |         | -     | THE         | 0.000   | 66 <sup>°</sup> |      |       | -    | (100)  | 510.15   | in 15 in                                 | 04     | 110    | 11.1    |
| a lease wat    | m.               | -       |       |             | 1.11    |                 |      |       | 18   |        | CALCU -  |                                          |        |        | 1       |

Fig.4 Simulation Result for 32-point MDF architecture

|             | 1-                   | 1.40m     | 1.004      | pure .      | put a          | 0.08%      | P.F.          | K.Mini    | p1.80%     | NRN        |
|-------------|----------------------|-----------|------------|-------------|----------------|------------|---------------|-----------|------------|------------|
| 101         | 4                    | 30.00     |            | 100         | 01 02          | 20 10      | 14 10         | 10.00     | 10 10      | 10 20      |
| 1.00        | 4                    |           |            |             |                |            | -             |           |            | 1.1.1      |
| la latinen  | à -                  |           |            |             |                | 1.00       |               |           |            |            |
| 194.00      | (A)                  |           |            |             |                | 1.1        |               |           |            |            |
| Name        | a                    |           |            |             |                | 1          |               |           |            |            |
| Maria       | 4                    |           |            |             |                | 1          |               |           |            |            |
| N           |                      | -         |            |             |                | 1.1        |               |           |            |            |
| M met.clist | a loop of            |           |            |             | 1              |            |               | 100.1     | 1          |            |
| Monthles.   | 100000               |           |            |             | 1              |            |               | Add (     | T          |            |
| M =1.054    | 1                    |           |            |             |                |            |               | 110X      |            |            |
| Mana, com   |                      |           |            |             | 1              |            |               | - 0       | 1          |            |
| M units     | ALL N                | 1041 (123 | 1002 (00.0 | les: les    | 10 1000 11 00  | 1102 (10.0 | (10 mg (24.8) | 122.2     | 1001.000   | (1994) (10 |
| Manager     | IDALAR MARK          | Second    | 0.00       | 64.00. 70.0 | III OLIVER AND | date: date | ADD DOL       | An das    | ALC: UNK   | 10.0000    |
| Managia     | Investmenter         | Summer    | 0.00 00    | R. (9       | 100 00 10 30 0 | 0000. (000 | ABRE: \$14.   | Att. (100 |            |            |
| ¥ mon, 1918 | International Sector | Summer    | 0.00 00    | alan da     |                | un (au     | and inc       | ALL GUL   |            | 10000      |
| M monital   | (CONTRACTOR)         | annuar    | 0.00       | COLUMN DA   |                | ALC: COM   | ARE - 144     | 146 (000  | to the set | NUCCESSI I |
| Manager     | INCOMPANY:           | 3000      | 0.300000   | 1           |                | 0.00       | iðt -         |           | \$61.108   | (30. 75    |
| Manacala    | 10000001,000         |           |            | [10         | 000.2000       |            |               | (00)      | (but (100  | De 11      |
| S were we   | 111                  | -         |            |             |                | 38         |               |           |            |            |
| Sec.        |                      | -         |            |             |                | - 10       |               |           |            |            |

Fig.5 Simulation Result for 64-point MDF architecture

| here          | 144                                     | picking policy size     | NY IINT          | (DA)ED a    | IX MIN       | 121,211 1 | (1000 tr                                  | 12.00in  |
|---------------|-----------------------------------------|-------------------------|------------------|-------------|--------------|-----------|-------------------------------------------|----------|
| 8.04          | 4                                       |                         | 10 10 10         | 12.12       | <b>B</b> (3) | 1 12      | 13 15                                     |          |
| 1.000         | +                                       |                         |                  |             |              |           |                                           |          |
| a bet make    | 1                                       |                         |                  |             |              |           |                                           |          |
| \$ 10,08      | 1. C                                    |                         |                  | 111         |              |           |                                           |          |
| Tank          | 1                                       |                         |                  | 1           |              |           |                                           |          |
| Magen.        | 1                                       |                         |                  | 1           |              |           |                                           |          |
| No.01         | 4                                       |                         |                  | I.          |              |           |                                           |          |
| Margaret.     | Discontante -                           |                         | 1                | -           |              | 1.682     | 1                                         | -        |
| W HALES       | 1100000000                              |                         | 1                |             |              | Sell )    | 1                                         |          |
| W mat (Site   | 1                                       |                         | 1                |             |              | 1         | 10                                        |          |
| West, and     |                                         |                         |                  |             |              | - X       | 1.1                                       |          |
| W among       | RAILS.                                  | MIN. OCK. SHEN. OPI. DR | \$ 400. WALL 191 | 10.01.1003  | 301.031      | HILLOW    | 100.100.1                                 | 101.38   |
| Wittener (114 | 100000000000000000000000000000000000000 | (Incompany)             | Companies        | utan (tee   | un (m        | .us. (au  | daman                                     | ntwinter |
| Witness W     | 1000000000                              | dan na dan manada       | . diminina a     | ahar pas    | dan yan      | 107 (10   | time of the                               | 100 AU   |
| Wann, tall    | 10.000                                  | 1.11000.0000            | Continues.       | 1000 1000   | unu din.     | 100 (000  | di ta | LEUXINI, |
| Mana G.A      | IRWINING                                | farmen and              | Continuent       | NUMBER (NOR | unu (na.     | .us :/uu  | damaa                                     | COLUMN   |
| \$ man, \$2   | (resource)                              | anasocodi xonoonin      | sty. (2000)      | sel montes  | (000000)     | (50       | 30 (8)                                    | 10. jii  |
| W how Ed.     | INNORO                                  | 303                     | and the second   | 1000        |              | 194       | 1811 (22.1                                | 00.0     |
| a test with   | auto:                                   |                         |                  | 103         |              |           |                                           |          |
| Lafe .        | 1100                                    |                         |                  | 10.0        |              |           |                                           |          |

Fig.6 Simulation Result for 128-point MDF architecture



Fig.7 Simulation Result for 256-point MDF architecture

For a 256-point FFT, inputs xin(0) ,xin(2)....xin(254) are given to the upper path and xin(1), xin(3), ....., xin(255) are given to lower path. An 8 bit counter controls the entire structure. There is 8 stages of butterfly unit in each of the parallel path. The first set of input sample x(0) is fed to upper path and x(1) is fed to the lower path in the first clock cycle. In the second clock cycle x(2) and x(3) is fed. Then in consecutive clock cycle rest of the samples are given and the last set of samples x(254) and x(255) is given at  $127^{th}$  clock cycle. The starting clock cycle at which output is obtained from each butterfly unit is as follows: For first butterfly unit at 65<sup>th</sup> clock cycle, For second butterfly unit at 97<sup>th</sup> clock cycle, For third butterfly unit at 113<sup>th</sup> clock cycle, For fourth butterfly unit at 121<sup>th</sup> clock cycle, For fifth butterfly unit at 125<sup>th</sup> clock cycle, For sixth butterfly unit at 127<sup>th</sup> clock cycle, For seventh butterfly unit at 128th clock cycle, For eighth butterfly unit at 128<sup>th</sup> clock cycle. The first output sample is obtained at 128<sup>th</sup> clock cycle and rest of the output samples are obtained at consecutive clock cycle and last output sample is obtained at 255<sup>th</sup> clock cycle.

Fig.8 shows the implementation setup of 256-point FFT architecture on Virtex-5 board



Fig.8 Implementation of 256-point MDF architecture

Table I shows the comparison of different architectures-SDF, MDC, MDF.

TABLE I COMPARISON OF DIFFERENT ARCHITECTURES

| Architecture | FFT<br>size | Min.<br>Period | Frequency<br>(MHz) | Throughput<br>(MS/s) |
|--------------|-------------|----------------|--------------------|----------------------|
| CDE          | 22          | (118)          | 26.70              | 26.70                |
| SDF          | 32          | 27.18          | 36.79              | 36.79                |
|              | 64          | 28.58          | 34.99              | 34.99                |
|              | 128         | 36.07          | 27.72              | 27.72                |
|              | 256         | 43.32          | 23.08              | 23.08                |
| MDC          | 32          | 22.21          | 45.03              | 90.05                |
|              | 64          | 25.04          | 39.94              | 79.88                |
|              | 128         | 28.83          | 34.68              | 69.36                |
|              | 256         | 35.58          | 28.11              | 56.22                |
| MDF          | 32          | 18.9           | 52.9               | 105.66               |
|              | 64          | 20.43          | 48.94              | 97.88                |
|              | 128         | 27.08          | 36.92              | 73.84                |
|              | 256         | 34.65          | 28.86              | 57.72                |

Fig.9 shows the comparison of different architecture based on throughput.



Fig.9 Comparison of different architectures

#### V. CONCLUSION

The Fast Fourier Transform is an inevitable module in OFDM systems, so it is to be designed in efficient way. The MDF architecture have high throughput than SDF and MDC architecture and it is found to meet the requirement of wireless standards. The architectures are fully coded in VHDL language. The simulation in done in ISim and a netlist is generated for FPGA configuration and is implemented in Virtex 5 FPGA board.

# ACKNOWLEDGMENT

Author wish to remark the great task carried out by the Xilinx and Virtex-5 user guide; and the author wish to thank Prof. Somi Sebastian and R. Ramya for their contribution in the design process.

#### REFERENCES

- [1] Shousheng He and Mats Torkelson, "Designing Pipeline FFT Processor for OFDM (de)Modulation", *IEEE International Symposium* on Signals, Systems, and Electronics, pp.257–262, 1998.
- [2] C. Fan, M. Lee, G. Su, "Efficient low multiplier cost 256-point FFT design with radix 2<sup>4</sup> SDF architecture", *IEEE Asia Pacific Conference* on Circuits and Systems, pp.1935–1938, 2006.
- [3] M.J.S Rangachar, "VLSI Implementation and Performance Analysis of Efficient Mixed-radix 8-2 FFT Algorithm with Bit Reversal for the Output Sequences", *Journal of Theoretical and Applied Information Technology* 2010.
- [4] C.Vennila, G.Lakshminarayanan, Seok-Bum Ko, "Dynamic Partial Reconfigurable FFT for OFDM based Communication Systems", *Springer Circuit System Signal processing*, p 1-18, October 2011.
- [5] Y.-N. Chang, "An efficient VLSI architecture for normal I/O order pipeline FFT design", *IEEE Trans. Circuits Syst. II, Exp. Briefs*, Vol. 55, No. 12, pp. 1234-1238, Dec. 2008.
- [6] Yang K.J, and Chuang G.C.H, "A MDC FFT Processor with Variable Length for MIMO-OFDM", IEEE *Transactions on VLSI systems*, Vol. 21, pp.720-731, 2013.