# A Critical Review on FPGA Implementation of Discrete Hartley Transform for Digital Image Processing Applications Manoj Kumar Gole<sup>1</sup>, Ambresh Patel<sup>2</sup>, and Sachin Bandewar<sup>3</sup> 1,2,3</sup>Department of ECE, Sri Satya Sai College of Engineering, RKDF University, Bhopal, M.P ABSTRACT: A Discrete Hartley Transform (DHT) algorithm can be efficiently implemented on a highly modular and parallel architecture having a regular structure is consisting of multiplier and adder. Discrete Hartley Transform (DHT) is one of the transform used for converting data in time domain into frequency domain using only real values. Many technologies have been investigated so far to reduce delay and area of the DHT system. This paper presents a comprehensive review of DHT for digital image processing using maximum combinational path delay (MCPD) and number of slice count of Discrete Hartley transform. A VLSI hardware comparison is also being presnted with respect to the state-of-the-art architectures. **KEYWORDS**: Discrete Hartley Transform (DHT), Urdhwa Multiplier, MCPD, and Number of slice ## I. INTRODUCTION Digital signal processing (DSP) includes processing of data in various domains based on their applications.DSP has vast applications in various fields such as space, medical, commercial, industrial and scientific [1]. Each requires processing of vast data for collecting useful information. Transform is a technique used in DSP for converting one form of data in another. A family of transform is available in DSP for data processing .Fourier analysis one of the oldest technique used in this family [2]. Fourier analysis is named after Jean baptiste joseph Fourier (1768-1830) a French mathematician and physicist. It was used for periodic continuous signals. Fourier series is a technique which decomposes a signal in time domain into a no. of sine and cosine waves in frequency domain. But it was not applicable for non-periodic signals. Then came Fourier transform into existence which removes the drawback of Fourier series and thus can be used for non-periodic continuous signals [3]. Fourier transform is a mathematical tool using integrals. But Fourier transform is not suitable for non-stationary signals. Since both transforms are not applicable for discrete signals, so there is a need for new transform for discrete signals. Discrete time Fourier transform (DTFT) is used for signals that extend from positive to negative infinity but are not periodic. DTFT is not used for periodic discrete signals so discrete Fourier transform (DFT) can into existence. DFT is a discrete numerical equivalent of FT using summation instead of integrals. DFT is used for signals that repeat themselves in periodic fashion extending from positive to negative infinity. FFT is improvement of DFT in which computation has becomes faster [4]. All the family members of Fourier till now works on complex values which requires large storage space and computationally complex in nature. So, now comes a new member of transform called Discrete Hartley transform (DHT) which converts real values into real values. Therefore, it needs lesser storage space and less computational complexity. There has been an expected rapidly growing interest in, and development of, secure communication techniques in relation to the activities of military services, banking systems and other systems where degree of secured speech signal transmission plays a major role. Scrambling is used to keep the secrecy of speech signal over unauthorized listeners. It is simply disordering of the speech signal so that it is no longer intelligible. The original speech signal can be recovered by the intended receiver through appropriate descrambling technique. Among speech scramblers, analog speech scramblers are considered due to their wide applicability. The scrambling techniques could be classified as time-domain and frequency-domain scrambling. In time-domain scrambling, speech signals are divided into small time interval units and these units are permuted [5]. As these units could be as small as just one sample, scrambling results in bandwidth expansion. This can lead to loss of signal out of band of the channel and thereby degrading the speech quality. In frequency-domain scrambling, speech signals are separated into several subbands and these subbands are then permuted. It ensures the original bandwidth is kept unchanged. In the frequency- domain, the first algorithms used were based on Fast Fourier Transform (FFT) technique, where the FFT coefficients are permuted frame to frame [6]. Techniques based on Discrete Cosine Transform [7], Hadamard Transform [8], Wavelet Transform [9], Principal Component Analysis [10] etc have also been subject of studies. The paper is organized as follows: Section II presents the discrete Hartley transform System Model, section III presents the Challenges and issues, section IV presents Literature review, Results and discussions are provided in Section V. Finally, in section VI conclusions are summarized. Discrete Hartley Transform is abbreviated for DHT and this transform was proposed by R. V. L. Hartley in 1942. DHT is the analogous to Fast Fourier transform which provides the only real value at any cost. The main difference from the DFT is that it transforms the real inputs to real outputs with no intrinsic involvement of complex value. DFT can be used to compute the DHT, and vice versa. In other words, Discrete Hartley transform is used to convert real values into real ones. It requires decomposition of data into stages using butterfly similar to FFT. But the butterfly used in DHT is quite different in terms of coefficients or multipliers. With the increase in number of DHT sequence length the number of coefficients is also increased simultaneously. Figure 1: Block Diagram of DHT Let $N \ge 4$ be a power of two. For any real input sequence $\{x(i) : i = 0,1,2,...$ $$X(k) = DHT(N)\{x(i)\}$$ $$= \sum_{i=0}^{N-1} x(i).cas[2ki\pi/N] \quad for k = 0,1.....N-1$$ Where cas(x) = cos(x) + sin(x) for k = 0,1,...N - 1 # • ALGORITHM FOR 16 POINT DHT- We present an implementation of fast DHT algorithm for a length N=1. There are six stages required to complete the butterfly design of N=16 length DHT. These stages include summing stages and coefficient multiplying stages. Before first stage the data sequence are arranged in bit reversed pattern by using any method like permutation. Then in the first stage the pairs of bit reversed patterns are added to form eight terms. In the second stage, one third of the terms are again added and subtracted to form further three terms. First two stages do not include any multiplication. Remaining terms are multiplied by the first coefficient. In the next stage again two new coefficients are introduced which is multiplied by the lower half of the third stage. In each stage multiplying of coefficients stage precedes its summing stage. After coefficient multiplication it is preceded by its summing stage to form the common terms used in the final stage. Last stage includes only summing of terms. Finally we get the transformed data sequence in order and do not need any permutation. #### II. CHALLENGES AND ISSUES Novel Structures for Cyclic Convolution Using Improved First-Order Moment Algorithm This paper first presented a decomposition scheme to reduce the computation time and make the firstorder moment-based cyclic convolution well suited for hardware implementation. By decomposing the fixed convolution kernel into similar subparts and using their preprocessing results as control signals, each subpart of cyclic convolution can be calculated with a basic computing substructure [11]. A Factorization Scheme for Some Discrete Hartley Transform Matrices. Discrete transforms such as the discrete Fourier transform (DFT) and the discrete Hartley transform (DHT) are important tools in numerical analysis, signal processing, and statistical methods. The successful application of transform techniques relies on the existence of efficient fast transforms [12]. Features extraction based on the discrete hartley transform for closed contour. In this paper the authors proposed a new closed contour descriptor that could be seen as a Feature Extractor of closed contours based on the Discrete Hartley Transform (DHT), its main characteristic is that uses only half of the coefficients required by Elliptical Fourier Descriptors (EFD) to obtain a contour approximation with similar error measure [13]. Discrete transforms such as the discrete Fourier transform (DFT) and the discrete Hartley transform (DHT) are important tools in numerical analysis, signal processing, and statistical methods. The successful application of transform techniques relies on the existence of efficient fast transforms. So, there is a strong need of algorithm for DHT which reduces delay and complexity as well as area. ### 3.1 Urdhwa Multiplier Vedic mathematics is an ancient fast calculation mathematics technique which is taken from historical ancient book of wisdom. Vedic mathematics is an ancient Vedic mathematics which provides the unique technique of mental calculation with the help of simple rules and principles. Swami Bharati Krishna Tirtha (1884-1960), former Jagadguru Sankaracharya of Puri culled set of 16 Sutras (aphorisms) and 13 Sub-Sutras (corollaries) from the Atharva Veda. He developed methods and techniques for amplifying the principles contained in the formulas and their sub-formulas, and called it Vedic Mathematics. According to him, there has been considerable literature on Mathematics in the Veda- sakhas. We have presented three designs for Urdhwa multiplier. These three designs vary in terms of 5:3 compressor designs. First design uses the simple 5:3 compressor which uses two full adders. Second design uses four XOR gates and two multiplexers (2×1). Third design of 5:3 compressors is two XOR-XNOR gates and four multiplexers (2×1). First design of 5:3 compressors is simplest one, but more delay. Next design has lesser delay as compared to the first design at the cost of higher complexity. But the last which is our proposed design has least amount of delay but has greatest complexity and occupies maximum space among all of them. # • 5:3 Compressor To add binary numbers with minimal carry propagation we use compressor adder instead of other adder. Compressor is a digital modern circuit which is used for high speed with minimum gates requires designing technique. This compressor becomes the essential tool for fast multiplication adding technique by keeping an eye on fast processor and lesser area. Figure 3: Block Diagram of 5:3 Compressors 5:3 compressors are capable of adding 4 bits and one carry, in turn producing a 3 bit output. The 5:3 compressors has 4 inputs $X_1$ , $X_2$ , $X_3$ and $X_4$ and 2 outputs Sum and Carry along with a Carry-in (Cin) and a Carry-out (Cout) as shown in Figure 1. The input Cin is the output from the previous lower significant compressor. # A. DELAY AND AREA CALCULATE The delay and area (cost) evaluation methodology considers all logic gates is consist of AND, OR, and Inverter (AOI), each having delay equal to 1 unit and area equal to 1 unit. The delay evaluation methodology adds up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block. Table 1: Delay and Area Count of the Basic Blocks of DHT | Adder Blocks | Delay | Area | | |--------------|-------|------|--| | AND Gate | 1 | 1 | | | OR Gate | 1 | 1 | | | NOT Gate | 1 | 1 | | | XOR Gate | 3 | 5 | | | XNOR Gate | 3 | 5 | | | 2:1 MUX | 3 | 4 | | | Half Adder | 3 | 6 | | | Full Adder | 6 | 13 | | #### III. LITERATURE REVIEW In Labros Pygras et al. [1] The discrete Hartley transform finds numerous applications in signal and image processing. An efficient Field Programmable Gate Array implementation for the 64-point. Two-Band Fast Discrete Hartley Transform is proposed in this communication. The architecture requires 57 clock cycles to compute the 64-point Two-Band Fast Discrete Hartley Transform and reaches a rate of up to 103.82 million samples per second at a 92 MHz clock frequency. The architecture has been implemented using VHDL and realized on a Cyclone IV FPGA of Altera. In Shirali Parsai et.al. [2], performance Comparison of Multi-resolution Wavelet Transforms to Hybrid Transforms for Image Compression, this paper proposed multi-resolution based image compression method. Wavelets have ability to analyze the signal at different resolutions to give different levels of details in the image. This property has been used in image compression and performance is compared with hybrid transform and hybrid wavelets generated using respective component transforms. Kekre transform (DKT) is paired with Hartley transform, Discrete Sine Transform and Real Fourier Transform to obtain respective hybrid wavelet transforms. Experiments showed that DKT-Real DFT gives better image compression than DKT-Hartley and DKT-DST pair. **In Doru Florin Chiper et.al.** [3], study on Underwater Navigation System Using Discrete Hartley Transform Unscented Kalman Filter for Attitude Estimation. To reduce the attitude estimate error and improve the calculation efficiency for navigation system applied to underwater glider; this paper proposed the discrete Hartley transform unscented Kalman filter (DUKF) utilizing inertial measuring units (IMUs). The architecture presented here uses high power consuming multipliers. In Sushma R. Huddar et.al. [4], the use of Hartley transform to solve the integral equation of convolution type. The authors have presented the new formula provided for the solution of the integral equation of convolution type in the Hartley basis. The advantage of discrete Hartley transform compared to the DFT is proved and illustrated and is also used for image compression. The results of the simulation and computational efficiency evaluation of the algorithms are presented. In S. S. Kerur et.al. [5], novel Structures for Cyclic Convolution Using Improved First-Order Moment Algorithm This paper first presented a decomposition scheme to reduce the computation time and make the first-order moment-based cyclic convolution well suited for hardware implementation. By decomposing the fixed convolution kernel into similar subparts and using their preprocessing results as control signals, each subpart of cyclic convolution can be calculated with a basic computing substructure. Due to the flexibility of decomposition, a trade-off between computation time and hardware complexity exists. Comparisons in terms of area-delay product, areatime product and power consumption with the existing memory-based structures have been made to demonstrate the efficiency and effectiveness of the proposed structures. Using the same metrics, the comparison results further show significant improvement of the proposed designs over the previous first-order moment-based structure. In Jagadguru Swami et.al. [6], a Factorization Scheme for Some Discrete Hartley Transform Matrices. Discrete transforms such as the discrete Fourier transform (DFT) and the discrete Hartley transform (DHT) are important tools in digtal image processing, numerical analysis, signal processing, and statistical methods. The successful application of transform techniques relies on the existence of efficient fast transforms. In this paper some fast algorithms are derived. The theoretical lower bounds on the multiplicative complexity for the DFT/DHT are achieved. The approach is based on the factorization of DHT matrices. TABLE II: Comparisons Result for 8-point Discrete Hartley Transform (DHT) with word length 16. | Parameter | S. S. Kerur | Sushma R | Doru et al. | Shirali et al. [2] | |-----------------------------------------|-------------|-----------|-------------|--------------------| | | [5] | [4] | [3] | | | No. of Slices | 635 | 612 | 601 | 599 | | No. of 4 input LUTs | 1125 | 1093 | 1059 | 1032 | | No. of bounded IOBs | 256 | 256 | 256 | 256 | | Maximum combinational path delay(in ns) | 29.626 ns | 26.977 ns | 25.783 ns | 24.388 ns | ## IV. CONCLUSION DHT is a new transform used for real value to real value conversion. This paper presented a literature review of DHT used for digital image processing in variaous applications. Existing power consuming multipliers is an ancient technique for multiplication. DHT is used in various fields such as image processing, space science, scientific applications etc. Delay provided and area required by hardware are the two key factors which are need to be consider. Here we present DHT with 8×8 Urdhwa multiplier by using full adder, OR and XNOR gates in different types of compressors. The effect of delay and area using DHT are also discussed in the present work. The VLSI implementation of DHT using efficient multiplication can be carried out on FPGA using XILINX ISE Desugn simulator for hardware acceleration of the algorithm. #### REFRENCES - [1] Pyrgas, Labros, Paris Kitsos, and Athanassios N. Skodras. "An FPGA design for the two-band fast discrete hartley transform." 2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2016. - [2] Shirali Parsai and Swapnil Jain, "VHDL Implementation of Discrete Hartley Transform Using Urdhwa Multiplier", 978-1-4673-9542- 7/15/\$31.00 ©2015 IEEE. - [3] Doru Florin Chiper, Senior Member, IEEE, "A Novel VLSI DHT Algorithm for a Highly Modular and Parallel Architecture", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 60, NO. 5, MAY 2013. - [4] Sushma R. Huddar and Sudhir Rao, Kalpana M., "Novel High Speed Vedic Mathematics Multiplier using Compressors", 978-1 - 4673-5090- 7/13/\$31.00 ⊚2013 - IEEE. - [5] S. S. Kerur, Prakash Narchi, Jayashree C N, Harish M Kittur and Girish V A, "Implementation of Vedic multiplier for Digital Signal Processing", International Conference on VLSI, Communication & Instrumentation (ICVCI) 2013, Proceedings published by International Joural of Computer Applications® (IJCA), pp.1-6. - [6] Jagadguru Swami Sri Bharati Krishna Tirthaji Maharaja, "Vedic Mathematics: Sixteen simple Mathematical Formulae from the Veda", Delhi (2011). - [7] Sumit Vaidya and Depak Dandekar. "Delay-power perfor-mance comparison of multipliers in VLSI circuit design". International Journal of Computer Networks & Communications (IJCNC), Vol.2, No.4, July 2012. - [8] SumitVaidya and DepakDandekar. "Delay-power perfor-mance comparison of multipliers in VLSI circuit design", International Journal of Computer Networks & Communications (IJCNC), Vol.2, No.4, July 2012. - [9] P. D. Chidgupkar and M. T. Karad, "The Implementation of Vedic Algorithms in Digital Signal Procesing", Global J. of Eng. Edu, Vol.8, No.2, 204, UICEE Published in Australia. - [10] S. Bouguezel, M. O. Ahmad, and M. N. S. Swamy, "New parametric discrete Fourier and Hartley transforms, and algorithms for fast computation," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 58, no. 3, pp. 562–575, Mar. 2011. - [11] J. S. Wu, H. Z. Shu, L. Senhadji, and L. M. Luo, "Radix 3 × 3 algorithm for the 2-D discrete Hartley transform," IEEE Trans. Circuits Syst. II, Exp.Briefs, vol. 55, no. 6, pp. 566–570, Jun. 2008. - [12] S. Bouguezel, M. O. Ahmad, and M. N. S. Swamy, "A split vector-radix algorithm for the 3-D discrete Hartley transform," IEEE Trans. CircuitsSyst. I, Reg. Papers, vol. 53, no. 9, pp. 1966–1976, Sep. 2006. - [13] D. F. Chiper, "Radix-2 fast algorithm for computing discrete Hartley transform of type III," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 59, no. 5, pp. 297– 301. May 2012. - [14] H. Z. Shu, J. S. Wu, C. F. Yang, and L. Senhadji, "Fast radix-3 algorithm for the generalized discrete Hartley transform of type II," IEEE SignalProcess. Lett., vol. 19, no. 6, pp. 348–351, Jun. 2012. - [15] D. F. Chiper, "Fast radix-2 algorithm for the discrete Hartley transform of type II," IEEE Signal Process. Lett., vol. 18, no. 11, pp. 687–689, Nov. 2011. - [16] R.N. Bracewell, The Discrete Hartley Transform, J. Opt. Soc. Amer., vol.73, pp.1832-1835,Dec., 1983. - [17] R.V.L. Hartley, A More Symmetrical Fourier Analysis Applied to Transmission Problems, *Proc. IRE*,vol.30, pp.144-150, Mar., 1942. - [18] R.N. Bracewell, The Hartley Transform, Oxford University Press, 1986. - [19] K.J. Olejniczak, G.T. Heydt Eds., Proc. of the IEEE, Section on the Hartley Transform, vol.82, n.3, Mar., pp.372-447, 1994. - [20] J.-L. Wu and Shiu, Discrete Hartley Transform in Error Control Coding, *IEEE Trans. Acoust., Speech, Signal Processing*, ASSP-39, pp.2356-2359, Oct., 1991. - [21] R.E. Blahut, Fast Algorithms for Digital Signal Processing, Addison-Wesley, 1985. - [22] G. Bi and Y.Q. Chen, Fast DHT Algorithms for Length N=q\*2m, *IEEE Trans. on Signal Processing*, vol. 47,n.3,Mar., pp.900-903, 1999.