# A Novel Approach for Improvement of Power and Delay on Various Domino Logic Circuits

Jyoti Shrivastava<sup>#1</sup>, Paresh Rawat

<sup>#</sup> PG Student [VLSI], Dept. of ECE, Truba College of Science and Technology Bhopal

Bhopal, India

Abstract— Leakage power consumption is a major technical problem facing in nanometre CMOS circuit in deep submicron technology. Domino logic is a CMOS based evolution of the dynamic logic techniques based on either PMOS or NMOS transistors. Dynamic logic circuits are used for their high performance, but their high noise and extensive leakage has caused some problems for these circuits. Dynamic CMOS circuits are inherently less resistant to noise than static CMOS circuits. In this paper we proposed different domino logic styles which increases performance compared to existing domino logic styles. According to the simulations in HSPICE at 90nm and 65nm CMOS technology, the proposed circuit shows the improvement of Average power consumption upto for 8 input OR gate 30% compared existing domino logics. This control circuit produces small voltage at the source of the pull down network in the standby mode. It improves the noise immunity of the domino circuits. The performance of these circuits has been evaluated by HSPICE using a BSIM4. Finally average power dissipation characteristics are plotted with the help of a graph and comparisons are made between different logic families.

Keywords—Low Power, High Speed, CKD, UNG.

# I. Introduction

The continuous advancement of semiconductor technology in electronic devices, over the years has resulted in better performance and higher circuit densities. However, as the size is getting smaller and the integration density increase, the increasing power dissipation has become a primary concern for further development of VLSI circuit technology. The two main types of power dissipation in semiconductor devices are: static power and dynamic power dissipation. The dynamic power dissipation is due to the energy loss during charging and discharging processes of output capacitance, during switching activities in transistor, while static power dissipation is caused by internal leakage in devices when the circuit is in off state [1].

The dynamic logic circuit requires two phases. The first phase, when clock is low, is called the pre-charge phase and the second phase, when clock is high, is called the evaluation phase. In the setup phase, the output is driven high unconditionally (no matter the values of the inputs A and B). The capacitor, which represents the load capacitance of

this gate, becomes charged[1-2]. Because the transistor at the bottom is turned off, it is impossible for the output to be driven low during this phase as shown in Fig.1.



Fig.1: Footed domino logic circuit

During the evaluation phase clock is high. If inputs A and B are also high, the output will be pulled low. Otherwise, the output stays high (due to the load capacitance).

Dynamic logic has a few potential problems that static logic does not. For example, if the clock speed is too slow, the output will decay too quickly to be of use. Also, the output is only valid for part of each clock cycle, so the device connected to it must sample it synchronously during the time that it is valid[3-4].

Also, when both A and B are high, so that the output is low, the circuit will pump one capacitor-load of charge from supply to ground for each clock cycle, by first charging and then discharging the capacitor in each clock cycle. This makes the circuit (with its output connected to a high impedance) less efficient than the static version (which theoretically should not allow any current to flow except through the output), and when the A and B inputs are constant and both high, the dynamic NAND gate uses power in proportion to the clock rate as long as it functions correctly [5]. The power dissipation can be minimized

by keeping the load capacitance low, but this in turn reduces the maximum cycle time, requiring a higher minimum clock frequency; the higher frequency then increases power consumption by the relation just mentioned. Therefore, it is impossible to reduce the idle power consumption (when both inputs are high) below a certain limit which derives from an equilibrium between clock speed and load capacitance.

#### **II. DOMINO LOGIC**

Domino logic is obtained by adding a static CMOS inverter to the output of basic dynamic CMOS logic. The domino logic gates are non-inverting because of the output inverter. The main idea of building domino logic is to limit charge sharing and charge leakage by feeding back the inverting output, so that we can retain the potential at the dynamic node (the output node of dynamic CMOS logic) with using the charge keeper circuit technique. Dynamic domino logic is typically used for the design of high-speed applications. Unfortunately, the clock distribution network dissipates 45% of the overall consumed power, thus preventing the use of dynamic domino circuits in low-power applications. Moreover, the distribution of the clock signal involves non-trivial design issues, such as controlling skew and jitter.

Footed domino logic is a general form of domino logic circuit. It is so called because of a footer transistor is placed in the circuit. The footer transistor is generally an nMOS transistor. Footer transistor shows a better noise and leakage tolerance because of leakage reduction due to stacking effect. The circuit diagram for footed domino logic circuit can be given as shown in Fig.1.

- Good noise margin- When noise does not exceed the margins, the gate eventually will settle to the correct logic level.
- Robustness against voltage scaling and transistor sizing.
- Number of transistors required to implement an N fan-in gate is 2N. This can results in a significant large implementation area.
- Requires both nMOS and pMOS transistors on each input and pMOS transistors add significant capacitance with relatively large logical effort.
- Suffers from lower performance, especially for large fan-in gates.

## **II.** Literature Survey

Static logic circuits require a large number of transistors to implement a function and also a considerable amount of time delay. In high-density, high-performance applications where reduction of circuit delay and silicon area is a major objective, dynamic logic circuit is preferably used. For such applications dynamic logic circuit gives various significant advantages as compared to the static CMOS logic circuits. The operation of all dynamic logic gates depends on temporary storage of charge in parasitic node capacitance, instead of relying on steady state circuit behaviour like static logic circuit. Dynamic logic circuits require periodic clock signals in order to control charge refreshing. In dynamic logic circuits the ability of temporarily storing a state allows us to implement simple sequential circuits with memory function. In dynamic logic circuit the use of common clock signals throughout the system enables us to synchronize the operations of various circuit blocks.

Basic domino circuits are footed domino logic (FDL) and footless domino logics (FLDL).

#### High speed Domino logic circuit

High speed domino is another domino logic circuit. In domino logic circuit current drawn through the keeper transistor and pull down network NMOS transistors at the beginning of the evaluation phase, can be reduced by applying a clock delay in the circuit. That does not affect the leakage current in the circuit as shown in Fig.2.



Fig.2. High speed domino logic circuit

But apart from this the extra clock delay consumes extra area and power, which is a big drawback of the circuit. It gives an effective solution to increase the robustness of the circuit.

In High speed domino logic circuit when clock becomes high,  $M_{n1}$  is still off and  $M_{p2}$  is still on. Therefore  $M_{p2}$  turns off the keeper transistor. After some delay of inverter  $M_{p2}$  becomes off. Now if dynamic node remains high during the evaluation phase, NMOS is turn on which turns on the keeper transistor. Hence at the beginning of evaluation phase dynamic node is afloat, so in the absence of keeper

transistor, evaluation node may be discharged for any noise at the input section. Also the voltage at the gate of the keeper transistor would be  $V_{DD}$ - $V_{tMn1}$ . This would provide a dc current flow through the PMOS keeper transistor and the NMOS network.

#### **Diode Footed Domino (DFD)**

Preacharge Phase: During this phase a low clock pulse is applied. Thus during the precharge phase transistor  $M_3$  is ON and it turns off the mirror transistor  $M_2$  to prevent any possible short-circuit current through  $M_2$  during this phase as shown in Fig.3 [7-8].



Fig.3. Diode Footed Domino Logic

Evaluation phase: In this phase high clock pulse is applied and any of the inputs are allowed to switch to high level. Due to the leakage current of the evaluation transistor some voltage drops across the diode footer M<sub>1</sub>. Thus a negative voltage exists between gate and source of the evaluation transistors that is in OFF mode. This negative voltage exponentially reduces subthreshold current. Moreover, the voltage-drop across the diode increases the body effect of the evaluation transistors, which also helps in the subthreshold leakage reduction. Switching threshold voltage of the gate is increases by the threshold of nMOS devices. Thus noise immunity improves by the higher gate switching voltage at the expense of speed degradation. Speed is improved by adding a mirror transistor  $M_2$  as shown in Fig. 3.6.

$$M = \frac{(\frac{W}{L})_{mirror \ transistor}}{(\frac{W}{L})_{diode \ footer}}$$

#### Leakage Current Replica (LCR)

In Leakage Current Replica Domino logic one extra pMOS transistor  $M_{K1}$  is stacked above the keeper transistor as shown in Fig.4. [6]. Addition to this pMOS transistor a replica current mirror is added with

 $M_{K1}$ . The main function of the current mirror is to track the leakage current and copies it into the dynamic gate through the transistor  $M_{K1}$ . Construction of the current mirror is such that it draws current sf.Ileak where sf is safety factor and Ileak is dynamic gate leakage current. An extra nMOS transistor  $M_2$  is used in the current mirror. Transistor  $M_2$  is in diode connection and work as a replica of the worst case leakage current hence its width is set equal to the sum of the widths of the nMOS transistor in the PDN times the safety factor.



Fig.4 Leakage Current Replica Domino Logic

#### **Current-Comparison Domino**

The proposed technique uses the difference and the comparison between the leakage current of the OFF Transistors and the switching current of the ON transistors of the pull down network to control the PMOS keeper transistor, yielding reduction of the contention between keeper transistor and the pull down network from which previously proposed techniques have suffered. Moreover, using the stacking effect, leakage current is reduced and the performance of the current mirror is improved as shown in Fig.5.

In this circuit, the reference current is compared with the pull down network current. If there is no conducting path from the dynamic node to the ground and the only current in the PDN is the leakage current, the keeper transistor will not turn off because the reference current is greater than the leakage current. In fact there is a race between the pull down network and the reference current. The current, which is greater than the other wins the race and turns off its keeper PMOS transistor. Transistor  $M_{pre2}$  is removed to discharge node K and thus turning on the keeper transistor in the precharge phase. This results in improved noise immunity. There- fore, unlike circuit designs such as HS domino in which the keeper transistor is off at the beginning of the evaluation phase, the keeper transistor is on in this design.



Fig. 5. An n-Input Current Comparison Domino OR gate

#### **III. Proposed Work**

The proposed circuit here is implemented in 65 nm and 90nm HSPICE CMOS technology with the power supply of 1V. Our circuit is based on footed domino logic circuit and based on this the proposed circuit is given which is having better leakage tolerance and improved noise immunity. The circuit is shown in figure (4) and has been tested for 8 and 16 inputs OR gate. To operate the circuit basically we have two modes of operations, namely precharge mode and evaluation mode. During precharge phase, the the dynamic node of all the gates are charged to VDD, which also causes the inverter output to go to 0 V. Now during the evaluation phase the logic signal associated with the pull down network is evaluated and the inverter output perhaps changes from 0 to VDD or some inverter output may remain at ground depending upon the logic signal provided

to the pull down network. In the proposed circuit an NMOS is added as shown in Fig.6. The main function of this transistor is to draw the contention current of the PMOS keeper and also it helps to speed up the discharging process of the capacitor at the dynamic node. At the beginning of the precharge mode the precharge device is in active mode. Therefore the voltage at the dynamic node will be at 0 V and hence that will pass through an inverter so the output at the inverter will be VDD. In consequence the extra added transistor will turn on and at the beginning of the precharge phase there will be contention of current between the two current derived from the extra added

transistor and the keeper transistor because the precharge device tries to charge the capacitor  $C_L$  and the current due to the added NMOS tries to discharge the capacitor  $C_L$ .

During the evaluation phase the precharge device gets OFF because at this time the clock switches from logic 0 to logic 1. Now the cases where inputs are such that the capacitor CL must retain the charge, then the output will be zero. Also if the inputs are such that the pull down network must discharge the capacitor CL, then the dynamic node voltage will start decreasing. At the beginning of discharging process the inverter output will be at zero, which will cause the extra added NMOS to stay inactive. The extra added NMOS compensates

the keeper current and speed up discharging of capacitor  $C_L$ .



Fig.6. Proposed Diode Footed Domino Circuit

The transient response of the conventional footer less domino logic logic is shown in Fig.7. in which output depends on the clock it gives same wave form as clock signal due to inverting the output signal .there is only one keeper to maintain the charge in dynamic node and make contention free between the keeper and pull down network.



Fig.7. Output transient response of proposed domino circuit.

# **IV Simulation and Results**

The circuits were simulated with using HSPICE TOOL using 65nm and 90nm technology using 1Vand 1.2V. The circuit was being compared with the OR gate previous techniques. The OR gate was implemented because it is a typical example of wide pull-down network. The proposed circuit was being implemented for OR gate and being compared with

the OR gate of other reference circuits and also investigated with different values of fan-in. It was found that the proposed circuit performs better than the previous proposed circuits. The simulation has been performed for proposed circuits and reported circuits. We have evaluated the leakage power consumption, active mode power consumption A.C noise margin of the proposed designed topologies and the comparisons have also been made with the reported circuits by performing simulation using 0.065-µm CMOS technology. Some parameters of the proposed circuit such as delay, average power, unity noise gain are calculated (Table II) for 8 input OR gate. In the proposed circuit we find improvement in these parameters as compare to the previous domino logic circuit. The comparison of unity noise gain of the proposed circuit with other standard domino logic circuit is shown in Table II and II. Table III shows the comparison of 16 input OR gate proposed circuit with footed domino logic circuit, footless domino logic circuit, high speed etc. domino logic circuit. The circuit is simulated at 65nm and 90nm technology in cadence spectre at 1V. For 8 and 16 input fan in OR gate comparison between well known existing circuit design techniques, we performed several simulations to obtain UNG and dissipated power as shown in the table .The simulation is performed by setting M keeper (W/L) =.25u, PMOS (W/L) =5u, NMOS (W/L) = 2.5u and  $C_L = 1 pf$ .

| S. No. | LOGIC<br>STYLE   | 8 INPUT |       | 16 INPUT |       | 32 INPUT |       |
|--------|------------------|---------|-------|----------|-------|----------|-------|
|        |                  | 65nm    | 90nm  | 65nm     | 90nm  | 65nm     | 90nm  |
| 1.     | SFLD             | 7.200   | 18.84 | 9.586    | 14.01 | 12.33    | 16.29 |
| 2.     | FLD              | 8.227   | 13.44 | 14.00    | 22.97 | 17.04    | 23.83 |
| 3.     | HSD              | 494.1   | 835.5 | 495.6    | 4220  | 5633     | 5762  |
| 4.     | CKD              | 264.5   | 496.6 | 266.8    | 499.4 | 298.3    | 301.2 |
| 5.     | WFD              | 8.908   | 13.69 | 14.30    | 14.30 | 19.38    | 27.91 |
| 6.     | LCR              | 6.039   | 9.015 | 8.210    | 11.85 | 12.39    | 15.27 |
| 7.     | CCD              | 6.125   | 11.75 | 14.86    | 16.93 | 19.34    | 26.20 |
| 8.     | Proposed Circuit | 2.793   | 4.239 | 4.376    | 6.945 | 7.021    | 9.023 |

TABLE I. COMPARISION OF POWER DESSIPATION (in µW)

TABLE II. COMPARISION OF Delay (in pS)

| S. No. LOGIC |                  | 8 INPUT |       | 16 INPUT |       | 32 INPUT |       |
|--------------|------------------|---------|-------|----------|-------|----------|-------|
|              | STYLE            | 65nm    | 90nm  | 65nm     | 90nm  | 65nm     | 90nm  |
| 1.           | SFLD             | 8.243   | 8.644 | 11.58    | 12.92 | 16.29    | 17.28 |
| 2.           | FLD              | 13.53   | 24.96 | 32.02    | 38.37 | 39.20    | 42.29 |
| 3.           | HSD              | 5.882   | 6.699 | 12.50    | 13.83 | 17.33    | 19.38 |
| 4.           | CKD              | 14.48   | 16.26 | 22.20    | 25.02 | 21.28    | 25.39 |
| 5.           | WFD              | 21.15   | 22.39 | 29.92    | 31.36 | 38.46    | 45.39 |
| 6.           | LCR              | 13.08   | 16.29 | 21.83    | 26.34 | 28.25    | 32.34 |
| 7.           | CCD              | 14.61   | 18.27 | 25.01    | 28.33 | 30.62    | 38.48 |
| 8.           | Proposed Circuit | 10.01   | 13.35 | 19.27    | 23.45 | 24.94    | 29.36 |

| S. No. | LOGIC<br>STYLE      | LOGIC 8 INPU<br>STYLE |       | 16 INPUT |       | 32 INPUT |       |
|--------|---------------------|-----------------------|-------|----------|-------|----------|-------|
|        |                     | 65nm                  | 90nm  | 65nm     | 90nm  | 65nm     | 90nm  |
| 1.     | SFLD                | 59.34                 | 162.8 | 111.0    | 181.0 | 200.8    | 281.4 |
| 2.     | FLD                 | 111.3                 | 335.4 | 448.2    | 881.3 | 667.9    | 1007  |
| 3.     | HSD                 | 2906                  | 5593  | 6195     | 5836  | 98139    | 11166 |
| 4.     | CKD                 | 3829                  | 8074  | 5922     | 1249  | 6347     | 7647  |
| 5.     | WFD                 | 188.4                 | 305.1 | 427.9    | 448.4 | 745.3    | 1266  |
| 6.     | LCR                 | 78.99                 | 146.8 | 179.2    | 312.1 | 350.0    | 493.8 |
| 7.     | CCD                 | 89.48                 | 214.6 | 371.6    | 479.6 | 592.1    | 1008  |
| 8.     | Proposed<br>Circuit | 27.95                 | 56.59 | 84.32    | 162.8 | 175.1    | 264.9 |

# TABLE III. COMPARISION OF PDP (in aJ)

## TABLE. III. COMPARISION OF UNG

| S. No. | LOGIC<br>STYLE | 8 INPUT 16 INPUT |       | NPUT  | 32 INPUT |       |       |
|--------|----------------|------------------|-------|-------|----------|-------|-------|
|        |                | 65nm             | 90nm  | 65nm  | 90nm     | 65nm  | 90nm  |
| 1      | SFLD           |                  |       |       |          |       |       |
| 1.     |                | 0.2984           | 0.312 | 0.272 | 0.289    | 0.253 | 0.273 |
| 2.     | FLD            | 0.3273           | 0.349 | 0.316 | 0.324    | 0.302 | 0.316 |
| 3.     | HSD            | 0.2962           | 0.314 | 0.277 | 0.293    | 0.256 | 0.267 |
| 4.     | CKD            | 0.3079           | 0.328 | 0.279 | 0.293    | 0.249 | 0.267 |
| 5.     | WFD            | 0.3293           | 0.346 | 0.317 | 0.324    | 0.302 | 0.317 |
| 6.     | LCR            | 0.3441           | 0.365 | 0.323 | 0.337    | 0.309 | 0.324 |
| 7.     | CCD            | 0.3572           | 0.389 | 0.339 | 0.356    | 0.314 | 0.336 |
| 8.     | Proposed       | 03821            | 0.421 | 0.367 | 0.382    | 0.346 | 0.268 |
|        | Circuit        |                  |       |       |          |       |       |

# TABLE. IV. COMPARISION OF Area (No. of Transistor)

| S. No. | LOGIC<br>STYLE   | 8 INPUT | 16 INPUT | 32 INPUT |
|--------|------------------|---------|----------|----------|
| 1.     | SFLD             | 12      | 18       | 34       |
| 2.     | FLD              | 13      | 21       | 37       |
| 3.     | HSD              | 18      | 26       | 42       |
| 4.     | CKD              | 23      | 29       | 45       |
| 5.     | WFD              | 16      | 24       | 40       |
| 6.     | LCR              | 15      | 23       | 39       |
| 7.     | CCD              | 20      | 28       | 36       |
| 8.     | Proposed Circuit | 17      | 25       | 33       |

# V. Conclusion

Subthreshold and gate oxide leakage currents need to be suppressed in a 65nm CMOS technology. In this paper, a new circuit is proposed to reduce both subthreshold and gate oxide leakage currents simultaneosly. Proposed circuit employs a PMOS sleep switch transistor between the power supply and output node with dual threshold voltage CMOS technology to suppress both subthreshold and gate oxide leakage currents. The sleep transistor, source of the pull-down network and source of NMOS transistor of output inverter is control by additional sleep signal. The simulation has been done using HSPICE software for OR8 and OR16 gates at 25°C. The proposed circuit reduces the total leakage power consumption upto 99.41% and 99.51% as compared to the standard dual threshold voltage footless domino circuits at 25°C and 110°C, respectively. Proposed circuit reduces the total leakage power consumption upto 93.79% and 97.98% as compared to the sleep control techniques at 25°C and 110°C, respectively.

#### REFERENCES

- [1] Domino logic design for high-performanceand leakagetolerantapplications Farshad and Moradi tua von cao "ELSEVIER 2011 " Vlsi journal
- Kawaguchi H., Sakurai T., "A Reduced Clock-Swing Flip-Flop (RCSFF) for 63% Power Reduction," IEEE J. Solid-state Circuits, 1998, 33, (5), pp. 807–811.
- [3] Tam S., Rusu S., Nagarji Desai U., Kim R., Zhang J., Young I," Clock Generation and Distribution for the First IA-64 Microprocessor, "IEEE J. Solid-state Circuits, 2000, 35, (11), pp. 1545–1552.
- [4] R. K. Krishnamurthy, A. Alvandpour, G. Balamurugan, N. R. Shanbhag, K. Soumyanath and S. Y. Borkar, "A 130-nm 6-GHz 256 × 32 Bit Leakage-Tolerant Register File," IEEE Journal of Solid-State Circuits, vol. 37, No. 5, pp. 624-632, May 2002.
- [5] M.W. Allam, M.H. Anis, M.I. Elmasry," High Speed Dynamic Logic Style for Scaled-Down CMOS and MTCMOS Technologies," Proceedings of The International Symposium on Low Power Electronics and Design, 2000, pp. 155–160.
- [6] A. Alvandpour, R.K. Krishnamurthy, K. Soumyanath, S.Y. Borkar, "A Sub-130-nm Conditional Keeper Technique," IEEE Journal of Solid-State Circuits, 37 (2002), pp. 633–638.
- [7] R. H. Krambeck, C. M. Lee, and H.-F. S. Law, "High-Speed Compact Circuits with CMOS," IEEE Journal of Solid-State Circuits, vol. 17, no. 3, pp.614–619, June 1982.
- [8] A. Alvandpour, P. Larsson-Edefors, and C. Svensson, "A Leakage-Tolerant Multi-Phase Keeper for Wide Domino Circuits," in Proceedings of the 1999 IEEE International Conference on Electronics, Circuits and Systems, 1999.
- [9] K. Bernstein, K. M. Carrig, C. M. Durham, P. R. Hansen, D. Hogenmiller, E. J. Nowak, and N. J. Rohrer, High-Speed CMOS Design Styles, Kluwer Academic Publishers, first edition, 1999.
- [10] T. Sakurai and A. R. Newton, "Delay Analysis of Series-Connected MOSFET Circuits," IEEE Journal of Solid-State Circuits, vol. 26, no. 2, pp. 122–131, Feb. 1991.
- [11] Jan M. Rabaey and Massoud Pedram. "Low Power Design Methodologies". Kluwer Academic Publisher, 1996.
- [12] Anantha P. Chandrakasan and Robert W. Brodersen. "Low Power Digital CMOS Design," Kluwer Academic Publisher, 1995.
- [13] H. Veendrick, "Short Circuit Dissipation of Static CMOS Circuitry and Its Impact on the Design of Buffer Circuits," *IEEE Journal of Solid-State Circuits*, vol. 19, no. 4, pp. 468– 473, Aug. 1984.
- [14] N. Hedenstierna and K. Jeppson, "CMOS Circuit Speed and Buffer Optimization," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 6, pp. 270– 281, Mar. 1987.

[15] K. Nose and T. Sakurai, "Analysis and Future Trend of Short-Circuit Power," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 9, pp. 1023–1030, Sept. 2000.