Low power high speed 40-bit Tag comparator using Domino logic for modern power efficient processors

Deepak Agarwal	extsuperscript{#1}, Ankit Singhal	extsuperscript{#2}, R Ravikumar	extsuperscript{#3}, SridhibhatlaSridevi	extsuperscript{#4}

	extsuperscript{#}School of Electronics Engineering, VIT University, Vellore, India

	extsuperscript{1}agarwal30.deepak@gmail.com

	extsuperscript{2}ankitsinghaliter@gmail.com

	extsuperscript{3}ravi10ee052@hotmail.com

	extsuperscript{4}sridevi@vit.ac.in

Abstract—In this paper, a 40 bit tag comparator is proposed. The proposed tag comparator has minimum leakage current. It also provides greater noise immunity without any reduction in the speed when compared to high speed domino logic, leakage current replica, keeper domino logic, diode footed domino logic and current comparison based dual-rail domino logic. The proposed circuit works by taking in to account the relation of mirrored current of the pull down network (PDN) with the leakage current flowing in the worst case. Simulation of the proposed 40 bit tag comparator is done using gpdk_90 nm CMOS process technology. The simulation results of proposed circuit shows 70% reduction in power consumption, 8% reduction in required area and 22% lowering down of propagation delay compared with the conventional standard footless domino having equal robustness condition. PMOS keeper transistor of conventional structures is eliminated as a result of which the figure of merit of the proposed circuit shows an improvement of 6.45 in contrast with the conventional domino circuit. The proposed circuit consumes less power with improved speed and is suitable for fully-associative caches consisting of huge number of tag comparators.

Keyword - Domino logic, leakage-tolerant, noise immunity, wide fan-in, CMOS, Power consumption.

I. INTRODUCTION

One of the most popular dynamic logic circuits, mostly used in numerous applications in order to achieve the elite that cannot be accomplished with static logic styles [1] is domino logic circuit. Yet the fundamental downside of dynamic logic families still remained to be vulnerable to unwanted noise, and as the innovations are scaling down the supply voltages are also reducing and in-turn the threshold voltage (Vth) is also reduced, but lowering down of the threshold voltage results in exponentially increase of the sub-threshold leakage current.

One of the major factors that reduce the robustness of the circuit is the leakage current because of which noise immunity is also reduced. As in the modern processors, supply voltage is reducing day by day and it causes subsequent reduction in threshold voltage and an increase in the leakage current which increases power consumption mainly during the static mode. The increase in fan-in of the systems also hinders the speed of the system.

Moreover in the modern processors in order to increase the functionality of the system the number of transistors utilized are also increasing day by day as a result of which the number of switching actions in order to implement the logic are also are increasing resulting in more power dissipation. Accordingly decreasing of leakage current and at the same time increment in the value of noise immunity are of significant worry in high speed and robust designs in current technology, more specifically for wide fan-in dynamic gates [2].

The processor cache is fast memory that the processor can access as quickly as possible and obviously it is far more expensive to produce compared to a normal RAM, as this memory is on the same die on which processor is made. The processor cache is basically faster than the system RAM and contains information that the processor will be accessing immediately and repeatedly. If the memory runs faster at a refresh rate which is closer to the CPU’s clock speed, the number of wasted cycles can be minimized. Since the modern microprocessors working at high frequencies demand a high-performance cache, researches have been working on the design of cache memory with high hit/miss rate. Tag comparator which generates hit/miss control signal for the cache controller plays major role in the performance of the cache memory. High way set-associative caches or content addressable memory (CAM) [1] tags are employed in tag comparators to increase hit/miss rate at the expense of more power consumption.

In this paper, design of a 40 bit tag comparator for a 64 bit processor with 50 bit physical address is described. In the proposed circuit, cascode current mirror is added at the footer of the evaluation network which reduces the sub-threshold leakage current due to the stacking effect [21]. The voltage drop across the cascode mirror
transistors decreases the sub-threshold leakage by establishing negative potential between the gate and source of the pull down network transistors. Furthermore, the body effect of evaluation transistors increases which in-turn increases the threshold voltage of each transistors in pull down network. Also the voltage between drain to source and drain induced barrier lowering current of the pull down network transistors will decrease, which in-turn reduces the overall power consumption of the circuit.

This paper is organized as follows: section II deals with sources of power dissipation in microprocessors, section III includes literature review, section IV deals with description of low power designs for tag comparators, section V includes proposed design and description. Section VI includes the simulation results and its comparison with conventional domino designs. Section VII concludes the results.

II. SOURCES OF POWER DISSIPATION

In this modern era the electronic device sizes are reducing day by day and most of them are becoming portable. As the portability relies on battery power and life, designing an energy efficient system becomes an important issue. In order to reduce the power dissipation it is necessary to identify that specific devices of the system which are consuming high percentage of the power as compared to others so that optimization can be done on that part of the system and by that the overall power consumption of the system can be reduced.

The power consumption of a logic gate can be described by

\[ P_{\text{avg/gate}} = P_{\text{switching}} + P_{\text{short-circuit}} + P_{\text{leakage}} \]  

Where \( P_{\text{switching}} \) is the dynamic power consumption which occurs during the switching action of the transistors i.e. the charging and the discharging of the capacitors, so higher the number of transistor more will be the switching actions and more will be the dynamic power consumption. The \( P_{\text{short-circuit}} \) (CMOS technology) due to the combination of well and substrate results in the formation of the p-n-p or p-n-p transistor and due to the leakage current the gate voltage of any of the transistor goes above the threshold value then it will turn on the transistor which will result in the turning of the other transistors which leads to increase in current to a higher value until the circuits fails and this results in the short circuit path between the source and ground. The factors that contribute to the leakage current of the transistor and \( P_{\text{leakage}} \) are frail inversion current, Gate induced drain leakage, leakage due to punch through, Drain induced barrier lowering (DIBL), etc. \( P_{\text{leakage}} \) is a function of temperature and increases with the increase in the temperature. It also increases with the scaling down of technology and subsequent reduction in the threshold voltage. So it is necessary to reduce the leakage current in order to reduce the power consumption of the system.

In the history of computing, the invention of caches and caching set a new milestone in the processing speed. Modern CPU cores right from ultra-low power chips to the highest-end Core use caches i.e. ARM Cortex-A5 to Intel i7. Even ultra-low power design and high-end microcontrollers utilize small caches for the performance benefits. Intel Pentium 4 processor employs 8KB L1 data cache and features an integrated 8-way set associative L2 cache provided with 256 bit bus works at full core frequency. Besides it also has 128 byte cache line, shown in Fig.1.

![Fig.1. Block diagram of Intel Pentium 4 processor [4]](image)

The performance impact of adding a CPU cache is directly related to its efficiency or hit rate. Repeated cache misses can have a catastrophic impact on CPU performance. Addition of each new cache memory decreases the need to go to main memory and can enhance execution in specific cases. Fig.2. Shows the image of the Pentium 4 chip where entire left side of the die is dedicated to L2 Cache.
Tag RAM, a specific RAM type, contains the record of all the memory areas that can be guide to any given block of cache. A completely associative cache has the benefit of high hit rate as any block of RAM information can have the capacity to store in any block of cache. In any case, the major disadvantage is that the search time is to a great degree long, that is the CPU has to seek through its entire whole cache to find out whether the data is available at cache or not, before start searching the main memory. Another kind of cache called as direct-mapped cache, which can be sought rapidly, however since it maps 1:1 to memory areas the hit rate of this sort of cache is low. Total power consumed by on-chip cache in Alpha 21364 is 15% [6], 43% of total power is consumed by on-chip caches in ARM SA110 [7] and 50% of the overall power is consumed by 300 MHz bipolar CPU [8].

III. LITERATURE REVIEW

Standard footless domino (SFLD) circuit is based on conventional dynamic logic and is very commonly used, as shown in fig. 3. In this domino, a keeper transistor consist of a PMOS transistor driven by output and is employed to keep the evaluation network dynamic node from any undesired discharging during the evaluation stage furthermore to forestall charge sharing of the pull down network and hence it enhances the robustness. The ratio of the keeper is defined as follows:

$$K = \frac{\mu_p (W/L)_{transistor-keeper}}{\mu_n (W/L)_{transistor-evaluation}}$$  \hspace{1cm} (2)

Where W and L indicate the transistor size, and µn is electron mobility and µp is the hole mobility. In spite of fact that, upsizing of keeper transistor enhances noise immunity, the contention current between the pull down network (PDN network) and keeper transistor increases which in-turn increase the power consumption and delay in evaluation in conventional standard domino logic circuits.

These problems are more professed in wide fan-in dynamic-logic gates where large number of leaky transistors (NMOS) is employed to implement the evaluation PDN which are connected to the dynamic node. Conventional way of using keeper transistor seems less effective in recent CMOS technology, since there is a trade-off between performance and the robustness and numbers of transistors in pull down network are limited. So far many circuits are proposed to deal with these problems. One of the major circuit technique involving in
modifying the gate voltage controlling circuit of PMOS keeper transistor [14-17] and another circuit technique involving in modifying the circuit topology of the footed evaluation transistor or footed transistor or by engineering the evaluation network or pull down network in different ways [13][18].

A leakage-current replica (LCR) keeper proposed in [10] tracks the variations like process, voltage and temperature (P-V-T) in the circuit as shown in Fig. 8. The leakage current of the pull down network is mirrored with the help of simple current mirror, by doing so the strength of the keeper transistor is controlled. Diode-partitioned domino (D-PD) circuit utilizes the enhanced diode that increases the gate voltage of the NMOS transistors present in the evaluation network, which in-turn improves the delay as shown in Fig. 5 [14]. In [13], a controlled keeper by current-comparison domino (CK-CCD) shown in Fig. 6, is developed. In this circuit, to improve delay and power consumption, the evaluation network current has been compared to reference current and this comparison controls the strength of the keeper transistor. In [19], a comparison based dual-rail domino (CDRL), is presented and it is based on voltage comparison to produce output and its complementary value at same instant as shown in Fig.7. Current-comparison based domino (CCBD) shown in Fig. 8, is another circuit technique used to decrease the contention current and power consumption [20].

IV. LOW POWER DESIGNS

Leakage in the evaluation network of domino circuit increases with technology scaling down especially in wide fan-in gates, simultaneously yielding higher power consumption and lower noise immunity. Voltage swing at each hit/miss reduces at the dynamic node. Therefore, new effective circuit designs are needed to obtain higher noise immunity and performance in wide fan-in circuits. Adding the number of inputs, not only increase the current contention between evaluation network and keeper transistor but also increases the worst case delay while reducing power consumption. The main conception in the proposed circuit is to compare the current flowing through the evaluation network with the reference current by means of cascade current mirror and also to remove the keeper transistor which consumes certain power if miss/hit is there.

As the keeper has been removed, power consumption is reduced because dynamic node is no more connected to a supply during evaluation period. The mirrored PDN current is compared to the reference current and the logic has been observed.
Tag comparator design

Cache memory controls the mismatch between the high speed processor and low or minimum speed of off-chip memory. So for in modern processors, there is great need to have good cache controller that has high hit/miss rate in order to increase their performance. Many researches resulted high performance of the cache memory but the gap between the performance of the processor and main memory is still growing as technology growing due to several factors. Out of all the parameters, the crucial parameter is tag comparator parameters.
because it makes the cache controller on hold while generating the hit/miss signal, so cache controller cannot complete its task until hit/miss signal is generated.

Domino logic with wide fan-in technology is used to implement the tag comparator. With advancements in technology, the number of input lines and memory size increases. So the number of tagging bits and physical address space also increase which in turn increases the delay as well as power dissipation in tag comparators. It implies that high fan-in domino circuits are liable to be implementing with a specific end goal to handle such parameters. Strong need of high fan-in domino circuit technology has expanded particularly in deep sub-micron (DSM) advancements with higher sub-threshold spillage. On using content addressable memory (CAM) labels to make increment in the hit/miss rate of tag comparator in return increases the power consumption of cache memory. CAM cell circuitries and CAM-tag cache are shown in Figs. 10 and 11 respectively.

In this paper, standard CAM cell with ten-transistor configuration is utilized for CAM cell. The CAM cell is made up of SRAM cell and is used to build hit rate while dynamic XOR gate is implemented for comparison purpose.

![Fig.10. CAM cell circuitry [1]](image)

![Fig.11. Architecture of CAM-tag cache memory [1]](image)

![Fig.12. Tag comparator and tag SRAM[14]](image)

The design is formed such that match line is joined with all CAM cells that are present in row and they are firstly made pre-charged to VDD and after that released to zero on a mismatch. For a 64-bit microprocessor having 50-bit physical address uses 40-bit tag comparator.

This tag comparator is actualized utilizing a 40-bit OR gate logic and a 2-input XOR logic, as shown in Fig 12 [14]. Each input bit is compared with its tag address by the tag CAM, it can be said that 40-inputs are compared with 40-bit tag addresses in parallel fashion but due to parallel combination this speed is achieved at the expense of more power dissipation within the circuit. In case if match is found then cache hit/miss signal will be generated in association with the word that is stored in DATA SRAM of the microprocessor for reading. Generally a cache line substitution will happen.

Design of tag comparator has huge number of branches connected to dynamic hub which builds the parasitic effects to a great extent. For 40-bit design, 80-input AND-OR gate is implemented which has one hundred sixty
transistors connected in parallel manner such that in every branch exactly two transistors will be there in series. So total eighty such branches will be there in light of the fact that two input XOR logic made out of two branches.

![Fig.13. The 40-bit SFLD tag comparator [5].](image)

For the worst case scenario information pattern, when one and only branch out of 80 branches is continuously conducting and discharged, yielding noise robustness, lower speed and higher power consumption. This loss becomes crucial when number of input bits is increased in the tag comparator as in future microprocessor. Therefore, there is a great need of multiple design stages along with a keeper to avoid noise without increasing the area overhead, power consumption and delay. Thus, it can be concluded that the fundamental consumers of total power in the cache memory particularly in CAM-tag cache are the tag comparators because of bigger number of inputs.

In this paper, implementation of tag comparators is done by utilizing current comparison based domino logic called CCL shown in Fig. 14, in order to reduce delay and additionally control utilization of the power of the wide fan-in tag comparators while maintaining proper noise robustness. In order to reduce the power consumption because of substantial exchanging capacitance, the keeper circuit has removed from the dynamic node. The output hit/miss signal is generated by taking mirrored PDN current from the circuit and then compared it with the reference current by means of current sensing amplifier. Due to reduction in power consumption of the tag comparator, CAM cache memory can be implemented in embedded systems in order to get advantages of higher speed and low power consumption.

V. PROPOSED CCL DESIGN

The conventional domino using wide fan-in gates, presents huge capacitance at the dynamic node and due to this there is a considerable delay in charging and discharging which in-turn decreases the speed drastically. In addition to that, there exist many parallel leakage paths in wide gates and this leads to reduced noise immunity. Even though increasing size of the keeper transistor can enhance noise tolerating strength but the delay and power consumption are increased due to substantial conflict. Therefore, mentioned issue can be overcome by isolating the PDN, executes logical function from the keeper transistor utilizing comparison circuit as a part of which the current which is present in the draw-down network (PDN) is contrasted with the worst case spillage current. This thought is proposed in CCD [20], which uses the PUN rather than the PDN. Actually, a race condition occurs between the reference current and the PUN.

The circuit adopted for the era of reference current [5], for all gates is appeared in Fig.9. This circuit is like a copy to a leakage circuit proposed by [10]. This reference circuit is a reproduction of the worst pessimistic leakage current of the PDN to effectively track leakage current variations because of process variations.

In the proposed CCD, the circuit utilizes NMOS transistors to actualize the logical capacity and has a mutual reference circuit along with seven extra transistors contrasted with standard footless domino (SFLD) as appeared in Fig. 14, current of the PDN is reflected by cascode current mirror circuit and it is contrasted with the reference current, which duplicates the leakage current of the PDN. The topology used for the reference circuit, which is generally shared for all gates proposed in [5], effectively tracks the process, voltage and temperature variations. By appropriate decision of the mirror proportion [3], the proposed circuit improves the speed.

The proposed circuit can operate in 2 stages: Pre charging stage and evaluation stage. The pre charging system consists of PDN and transistors $M_{Pre}$, $M_{eval}$, and M1. The evaluation system includes the cascode current mirror circuit that resembles a footed domino with one input A as shown in Fig. 14. The input signal to this node is controlled by the first stage. Unlike the conventional dynamic logic circuits, the PDN which executes the coveted logic function is separated from dynamic hub $D_n$ and by implication changes the dynamic voltage.
In pre charging mode the dynamic node $D_n$ and the node $A$ is charged to $V_{DD}$ by $M_{Pre}$ and $M_1$ respectively, maintaining the output low. During evaluation mode, depends on the inputs, either a low or high voltage drop is established across the transistor $M4$ and $M5$. In the event that all inputs are at same level, either high or low, leakage current exists in the PDN system and cascode current mirror transistors $M4$ and $M5$. The voltage across these transistors will be low, which maintains the output low. If at least one of the inputs is different, then it creates a conductive path between dynamic node $D_n$ and the mirror transistors, increases the voltage drop across $M4$ and $M5$ and getting turned on and changing the yield voltage. The voltage drop across these current mirror transistors $M4$ and $M5$ gets to be certain, yielding an exponential lessening in sub-threshold leakage because of the marvel called the stacking effect [10]. By up sizing the transistors $M6$ and $M7$ increase the speed at the expense of more power i.e., leakage current increment and higher deviation because of process variations. The mirror proportion mischaracterized as the proportion of the extent of transistor $M7$ to the size of transistor $M5$.

$$M = \frac{W_{M7}}{W_{M5}}$$

Unlike traditional domino logic circuits, the proposed circuits does not require PMOS keeper transistor since there is no need for maintaining the dynamic node $D_n$ at high whenever the output goes low, which in-turn reduces the consumption of power and total number of cells(transistor) and area.
the transistor in order to make the correlations reasonable. In the worst case scenario and substantial load because of the presence of large fan-out, yield (output) capacitance burden (load) is maintained at 5 fF. The NMOS transistor width in the evaluation network is made twice that of base (minimum width) and is equal to \( W_{\text{min}} = 1.4 \times L_{\text{min}} \), where \( L_{\text{min}} = 90 \) nm. Inverters are measured to have half of \( V_{\text{DD}} \) excursion point with the exception of CCBD in which the inverter with width ratio is 1. The Width and the length of transistors are set to least size and are essentially changed to accomplish the sought noise power.

**VI. SIMULATIONS RESULTS AND COMPARISONS**

A tag comparator of size 40-bit is actualized in the design of standard footless domino (SFLD) which is traditionally composed. In [14], the tag comparator utilizes diode partitioned domino (D-PD) which is segmented in to 20-segments with four paths per segment that is the fastest technique. The tag comparator based on comparison-based dual-rail domino (CDL) is actualized as proposed in [5]. The proposed current comparison based domino comparator circuit is simulated using Cadence Virtuoso 6.1.2 gpdk_90nm technology performed in all process corner at 110 °C with supply voltage 1Volt. In order to quantitatively quantify the noise rejection capacity of the circuits and also to account other factors of the circuit the accompanying noise metric and also the figure of a merit are utilized.

**Noise metric**

There are so many metrics available to determine noise margin in domino circuit out of which unity-noise-gain (UNG) is used in this paper. In simple words UNG is amplitude of noise that appears at input and replicates same noise amplitude at output.

\[
\text{UNG} = \{V_{\text{in}}: V_{\text{NOISE}} = V_{\text{OUTPUT}}\}
\]

The amount of noise can be altered by supplying the desired amplitude and duration of pulse. In the simulations of this paper, worst case i.e. 20% of noise has been added to the input signal to compare robustness.

**Figure of merit**

Figure of merit (FOM) is the measure to compare the performance of the circuit with all other existing techniques. In this paper, comprehensive figure of merit is been used which is capable of calculating FOM with respect to area, power, delay, noise together.

\[
\text{FOM} = \frac{\text{UNG}_{\text{norm}}}{\tau_{\text{ Prop norm}} \times \tau_{\text{Delay norm}} \times A_{\text{norm}}}
\]

Where \( \text{UNG}_{\text{norm}} \) is the unity noise gain, \( \tau_{\text{ Prop norm}} \) is the propagation delay for worst case and \( A_{\text{norm}} \) is normalized area of each design with respect to SFLD. The \( \tau_{\text{Prop norm}} \) represents the worst case propagation delay normalized to SFLD. The Table I summarize the FOM of 40-bit tag comparators implemented using conventional and proposed design. The simulation is done with the output driving 5fF capacitance load at 110 °C with supply voltage 1Volt.

In this work, the worst propagation delay is taken, when only one of 80 parallel paths in the pull down network is available to discharge the dynamic node Dn. In the rest of the parallel paths, to consider worst case scenario, either one of the transistors connected in series is ON or OFF. The Fig 15 shows the simulated waveforms of the CCL circuit at worst case condition, at 110 °C typical process corner. For proper comparison all the data are normalized to the figure of merit of the SFLD. The Fig 15 demonstrates the area and power consumption of proposed CCL for the 40 bit tag comparator is relatively less as compared to the SFLD, DPD and CDL. There is 70% reduction in power consumption, 22% reduction in delay and 8% reduction in area compared to that of SFLD and PMOS keeper transistor is eliminated at same robustness condition and yielding the improvement in FOM of 6.45 orders of magnitude. The results show that there is an increase in the value of figure of merit (FOM) of the proposed design as compared with all other conventional designs.

The impact of the number of inputs on various factors like consumption of the power, time delay on the proposed CCL circuit are shown in Fig. 19 where the unity noise gain maintained at 0.2V. As the below figure indicates, the power consumption of the CCL circuit is less even though the inputs are more than 64 and the delay is decreasing for an increased number of inputs.
The Figs. 17 and 18 show the power consumption and the time delay for three unique temperatures for the typical process of 1V and for variable VDD at 110 °C temperature in the typical process, also the effect on the variation of the process on the various factors like the normalized power consumption and delay of CCL are simulated in four process corners at 1V and 110 °C, as shown in Fig.16.

Table. 1 Figure of merit comparisons of the tag comparators

<table>
<thead>
<tr>
<th></th>
<th>SFLD</th>
<th>DPD</th>
<th>CDL</th>
<th>CCL</th>
</tr>
</thead>
<tbody>
<tr>
<td>No. of transistors</td>
<td>184</td>
<td>166</td>
<td>176</td>
<td>169</td>
</tr>
<tr>
<td>Area(WminxLmin)</td>
<td>372</td>
<td>472</td>
<td>344</td>
<td>340</td>
</tr>
<tr>
<td>Normalized area</td>
<td>1</td>
<td>1.277</td>
<td>0.925</td>
<td>0.91</td>
</tr>
<tr>
<td>Power (µW)</td>
<td>51</td>
<td>50.4</td>
<td>30</td>
<td>14</td>
</tr>
<tr>
<td>Normalized power</td>
<td>1</td>
<td>0.99</td>
<td>0.59</td>
<td>0.28</td>
</tr>
<tr>
<td>Delay (ps)</td>
<td>217</td>
<td>169</td>
<td>169</td>
<td>169</td>
</tr>
<tr>
<td>Normalized delay</td>
<td>1</td>
<td>0.78</td>
<td>0.78</td>
<td>0.78</td>
</tr>
<tr>
<td>UNG (V)</td>
<td>0.2</td>
<td>0.2</td>
<td>0.2</td>
<td>0.2</td>
</tr>
<tr>
<td>Normalized UNG</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>FOM</td>
<td>1</td>
<td>1.3</td>
<td>3.01</td>
<td>6.45</td>
</tr>
</tbody>
</table>

Fig.16. Normalized power and delay of CCL at various process corners
The tag comparators used in cache memory are designed by using wide fan-in dynamic gates. Super scaling of the dynamic gates leads to less noise immunity and high leakage current even though the power consumption is reduced. So the demand for high noise immunity under low power consumption is drastically increased. This paper presented an efficient way of designing a 40 bit tag comparator based on current comparison based domino logic (CCL), which compares the mirrored current of the evaluation network (PDN) with the reference current and performs the logic function under lowest miss rate. The proposed design is simulated using gpdk_90 nm CMOS technology under typical process corner at 110 ºC. The results show that there is 70% lower power consumption compared to the conventional standard footless domino logic (SFLD)[13].

**VII. CONCLUSION**

The tag comparators used in cache memory are designed by using wide fan-in dynamic gates. Super scaling of the dynamic gates leads to less noise immunity and high leakage current even though the power consumption is reduced. So the demand for high noise immunity under low power consumption is drastically increased. This paper presented an efficient way of designing a 40 bit tag comparator based on current comparison based domino logic (CCL), which compares the mirrored current of the evaluation network (PDN) with the reference current and performs the logic function under lowest miss rate. The proposed design is simulated using gpdk_90 nm CMOS technology under typical process corner at 110 ºC. The results show that there is 70% lower power consumption compared to the conventional standard footless domino logic (SFLD)[13].
REFERENCES


AUTHOR PROFILE

Author1 Deepak Agarwal: He is currently pursuing M.Tech. in VLSI Domain. He completed his B.Tech. in Electronics and Communication. His current area of research includes low power IC design, high speed data processing and mixed signal processing.

Author2 Ankit Singhal: He is currently pursuing M.Tech. in VLSI Domain. He completed his B.Tech. in Instrumentation. His current area of research includes low power IC design and memory design.

Author3 Ravi Kumar: He is currently pursuing M.Tech. in VLSI Domain. He completed his B.Tech. in Electrical and Electronics. His current area of research includes high speed data processing and low power analog integrated circuits and high speed data converters.

Author4 Sriadibhatla Sridevi: She completed her PhD. Currently working as a professor at VIT University. Her area of research includes ultra low power integrated circuit design and digital signal processing.