#### Vol. 2, Special Issue 10, March 2016

# LOW POWER MULTIPLIER DESIGN USING DADDA MULTIPLIER

V.Sudha<sup>1</sup>, Vishnu Priya<sup>2</sup>, C.Yasotha<sup>3</sup>, T.Rajaganapathi<sup>4</sup>

UG Scholar, Electronics and Communication Engineering, EBET Group of Institutions, Tirupur, India<sup>1,2,3</sup> Assistant Professor, Electronics and Communication Engineering, EBET Group of Institutions, Tirupur, India<sup>4</sup>

### ABSTRACT

In this paper, we propose a reliable lowpower ANT multiplier design using dadda multiplier with the fixed-width multiplier to build the reduced precision replica redundancy block (RPR). The ANT architecture can meet the demand of high precision, low power consumption, and area efficiency. We replace the array multiplier by using dadda multiplier to reduce the number of gates in the multiplication process. The hardware complexity can be reduced. In a  $12 \times 12$  bit ANT multiplier, circuit area in our fixedwidth RPR can be lowered by 44.55% and power consumption in our design can be saved by 23%.

**KEYWORDS-**Algorithmic noise tolerant (ANT), fixed –width multiplier, reduced-precision replica (RPR) dadda multiplier.

### **INTRODUCTION**

The rapid growth of portable and wireless computing systems in recent years drives the need for ultra low power systems. To lower the power dissipation, supply voltage scaling is widely used as an effective low-power technique since the power consumption in CMOS circuits is proportional to the square of supply voltage [1]. However, in deepsubmicrometer process technologies, noise interference problems have raised difficulty to design the reliable and efficient microelectronics systems; hence, the design techniques to enhance noise tolerance have been widely developed. The RPR designs in the ANT designs of [5]-[7] are designed in a customized manner, which are not easily adopted and repeated. The RPR designs in the ANT designs can operate in a very fast manner, but their hardware complexity is too complex. As a result, the RPR design in the ANT design of [2] is still the most popular design because of its simplicity. However, adopting with RPR in should still pay extra area overhead and power consumption.

In this paper, we further proposed an easy way using the fixed-width RPR to replace the full-width

RPR block in. Using the fixed-width RPR, the computation error can be corrected with lower power consumption and lower area overhead. We take use of probability, statistics, and partial product weight analysis to find the approximate compensation vector for a more precise RPR design. In order not to increase the critical path delay, we restrict the compensation circuit in RPR must not be located in the critical path. As a result, we can realize the ANT design with smaller circuit area, lower power consumption, and lower critical supply voltage.

### ANT ARCHITECTURE DESIGNS

The ANT technique includes both main digital signal processor (MDSP) and error correction (EC) block. To meet ultra low power demand, VOS is used in MDSP. However, under the VOS, once the critical path delay Tcp of the system becomes greater than the sampling period Tsamp, the soft errors will occur. In the ANT technique, a replica of the MDSP but with reduced precision operands and shorter computation delay is used as EC block. Under VOS, there are a number of input-dependent soft errors in its output ya[n]; however, RPR output yr[n] is still correct since the critical path delay of the replica is smaller than Tsamp. Therefore, yr[n] is applied to detect errors in the MDSP output ya[n]. Error detection is accomplished by comparing the difference |ya[n] - yr[n]| against a threshold Th. Once the difference between  $y_{a}[n]$  and  $y_{r}[n]$  is larger than Th, the output y[n] is yr[n] instead of ya[n]. As a result, y[n] can be expressed as

 $y[n] = ya[n], \text{ if } |ya[n] - yr[n]| \le Th$ y[n] = yr[n], if |ya[n] - yr[n]| > Th (1)

#### International Journal of Advanced Research in Biology Engineering Science and Technology (IJARBEST)

### Vol. 2, Special Issue 10, March 2016

Th is determined by

$$Th = \max_{\forall input} |yo[n] - yr[n]|$$
(2)

Where yo[n] is error free output signal. In this way, the power consumption can be greatly lowered while the SNR can still be maintained without severe degradation.

Fig. 1. ANT architecture with fixed-width RPR.

### FIXED WIDTH MULTIPLIER

The fixed-width multipliers have been widely used in the design of digital signal processor due to their smaller area and lower power dissipation. In order to reduce the chip area of channel detector for cognitive radio, many fixed width booth multipliers have been used. However, they reduce the detection accuracy because of truncated partial product. This method can reduce the truncated error by using variable compensation value. The third



categor y is the hybrid error compensation, which uses both constant and adaptive QEC techniques together to reduce the truncated error. In order to overcome the disadvantage of has presented a method of dividing the truncated partial products in to the major truncated section and the minor truncated section.



Fig. 2.  $12 \times 12$  bit ANT multiplier is implemented with the six-bit fixed width replica redundancy block.

To evaluate the accuracy of a fixed-width RPR, we can the (n/2)-bit fixed-width RPR output and the 2n-bit full-length MDSP output, which is expressed as

## $\varepsilon = P - Pt$

where P is the output of the complete multiplier in MDSP and Pt is the output of the fixed-width multiplier in RPR. Pt can be expressed as

$$Pt = \sum_{j=n+1/2}^{n-1} y_j, 2j \sum_{i=3n-j/2}^{n-1} x_i 2i$$

The source of error generated in the fixed-width RPR is dominated by the bit product of ICV since they have the largest weight.In[8],it is reported that a low cost easy circuit can be easily design.

### **REDUCED PRECISION RENDUNDANCY**

The MDSP block is subject to VOS, which results in sot errors in its output. When a soft error in MDSP is detected using an error control (EC) block,

#### Vol. 2, Special Issue 10, March 2016

the RPR output is used as an output. Next, we describe the error characteristics of a system under VOS and then present the proposed error control algorithm.

### DADDA MULTIPLIER

In the dadda multiplier, the number of stages is smaller, so summation of the partial products is faster. In fact the dadda multiplier is time optimal, having  $T=0(log_2 n)$  and this limit is reached very quickly, so the scheme is suitable for small word lengths.

The carry-save array T=0(n) but has a more regular structure. Both the carry-save array multiplier and dadda multiplier may be pipelined by inserting register between each stage of processing elements.

However, if the system is pipelined at every stage, then the dadda multiplier will require fewer register stages since it has fewer processing stages. The carry-save array requires fewer registers at each stage compare to the dadda scheme; this is offset, however, by the fact that more register stages are required for the array to operate at the same clock frequency.

Overall, the latency in a carry-save array wil be longer than that of the dadda multiplier if the systems are both pipelined so that each has the same delay at each stage.

The number of gate delays in the carry-save scheme compared with the dadda scheme, assuming that both multipliers are constructed using only NAND gates and that both schemes use a carry look ahead-adder at the final stage.





### TRUNCATION ERROR IN THE LSP BLOCK

The source of errors generated in the fixedwidth RPR is dominated by the bit products of ICV since they have the largest weight. In it is reported that a low-cost EC circuit can be designed if a simple relationship between f(EC)And  $\beta$  is found. It is noted that  $\beta$  is the summation of all partial products of ICV. By statistically analyzing the truncated difference between MDSP and fixed-width RPR with uniform input distribution, we can find the relationship between f(EC) and  $\beta$ . The statistical results show that the average truncation error in the fixed-width RPR multiplier is approximately distributed between  $\beta$  and  $\beta$ +1. More precisely, as  $\beta$  = 0, the average truncation error is close to  $\beta$  + 1. As  $\beta$  > 0, the average truncation error is very close to  $\beta$ . If we can elect  $\beta$  as the compensation vector, the compensation vector can directly inject into the fixed-width RPR as compensation, which does not need extra compensation logic gates.

#### International Journal of Advanced Research in Biology Engineering Science and Technology (IJARBEST)

#### Vol. 2, Special Issue 10, March 2016



Fig. 4, Truncation error in lsb block

# ERROR COMPENSATION VECTOR FOR FIXED WIDTH RPR DESIGN

In the ANT design, the function of RPR is to correct the errors occurring in the output of MDSP and maintain the SNR of whole system while lowering supply voltage. In the case of using fixedwidth RPR to realize ANT architecture, we not only lower circuit area and power consumption, but also accelerate the computation speed as compared with the conventional full-length RPR. However, we need to compensate huge truncation error due to cutting off many hardware elements in the LSB part of MDSP. The source of errors generated in the fixedwidth RPR is dominated by the bit products of ICV since they have the largest weight.

The (n/2)-bit unsigned full-width Baugh– Wooley partial product array can be divided into four subsets, which are most significant part (MSP), input correction vector [ICV( $\beta$ )], minor ICV [MICV( $\alpha$ )], and LSP. In the fixed-width RPR, only MSP part is kept and the other parts are removed. Therefore, the other three parts of ICV( $\beta$ ), MICV( $\alpha$ ), and LSP are called as truncated part. The truncated ICV( $\beta$ ) and  $MICV(\alpha)$  are the most important parts because of their highest weighting. Therefore, they can be applied to construct the truncation error compensation algorithm.







Fig. 6, High-accuracy fixed-width rpr multiplier with compensation

#### Vol. 2, Special Issue 10, March 2016

Before directly the compensation vector in to the fixed-width RPR, we go further to double check the weight for the partial product terms in ICV with the same partial product. Therefore, we apply the same weight of unity to each input correction vector element. This conclusion is beneficial for us to inject the compensation vector in to the fixed-width RPR directly.

In this way, no extra compensation logic gates are needed for this part compensation and only wire connections are needed. We can lower the compensation error effectively and no additional compensation error will be generated.

#### **CONCLUSION**

A low error and area-efficient fixed-width RPR based ANT multiplier design is implemented. Noise sources such as cosmic rays and alpha particles can impact the error control blocks as well. Under 0.6 v supply voltage and 200 MHz operating frequency, the power consumption is 0.393 mW. In the presented 12-bit by 12-bit ANT multiplier, the circuitry area in our fixed-width RPR can be saved by 45%, the lowest reliable operating supply voltage in our ANT design can be lowered to 0.623  $V_{DD}$ , and power consumption in our ANT design can be saved by 23% as compared with the state-of-art ANT design.

#### REFERENCES

[1] (2009). The International Technology Roadmap for Semiconductors [Online]. Available: http://public.itrs.net/ [2] B. Shim, S. Sridhara, and N. R. Shanbhag, "Reliable low-power digital signal processing via reduced precision redundancy," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 12, no. 5, pp. 497–510, May 2004.

[3] B. Shim and N. R. Shanbhag, "Energy-efficient soft-error tolerant digital signal processing," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 4, pp. 336–348, Apr. 2006.

[4] R. Hedge and N. R. Shanbhag, "Energy-efficient signal processing via algorithmic noise-tolerance," in *Proc. IEEE Int. Symp. Low Power Electron. Des.*, Aug. 1999, pp. 30–35.

[5] V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, "Low-power digital signal processing using approximate adders," *IEEE Trans. Comput. Added Des. Integr. Circuits Syst.*, vol. 32, no. 1, pp. 124–137, Jan. 2013.

[6] Y. Liu, T. Zhang, and K. K. Parhi, "Computation error analysis in digital signal processing systems with overscaled supply voltage," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 18, no. 4, pp. 517–526, Apr. 2010.

[7] J. N. Chen, J. H. Hu, and S. Y. Li, "Low power digital signal processing scheme via stochastic logic protection," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2012, pp. 3077–3080.

[8] J. N. Chen and J. H. Hu, "Energy-efficient digital signal processing via voltage-overscaling-based residue number system," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 21, no. 7, pp. 1322– 1332, Jul. 2013.

[9] P. N. Whatmough, S. Das, D. M. Bull, and I. Darwazeh, "Circuit-level timing error tolerance for low-power DSP filters and transforms," *IEEE Trans. Very Large Scale Integr.* (VLSI) Syst., vol. 21, no. 6, pp. 12–18, Feb. 2012.

[10] G. Karakonstantis, D. Mohapatra, and K. Roy, "Logic and memory design based on unequal error protection for voltage-scalable, robust and adaptive DSP systems," *J. Signal Process. Syst.*, vol. 68, no. 3, pp. 415–431, 2012.