High efficient carry skip adder in various multiplier structures.
A low power arithmetic circuit has become very significant in many VLSI industries. Adder circuit is the main building block in various DSP mainframes. Adder is the foremost component of arithmetic unit. Addition has an important process for various digital systems , digital signal processing or control system. The fast and accurate procedure of a digital system is being prejudiced by the efficiency of the adders. Adders are also very significant component in digital systems because of their larger use in other basic digital processes such as multiplication, subtraction and division. Hence, the improving performance of the digital adder would greatly advance the execution of binary operations inside a circuit involving of such blocks. The presence of a digital circuit block is obtained by recitation its power dissipation, area and its operating speed. There are numerous works on the subject of enhancing the speed and power of these unit, which have been described in -. Apparently, it is particularly famous to achieve developed speeds at low-power/energy consumptions, which is a dare for the inventors of general purpose processors.
In existing method, we are using multiplexer logic in which the amount of gate count is more. The power consumed by the carry skip adder using multiplexer logic is more and critical path delay is high. The ripple carry adder is accumulated by cascading full adders (FA) blocks in series [10-15]. One full adder is accountable for the addition of two binary digits at any phase of the ripple carry. The carryout of one stage is fed directly to the carry-in of the subsequent stage. A number of full adders may be added to the ripple carry adder or else ripple carry adders of unlike sizes may be cascaded in order to accommodate binary vector strings of greater sizes. For an n-bit parallel adder, it needs n computational elements (FA). The carry is scattered in a serial computation. Hence, delay is more as the amount of bits is greater than before in RCA.
The rest of this paper is prepared as follows. Section II converses related work on the recommended CSKA structure for improving the speed and for increasing the efficiency of adders. In Section III, ripple carry structure is explained. The incrimination block structure is suggested in Section IV. The section V describes the Baugh-Woolley Multiplier, while section VI describes the Wallace Tree Multiplier. The results and discussion are made in Section VII. Finally, the conclusion is given in section VIII.
The effort of this paper is on rapidity of the carry skip adder structure and the lessening in the critical path Delay. The projected system comprises the ripple carry adder structure through the AOI and OAI compound gates for skip logic. The skip logic decreases the number of gate amount in the structure. The structure speed is achieved by using skip logic and the delay is reduced. The predictable structure of the CSKA involves of stages containing chain of full adders and 2:1 multiplexer. The RCA blocks are linked to each other through 2:1 multiplexers, which can be placed into one or more level structures.
The CSKA configuration (i.e., the number of the FAs per stage) has a great impression on the speed of this type of adder. Many approaches have been recommended for finding the optimal number of the FAs.
Figure 1 expressions the adder comprises two N bits inputs, A and B, and Q stages. Every stage contains of an RCA block with the size of Mj (j = 1, ..., Q). In this structure, the carry input of all the RCA blocks, excluding for the first block which is [C.sub.i], is zero i.e the concatenation of the RCA blocks. Consequently, hence all the blocks of the structure and perform their jobs instantaneously.
In arithmetic and logic units adders are a key constructing block. Later increasing their speed and reducing their power/energy consumption intensely affect the speed and power consumption of processors. There are several works on the subject of improving the speed and power of these elements, which have been described. Evidently, it is highly desirable to achieve higher speeds at low-power/energy consumptions, which is a encounter for the designers of common purpose processors.
In this assembly, when the first block calculates the summation of its consistent input bits (i.e., SM1.....S1 ) and C1, the additional blocks promptly compute the in-between results and also Cj signals. In the future structure, the first stage has only one block, which is RCA. The phases 2 to Q consist of two blocks of RCA and incrimination.
The incrimination block customs the intermediate outcomes created by the RCA block and the carry output of the earlier stage to compute the final summation of the stage.
All the RCA blocks, except for the 1st block partaking zero as carry input. Output carries of the RCA blocks are intended in parallel. The skip logic comprises the OAI (OR AND Invert) and AOI (AND OR Invert) complex gates for skip logic. The gates, which involve of fewer transistors, have lower delay, area, and smaller power consumption likened with those of the 2:1 multiplexer. In this structure, as the carry proliferates through the skip logics, it converts complemented.
Ripple Carry Adder:
The ripple carry adder is formed by cascading full adders blocks in sequences. One full adder is accountable for the addition of two binary digits at several stage of the ripple carry. The carry out of one stage is fed conventional to the carry-in of the next stage.
An amount of full adders may be added to the ripple carry adder or ripple carry adders of dissimilar sizes may be dropped in order to accommodate binary vector strings of greater sizes. On behalf of an n-bit parallel adder, it needs n computational elements (Full Adder).
The worst-case delay of the RCA is once a carry signal transition ripples through entire stages of adder sequence from the slightest significant bit to the maximum significant bit, which is approximated by:
T = (n - 1)tc + ts (1)
where the tc is the delay through the carry stage of a full adder. It is the delay to calculate the sum of the last stage is shown in equation 1.
The interior structure of the incrementation block, which comprises a sequence of half-adders (HAs), is shown in Figure 3. In addition, note that, to reduce the delay significantly, for calculating the carry output of the stage, the carry output of the incrementation block is not used. The incrementation contains AND gate and EXOR gate. The carry input is fed into the structure and it performs the operation like the half adder.
In retained multiplication the period of the partial products and the number of partial products will be very high. So an algorithm was presented for sign multiplication named as Baugh Woolley algorithm. The Baugh-Woolley multiplication is one among the cost-effective ways to embrace the sign bits. This technique has been established so as to style regular multipliers, suited to 2's compliment figures. Baugh-Woolley multiplier hardware building is shown in figure 3. It follow left shift algorithm. MUX can choose which bit will multiply. The carry skip adder construction is implementer to the multiplier and their comparison is made.
RESULT AND DISCUSSION
The design proposed in this paper has been developed using MODEL SIMULATOR. The comparison of area, delay, power and energy are obtained for 16 bit adder and a 32 bit adder. The adder structure was performed using an fixed stage size and an variable stage size method. The carry skip adder is applied to an multiplier structure and its area, delay, power and energy are compared.
In this paper, Carry Skip Adder (CSKA) structure was suggested, which shows a lower energy consumption and higher speed likened with the conventional structure. The speed is concluded the concatenation and efficiency by the incrementation procedures. In addition, AOI and OAI compound gates were exploited for the carry skip logics. The competence of the proposed structure for both VSS and FSS was studied. The consequences also suggested that the CSKA structure is a very good adder for the applications where both the speed and energy ingesting are critical. In accumulation, a hybrid variable latency extension of the construction was proposed which decreases the power without upsetting the speed of the structures.
[1.] Koren, Computer Arithmetic Algorithms, 2002. 2nd ed. Natick, MA, USA: A K Peters, Ltd.
[2.] Zlatanovici, R., S. Kao and B. Nikolic and koren, 2009. "Energy-delay optimization of 64-bit carry-look ahead adders with a 240 ps 90 nm CMOS design example," IEEE J. Solid-State Circuits, 44(2): 569583.
[3.] Mathew, S.K., M.A. Anders, B. Bloechel, T. Nguyen, R.K. Krishnamurthy and S. Borkar, 2005. "A 4-GHz 300-mW 64-bit integer execution ALU with dual supply voltages in 90-nm CMOS," IEEE J. Solid-State Circuits, 40(1): 44-51.
[4.] Oklobdzija, V.G., B.R. Zeydel, H.Q. Dao, S. Mathew and R. Krishnamurthy, 2005. "Comparison of high-performance VLSI adders in the energy-delay space," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 13(6): 754-758.
[5.] Ramkumar, B. and H.M. Kittur, 2012. "Low-power and area-efficient carry select adder," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 20(2): 371-375.
[6.] Vratonjic, M., B.R. Zeydel and V.G. Oklobdzija, 2005. "Low- and ultra low-power arithmetic units: Design and comparison," in Proc. IEEE Int. Conf. Comput. Design, VLSI Comput. Process. (ICCD), pp: 249-252.
[7.] Nagendra, C., M.J. Irwin and R.M. Owens, 1999. "Area-time-power tradeoffs in parallel adders," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., 43(10): 689-702.
[8.] He, Y. and C.-H. Chang, 2008. "A power-delay efficient hybrid carry- lookahead/carry-select based redundant binary to two's complement converter," IEEE Trans. Circuits Syst. I, Reg. Papers, 55(1): 336346.
[9.] Chang, C.-H., J. Gu and M. Zhang, 2005. "A review of 0.18 [micro]m full adder performances for tree structured arithmetic circuits," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 13(6): 686-695.
[10.] Markovic, D., C.C. Wang, L.P. Alarcon, T.-T. Liu and J.M. Rabaey, 2010. "Ultralow-power design in near-threshold region," Proc. IEEE, 98(2): 237-252.
[11.] Dreslinski, R.G., M. Wieckowski, D. Blaauw, D. Sylvester and T. Mudge, 2010. "Near-threshold computing: Reclaiming Moore's law through energy efficient integrated circuits," Proc. IEEE, 98(2): 253-266.
[12.] Jain, S. et al., 2012. "A 280 mV-to-1.2 V wide-operating-range IA-32 processor in 32 nm CMOS," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers (ISSCC), pp: 66-68.
[13.] Zimmermann, R., 1998. "Binary adder architectures for cell-based VLSI and their synthesis," Ph.D. dissertation, Dept. Inf. Technol. Elect. Eng., Swiss Federal Inst. Technol. (ETH), Zurich, Switzerland.
[14.] Harris, D., 2003. "A taxonomy of parallel prefix networks," in Proc. IEEE Conf. Rec. 37th Asilomar Conf. Signals, Syst., Comput., 2: 2213-2217.
[15.] Kogge, P.M. and H.S. Stone, 1973. "A parallel algorithm for the efficient solution of a general class of recurrence equations," IEEE Trans. Comput., C-22(8): 786-793.
(1) Arun Sekar. R, (2) Naveen Balaji. G, (3) A. Gautami, (4) B. Sivasankari
(1, 2, 3, 4) Assistant Professor, Department of Electronics and communication Engineering, SNS College of Technology, Coimbatore, India
Received 7 June 2016; Accepted 12 October 2016; Available 20 October 2016
Address For Correspondence:
Arun Sekar. R, Assistant Professor, Department of Electronics and communication Engineering, SNS College of Technology, Coimbatore, India.
Caption: Fig. 1: Proposed Structure Of CSKA
Caption: Fig. 2:
Caption: Fig. 4: Structure of 8*8 Wallace Tree Multiplier
Table I: Comparison of Area, delay, power and energy of CSKA structure EXISTING METHOD FIXED VARIABLE STAGE STAGE SIZE SIZE 16 BIT AREA 240 318 (GATE COUNT) DELAY 33.221nsec 31.988nsec POWER 23.95mW 24.60mW ENERGY 795.65 797.15 32 BIT AREA 623 678 (GATE COUNT) DELAY 40.48nsec 39.478nsec POWER 24.72mW 24.74mW ENERGY 1000.66 976.68 PROPOSED METHOD FIXED VARIABLE STAGE STAGE SIZE SIZE 16 BIT AREA 276 300 (GATE COUNT) DELAY 23.882nsec 22.456nsec POWER 22.46mW 22.52mW ENERGY 536.38 505.70 32 BIT AREA 600 660 (GATE COUNT) DELAY 38.682nsec 30.927nsec POWER 23.31mW 23.37mW ENERGY 901.67 722.76 Table III: Comparison of Area, delay, power and energy of Multipliers EXISTING SYSTEM MULTIPLIERS AREA DELAY POWER ENERGY (GATE (nsec) (mW) COUNT) BAUGH-WOOLEY 1342 51.72 23.72 1226.79 WALLACE TREE 5230 7.327 27.44 201.05 PROPOSED SYSTEM MULTIPLIERS AREA DELAY POWER ENERGY (GATE (nsec) (mW) COUNT) BAUGH-WOOLEY 1164 48.60 21.34 1037.12 WALLACE TREE 4940 5.347 25.32 135.38 Fig. 3: Structure of BAUGH-WOOLEY Multiplier A = a3 a2 a1 a0 B = b3 b2 bl b0 1 [bar.a3b0] [bar.a3bl] a2b1 [bar.a3b2] a2b2 a1b2 1 [bar.a3b3] a2b3 a1b3 a0b3 p7 p6 p5 p4 p3 a2b0 a1b0 a0b0 a1b1 a0b1 a0b2 p2 p1 p0
|Printer friendly Cite/link Email Feedback|
|Author:||Arun, Sekar R.; Naveen, Balaji G.; Gautami, A.; Sivasankari, B.|
|Publication:||Advances in Natural and Applied Sciences|
|Date:||Sep 15, 2016|
|Previous Article:||Easy tax: a user friendly mobile application.|
|Next Article:||Impact of metakaolin on the properties of concrete: a literature review.|