Printer Friendly

Implementation of low-power and area-efficient carry save adder.


Most of the Very Large Scale Integrated Circuits (VLSI) applications, such as digital-signal processing and microprocessors, use the arithmetic operations. In addition, subtraction and multiplication are performed widely used. The full adder is the main building block of these operation modules. Because of the 1 -bit carry save adder is equal to carry save adder. Such an adder can be implemented using pass transistor technique. Among the logic styles available, pass transistor is found to enhance the circuit performance.

Nowadays, the integrated circuits (ICs) are fabricated the number of components in very small size. In modern VLSI design, circuit may integrate millions or billions of transistors in a single chip. Such high level of integration leads to increased power consumption and circuit area. If the power consumption is increased, then battery life will be reduced. NMOS transistors are mostly used in VLSI design [1]. Dynamic logic (e.g. [R]-n network, [PHI]-p network, or domino logic) requires less number of transistors, thus leading to more area-efficient design [2]. However, the dynamic outputs circuits are not always available. The CMOS circuits are performing pre-charge and evaluation process, and outputs are valid only in evaluation phase. Therefore, dynamic circuits are affected by leakage effect. The combination of NMOS and PMOS networks are called CMOS circuit. When the input is "0", then nMOS turned off and pMOS pre-charged to VDD. The load capacitance is getting charged from source (VDD) to capacitance. When the input is "1", then pMOS turned off and nMOS evaluated to VSS.

The load capacitance is getting discharged from capacitance to ground (VSS). Pass-transistor logic offers minimum transistors count, smaller circuit area and lower energy consumption [3]. It does not involve any charge leakage problem in dynamic logic. The outputs of PTL are valid in static CMOS. Thus pass-transistor logic can be a good choice for low power VLSI design. Pass-transistor logic has been widely used in low power VLSI design, nano-electronics, optical computing and quantum computing. Nowadays, more number of transistors is integrated into a single chip.

II. Review Of Full Adder Design:

A. Full Adder:


A full adder is adding three bits, the third output produced from a previous addition. A full adder is a combinational circuit that adds three 1-bit inputs (A, B & []) and generates 1-bit sum (S) and 1-bit carry ([C.sub.out]). The full adder has three inputs (A, B & []) and two outputs (sum and [C.sub.out]). The fig. 1 shows 1-bit carry save adder block is the same circuit as a full adder.

B. Basic operation of Pass transistor:

Pass transistor is a fast operating device. The number of inputs can be controlled by the gate inputs. It can be act as a switch either NMOS or PMOS. NMOS has faster switching operation than PMOS. Mostly NMOS are used in arithmetic operation. In PMOS transistors, delay time can be increased. So that, PMOS mostly avoided in the arithmetic operations. It used to reduce the transistors count to make as different logic gates, by eliminating redundant transistors. A set of pass transistor can be controlled by gate signals to produces the logic function. The inputs are given to the Pi and outputs are combined to produce the function ([F.sub.n]). In NMOS or PMOS, if the inputs are "0 or 1", it produces "strong 0 or 1". Pass transistors are controlled by the gate signals ([V.sub.1],[V.sub.2], ... [V.sub.n]). The pass transistor circuit is used to minimize the transistor count. So, the total chip area has been reduced. The power consumption has been minimized and delay has been reduced. In this paper, 3 bit carry save adder (CSA) has been designed using parallel prefix adder circuit. The carry save adder used to perform both addition and multiplication operation. Additions with carry save adder saves time and logic. The final results, is the additional of three inputs can be controlled by gate inputs to produces SUM and CARRY.

III. Carry Save Adder:

A. Carry save Adder:

In Carry Save Adder (CSA), three or more than three n-bits binary numbers are added in parallel at a time.

Carry save adder reduces the addition of three inputs to two outputs and the outputs are sum and carry-out. In addition operation the carry can be saved in register. Instead of the carry is stored in present stage, and updated to next stage. Hence, the delay of carry-out is reduced in this scheme. Fig.2 shows the circuit diagram of carry save adder. The carry save adder is similar to as full adder. In carry save adder, three inputs ([A.sub.i], [B.sub.i] & [C.sub.i]) and outputs are sum (S) and carry-out ([C.sub.o]). Carry save adder has reduces the delay in the carry.


If the adder is required to add two numbers and produce a result, carry-save addition is useless, since the result still has to be converted back into binary values and the carry is propagate through from right to left. But in large-integer arithmetic, addition is a very rare operation, and adders are mostly used to accumulate partial sums in a multiplication.

B. Structure of Carry save Adder:

The structure of carry save adder has a longest critical path delay in the stages. Longest path has more delay time in the addition operation. Critical path is divided into two short latency paths to reduce the delays in the operation. Carry is one of the inputs stored in register to the carry save adder. The concepts of the variable latency adders, adaptive clock stretching, and also supply voltage scaling in an N-bit CSA adder may be explained using Fig. 3.


C. Structure of Hybrid Variable Latency Carry Save: Adder:

The structure of hybrid variable latency CSA is shown in fig.4. Addition can be performed in three inputs ([A.sub.i], [B.sub.i] & [C.sub.i]). The additions of three inputs are summed and carry is stored in present stages. Stored values of carry can be updated to next stages. The signal P8:1 is used in the skip logic to determine if the carry output of the previous stage (i.e., CO,p-1) should be skipped or not. In addition, this signal is determine as the predictor signal in the variable latency adder. It should be mentioned that all of these operations are performed in parallel with other stages. In the case, where P8:1 is one, CO,p-1 should skip this stage predicting that some critical paths are activated.

The incrementation blocks are used to finding the sizes of the stages in the hybrid variable latency carry save adder structure. Since the Parallel Prefix Adder structure is more efficient compared with its size. The nucleus stage is used to reduce the number of stages as well smaller delays for SLP1 and SLP2. Finally, the delay time can be reduced. The results of simulation output can be achieved efficiently. Compared with existing method, the proposed methods have high speed, low power and area.

D. Computation flow of Carry save Adder:

The carry save adder has three inputs (A, B, C) and two outputs (Sum & Carry-out). Numbers of binary inputs are added in carry save adder. The carry save adder performed number of stages. The output of carry save adder will be added in parallel prefix adder and produces SUM and CARRY. The carry save multiplication also performs add-and-shift operation. Finally, sum and carry together to computing the resulting output. Fig.5. Shows the computation flow of carry save adder.



The final addition is then computed as: 1. Shifting the carry output to left next stage.

2. Placing a 0 to the front (MSB) of the partial sum sequence S.

3. Finally, a carry save adder is used to add these three together and computing the resulting sum.


Fig. 6. shows the computation of parallel prefix adder. Three n-bits inputs binary number are added in addition operation. During the addition operation, longest delay can be introduced in carry-out. The delay can be reduced by parallel prefix adder. Then the longest latency path can be separated into short latency path 1 and short latency path 2. Finally, the addition operation of delay can be reduced. In the carry save multiplier, multiplication operation can be dividing into short latency path. But, the delay will be increased. Finally, the power and area also increased.

IV. Simulation Results:

The results of carry save adder performs approximately achieved the delay and efficient. The results also suggested the CSA structure as a very good adder for the applications where both the speed and energy consumption are critical. It exploited a parallel prefix adder structure at the middle stage for increasing the slack time, which provided us with the opportunity for lowering the energy consumption by reducing the supply voltage.

The advantage of this method is to reduce delay in the critical path. The parallel prefix adder is placed in between the stages and it can be increases the slack time in the adder. The combination of carry save adder with parallel prefix adder should be achieved minimum delay. The simulation operations were performed using Xilinx ISE 12.1. The simulation output of carry save adder with parallel prefix adder is shown in fig. 7.


The parametric analysis of carry save adder is performed in parallel operation. Carry save adder is compared with other adders like full adder, ripple carry adder, carry look-ahead adder; it reduces the delay and increases the efficiency. The addition operation can be performed in same memory. Synthesis output of carry save adder with parallel prefix adder summary is shown in the table1.

The combination of carry save adder with parallel prefix adder has more advantages than other adders. The carry save adder reduces the delays as shown in fig.8. The memory can be use same register. The efficiency of carry save adder is increased as shown in fig. 10. The results are obtained using Xilinx simulations, the Parallel Prefix Adder improves delay in the Carry Save Adder. The results are obtained from the synthesis output of simulation. Further more modifications in the structure, the adder will be achieved very efficient.



An FPGA is a device that contains a configurable logic blocks, input/output blocks and programmable interconnect. When a FPGA is configured, the internal circuitry is connected in a way that creates a hardware implementation of the software application. Fig. 11 shows the photographic view of Xilinx Spartan 3E FPGA module. The data is transfer by port or RS-232 cable. The circuit contains 8-bit switches; they are SW0-SW3 and SW7-SW4. Assign the inputs to hardware by changing the switches. The outputs are displayed in the 4digits 7-segment display.

VI. Result & Conclusion:

In this paper, Hybrid Variable Latency Carry Save Adder structure was proposed, that achieved a higher speed and lower energy consumption. The delay was achieved by modifying the structure through the parallel prefix adder and incrimination blocks. These techniques are present in carry save adder operation to improve the overall efficiency. The proposed Hybrid Variable Latency Carry Save Adder structure was increases the efficiency. In addition, a structure of hybrid variable latency carry save multiplier was proposed. For multiplication, add-and-shift will occupy more delay and area. The carry save adder can be implemented in Xilinx ISE. Xilinx Spartan 3 FPGA module is used to implement the Hybrid Variable Latency Carry Save Adder. The result of carry save adder delay was reduced up to 1.27ns. The results also suggested the CSA structure as a very good adder for both the delay and energy consumption.


The author would like to particularly thank to the director K Saravana Kumar. R Dhayabarani for giving idea for Low power VLSI design. Very thanks to the department of ECE & EEE, V.S.B. Engineering College, Karur.


[1.] Bahadori, M., 2015. "High-Speed and Energy-Efficient Carry Skip Adder Operating Under a Wide Range of Supply Voltage Levels" Very Large Scale Integration (VLSI) Systems, IEEE Transactions on pp: 99.

[2.] Ramkumar, B. and H.M. Kittur, 2012. "Low-power and area-efficient carry save adder," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 20(2): 371-375.

[3.] Chang, C.-H., J. Gu and M. Zhang, 2005. "A review of 0.18 pm full adder performances for tree structured arithmetic circuits," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 13(6): 686-695.

[4.] Nagendra, C., M.J. Irwin and R.M. Owens, 1996. "Area-time-power tradeoffs in parallel adders," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., 43(10): 689-702.

[5.] Markovic, D., C.C. Wang, L.P. Alarcon, T.-T. Liu and J.M. Rabaey, 2010. "Ultralow-power design in near-threshold region," Proc. IEEE, 98(2): 237-252.

[6.] Suzuki, H., W. Jeong and K. Roy, 2003. "Low power adder with adaptive supply voltage," in Proc. 21st Int.

Conf. Comput. Design, pp: 103-106.

[7.] Kantabutra, V., 1993. "Accelerated two-level carry-skip adders--A type of very fast adders," IEEE Trans.

Comput., 42(11): 1389-1393.

[8.] Alioto, M. and G. Palumbo, 2003. "A simple strategy for optimized design of one-level carry-skip adders," IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., 50(1): 141-148.

[9.] Vratonjic, M., B.R. Zeydel and V.G. Oklobdzija, 2005. "Low- and ultra low-power arithmetic units: Design and comparison," in Proc. IEEE Int. Conf. Comput. Design, VLSI Comput. Process. (ICCD) pp: 249-252.

[10.] Dreslinski, R.G., M. Wieckowski, D. Blaauw, D. Sylvester and T. Mudge, 2010. "Near-threshold computing: Reclaiming Moore's law through energy efficient integrated circuits," Proc. IEEE, 98(2): 253266.

[11.] Zlatanovici, R., S. Kao and B. Nikolic, 2009. "Energy- delay optimization of 64-bit carry-lookahead adders with a 240 ps 90 nm CMOS design example," IEEE J. Solid-State Circuits, 44(2): 569-583.

[12.] He, Y. and C.-H. Chang, 2008. "A power-delay efficient hybrid carry-lookahead/ carry-save based redundant binary to two's complement converter," IEEE Trans. Circuits Syst. I, Reg. Papers, 55(1): 336346.

(1) K. Boopathi, (2) R. Dhayabarani, (3) S. Raja (1) P.G. Scholar, 2 Associate Professor and 3 Assistant Professor (1,2,3) Department of Electronics and Communication Engineering, V.S.B. Engineering College, Karur--639 111, India.

Received 25 January 2016; Accepted 18 April 2016; Available 28 April 2016

Address For Correspondence:

K. Boopathi, Department of Electronics and Communication Engineering, V.S.B. Engineering College, Karur--639 111, India. E-mail:
Table 1: Device Utilization Summary Of Carry Save Adder With Ppa
Device Utilization Summary (estimated values)

Logic Utilization        Used    Available    Utilization

Number of Slices           47          960            4%
Number of 4 input LUTs     81         1920            4%
Number of bonded IOBs      65           66           98%

Figure 8. Comparisons of delay with adders.

Carry Save Adder            6.236
Full Adder                  7.506
Ripple Carry Adder          9.926
Carry look shead adder     11.048

Note: Table made from bar graph.

Figure 10. Comparisons of memory with adders.

Carry Save Adder            83.3%
Full Adder                  78.6%
Ripple Carry Adder          73.6%
Carry look shead adder      72.5%

Note: Table made from bar graph.
COPYRIGHT 2016 American-Eurasian Network for Scientific Information
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2016 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Boopathi, K.; Dhayabarani, R.; Raja, S.
Publication:Advances in Natural and Applied Sciences
Article Type:Report
Date:Apr 1, 2016
Previous Article:An survey on observation response to primary therapy in carcinoma victimization using dynamic contrast-enhanced magnetic resonance imaging.
Next Article:Analysis of performance metrics of controllers for pitch controlled variable speed wind turbine.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters