Printer Friendly

FPGA implementation of area efficient and reduced delay 64x64 Vedic Multiplier.


In Arithmetic operations Multiplication is the most important fundamental function. Some of the frequently used Computation- Intensive Arithmetic Functions (CIAF) such as Multiply and Accumulate (MAC) and inner product are currently implemented in many Digital Signal Processing (DSP) applications such as convolution, Fast Fourier Transform(FFT) and filtering. So there is a need of high speed multiplier because multiplication dominates the execution time of most DSP algorithms. The dominant factor of determining the instruction cycle time of a DSP chip is the multiplication time.

Vedic Mathematics is the name given to the ancient Indian system of mathematics that was discovered in early twentieth century. It is mainly based on sixteen principles which are termed as sutras. A simple digital multiplier i.e. Vedic Multiplier architecture based on Urdhva Tiryakbhyam (Vertically and Crosswise) sutra ([3.sup.rd] sutra of Vedic Mathematics) with 64-bit RCA has been discussed by Sree Hare Priya.V.T et al. (2014). In order to improve the speed of VM we have proposed two adders with reduced delay. The main key to the proposed VM is to reduce the area and logic delay of the VM which has been discussed Sree Hare Priya.V.T (2014). SwetaKhatri and GhanshyamJangid (2014) proposed a 64- bit fast VM using barrel shifter based on Nikhilam sutra ([2.sup.nd] sutra of Vedic Mathematics). The performance of proposed VM shows that Urdhva Tiryakbhyam is better than Nikhilam sutra for digital multipliers. Kandimalla Rajaneesh and M.Bharathi (2014) proposed a 64-bit Multiply and Accumulate (MAC) unit with 64-bit VM using 64-bit RCA. Our proposed VM shows area improvement when compared to their proposed multiplier. According to PranaliThakre et al. (2014) when useUrdhva Tiryakbhyam, each time the carry will be added to the product term by using 64-bit RCA, so propagation delay will be minimized which reduces the overall delay of VM.

Vedic multiplication:

Vedic mathematics is mainly based on 16 sutras. VM is based on the [3.sup.rd] sutra which is Urdhva Tiryakbhyam (Vertically and Crosswise). The basic block of VM is 2x2 multiplier, which is shown in Fig 1.


A. Urdhva Tiryakbhyam for 2x2:

* Let us consider the two 2-bit binary numbers A1A0 and B1B0 as shown in Fig 1.The result of this 2X2 bit multiplication would be 4 bits that is C2, S2, S1 and S0 as shown in Fig 2.

* The least significant bit (LSB) A0 of multiplicand is multiplied vertically by LSB bit B0 of the multiplier, get their product S0 and this S0 is the LSB of result (S0).

* Then A1 and B0 and A0 and B1 are multiplied crosswise, add these two values, get sum1 (S1) and carry1 (C1). The sum bit is the middle part of the result (S1).

* Then A1 and B1 is multiplied vertically, add it with the previous carry (C1) and get S2 as their product and carry2 (C2), the sum bit is down to the result (S2).

* Then the carry2 (C2) is considered as the most significant part of the result (S3).


B. Urdhva Tiryakbhyam for 4x4:

The partial products and their summation of 4 digit multiplication are obtained parallel in Urdhva Tiryakbhyam as shown in Fig 3.

Algorithm for 4x4 VM using Urdhva Tiryakbhyam for two binary numbers
is shown below.

        a3 a2  a1 a0   Multiplicand
        b3 b2  b1 b0   Multiplier
r7 r6 r4 r3 r2 r1 r0

r0 = (a0xb0); r1 = (a1xb-0) + (a0xb1); r2 = (a2xb0) + (a0xb2) + (alxbl); r3 = (a3xb0) + (b3xa0) + (a2xb1) + (a1xb2); r4 = (a3xb1) + (a1xb3) + (a2xb2); r5 = (a3xb-2) + (a2xb3); r6 = (a3xb3); r7 = Carry Propagated from previous bits

Conventional 64x64 vm and its adders:

The architecture for conventional 64x64 VM is shown in Fig 4. For 64-bit Multiplier design, first the basic block 2x2 bit multiplier (Fig 2) is designed, then a4x4 block is designed using this 2x2 block, then a 8x8 bit block is designed using this 4x4 block, then a 16x16 bit block is designed using this 8x8 block, then a 32x32 bit block is designed using this 16x16 block, finally a 64x64 bit Multiplier is designed. From these multipliers we have concluded that to construct a 64x64 bit Vedic Multiplier, four 32x32 VMs and three 64-bit Ripple Carry Adders (RCAs) are required.


A. 64-Bit Ripple Carry Adder (RCA):

The architecture for 64-bit RCA is shown in Fig 5.It adds two 64-bit binary number and produces 65 bit binary number as a result. So it consists of 1 Half Adder (HA) and 63 Full Adders (FAs).


B. Urdhva Tiryakbhyam Algorithm for 64X64 VM:

* Let's consider two 64-bit binary numbers. The result of this 64x64 bit multiplication would be 128 bits.

* Divide the given 64-bit into two 32 bit binary numbers. Consider the first 32 bits are MSB and last 32 bits are LSB.

* By using 32x32 VM find the partial products as shown in Fig 4. Consider partial product 0 (pp0), pp1, pp2 and pp3 are the 64 bits outputs of the 32x32 VM1, VM2, VM3 and VM4 respectively.

* By using 64 bit RCA1, pp1 and pp2 are added and produces s1 [64:0].

* Last 32 bits of pp0 i.e. pp0 [31:0] are directly passed to last 32 bits of resultant product i.e. p [31:0]. So it has remaining 32 bits only, i.e. pp0 [63:32].

* In order to add 64 bits of RCA1 i.e. s1[63:0] with the remaining 32 bits of VM1 by using RCA2, 32 zeros are appended with the remaining 32 bits of VM1 ,i.e.pp0[63:32].

* These two 64 bits are added by using RCA2 and produces s2 [64:0].

* Last 32 bits of RCA2 i.e. s2 [31:0] are passed to the next 32 bits i.e. p [63:32] of the resultant product. So it has remaining 32 bits only, i.e.s2 [63:32].

* The MSB of both RCA1 and RCA2 i.e. s1 [64] and s2 [64] are given to a OR gate and the output is appended with s2 [63:32].So now it has 33 bits.

* In order to add 64 bit pp2 with the above 33 bits, 31 zeros are appended. So now it has 64 bits.

* These 64 bits are added with 64 bits pp2 by using RCA3 and produces s3 [64:0].

* The 64 bits i.e. s3 [63:0] except MSB is passed to the next 64 i.e. p [127:64] bits of resultant product. So final resultant product has 128 bits.

Proposed 64x64 vm and its adders:

A. Adder 1 for 64X64 VM:

Adder 1 for proposed NxN VM is the same N-bit RCA, which is used in conventional 64x64 VM. The architecture of 64-bit RCA is shown in Fig uses 63 Full Adders (FAs) and one Half Adder (HA).

B. Proposed Adder 2 for 64X64 VM:

In conventional 64x64 VM, 32 bits zeros are appended with 32 bits of pp0, in order to add these two bits by RCA2.So here an Adder 2 is proposed instead of RCA2, in order to avoid appending 32 zeros. The architecture for proposed Adder 2 is shown in Fig 6. Conventional 64-bit RCA uses 63 FAs and one HA as shown in Fig 5. But proposed Adder 2 uses 31 FAs and 33 HAs.



C. Proposed Adder 3 for 64X64 VM:

In conventional 64x64 VM, 31 zeros are appended with 33 bits of RCA2, in order to add these two bits by RCA3.So here an Adder 3 is proposed instead of RCA3, in order to avoid appending 31 zeros. The architecture of proposed Adder 3 is shown in Fig 7. 64-bit RCA consists of 63 FAs and 1 HA. But proposed Adder 3 has 32 FAs and 32 HAs.

D. Proposed 64X64 VM:

The architecture for our proposed 64x64 VM is shown in Fig 8.Connventional 64x64 VM uses four 32x32 VMs and three 64-bit RCA. But in our proposed 64x64 VM we use one 64-bit RCA with our two proposed adders instead of three 64-bit RCAs. We reduce the area of the 64-bit RCA which is required to get the product terms from the partial products.



A. Area Evaluation and Comparison:

The area evaluation method considers all blocks of adders to be made up of AND, OR and INVERTER (AOI) because these gates are performing the operations in parallel and each having area equal to one unit. So the area evaluation is done by counting the total number of AOI gates required for each adder blocks. Based on this approach the area of VM blocks are evaluated and listed in Table 1.

By using Table 1, the area of conventional VM and proposed VM has been calculated and listed in Table 2.

B. Comparison of Logic Delay:

Proposed VM and conventional VM are coded in Verilog, synthesized and simulated using ISE simulator. The maximum combinational path delay of both the multiplier has been has been found out and listed in Table 3.


In our design, efforts have been made to reduce the area and propagation delay and achieved an improvement in the reduction of area with 13.23% and reduction of delay with 20.3% for 64x64 Proposed VM when compared to conventional 64x64 VM. The propagation delay or gate delay is the length of time which starts when the input to a logic gate becomes stable and valid to change, to the time that the output of that logic gate is stable and valid to change. Reducing gate delays in digital circuits allows them to process data at a faster rate and improve overall performance. The high speed implementation of such a multiplier has wide range of applications in image processing, arithmetic logic unit and VLSI signal processing.


Article history:

Received 3 September 2014

Received in revised form 30 October 2014

Accepted 4 November 2014


Gianluca Cornetta and JordiCortadella, 2001. Asynchronous Multipliers with Variable-Delay Counters. IEEE Conference, pp: 701-705.

Jagadguru Swami, Sri Bharati Krisna and Tirthaji Maharaja, 1965. Vedic Mathematics or Sixteen Simple Mathematical Formulae from the Veda, Delhi. Motilal Banarsidas, Varanasi, India.

Jung-Yup Kang and Jean-Luc Gaudiot, 2006.A Simple High-Speed Multiplier Design. IEEE Transactions on Computers, 55: 1253-1258.

Kandimalla Rajaneesh and M. Bharathi, 2014. A Novel High Performance Implementation of 64 Bit MAC Units and Their Delay Comparison. International. Journal of Engineering Research and Applications, 4: 122-127.

Morris Mano, M., 1993. Computer System Architecture. [3.sup.rd] edition, Prentice-Hall, New Jersey, USA.

Poornima, M., Shivaraj Kumar Patil, Shivukumar, KP. Shridhar and H. Sanjay, 2013. Implementation of Multiplier using Vedic Algorithm. International Journal of Innovative Technology and Exploring Engineering, 2: 219-223

PrabirSaha, Arindham Banerjee, Partha Battacharyya and Anup Dhandapat, 2011.High speed design of complex multiplier using Vedic mathematics. IEEE students' technology symposium, IIT Kharagpur, pp: 237-241.

Pranali Thakre, Dr. Sanjay Dorle and Prof. VipinBhure, 2014. Low Power 64bit Multiplier Design by Vedic Mathematics. International Journal of Application or Innovation in Engineering & Management, 3: 393-396.

Priya, R. and J. Senthil Kumar, 2013.Implementation and Comparison of Vedic Multiplier using Area Efficient CSLA Architectures. International Journal of Computer Applications, 73(10): 22-29.

Ramachandran, S. and S. Kirti Pande, 2012. Design, Implementation and Performance Analysis of an Integrated Vedic Multiplier Architecture.International Journal of Computational Engineering Research, 2(3): 697-703.

Ramalatha, M. Dayalan, KD. Dharani, P. Priya and S. Deoborah, 2009. High Speed Energy Efficient ALU Design using Vedic Multiplication Techniques. International Conference on Advances in Computational Tools for Engineering Applications, pp: 600-603.

Sree Hare Priya, V.T., S. Vidhya and P. Vennila, 2014. Implementation of 64x64 Vedic Multiplier using Urdhva Tiryakbhyam Technique International Journal of Advanced Information Science and Technology, 27(17): 147-150.

Sweta Khatri and Ghanshyam Jangid, 2014. FPGA Implementation of 64-bit fast multiplier using barrel shifter. International Journal for Research in Applied Science and Engineering Technology, 2(VII): 344-348.

Tariquzzaman, Syed Rizwan Ali and Nahid Kausar, 2014. FPGA implementation of 64 bit RISC processor with Vedic multiplier using VHDL. IOSR Journal of Electrical and Electronics Engineering, pp: 12-16.

Paldurai K and Hariharan K

Thiagarajar College of Engineering, Department of ECE, Madurai--625015, Tamil

Nadu, India

Corresponding Author: Paldurai, K., Embedded Systems Laboratory, Department of ECE, Thiagarajar College of Engineering, Madurai--625015, Tamil Nadu, India.

Tel: +91-8973277670, E-mail:
Table 1: Area count of VM blocks.

Blocks   Area count

 XOR         5
  HA         6
  FA         13

Table 2: Area comparison of conventional and proposed VM.

Multiplier   Proposed VM   Conventional VM   Reduced Area (%)

 64 x 64        75438           86939             13.23

Table 3: Path delay comparison of conventional and proposed VM.

Multiplier   Proposed VM(ns)   Conventional VM(ns)   Reduced Delay (%)

 64 x 64         64.603              81.058                20.3
COPYRIGHT 2014 American-Eurasian Network for Scientific Information
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2014 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Paldurai, K.; Hariharan, K.
Publication:Advances in Natural and Applied Sciences
Article Type:Report
Geographic Code:9INDI
Date:Nov 1, 2014
Previous Article:Design of multi output binary adder using modified parallel prefix addition.
Next Article:Impact of image size on performance of IQA algorithms on large images.

Terms of use | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters