Printer Friendly

Performance & analysis of high efficient CAM using parity bit and gated circuits.

INTRODUCTION

Digital VLSI circuits are predominantly CMOS based. The way normal blocks like latches and gates are implemented is different from what students have seen so far, but the behavior remains the same. Some of the factors are: Large complicated circuits running at very high frequencies have one big problem to tackle-the problem of delays in propagation of signals through gates and wires. The operation speed is so large that as delays add up, they can actually become comparable. The effect of high operation frequencies is increased consumption of power. This has two-fold effect-devices consume batteries faster, heat dissipation increases and coupled with the fact the surface area are decreased. Heat posses a major threat to the stability of the circuit itself. Laying out the circuit component is task common to all branches of electronics what so special in our case is that there are many possible ways to do this. There can be multiple materials on the same silicon.

The power dissipation and speed in a circuit present a trade off; if we try to optimize on one, the other is affected. The choice between the two is determined by the way we chose the layout the circuit components on the silicon. The key advantage of VHDL when used for systems design is that it allows the behavior of the required system to be described (modeled) and verified (simulated) before synthesis tools translate the design. Another benefit is that VHDL allows the descriptions of a concurrent system (many parts, each with its own sub-behavior, working together at the same time). VHDL is a data flow language, unlike procedural computing languages such as BASIC, C, and assembly code, which all run sequentially, one mapped onto a program instruction at a time. The power dissipation and speed in a circuit present a trade-off; if we try to optimize on one, the other is affected. The choice between the two is determined by the way of choosing the layout. A final point is that when a VHDL model is translated into the "gates and wires" that are mapped onto a programmable logic device such as a CPLD or FPGA, then it is the actual hardware being configured, rather than the VHDL code being; executed as if on some form of a processor chip. Layout can also affect the fabrication of VLSI chips, making it either easy or difficult to implement the components on the silicon.

Literature Review:

Dong Hyuk Woo and Hsien-Hsin S [1] proposed heterogeneous DRAM chip for improving both performance and energy efficiency. The proposed method have a novel floor plan and several architectural techniques that fully exploit the benefits of 3-D stacking technology. By tightly integrating a small row cache with its corresponding DRAM array, the performance can be improved by 30% while saving dynamic energy by 31%. Nitin Mohan et al, [2] presented a 20-kilobit TCAM featuring two MLSAs with positive-feedback techniques. The proposed circuits have been fabricated on a test chip in 0.18- nm CMOS technology. Energy measurement results of the two MLSAs show reductions of 56% and 48%, respectively, over the conventional current-race MLSA.Po-Tsang Huang and Wei Hwang [3] proposedTCAM design utilizes two power gating techniques, namely supercut-off power gating and multi-mode data-retention power gating, to reduce theincreasing leakage power in advanced technologies.An energy-efficient 256* 144 TCAM macro is implemented using UMC 65 nm CMOS technology, and the experimental results demonstrate a leakage power reduction of 19.3% and an energy metric of the TCAM macro of 0.165 fJ/bit/search.

Shuhong Gao and Todd Mateer [4] proposed an new additive fast fourier transform based on Taylor expansion over finite fields of characteristics.The algorithm of cyclic multiplication group order (n).The advantage is multiplication is easy and power consumption.Yi Min Lin et al., [5] proposed an CMOS technology supporting 21 modes in the s2 system and produce 300 mhz operation frequency with a gate count of 32.4k. The advantage is reduce the requirements of electronic components so that the space complexity is reduced. Yue Zhang [6] presented a design of NOR-type CAM based on DW motion in PMA magnetic tracks. The CMOS switching and sensing circuits are globally shared to optimize the cell area down to 6 F /bit; the complementary dual track allows the local sensing and faster data search speed while keeping low power. The power consumed can be at an average of 30%.

Anto bennet et al., [7] presented a novel, low-energy content addressable memory (CAM)structure is presented which achieves an approximately four-fold improvement in energy per access. It exploits the address patterns commonly found in application programs, where testing the four least significant bits of the tag is sufficient to determine over 90% of the tag mismatches; the proposed CAM checks those bits first and evaluates the remainder of the tag only if they match. The energy savings come at the cost of a 25% increase in search time, the proposed CAM organization also supports a parallel operating mode without a speed loss but with reduced energy savings. Anto bennet et al., [8] presented a novel VLSI architecture for a fully parallel pre computation-based content-addressable memory (PB-CAM) with low-power, low-cost, and low-voltage features. This design is based on a pre computation approach that saves not only power consumption of the CAM system, but also reduces transistor count and operating voltage of the CAM cell.. With a 128 words by 30 bits CAM size, the measurement results indicate that the proposed circuit works up to 100 MHz with power consumption of 33 mW at 3.3-V supply voltage and works up to 30 MHz under 1.5-V supply voltage.

Experimental Section:

A new techniques are proposed to reduce the power consumption and increase the speed. This can be achieved by using gated power ML technique and parity bit respectively. Since all available words in the CAMs are compared in parallel, result can be obtained in a single clock cycle. Experimental results show that the proposed techniques can effectively reduce power consumption in network routers and other applications. The ability of the designs to work at low supply voltage, by re-implementing the designs in convention alone is of 65-nm technology. It demonstrates poor adaptability to voltage scaling. They cannot be operated at a supply voltage lower than 0.9 V. In proposed paper CMOS 45-nm technology is used. This increases the number of transistors in the chip and the performance can be increased. This technique also reduces the chip area consumption. This is said to be integration in VLSI. The power consumption can also be reduced when compared to the conventional CAM. Proposed design can be implemented by using T anner tool. This transform the ideas into designs it should be simulate large circuits quickly and with a high degree of accuracy. This provides fast, accurate and precise options to enable optimal balance of accuracy and performance. Enables to link from syntax errors to the SPICE deck by double clicking.

Block Diagram Of High Efficient Cam:

[FIGURE 1 OMITTED]

Input Data:

The input data given in the CAM can be of the data by which the location of the data can be identified. The input can be stored in the CAM cells. These CAM cells are formed by the intersection of rows and columns.

Search Word Register:

A compare operation by loading an n-bit input search word into the search data register. The search data are then broadcast into the memory banks through n pairs of complementary search-lines SL and directly compared with every bit of the stored words using comparison circuits. Each stored word has a ML that is shared between its bits to convey the comparison result.

Parity Bit:

The parity bit based CAM design is consisting of the original data segment and an extra one-bit segment, derived from the actual data bits. We only obtain the parity bit, i.e., odd or even number of "1"s as shown in fig 2. The obtained parity bit is placed directly to the corresponding word and ML. Thus the new architecture has the same interface as the conventional CAM with one extra bit. During the search operation, there is only one single stage as in conventional CAM. Hence, the use of this parity bits does not improve the power performance.

[FIGURE 2 OMITTED]

Match Line:

A match-line (ML) sensing scheme that distinguishes a match from a miss by first shunting every ML with a fixed negative resistance, then exciting the MLs with an initial charge, and subsequently observing their voltage developments. It is shown that the voltage on the matched ML will grow to VDD, as in an unstable system, whereas the voltage on a missed ML will decay to zero, as in a stable system. Since the initial excitation charge on the ML's can be as low as the noise level in the system, this scheme can approach the minimum possible energy consumption level for match-line sensing.

Match-Line Energy Consumption:

The energy consumed in a CAM is due to the repeated pre charging and discharging of all but one of the match lines in each access. This is due to the "parallel" (or NOR type) implementation of the match operation. "Serial" (or NAND type) CAM designs search one bit at a time (for each row) so that they do not discharge a single large capacitance when there is no match. They are generally slower than parallel CAMs, as their search speed depends on the number of cells in a row. This can be exploited by a mixed, serial-parallel, matching method, where after searching a small number of bits serially, the remainder can be evaluated in parallel, for those lines that could still match.

Gated-Power Ml Sense Amplifier Design:

The CAM cells are organized into rows (word) and columns (bit). Each cell has the same number of transistors as the conventional P-type NOR CAM and use a similar ML structure. However, the comparison unit, i.e., transistors M1,M4 and the "SRAM" unit are independently operated. The VDD is independently controlled by a power transistor Px as shown in fig 3. And a feedback loop that can auto turn-off the current to save power. The purpose of having two separate power rails of is to completely isolate the SRAM cell from any possibility of power disturbances during compare cycle.

[FIGURE 3 OMITTED]

Priority Encoder:

A priority encoder is a circuit or algorithm that compresses multiple binary inputs into a smaller number of outputs. The output of a priority encoder is the binary representation of the original number starting from zero of the most significant input bit. They are often used to control interrupt requests by acting on the highest priority request.

Two or more inputs are given at the same time, the input having the highest priority will take precedence. An example of a single bit 4 to 2 encoder is shown in fig 4, where highest-priority inputs are to the left and "x" indicates an irrelevant value--i.e. any input value there yields the same output since it is superseded by higher- priority input. The output V indicates if the input is valid.

[FIGURE 4 OMITTED]

Output Data:

The output data can be obtained from the priority encoder. This chooses the best data which is matched with the input given to the CAM cells. The mismatched data at the output can be ignored by the priority encoder. Even sometimes the data which is of nearest match to the input can be accessed where no exact match is obtained.

Search Line:

In the routing process in VLSI and PCB, the "gridless router" has been considered where the routing is executed without using a grid. The improved line search algorithm is one of the gridless routers where the line search algorithm is extended so that the route search procedure is executed on a linear memory space in a polynomial time. The search of the route with the minimum bends is ensured. However, the algorithm has a problem in that the route is obtained by iterating the search of the two-dimensional figures. The associative memory presents an algorithm for the improved line search using the associative memory where the time and space complexity is O(n). The proposed algorithm is implemented on the associative processor and the result is compared to the result of implementation on the general-purpose computer.

Search-Line Energy Consumption:

Traditional CAM cells combine the search lines with the bit lines. This causes an increase in the capacitance of each search line as an extra transistor drain per cell is present (though it could be shared in the cell layout. Even after separating search and bit lines, driving the search lines accounts for almost half of the energy in search operations. Apart from having a relatively high capacitance, in parallel CAMs, one search line per bit switches at every search, even when the same value is searched each time. This is because the search lines must be driven low while the match lines are being pre-charged to avoid a direct short-circuit from supply to ground through the cells that do not match. On the other hand, serial CAMs form chains of transistors that propagate a value when all the cells match; evaluation is coordinated by pre-charging the intermediate nodes so that the search lines do not have to be pre-charged for every search.

Power gating transistors:

Power gating is a technique used in integrated circuit design to reduce power consumption, by shutting off the current to blocks of the circuit that are not in use. In addition to reducing stand-by or leakage power, power gating has the benefit of enabling power gating affects design architecture more than clock gating. This increases time delays, as power gated modes have to be safely entered and exited. Architectural trade-offs exist between designing for the amount of leakage power saving in low power modes and the energy dissipation to enter and exit the low power modes.

Dynamic Power Consumption:

The power-gated transistor is turned off after the output is obtained at the sense amplifier, the proposed technique renders a lower average power consumption. This is mainly due to the reduced voltage swing on the ML bus. Another contributing factor to the reduced average power consumption is that the new design does not need to pre charge the ML buses because the EN signal turns off transistor PX of each row and hence the SL buses do not need to be pre-charged, which in turn saves power on the SL buses.

Existing Method:
Table 1: Voltage and power consumption of existing

PARAMETER           INPUT VOLTAGE    NUMBER OF      POWER CONSUMED
                                     TRANSISTORS

90nm Technology     1.8V             572            1.77* [10.sup.-2]
65nm Technology     1V               572            4.43*[10.sup.-3]


Proposed Method:
Table 2: Voltage and power consumption of proposed

PARAMETER           INPUT VOLTAGE    NUMBER OF      POWER CONSUMED
                                     TRANSISTORS

45nm Technology     0.8V             845            4.43*[10.sup.-6]


In existing method 65nm and 90nm technology is used. The power and voltage values are shown in table 1. In proposed method 45nm technology is used whose power consumption is low as shown in table 2. The input voltage is less than other technologies.

RESULTS AND DISCUSSIONS

Conventional Memory is a memory cells which can provide the content of data if the memory address is provided. Since if we know the memory address the data can be searched from the memory cells. Thus it increases the complexity of memory bank. Even the usage of conventional memory, delay is raised. Due to the delay, power consumption of memory banks have been raised. Thus in order to overcome the complexities CAM is introduced. CAM is an acronym of Content Addressable Memory. CAM cells are used in network routers and data compressors. As this CAMs are faster than other hardware and software search system. CAM cells are grouped together to form CAM array. If we provide the data, memory address can be accessed. Thus simultaneously we can update the content of data present in various memory sites. Thus CAM reduce the delay. Obviously, the power consumption can be reduced by CAM.

CAM is based on 65nm technology by which each CAM is inbuilt with 10 transistors each. Whereas if we use 90nm technology can inbuilt with 11 transistors. It provides series of CAM cells arranged in matrix form. The matrixform comprises m number of rows and n number of columns. By average, CAM array consists of 784 transistors.

Block Diagram Of 65nm Technology:

The CAM arrays are powered with voltage sources as shown in fig 5. Each row of CAM array is provided with individual sources. Each CAM cell in the array consists of 10 transistors. Finally, all the CAM array are fed to the encoder.

[FIGURE 5 OMITTED]

[FIGURE 6 OMITTED]

Block diagram Of 45nm Technology:

The CAM cell in the array consists of 9 transistors, which reduces the power consumption compared to the 65nm technology. The enable signal used to control each CAM array independently as shown in fig 6.

Conclusion:

An effective gated-power technique and a parity-bit based architecture that offer several major advantages, namely reduced peak current (and thus IR drop), average power consumption, boosted search speed and improved process variation tolerance. It is much more stable than recently published designs while maintain their low-power consumption property. When compared to the conventional design, its stability is degraded only at extremely low supply voltages. At 1 V operating condition, both designs are equally stable with no sensing errors. The design is implemented in sub-45-nm CMOS technologies which consumes power as low at the range of 4.43 x [10.sup.-6]. In this paper low power consumption and speed can be implemented. The proposed design has a smaller pull-up current due to the gated-power transistor Px and hence sometimes error happens. In future work error can be reduced based on feedback loop structure and decisions are made at the very beginning of the sensing cycle.

REFERENCES

[1.] Dong Hyuk Woo and Nak Hee Seong, 2013. 'Pragmatic Integration of an SRAM Row Cache in Heterogeneous 3-D DRAM Architecture Using TSV', 21(1): 678-682.

[2.] Nitin Mohan and Wilson Fung, 2009. 'Functional Implementation Techniques for CPU Cache Memories', 48(2): 65-71.

[3.] Po-Tsang Huang and Wei Hwang, 2011. 'A 65 nm 0.165 fJ/Bit/Search 256 144 TCAM Macro Design for IPv6 Lookup Tables', 46(2): 2431-2438.

[4.] Shuhong Gao & Todd Mateer, 2010. 'Additive fast fourier transforms overfinite fields,' 12(13): 1-11.

[5.] Yi Min Lin, Chih Lung Chen and Chen Yi Lee, 2010. 'A 26.k 314 mb/s soft (32400,32208) BCH decoder chip for DVB S2 system,' 45(11): 2330-2340.

[6.] Yue Zhang and Weisheng Zhao, 2012. Ultra-High Density Content Addressable Memory Based on Current Induced Domain Wall Motion in Magnetic Track, 48(11): 34-39.

[7.] Dr. AntoBennet, M., M. Manimaraboopathy, P. Maragathavalli, T.R. Dinesh Kumar, 2014. Low Complexity Multiplier For Gf(2m) Based All One Polynomial, Middle-East Journal of Scientific Research, 21(11): 2064-2071.

[8.] Dr. AntoBennet, M., 2015. Power Optimization Techniques for sequential elements using pulse triggered flipflops, International Journal of Computer & Modern Technology, 1(1): 29-40.

[9.] Dr. AntoBennet, M., G. Sankar Babu, R. Suresh, S. Mohammed Sulaiman, M. Sheriff, G. Janakiraman, S. Natarajan, 2015. Design & Testing of Tcam Faults Using TH Algorithm, Middle-East Journal of Scientific Research, 23(08): 1921-1929.

(1) Dr. M. Anto Bennet, (2) R. Suresh, (3) Vivin Christopher, (4) Kelwin Inasu, (5) S. Vignesh

(1) Professor, (2) Assistant Professor, Department of Electronics and Communication Engineering, (3,4,5) UG STUDENTS, Department of Electronics and Communication Engineering, VELTECH, Avadi-Chennai-600062, Tamilnadu, India.

Received 25 January 2016; Accepted 18 April 2016; Available 28 April 2016

Address For Correspondence:

Dr. M. Anto Bennet, Professor, Assistant Professor, Department of Electronics and Communication Engineering, E-mail: bennetmab@gmail.com
COPYRIGHT 2016 American-Eurasian Network for Scientific Information
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2016 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Content Addressable Memory
Author:Bennet, M. Anto; Suresh, R.; Christopher, Vivin; Inasu, Kelwin; Vignesh, S.
Publication:Advances in Natural and Applied Sciences
Article Type:Report
Date:Apr 1, 2016
Words:3298
Previous Article:Improving the network lifetime using network connectivity and target coverage in Mobile Sensor Networks.
Next Article:Placing sensors optimally in structures by combining Mse method with Aga for structural health monitoring.
Topics:

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |