Energy efficient mechanism in cache memory using in ETA method.
In the presently all computer systems represents the cache memory is the important part because it reduces the speed of the processor. Because performance of the Processor relates to the cache size. So increasing the cache size will lead to improve the performance of the processor. There are several techniques can be used to reduce the power dissipation of cache memory structures. ETA-Early Tag Access method is the one of the important techniquesused to improvement of speed and power consumptions. ETA has been designed to eliminate the unwanted data flow in the system. The important sources of power consumption are at application and program level of system, operating system level and architecture level. The majority of architecture level power consumption is due to memory sub-system which is because of the cache memory activities. Reduction in cache power consumption can be achieved by reducing miss rate, miss penalty, latency per access, and power consumption per access. By improving the cache-hit performance, the system can reduce the effective cache access time and overall power consumption.
Cache Memory Design:
Cache memory receives requested address from the microprocessor. Then the controller looks for the tag cache address. Cache memory design shown in fig. 1 & fig. 2. It will be consider in two ways : cache hit and cache miss. If the address is present, the given data will form the location path by the data cache (i.e) cache hit otherwise it is called as cache miss. Mapped memory blocks in cache lines are determined by the mapping function. There are three types of mapping function Direct mapping is a single cache line for each memory block. Set-associative mapping is a set of cache lines for each memory block. Associate mapping cache line can be used for any memory block.
Cache controller consists of Tag cache, GMux, LRU. Finite State Machine (FSM): the FSMcontrols the read and write data signals for both cache and main memory they indicates which cache set is have the requested address by sending signals for (GMux) and for (LRU controller unit) if set associative was selected from the cache memory when the direct mapped
In this Tag cache contains 19 tag bits, 1 valid bit, 1 dirty bit and 7 bits LRU (Least Recently Used)is the important concept of each data cache line, tag bits are used for holding the 19 tag bits (shown in fig 3)comes from the cache size controller unit of the address being accessed, the every valid bit indicates whether the cache line is valid or not and dirty bit is set when the cache line is written without updating the corresponding main memory line, when the machine restarts all valid and dirty bits are reset. General Mux (GMux) is responsible for connecting the input and output data buses to and from cache sets relying on the signal which sent from FSM. LRU controller Unit when set associative was selected the FSM sending the Read and write data signals for multi sets and the LRU controller unit will indicates which way is least recently used.
Early Tag Access:
Early Tag Access  is used to determine the destination way of memory instruction. Level of caches can be consists of L1 and L2. Tag information of L2 send to the L1 cache when the data is loaded from the cache of L2 (i.e) direct mapping. Therefore the energy efficiency will improve without degradation. Here prediction technique is used for complete the operation. It will check the tag and data arrays from the cache level. If the prediction is correct operation is complete, whether data access search the data, Mis-prediction lead to performance penalty. Another method is way -halting cache, it check the data array by matching process. These techniques have some cases that will increases the critical paths.
In this paper, we propose a new cache technique, (i.e) ETA cache, to improve energy efficiency of L1 data caches. Cache controller receives the requested address from cache size. If compare the received index and tag with tag cache content then check whether the requested data is fetching or not. If that requested data is fetched means, then no need main memory. Then the given data will directly processed to the processor. Else the cache memory should fetches, it from main memory using ETA method.
This block diagram (fig 4) explains about the available address like as ripple carry adder, carry lookahead adder etc. In this adder uncorrected part of the design will estimate the carry from one bit to next by the controller unit and carry able addition unit.
A VHDL components of a pipelined MIPS processor is designed and combined with the VHDL components of this design, by using (XilinxISE Design Suite 12.1) all these components are connected together in order to compose the top level. In shown table 1 r the processor management with sets of cache as one set is controlled by cache depends upon the size and location of cache lines .
Replacement method ETA with least recently used (LRU) algorithm is used. This technique states that we should replace the cache line that has not been accessed for the longest period.
Fig 5&6 shows the simulation of the design using when 64K byte size with 2 way set associative cache is chosen and the processor requested for read a location address (re = 1), while Miss occurs (hit0 & hit1 = 0) and (dirty0 & dirty1= 0) with (LRU = 1) so the cache controller is stalled the processor for (8 clock cycles) until storing the previous data line which has the same index value and reads the requested address from the main memory to the Way1 of cache memory.
The table 2(a)(b)(c)&(d) explains that the thermal properties and the supply power etc., they are measured the XPower Analysis of Xilinx software they are processor data power usages. Due to this tabulations we conclude that the power consumption of the system is 40%.
Overall system performance will consider by the cache memory. So the cache memory performance has huge impact in system performance. The way to improve the cache memory performance is to improve the cache hit rate. This can be achieved by the help of replacement method of ETA. This paper has given an exhaustive survey of cache replacement system to improve the power of the system. Software assisted cache replacement strategies achieve the least cache miss rate as compare to hardware cache replacement strategies. Conversely intelligent compiler, operating system and cache controller communication is required by software assisted cache replacement strategies. The VHDL design of resizable cache memory has been implemented in this paper.
[1.] Mostafa Abd-El-Barr and Hesham El-Rewini, 2005.Fundamentals of computer organization and architecture. New Jersey, USA: John Wiley & Sons.
[2.] Sivarama, P. Dandamudi, 2002. Fundamentals of Computer Organization and Design. 2nd ed. New York,USA:Springer.
[3.] Albonesi, D.H., 2002. "Selective cache ways: on demand cache resource allocation". Journal of Instruction Level Parallelism.
[4.] Kim, H., A.K. Somani and A. Tyagi, 2001. "A Reconfigurable Multifunction Computing Cache Architecture", IEEE Transactions on VLSI, 9(4): 509-523.
[5.] Ranganathan, P., S.Adve and N.P.Jouppi, 2002."Reconfigurable caches and their application to media processing". ACM SIGARCH Computer Architecture News, 28(2): 214-224.
[6.] Safaa, S., Omran and Ibrahim A. Amory, 2016. "Design and Implementation of Resizable Cache Memory using FPGA". The second international Engineering Conference IEC, Ishik University, Irbil, Iraq.
[7.] Safaa, S., Omran and Ibrahim A. Amory, 2016. "Implementation of Reconfigurable Cache Memory using FPGA". First Engineering Conference for Graduate Research, Technical College, Baghdad, Iraq.
[8.] Jesung kim, Soonate kim, 2015. "Exploiting same tag bits to improve the reliability of the cache memories", in IEEE transactions on VLSI systems,23: 2,.
[9.] Jianwei Dai, Menglong Guan and Lei wang, 2014. "Exploiting Early tag Access for reducing L1 data cache energy in Embedded processors ", in IEEE transactions on VLSI systems, 22: 2.
[10.] Nicolaescu, D., A. Veidenbaum and A. Nicolau, 2003. "Reducing power consumption for high-associativity data caches in embedded processors," in Proc. Design, Autom., Test Eur. Conf. Exhibit., pp: 1064-1068.
[11.] Zhang, C., F. Vahid, Y. Jun and W. Najjar, 2004. "A way-halting cache for low-energy high-performance systems," in Proc. Int. Symp. Low Power Electron. Design, pp: 126-131.
[12.] Ishihara,T.and F.Fallah, 2005. "A way memoization technique for reducing power consumption of caches in application specific integrated processors," in Proc. Design Autom. Test Eur. Conf., pp: 358-363.
(1) Manikandan P, (2) Mangayarkarasi P, (3) Hemadevi N
(1,2,3) A.C. College of Engineering and Technology, Karaikudi-630 003.
Received 28 February 2017; Accepted 29 April 2017; Available online 2 May 2017
Address For Correspondence:
Manikandan P, A.C. College of Engineering and Technology, Karaikudi-630 003.
Caption: Fig. 1 : Cache Memory Design
Caption: Fig. 2: Cache design
Caption: Fig. 3: Block of Bit address storage system
Caption: Fig. 4: ETA block diagram
Caption: Fig. 5: Simulation results for 64k byte data cache
Caption: Fig. 6: Simulation results for 64K byte cached data
Table 1: Truth Table Of Cache Controller Unit Cache Mapping No. of. No. of Tag size function Index bits bits 512k Direct Mapping 15 13 2 way associative 14 14 4 way associative 13 15 8 way associative 12 14 255k Direct Mapping 14 14 2 way associative 13 15 4 way associative 12 16 8 was associative 11 17 123k Direct Mapping 13 15 2 way associative 12 16 4 way associative 11 17 8 way associative 10 18 44k Direct Mapping 12 16 2 way associative 11 17 4 way associative 10 18 8 way associative 9 19 Table 2: Thermal Properties And Supply Power Thermal properties Effective (c/w) Max Ambient ( c ) 37.0 83.1 (a) Supply Summary Total Source Voltage Current (A) Vccint 1.200 0.015 Vccaux 2.500 0.012 Vcco25 2.500 0.002 (b) Supply Power (w) Total Dynamic 0.052 0.000 (c) On chip Power ( W) Used Clocks 0.000 2 Logic 0.000 1402 Signals 0.000 3475 IOs 0.000 24 Leakage 0.052 Total 0.052 (d) Junction Temperature ( c ) Thermal properties 26.9 (a) Supply Dynamic Quiescent Source Current ( A ) Current ( A ) Vccint 0.000 0.015 Vccaux 0.000 0.012 Vcco25 0.000 0.002 (b) Supply Power (w) Quiescent 0.052 (c) On chip Available Utilization Clocks -- -- Logic 4896 286 Signals -- -- IOs 158 15.2 Leakage Total
|Printer friendly Cite/link Email Feedback|
|Author:||Manikandan, P.; Mangayarkarasi, P.; Hemadevi, N.|
|Publication:||Advances in Natural and Applied Sciences|
|Date:||May 1, 2017|
|Previous Article:||English cursive hand written character recognition.|
|Next Article:||Simulation of force sensor and analysis of its characteristics using panda ring resonator.|