#### AN ADDRESS DECODER FOR VARIABILITY CHARACTERIZATION FOR 65nm MOS TRANSISTORS

Felipe Correa Werle<sup>1</sup>, Giovano da Rosa Camaratta<sup>1</sup>, Juan Pablo Martinez Brito<sup>2</sup>, Sergio Bampi<sup>1,2</sup> <sup>1</sup>GME, Microelectronics Group <sup>2</sup>PGMICRO, Graduate Program on Microelectronics UFRGS, Federal University of Rio Grande do Sul Porto Alegre, Brazil {fcwerle, grcamaratta, juan, bampi}@inf.ufrgs.br

### ABSTRACT

The increase of statistical variations in nanometer CMOS technologies imposes a major challenge for digital and analog circuit design. This paper presents the design of an address decoder used for a variability test chip vehicle. The design is done in 65nm CMOS Bulk technology and the final area is about  $88\mu m \ge 1.7 \mu m$  using 4 metal levels.

# **1. INTRODUCTION**

Variation in CMOS devices has long been a concern in the design, manufacture and operation of integrated circuits (IC). With the continued scaling in MOS technology, process variability has become a major issue of performance and yield in ICs. Process variations generally can be divided into two groups: 1) Interdie variations [1] which are characterized by differences in equals devices placed on different dies (die-to-die), wafers (wafer-to-wafer) and/or lots (lot-to-lot). These errors, usually predictable and systematic in nature, are related to process issues like temperature gradients and/or photolithography errors. On the other hand, 2) intradie variations [2] are those characterized from variations in equals and closely devices placed on the same die. These errors are unpredictable and are caused by local random uncertainties in the fabrication process. Due to the fact that they are related to the stochastic behavior of matter [3], these are considered as one of the limits of MOS technology [4] and certainly a limitation to the continuance of Moore's law [5]. In this paper we are introducing the design of a test chip for variability characterization for 65nm bulk CMOS technology. The scope of this work is the address decoder used to select each component within the die, doing so the use of a few number of PADs for component access. The region under test comprises a MOSFET Matrix with identical designed transistors, which are activated one at a time using row and column decoder's signals. Note that the decoder circuit is separated from the bias transistor circuit, in which potential losses and current leak must be circumvented. In session 2 we present an introduction of the measurement challenges imposed by variability. In section 3 we discuss and compare the different types of address decoders as a circuit for component selection from related works. In section 4 we explain our circuit choice made for this test chip. Section 5 and 6 shows the layout and the simulation results respectively. Finally, in section 7 the conclusions are drawn.

#### 2. MEASUREMENT CHALLENGES

One of the most serious challenges in process variations for sub-100-nm technologies is the manner to obtain statistical data with reliable precision within a reasonable time. To obtain meaningful variation data, special care must be taken concerning the test structure and measurement setup. Due to its statistically random nature, the variation effect must be characterized by measuring a large number of individual devices. A first class approach to obtain statistical data from a given process is to use transistor arrays [6, 7]. These allow high measurement accuracy and are effective in terms of generating data to the transistor models [8, 9]. Transistor arrays unavoidably occupy a large area and have a low measurement throughput. An alternative to the first class of measurements is to convert an analog signal to a more robustly measurable quantity since doing so simplifies the requirements of the test equipment and environment. Frequency measurements using on-chip circuitry can help with signal-to-noise and bandwidth problems and provide a minimally invasive probing strategy. Ring oscillatorbased approaches [10] are effective for test time and are good general purpose indicators of digital performance. However, they typically cannot help to predict unique devices variation because they tend to indicate the mean device strength from its frequency value. Nevertheless, the attractiveness of transistor arrays as test structures has led to recent efforts [11] that aim to come up with new ways to create optimized structures with fast measurement, sufficient replication and good generality by relying on multiplexed transistor arrays with highdensity access to multiple devices by means of address decoders and/or shift-register circuits.

## **3. RELATED WORKS**

Address decoders are fundamental building blocks for systems that use buses. They are represented in all integrated circuit families and processes and in all standard FPGA and ASIC libraries. For microprocessor register files and memories addresses, efficient decoders must be used because speed is a critical issue. Decoding structures tend to have large design effort because the fan-out of the address bits to all decoders is large and the fan-out of the decoder's output to the transistors in the memory word is usually large too [12]. In our design, we are concerned about the decoder output in respect to selection of a unique component from a 64x64 MOSFET matrix as depicted in figure 1:



Figure 1 - The Variability Test Chip

The considerations that affect decoder design are many, as speed, power consumption and layout area. In our design, layout area is the main concern. Layout considerations are important if the designers want to fit the same layout pitch of some design as memory cells or MOSFET Matrix. Overall decoder size and power consumption are important [12]; a design that minimizes logical effort may require too much power or too many transistors to be practical. In our case, we are not concerned about power consumption. Finally, many decoder structures use pre-charging architecture to reduce logical effort. Thus, we will analyze some type of decoders considering the one that uses a small number of transistors to achieve the minimal area required. We must also consider that each decoder type will have a particular layout sketch and this could lead to a complex routing and placement of the cells and drive to much more irregular layout area, considering minimum transistor size for the target technology. The specification goal of our design is a 6x64 address decoder. In following table 1, we summarize a collection of different types of same size decoders found in the literature.

Table 1 64X4 Address Decoders comparison

| Decoder type      | # of transistors |
|-------------------|------------------|
| one stage         | 780              |
| transmission gate | 264              |
| pre-charge        | 524              |
| Flip-Flops        | 1024             |
| two stage         | 524              |

So, from table 1, starting with One Stage Decoder. This simplest decoder is a collection of NAND gates with 6 inputs. Considering the inverters, the estimated number of transistors was **780**. [13] This structure is useful for up to 5-6 inputs or more if speed is not critical. The NAND

transistors are usually made minimum size to reduce the load on the buffered address lines [13-15]. The layout of this type is big, considering the number of transistors. The second decoder type in table 1 is the decoder that uses Transmission Gate in a tree based style [15]. The advantage of this type is the number of transistors that is drastically reduced for 264. Nevertheless, the tree based style can create high output impedance and the layout could be irregular. Another disadvantage of this decoder type is that the delay increases quadratically with the number of sections (so, prohibitive for large decoders) [15]. Considering the problem of high output impedance and using pre-charge logic [15] in the first decoder type of table 1, it could solve this issue, but the number of transistors remains the same. Another solution is the use flip-flops as shift-register circuits for the selection of lines or columns [16]. So, with the use of shift-registers we are able to select each line and/or column using the bit stored in the flip-flop, with the high logical signal we would select target line and/or column and with the low logical signal we would be keeping the line and/or column deactivated. Thus, by sending a serial vector to the shift-register, we're able to select any device in the matrix. This would be a good solution. It could make diverse combinations of the measure elements in parallel for example and the main advantage is that uses three pads (one for serial input vector, another one for the clock and the third to reread the vector sent for security). However, the area would not be reduced (1024 transistors). Moreover this solution requires a great precision in the time of the signal clock and the serial vector. In the case the rise and/or the fall time of our clock is very slow, the shift-register could have an asynchronous race condition and the address would be wrong. So, considering the fact that we need the smaller area as possible with the security in addressing the right component, the fifth decoder type in table 1 is our design choice. This is the two stage decoder [13]. This will be described in the next section.

### 4. THE ADDRESS DECODER

Our choice is based on the reduction of the area and on the PAD's exploitation, leading to the use of the two stage decoder without pre-charge. This circuit presents a reduced area, an easy layout, a reliable address output and it optimizes the number PAD's in one (1) to ten (10). Compared with the others circuits, the number of transistors present a result for a satisfactory and acceptable area. In addiction it is a static circuit, extremely trustworthy and that with only 6 PADs we can selects 64 lines and/or columns. Although this circuit does not have the minor number of transistors, it has a simple and standardized layout which facilitates the design and its logical signals are strong (not as in the case of the transmission Gates, which the non-active pin is in high impedance). The main idea of this decoder [13] is to cascade different size decoders formed from a cascade of smaller gates. In our solution, we implemented a two stage (with pre-decoding) to develop a 6X64 Decoder. We realize that it can be formed with one 4X16 Decoder (pre-decoding), that uses **NOR4 gates**, and sixteen 2X4 Decoders, that uses **NAND3 gates**. The estimated number of transistors was **524**. Our circuit is composed by a NOR-NAND-NOR decoder as is depicted in the following picture:



Figure 2 - Logic diagram of the decoder

As seen in the above picture the circuit consist in 16 blocks of one standard four-input NOR, which performs a pre-selection for the input signals A2 to A5, and tree tree-input NAND which combines with the NOR4 other two signals (A0 and A1) coming from the pad's. The 16 blocks were linked by a single bus as shown below (Fig.3). This strategy allowed an easy replication of each block by changing only the order to the main bus connections. The complete schematic of our circuit is shown in the following figure 3:



Figure 3 - Address Decoder Schematic

# **5. LAYOUT**

Using Cadence tools Virtuoso schematic we develop the schematic and layout of the circuit. The reason to develop it in this way was leaded by the convenient layout, designed with standard-cell approach. The result was a more compact layout as we can see in Figure 4.

The 16 groups were linked to the bus using only four levels of metal available in the IBM 65nm CMOS technology. As the circuit does not use the (a) high frequency bus was placed on the circuit by further reducing the area used by the decoder in exchange for a small parasitic capacitance that is not harmful.





The following figure 5 shows the assembly of the decoder where the five groups are placed side by side and the bus is placed on top down to lower levels of metal to make the connections in their cells. The layout of our circuit emphasizes each group is shown in the following figure 5:



Figure 5 – Sketch of the Address Decoder Layout

## 6. SIMULATION

Thereafter, capacitance and resistance were extracted from the layout using the software Calibre from Mentor Graphics and with Cadence Spectre the simulation was performed. The simulation of our extracted circuit is shown in the following figure 6:



As we planned the decoder worked pretty well, responding with a short delay (always less than a 0.5ns) signal with rise and fall time of 1ns These results are great for us who do not want to use the circuit at a high frequency.

### 7. CONCLUSION

The uses of these devices are essential to achieve statistical parameters of mismatch or from otherwise nature. By the specification the design is possible to choose a decoder circuit that meets all constraints. After a detailed analysis of current solutions with the characteristics of available technology and the needs of this design we chose to use a simple two stage Decoder. With this circuit of **524** transistors we can select a large number of devices with reliability and a very small cost in area and PADs.

The choice of a decoder circuit for a variability test chip must be very careful planned with huge priority. With an appropriate choice, this will set the number PADs needed, the remaining area, the maximum operation frequency and characteristics of the signal selection. The choice of right decoder since the beginning can avoid some situations like overhaul of the layout, poor use of PADs or even the need to restart the design.

### 8. REFERENCES

[1] Ytterdal, T.; Cheng, Y.; and Fjeldly T. A.; *Device Modeling for Analog and RF CMOS Circuit Design*. New York: Wiley, 2003.

[2] S. Springer et al., "Modeling of Variation in Submicrometer CMOS ULSI Technologies," IEEE Transactions on Electron Devices, vol. 53, Issue 9, pp. 2168–78, September 2006.

[3] Asenov, A.; Brown, A.R.; Davies, J.H.; Kaya, S.; Slavcheva, G., "Simulation of intrinsic parameter fluctuations in decananometer and nanometer-scale MOSFETs," Electron Devices, IEEE Transactions on , vol.50, no.9, pp. 1837-1852, Sept. 2003.

[4] Wilson R.; "The dirty little secret: Engineers at design forum vexed by rise in process variations at the die level," EE Times, p. 1, Mar. 25, 2002. Web: http://www.eetimes.com/issue/fp/OEG20020324S0002.

[5] Kuhn, K. et al. "Managing Process Variation in Intel's 45nm CMOS Technology". Intel Technology Journal, [S.I.], 2008.

[6] Wang, V.; Shepard, K.L., "On-chip transistor characterization arrays for variability analysis," Electronics Letters, vol.43, no.15, pp.806-807, July 19 2007.

[7] Agarwal, K.; Liu, F.; McDowell, C.; Nassif, S.; Nowka, K.; Palmer, M.; Acharyya, D.; Plusquellic, J., "A Test Structure for Characterizing Local Device Mismatches," VLSI Circuits, 2006. Digest of Technical Papers. 2006 Symposium on , vol., no., pp.67-68, 0-0 0.

[8] Galup-Montoro, C.; Schneider, M.C.; Klimach, H.; Arnaud, A., "A compact model of MOSFET mismatch for circuit design," Solid-State Circuits, IEEE Journal of , vol.40, no.8, pp. 1649-1657, Aug. 2005.

[9] Chung-Hsun Lin; Dunga, M.V.; Darsen Lu; Niknejad, A.M.; Chenming Hu, "Statistical Compact Modeling of Variations in Nano MOSFETs," VLSI Technology, Systems and Applications, 2008. VLSI-TSA 2008. International Symposium on , vol., no., pp.165-166, 21-23 April 2008.

[10] Bhushan, M.; Ketchen, M.B.; Polonsky, S.; Gattiker, A., "Ring oscillator based technique for measuring variability statistics," Microelectronic Test Structures, 2006. ICMTS 2006. IEEE International Conference on , vol., no., pp. 87-92, 6-9 March 2006.

[11] Agarwal, K.; Hayes, J.; Nassif, S., "Fast Characterization of Threshold Voltage Fluctuation in MOS Devices," Semiconductor Manufacturing, IEEE Transactions on , vol.21, no.4, pp.526-533, Nov. 2008.

[12] Sutherland, I., B. Sproull, and D. Harris, *Logical Effort: Designing Fast CMOS Circuits*, Morgan Kaufmann, San Fransisco, USA, 1999.

[13] Weste, N.H.E., and D. Harris, *CMOS VLSI Design: A Circuits and Systems Perspective*, 3rd ed., Pearson/Addison-Wesley, Boston, USA, 2005.

[14] Goel, Ashish Kumar; Agarwal, Manish; "Decoder scheme for making large size decoder", STMicroelectronics, United States, 2004. US Patent 6794906 http://www.freepatentsonline.com/6794906.html

[15] Rabaey, J.M. et al., *Digital Integrated Circuits: A Design Perspective*, Prentice-Hall, Englewood Cliffs, USA, 2002.

[16] Klimach H., "Modelo do descasamento (Mismatch) entre transistores MOS" Ph.D. Thesis in Engenharia Elétrica, Universidade Federal de Santa Catarina(UFSC), 2008.