# EVALUATION OF STANDARD CELL LIBRARIES WITH DIFFERENT TEMPLATES AND GATE DESIGN APPROACHES

Diogo C. da Silva<sup>1</sup>, Paulo F. Butzen<sup>1,2</sup>, André I. Reis<sup>1</sup>, Renato P. Ribas<sup>1</sup>

<sup>1</sup>UFRGS - Instituto de Informática – Nangate-UFRGS Research Lab, Porto Alegre, Brazil <sup>2</sup>PGMICRO – UFRGS, Porto Alegre, Brazil

## ABSTRACT

This paper examines issues related to building a cell library and analyzes the differences between libraries with several cell height and gate design approaches. The libraries are composed of CMOS cells that implement all 1 to 7-input functions. Six libraries with different cell height were created using an automatic layout generation tool. Design techniques such as folding and output buffers were explored to achieve better designs. The results show the area analysis for each library height and confirm the effectiveness of design techniques for achieving high-quality designs.

## **1. INTRODUCTION**

Standard cell is the most used circuit design methodology in ASIC projects. It basically consists of two steps: *logic synthesis*, where the circuit description is mapped to logic gates, and *physical synthesis*, where these logic gates are placed and connected (routed). The logic gates used in the logic synthesis steps are described in a previously designed library. The quality of the library is directed related to the quality of the ASIC [1].

Several works explore the generation, composition, and optimization of libraries. A solution for standard cell design automation from a behavioral description is presented in [2]. In [1], Scoot et al., examines the issues associated with building a cell library and proposes a set of cells that should be included in a library in order to improve the speed of a circuit. Library composition is also explored in [3-8], where there isn't a consensus about the ideal library. In [8], Fischer et al., claims that it is desirable to have specific standard cell libraries for low-power, high-performance and low-area circuits. Library optimization is explored in [8-10]. In these works, gate sizing, drive strength and P/N ratio are the optimization parameters.

Previous works have not explored the influence of cell height on area, performance and power results of standard cell libraries. It is an important parameter because all gates in a standard cell library have the same height and the area is directly related to it. The performance and power characteristics of each gate can be influenced when design strategies such as folding and output buffers are applied in cell generation. The influence of adding complex gates to the library is also explored.

This work compares area, performance and power characteristics of standard cell libraries composed of

different sets of gates, designed for several heights. The design strategies used are also evaluated.

The paper is organized as follows. Design concerns related to the internal layout of standard cells are described in Section 2. The experimental method and tools used to create the libraries are described in Section 3. The results of the experiments and future works are presented in Section 4 and the conclusions in Section 5.

## 2. CELL LAYOUT DESIGN CONCERNS

This section discusses general aspects that should be considered in the creation of a standard cell library. The first aspect to be considered in a library is a fixed height for all gates. This characteristic allows sharing the same supply and ground locations when the gates are placed side by side. It also implies fixed N and P transistor regions, which has a direct influence, or is defined based on, the P/N transistor ratio. The P/N ratio is defined as the ratio of the PMOS width to the NMOS width of the transistors in an inverter. This ratio is also used to design all other gates.

Another aspect that should be defined is the sizing applied to stacks of transistors, as found in NAND gates. The general rule is to multiply the transistor width by the number of series transistors. A constant can be added to achieve a better design. For example, in libraries where the focus is minimizing area, this constant is usually a number smaller than one.

The library can also have several versions of the same cell with different transistor sizes to provide different current capabilities. This is defined as drive strength. The most common libraries have three drive strengths for most gates: X1, X2, and X4, where X2 and X4 have transistors two and four times wider than X1, respectively.

Some aspects related to cell design strategies are also applied in library generation. The most used design techniques are referred to in this work as *folding* and *output buffer*. These two techniques are usually used when some gates cannot be designed at a specific cell height due to excessively wide transistors. In a folded design, wide transistors are "split" multiple times, being replaced by two or more parallel devices of smaller width. When the gate is composed by several transistors and folding would have to be applied to most of them, compromising the gate's area, an alternative is building the gate in two stages, i.e. the cell's logic followed by an output buffer. The logic stage is responsible for implementing the logic function and the output buffer for providing the desired current capability or drive strength. Figure 1 illustrates three different versions, X1 (a), X2 (b), and X4 (c), of a NAND3 gate. The NAND3\_X2 layout illustrates folding in its NMOS transistors. The NAND3\_X4 layout depicts the output buffer technique. In all layouts it is possible to observe other concepts described previously, such as the fixed cell height and P/N ratio.







Figure 1 – Three versions of a NAND3 gate. (a) X1, (b) X2 with folding in NMOS transistors, and (c) X4 with output buffer.

### **3. EXPERIMENTAL METHOD**

This section describes the experimental method used to design the libraries with several heights. To generate the cell layouts and characterize the library, two commercial tools were used, Nangate Library Creator [11] and Nangate Library Characterizer [11], respectively, for a predictive 45nm technology process [12].

The commercial tools employed define the library height as a multiple of a base row height. There are different ways to define the row height. In this work, the row is the minimum distance between the centers of two metall wires in parallel. Like the height, the width of the final layout is a multiple of a base column width. As for the row, a column can be defined in several ways. In this work, a column is the distance between the centers of the source and drain diffusion contacts of a transistor.

The chosen sizing strategy has a P/N ratio of 1.5, i.e., a PMOS transistor is 1.5 times wider than a NMOS transistor for the same topology conditions. The minimum NMOS transistor width is  $0.18\mu$ m and the minimum PMOS transistor with is  $0.27\mu$ m. The strategy used in transistor stack structures is presented in Table 1.

Table 1 - Stack Strategy

| Stack Transistor | NMOS    | PMOS    |
|------------------|---------|---------|
| 1                | 0.18 µm | 0.27 μm |
| 2                | 0.27 μm | 0.41 μm |
| 3                | 0.36 µm | 0.54 μm |
| 4                | 0.45 µm | 0.68 µm |

The libraries are composed by the functions of 1 to 4 inputs of Genlib 44-6, each with three drive strengths: X1, X2, and X4. The library is composed of 51 combinational gates. The characterization is performed for a capacitance range of 0.5fF to 10fF and an input slope range of 5ns to 500ns.

### 4. EXPERIMENTAL RESULTS AND FUTURE WORKS

The analysis of the results is divided in three topics. The first one evaluates area results for different libraries in different height templates. The second topic presents the design strategies (folding and output buffer) used in each library. The last one discusses future works.

#### 4.1. Area Results

This sub-section explores the area results of several libraries with different cell height. Table 2 presents the sum of the areas of all the gates in each library. The libraries are divided by the number of inputs. The number of gates in each library is also presented. The area results for combining these libraries up to n-inputs are illustrated in Figure 2.

In the data presented, the best results are concentrated in the 9-row library. This does not imply an

ideal cell height for any library; the best result is produced by a combination of factors. Cell height is one of the most important factors influencing the total library area. However, it should be defined taking into account the sizing strategy and the design strategies used in gate designs. In the next section, the design strategy confirms the better results in the 9-row library.

|        | Library – (# inputs - # gates) |            |            |            |
|--------|--------------------------------|------------|------------|------------|
|        | 1 input -                      | 2 inputs - | 3 inputs - | 4 inputs - |
| # Rows | 3 gates                        | 6 gates    | 12 gates   | 30 gates   |
| 8      | 1.92                           | 6.81       | 20.64      | 57.24      |
| 9      | 1.68                           | 7.18       | 18.67      | 48.60      |
| 10     | 1.86                           | 7.98       | 20.22      | 51.87      |
| 11     | 2.05                           | 8.78       | 22.24      | 54.42      |
| 12     | 2.23                           | 7.02       | 23.94      | 58.41      |
| 13     | 2.42                           | 7.61       | 22.82      | 61.55      |

Table 2 – Sum of all gate areas for n-input libraries designed in different template heights.



Figure 2 – Normalized area of libraries up to n-inputs for different template heights.

## 4.2. Design Strategy Results

The automatic layout generation tool used in this work provides two design strategies for achieving better area results in the final gate design. The concepts of folding and output buffers were already explored in Section 2. Considering the area constraint, the best design is the one that does not require folding or output buffers. The folding strategy adds some area penalty (in terms of columns – a concept presented in Section 3), and output buffers are the worst solution, used only when folding becomes unacceptable for introducing too many parallel devices. Based on previous results, the library with the most gates using the output buffer strategy tends to present the worst area results (in terms of columns), following by folding and neither strategy, respectively.

Figure 3 shows the percentage amount of gates using each strategy for the library of cells with up to 4 inputs implemented in different template heights. It is possible to conclude that the area reduction when the templates migrate from a height of 9 rows to 8 rows is not effective, since there is an expressive increase in the number of gates that require the output buffer strategy. The same reduction in height is advantageous from 11 to 9 rows, since the proportion of design strategies used on the library's gates remains constant. In the case of migration from 13 or 12 rows to smaller heights, there isn't a big impact since the design strategy does not change as much as the height reduction.



Figure 3 – Design strategies for the library of all gates with up to 4 inputs for different template heights.

## 4.3. Future Works

There are two points that are currently being explored in the context of this paper. The first one is generating the libraries for cells with more inputs. The desired objective is creating libraries with functions up to 7 inputs. The second point is performing power and performance analysis of these libraries and the different design strategies.

#### **5. CONCLUSIONS**

This paper presented an area-focused analysis of the influence of cell height on a standard cell library composed of 51 combinational gates with up to 4 inputs. The analysis also explored the folding and output buffer design strategies. The generation of cells with more than 4 inputs and the analysis of power and performance are currently being performed.

## 6. ACKNOWLEDGEMENTS

This work has been developed in cooperation with Nangate Inc., including financial support.

#### 7. REFERENCES

[1] K. Scott and K. Keutzer, "Improving Cell Libraries for Synthesis," *CICC*, IEEE, pp. 128-131, 1994.

[2] A. Bahuman, K. Rasheed, and B. Bishop, "An Evolutionary Approach for VLSI Standard Cell Design," *CEC*, IEEE, pp. 431-436, 2002.

[3] B. D. Guan and C. Sechen, "Large Standard Cell Libraries and Their Impact on Layout Area and Circuit Performance," *ICCD*, IEEE, pp. 378-383, 1996.

[4] B. D. Guan and C. Sechen, "ASIC Automatic Layout Generation Using Large Standard Cell Libraries," *ICASIC*, pp. 5-8, 1996.

[5] J. Masgonty et al., "Low-Power Low-Voltage Standard Cell Libraries with a Limited Number of Cells," *PATMOS*, pp. 940-448, 2001.

[6] M. Vujkovic and C. Sechen, "Optimized Power-Delay Curve Generation for Standard Cell ICs," *ICCAD*, ACM, pp. 387-394, 2002.

[7] A. Ricci, I. D. Munari, and P. Ciampolini, "An Evolutionary Approach for Standard-Cell Library Reduction", *GLSVLSI*, ACM, pp. 305-310, 2007.

[8] C. Fischer et al., "Optimization of Standard Cell Libraries for Low Power, High Speed, or Minimal Area Designs", *CICC*, IEEE, pp. 493-496, 1996.

[9] S. Lin, M. Marek-Sadowska, and E. S. Kuh, "Delay and Area Optimization in Standard-Cell Design", *DAC*, ACM, pp 349-352, 1990.

[10] D. S. Kung and R. Puri, "Optimal P/N Width Ratio Selection for Standard Cell Libraries", *ICCAD*, IEEE, pp. 178-184, 1999.

[11] Nangate Inc., Avaliable at: www.nangate.com, 2009.

[12] FreePDK Predictive technology node., Avaliable at: http://www.eda.ncsu.edu/wiki/FreePDK, 2009