

# Performance, Power, and Area of Standard Cells in Sub 3 nm Node Using Buried Power Rail

Ju[n](https://orcid.org/0000-0002-3132-4556)-Sik Yoon<sup>®</sup>, Member, IEEE, Jinsu Jeong<sup>®</sup>[, S](https://orcid.org/0000-0002-8932-6626)eunghwan Lee<sup>®</sup>,

Junjong Lee<sup>®</sup>, Graduate Student Member, IEEE, Sanguk Lee<sup>®</sup>, Graduate Student Member, IEEE, Rock-Hyun Baek<sup>®</sup>, Member, IEEE, and Sung Kyu Lim, Senior Member, IEEE

**Abstract—We analyzed the performance, power, area of 3 nm node fin and nanosheet (NS) field-effect transistors (FETs) implementing buried power rail (BPR) after full calibration to 5 nm node hardware. Fin-shaped FETs (FinFETs) have smaller RC delay than do NS FETs (NSFETs) under the same footprint and two-fin configuration. Larger number of NS channels boost drive currents but also increase gate capacitancesas a tradeoff. Compared with 7 and 3 nm standard cells achieve 75% cell area scaling in average. Cells using BPR decrease delay, transition time, internal power, and pin capacitances under the same area. Larger cells such as D-flip flop (DFF) and XOR decrease those further because the parasitic capacitances of metal layers between signal and power/ground decrease much. NS-based cells using BPR can improve delay and transition time by increasing the number of NS channels, but increase internal power and pin capacitance. Overall, fin-based cells using BPR have smaller energy delay product by 12% compared with those without BPR and by 10% compared with NS-based cells using BPR.**

**Index Terms—3 nm, buried power rail (BPR), fin, nanosheet (NS), performance-power-area (PPA), standard cell.**

#### I. INTRODUCTION

SILICON fin-shaped field-effect transistors (FinFETs) have<br>been scaled down from 10 to 5 nm node with contin-<br>parts and selling of contacted rates with (CDD) and sell bright uous scaling of contacted poly pitch (CPP) and cell height

Manuscript received November 25, 2021; revised December 22, 2021; accepted December 23, 2021. Date of publication January 17, 2022; date of current version February 24, 2022. This work was supported in part by the Ministry of Trade, Industry and Energy under Grant 10080617, in part by the Korea Semiconductor Research Consortium Support Program for the Development of the Future Semiconductor Device, in part by the Pohang University of Science and Technology (POSTECH)- Samsung Electronics Industry-Academic Cooperative Research Center, in part by the National Research Foundation of Korea grant funded by the Ministry of Science, ICT under Grant 2020R1A4A4079777 and Grant 2020M3F3A2A02082436, and in part by BK21 FOUR and IC Design Education Center. The review of this article was arranged by Editor R. Wang. (Corresponding author: Rock-Hyun Baek.)

Jun-Sik Yoon is with the Department of Electrical Engineering, Pohang University of Science and Technology (POSTECH), Pohang 37673, South Korea, and also with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: junsikyoon@postech.ac.kr).

Jinsu Jeong, Seunghwan Lee, Junjong Lee, Sanguk Lee, and Rock-Hyun Baek are with the Department of Electrical Engineering, Pohang University of Science and Technology (POSTECH), Pohang 37673, South Korea (e-mail: rh.baek@postech.ac.kr).

Sung Kyu Lim is with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA.

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TED.2021.3138865.

Digital Object Identifier 10.1109/TED.2021.3138865

(Fig. 1) [1]–[5]. Fin shape has changed from tapered to rectangular fin through full-fledged extreme ultraviolet (EUV) and SiGe channel [6]. Self-aligned contact and contact-over-activegate (COAG) reduce the number of metal tracks [1], [2]. Single diffusion break (SDB) reduces the number of dummy gates to increase the standard cell density [1], [2], [7]. If CPP and cell height are scaled down at constant rate from 10 nm node, it is expected that 3 nm node has the CPP of 42 nm and the cell height of 120 nm. Gate-all-around (GAA) nanosheet fieldeffect transistors (NSFETs) reduce the short channel effects and have larger current drivability compared with FinFETs [8]. Also nanosheet (NS) width  $(W_{NS})$  is easily tuned at a certain value, which enables performance and power optimization for different applications. However, we should also consider the middle-of-line (MOL) layers because the parasitic *RC*s at MOL level increase greatly as technology node advances [9].

Buried power rail (BPR) has been proposed to place the power (*V* DD) and ground (*V*SS) metal lines below the devices [10]. Especially, back-side BPR decreases the IR drop and the back-end-of-line (BEOL) routing congestion by placing the power delivery network below the substrate [11]. Static random access memory (SRAM) implementing BPR as bitline for signal routing decreases both access time and dynamic power over conventional SRAM [12]. But to the best of our knowledge, there are no quantitative analyses of BPR-implemented standard cells in state-of-the-art technology nodes.

This study is based on full calibration to 5 nm hardware, thus estimating the performances of device and cell in 3 nm node accurately. In addition, we designed 24 standard cells in 3 nm implementing fin or NS structure or/and BPR, and investigated those in terms of performance, power, and area (PPA) using commercial electronic design automation (EDA) tools. Therefore, this work provides the device design guideline in the cell layout perspective.

#### II. SIMULATION METHOD

Both FinFETs and NSFETs were simulated using Sentaurus TCAD [13]. All the equations used in this work were same as in [14] and [15]. Doping profile, mobility, and carrier velocity were calibrated to 5-nm node FinFETs [6] as shown in Fig. 2(a). Geometry parameters of 5-nm node FinFETs were also included in the inset. First, subthreshold swing (SS) and drain-induced barrier lowering (DIBL) were fitted by changing source–drain (S–D) doping concentration and annealing time. Then, low-field mobility and its related parameters at high gate

0018-9383 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.



Fig. 1. Standard cell area (CPP  $\times$  cell height) of three major industries as technology node is scaled down from 10 to 7 nm nodes [1]–[5]. Based on the scaling trend, this work estimates the possible cell area in 3 nm node and addresses the implementation of BPR or/and GAA.



Fig. 2. (a) Calibration results to 5 nm node FinFETs [6] and (b)  $I_{ON} - I_{OFF}$ plot for 7 nm [18], 5, and 3 nm of NFETs (filled) and PFETs (empty).

electric field were calibrated by fitting the drain currents  $(I_{ds})$ in the linear region. Finally, saturation velocity was modified to fit the  $I_{ds}$  in the saturation region. Compared with the previous calibrations done in 10 nm node [14]–[17], this work calibrates to state-of-the-art 5-nm node FinFETs. So, it would give much reliable results to predict 3 nm and beyond nodes.

In this work, it was assumed that 3 nm FinFETs follow the same performance gain as from 7 to 5 nm by improving the ON-currents  $(I_{ON})$  by 15% with respect to 5 nm FinFETs [6]. The  $I_{ON}$  was improved through proper process advancements including  $W_{fin}$  scaling, much abrupt S/D doping profile for similar DIBL and SS at the scaled gate length  $(L_g)$ , and contact resistivity reduction. Fig.  $2(b)$  shows the  $I_{ON}$ -OFF-currents  $(I<sub>OFF</sub>)$  plot for 7 nm [18], 5, and 3 nm for three different applications. From 7 to 5 nm, the  $I_{ON}$  was improved greatly by 33% and 40% for N-type Field Effect Transistors (NFETs) and P-type Field Effect Transistors (PFETs), respectively, by taller and rectangular fin, SiGe high mobility channel, and other process advancements. In this work, we chose standard performance application (0.25 nA per fin) only for device and cell-level analyses.

Table I presents the geometry parameters of FinFETs and NSFETs in 7 and 3 nm nodes. Geometrical parameters of 7 nm node are from ASAP7 [18]. The 3 nm node has the cell height of 120 nm, equal to total six fins or five metal tracks. Fin height  $(H<sub>fin</sub>)$  is fixed to 55 nm, same as in 5 nm node. Fin width  $(W<sub>fin</sub>)$  is chosen to 5 nm for better controllability than 5 nm

TABLE I GEOMETRICAL PARAMETERS OF FinFETS AND NSFETS IN 7 AND 3 nm NODES

| Tech.        | $7 \text{ nm} [18]$ |        | $3 \text{ nm}$ |
|--------------|---------------------|--------|----------------|
|              | FinFET              | FinFET | <b>NSFET</b>   |
| <b>CPP</b>   | 54                  | 42     | 42             |
| $L_{\rm g}$  | 21                  | 12     | 12             |
| MP           | 36                  | 24     | 24             |
| Cell height  | 270                 | 120    | 120            |
| # Track      | 7.5                 | 5      | 5              |
| FP           | 27                  | 20     |                |
| $W_{fin}$    | 7                   | 5      |                |
| $H_{fin}$    | 32                  | 55     |                |
| $N_{fin}$    | 1, 2, 3             | 1, 2   |                |
| $W_{NS}$     |                     |        | 10, 25         |
| $T_{NS}$     |                     |        | 5              |
| $T_{\rm sp}$ |                     |        | 10             |
| $N_{NS}$     |                     |        | 3, 4, 5        |
|              |                     |        |                |

TABLE II GEOMETRY PARAMETERS AND RESISTANCES OF METAL LAYERS AND VIAS



node, but not 4 nm due to the loss of carrier mobility [19]. NS widths  $(W_{NS})$  are 25 nm (W25) and 10 nm (W10) to match the footprint of two- and one-fin FinFETs. It was announced that NSFETs having larger  $W_{\text{NS}}$  improve *RC* delay [17], but here we designed the devices under the same active area with FinFETs for fair comparison. NS thickness  $(T_{NS})$  and spacing  $(T_{\rm SD})$  are 5 and 10 nm, respectively [8]. Table II indicates geometry parameters and resistances of metal layers and vias. Resistance of metal layers (M1, M2) is  $347 \Omega/\mu$ m [20], and that of via (V0, V1) is 63.5  $\Omega$  [21]. Resistance of MOL metals is 523  $\Omega/\mu$ m from ASAP7 [18]. Metal BPR (MBPR) and via BPR (VBPR) have the resistance of 65  $\Omega/\mu$ m and 56  $\Omega$ , respectively [10]. MBPR has the width of 25 nm and the aspect ratio of 2. VBPR has the width  $\times$  length of 20  $\times$  12 nm<sup>2</sup> and connects between MBPR and source-side MOL metals (M0). Operation voltage  $(V_{dd})$  is fixed at 0.70 V.

Fig. 3 shows the geometry of FinFETs and NSFETs. All the process steps for FinFETs and NSFETs are equivalent as in [14]–[17]. Several geometrical parameters of the devices are specified. S/D doping concentration, annealing temperature, and time are  $4.10^{20}$  cm<sup>-3</sup>, 1050 °C, and 0.5 s, respectively. Doping concentration for punchthrough-stopper (PTS) region is  $5.10^{18}$  cm<sup>-3</sup> to prevent subfin leakage. Both the devices have the rectangular S/D epi, to be explained in Section III. For NSFETs, bottom oxide was used to completely remove the bottom transistor for dc/ac performance advancements [22], [23]. We also considered SiGe intermixing to Si NS channels causing threshold voltage changes [24].

Electrical characteristics of the standard cells were prepared (Fig. 4). We used Synopsys EDA tools, except Cadence Liberate for library (LIB) file generation for fair comparison with ASAP7. HSPICE fits the transfer, output, and capacitance characteristics of the calibrated TCAD devices by using Berkeley Short channel IGFET model (BSIM) common metal



Fig. 3. Half schematic diagrams of FinFETs and NSFETs in 3 nm node. Materials and three terminals [source (S), gate (G), and drain (D)] are specified.



Fig. 4. Schematic flow to generate LIB file including the electrical characteristics of the standard cells.

gate (CMG). StarRC generates the nxtgrd file, containing the parasitic *RC* of metal interconnects, from the interconnect technology format (ITF). After drawing the standard cell layouts using Custom Compiler, IC validator performs layout versus schematic (LVS) check. Then, StarRC does layout parasitic extraction (LPE) using the LVS output and nxtgrd. Finally, Cadence Liberate uses SPICE parameters and parasitic *RC* of the standard cells to generate LIB file containing all the electric characteristics of standard cells such as delay, transition time  $(t<sub>tran</sub>)$ , internal power  $(P<sub>int</sub>)$ , input pin capacitance  $(C_{pin})$ , and leakage power  $(P_{leak})$ . To utilize BPR, we first defined the resistances of MBPR and VBPR in ITF. After MBPR and VBPR are drawn in the cell layout, *RC* components of MBPR and VBPR are extracted in the LPE step.

To reside two-fin NFET/PFETs within the cell height of 120 nm, all the standard cells need one dummy fin between the devices to isolate S/D epi [Fig.  $5(a)$ ]. The diamond S/D epi should not grow laterally over the length of fin pitch (FP)- $W_{fin}/2$ , which is 17.5 nm in 3 nm node, for the epi isolation. But it is challenging when diamond epi and wrap-



Fig. 5. (a) Schematic showing the concerns of S/D epi isolation in the standard cell and (b) dc characteristics of the FinFETs having different S/D epi schemes.

around contact (WAC) are used due to S/D epi merging [Fig. 5(b)]. Diamond epi without WAC can avoid this, but its dc performance is degraded by increasing the contact resistance. So, we used S/D patterning (SDP) scheme forming rectangular S/D epi [15]. This scheme can maintain the dc performance and isolate the S/D epi concurrently.

### III. RESULTS AND DISCUSSION

### A. Device-Level Characterization

Fig. 6 summarizes the  $I_{ON}$ , gate capacitances ( $C_{gg}$ ), and *RC* delay ( $C_{gg}V_{dd}/I_{ON}$ ) of FinFETs and NSFETs in 3 nm node.  $I_{OFF}$ are fixed to 0.5 nA for two-fin FinFETs and W25 NSFETs, whereas the  $I_{\text{OFF}}$  are 0.25 nA for one-fin FinFETs and W10 NSFETs. ASAP7 also has the same  $I_{\text{OFF}}$  of 0.25 nA/fin [18]. Effective widths (*W*eff) of FinFETs and NSFETs are calculated as  $N_{fin} \cdot (W_{fin} + 2H_{fin})$  and  $N_{NS} \cdot (2W_{NS} + 2T_{NS})$ , respectively. While the C<sub>gg</sub> increases at constant rate as a function of  $N_{\rm NS}$ , the increasing rate of  $I_{\rm ON}$  per  $N_{\rm NS}$  decreases. This effect is explained by the  $I_{ON}$  normalized by  $W_{\text{eff}}$  ( $I_{ON}/W_{\text{eff}}$ ) in Table III. NSFETs have smaller  $I_{ON}/W_{\text{eff}}$  as the  $N_{NS}$  increases because the longer carrier path for bottom-most NS channel



Fig. 6. ON-currents  $(l_{\text{ON}})$ , gate capacitances  $(C_{gg})$ , and RC delay of FinFETs (dotted lines) and NSFETs (symbols) having different  $W_{NS}$  and  $N_{\text{NS}}$ . Effective widths ( $W_{\text{eff}}$ ) of FinFETs and NSFETs are also specified in the bracket.

TABLE III ION NORMALIZED BY Weff FOR FINFETS AND NSFETS

| $(mA/\mu m)$   | $\rm N_{NS}$ | $W10$ (or 1 fin) |        | $W25$ (or 2 fins) |        |
|----------------|--------------|------------------|--------|-------------------|--------|
|                |              | N-type           | P-type | N-type            | P-type |
| <b>FinFETs</b> |              | 0.62             | 0.64   | 0.67              | 0.68   |
|                |              | 0.63             | 0.63   | 0.68              | 0.69   |
| <b>NSFETs</b>  |              | 0.62             | 0.60   | 0.66              | 0.66   |
|                |              | 0.59             | 0.58   | 0.62              | 0.62   |

induces the larger parasitic resistance [17]. FinFETs and NSFETs have similar  $I_{ON}/W_{\text{eff}}$  at the  $N_{NS}$  of 3, but the NSFETs have smaller  $I_{ON}$  under the same active area due to smaller  $W_{\text{eff}}$  than FinFETs. The NSFETs with the  $N_{\text{NS}}$  of 4 and 5 have larger  $I_{ON}$  than the FinFETs due to their larger  $W_{\text{eff}}$ , but degrade the  $C_{gg}$  arising from more channels, overlap, and outer-fringing capacitances [14]. Therefore, all the NSFETs have larger *RC* delay than the FinFETs. Previous work [14] showed that NSFETs have smaller *RC* delay than FinFETs. But the FP in this work is 20 nm, shorter than 28 nm in [14], thus increasing the *W*eff per footprint for FinFETs compared with NSFETs in the two-fin configuration. Comparing one-fin FinFETs and W10 NSFETs, FinFETs certainly outperform NSFETs in the one-fin configuration as the FinFETs have smaller  $C_{gg}$  at the same  $I_{ON}$ .

## B. Cell-Level Analysis

Fig. 7 shows the INV $\times$ 1 layouts of 3 nm node FinFETs without (w/o) and with (w/) BPR. All the standard cells in 3 nm node have the same cell height of 120 nm irrespective



Fig. 7. INV $\times$ 1 layouts of 3 nm node FinFETs without (left) and with (right) BPR.

of BPR. All the layers except MBPR and VBPR are obtained from ASAP7 [18]. Both COAG and SDB are adopted for the standard cell design. NSFETs use the same metal layers as FinFETs, thus are not shown here.  $INV \times 1$  without BPR has 12-nm-wide M1 layers for power/ground (*V* DD/*V*SS), whereas that with BPR has 25-nm-wide MBPR layers enabling longer input (*A*) and output (*Y*) M1 lengths for better routability in circuit design. The cells without BPR have 11.5 nm proximity of fins to *V* DD/*V* SS lines, whereas the cells with BPR have 5 nm proximity. *V* DD/*V* SS line resistances without and with BPR are the same as those of M1 and MBPR, respectively.

Table IV summarizes the electrical characteristics of two standard cells  $(INV \times 1$  and DFFH $\times 1$ ) of FinFETs and NSFETs at different input slews and load capacitances (C<sub>load</sub>). The 7 nm standard cells are also included for comparison. Energy delay product (EDP) is calculated as the multiplication of power and delay squared.

First, comparing 7 and 3 nm nodes, the cell area is scaled down significantly for both INV×1 ( $-77\%$ ) and DFFH×1 (−68%). As the 3-nm node uses two fins instead of three fins for 7 nm node, both  $C_{\text{pin}}$  and  $P_{\text{leak}}$  decrease. Taller fin for 3 nm node increases the  $I_{ON}$  per fin greatly. Overall, all the electrical characteristics (delay,  $t_{tran}$ ,  $P_{int}$ ) are improved for 3 nm node.

Second, comparing fin-based cells, BPR decreases the  $C_{\text{pin}}$ by 1% for INV $\times$ 1 and 2% for DFFH $\times$ 1 due to the reduced parasitic capacitance  $(C_{\text{para}})$  between *VDD/VSS* and signal. Especially, BPR improves the delay,  $t_{\text{tran}}$ , and  $P_{\text{int}}$  of DFFH $\times 1$ greatly than INV $\times$ 1 because the  $C_{\text{para}}$  decreases much for larger cells. Smaller resistance for MBPR compared with M1 reduces those metrics further as given in Table II.

Third, fin-based cells have smaller  $C_{\text{pin}}$  by 4% and 19% than NS-based cells with the  $N_{\text{NS}}$  of 4 and 5, respectively, arising from smaller  $C_{gg}$  (in Fig. 6). Different from *RC* delay results in Fig.  $6$ , NS-based cells have shorter delay and  $t_{tran}$ 

TABLE IV ELECTRICAL CHARACTERISTICS OF INV×1 AND DFFH×1 IN 7 AND 3 nm NODES AT DIFFERENT INPUT SLEW AND LOAD CAPACITANCES

|                                                       |                | $3 \text{ nm}$                                        |               |                 |                 |  |  |
|-------------------------------------------------------|----------------|-------------------------------------------------------|---------------|-----------------|-----------------|--|--|
| INVx1                                                 | $7 \text{ nm}$ | w/o BPR                                               | $w/$ BPR      |                 |                 |  |  |
|                                                       | [18]           | FinFET                                                | FinFET        | NS <sub>4</sub> | NS <sub>5</sub> |  |  |
| Area                                                  | 0.044          | $0.010(-77\%)$                                        |               |                 |                 |  |  |
| $C_{pin}$ (fF)                                        | 0.437          | 0.315                                                 | 0.312         | 0.328           | 0.385           |  |  |
| $P_{\textit{leak}}\left(\text{pW}\right)$             | 503            | 278                                                   | 278           | 271             | 274             |  |  |
| Fast case: (input slew = 10 ps, $C_{load}$ = 1.44 fF) |                |                                                       |               |                 |                 |  |  |
| Delay (ps)                                            | 7.78           | 7.40                                                  | 7.40          | 7.29            | 6.83            |  |  |
| $t_{tran}$ (ps)                                       | 11.16          | 8.15                                                  | 8.14          | 8.39            | 7.50            |  |  |
| $P_{int}$ (fJ)                                        | 0.052          | 0.037                                                 | 0.036         | 0.039           | 0.043           |  |  |
| EDP                                                   | 3.16           | 2.05                                                  | 2.00          | 2.08            | 2.02            |  |  |
| Slow case: (input slew = 40 ps, $C_{load}$ = 5.76 fF) |                |                                                       |               |                 |                 |  |  |
| Delay (ps)                                            | 28.43          | 27.91                                                 | 27.93         | 27.32           | 25.52           |  |  |
| $t_{tran}$ (ps)                                       | 42.95          | 31.80                                                 | 31.82         | 32.60           | 29.14           |  |  |
| $P_{int}$ (fJ)                                        | 0.075          | 0.043                                                 | 0.042         | 0.046           | 0.050           |  |  |
| <b>EDP</b>                                            | 60.50          | 33.71                                                 | 32.95         | 34.18           | 32.87           |  |  |
|                                                       |                | 3 nm                                                  |               |                 |                 |  |  |
|                                                       |                |                                                       |               |                 |                 |  |  |
| DFFHx1                                                | $7 \text{ nm}$ | w/o BPR                                               |               | $w/$ BPR        |                 |  |  |
|                                                       | [18]           | FinFET                                                | FinFET        | NS <sub>4</sub> | NS <sub>5</sub> |  |  |
| Area                                                  | 0.379          |                                                       | $0.121(-68%)$ |                 |                 |  |  |
|                                                       | 0.431          | 0.293                                                 | 0.288         | 0.300           | 0.352           |  |  |
| $C_{pin}$ (fF)                                        | 2744           | 1491                                                  | 1491          | 1497            | 1494            |  |  |
| $P_{leak}$ (pW)                                       |                | Fast case: (input slew = 10 ps, $C_{load}$ = 1.44 fF) |               |                 |                 |  |  |
| Delay (ps)                                            | 32.89          | 22.61                                                 | 21.61         | 21.82           | 21.09           |  |  |
| $t_{tran}$ (ps)                                       | 17.17          | 12.00                                                 | 11.10         | 11.98           | 10.88           |  |  |
| $P_{int}$ (fJ)                                        | 0.878          | 0.617                                                 | 0.593         | 0.649           | 0.737           |  |  |
| <b>EDP</b>                                            | 949.69         | 315.40                                                | 277.01        | 308.74          | 327.73          |  |  |
|                                                       |                | Slow case: (input slew = 40 ps, $C_{load}$ = 5.76 fF) |               |                 |                 |  |  |
| Delay (ps)                                            | 54.58          | 43.30                                                 | 41.29         | 40.47           | 38.59           |  |  |
| $t_{tran}$ (ps)                                       | 45.27          | 34.05                                                 | 31.29         | 32.63           | 28.80           |  |  |
| $P_{int}$ (fJ)                                        | 0.896          | 0.631                                                 | 0.611         | 0.672           | 0.761           |  |  |
| <b>EDP</b>                                            | 2668.75        | 1184.02                                               | 1041.75       | 1099.61         | 1133.07         |  |  |

EDP Unit: 10 ʻJ.S

for more  $N_{\text{NS}}$ . It is because the C<sub>load</sub> is much greater than the  $C_{gg}$  and thus the  $I_{ON}$  dominantly affects the cell speed at the specific case. It is clear that more  $N_{\text{NS}}$  decreases delay and  $t_{\text{tran}}$  much by 6% and 11% for the slow case, respectively. But the  $I_{ON}$  increase causes the  $P_{int}$  increase due to the increase of short-circuit currents in operation.

Finally, fin- and NS-based  $INV \times 1$  have similar EDPs, whereas fin-based DFFH $\times$ 1 have smaller EDP by 10% for the fast case and by 5% for the slow case than NS-based ones.  $DFFH \times 1$  use more number of field-effect transistors (FETs) than  $INV \times 1$ , so the  $P_{int}$  difference between FinFETs and NSFETs increases. In addition,  $DFFH \times 1$  has a one-fin configuration, and one-fin FinFETs outperform W10 NSFETs. Therefore, fin-based DFFH $\times$ 1 w/ BPR have the smallest EDP in 3 nm node.

There are three possible reasons why NS-based cells have larger EDP than fin-based cells.

First, NSFETs lose the benefits of drive currents as the FP is scaled down to 20 nm in 3 nm node. Under the same active region for two-fin configuration, NSFETs need at least 4 of  $N_{\rm NS}$  to meet similar  $I_{\rm ON}$  as FinFETs, but more  $N_{\rm NS}$  increase the  $C_{gg}$  as a tradeoff. For one-fin configuration, FinFETs are much better than NSFETs because FinFETs have smaller facing area between S/D and gate and thus smaller  $C_{gg}$ . If three-fin or beyond configuration is adopted, NSFETs would increase the drive currents over FinFETs.

Second, there is minimal benefit of short channel controllability for GAA over fin channel. DIBL and SS are similar



Fig. 8. (a) Delay, (b)  $P_{\text{int}}$ , and area of the standard cells using FinFETs without and with BPR.

between FinFETs and NSFETs as long as the PTS controls the subfin leakage effectively [14]. So, the cell characteristics are determined mostly by  $I_{ON}$  and  $C_{gg}$  as shown in Fig. 6. Previous work for analog/RF application shows that NSFETs have greater intrinsic gain  $(G_m R_o)$  than FinFETs due to larger transconductance  $(G_m)$  by larger  $W_{\text{eff}}$  and large output resistance  $(R_0)$  by better gate electrostatics [25]. But in terms of the standard cells for digital application, fin-based cells show better results than NS-based ones.

Third, this work only applies W10 and W25 NSFETs for one- and two-fin, respectively. W<sub>NS</sub> can be modulated at continuous level, whereas  $N_{fin}$  is discrete and  $W_{fin}$  and  $H_{fin}$ are fixed. This design flexibility for NSFETs would give rise to the improved cell performance, which is beyond the scope of the article.

Fig. 8 summarizes delay,  $P_{int}$ , and area of all the standard cells using FinFETs without and with BPR for the medium speed case. All the standard cells except  $INV \times 1$ , NAND2  $\times 1$ , and  $NOR2\times1$  improve the delay by BPR. Three standard cells with the greatest delay and  $P_{int}$  saving by BPR are DFFH $\times 1$ , XNOR3×1, and XOR3×1 because their cell areas are the largest. As the cell is larger, *C*para between *V* DD/*V* SS and signal decreases much for BPR. In addition, the smaller resistance of MBPR over M1 further decreases the delay and *P*int and benefits more as the cell is larger.

# IV. CONCLUSION

FinFETs and NSFETs implementing BPR are analyzed thoroughly in terms of device and cell levels using fullcalibrated TCAD. SDP scheme is adopted for rectangular S/D epi to prevent S/D epi merging. At the device level, NSFETs have larger *RC* delay than FinFETs. Especially, the *RC* delay of W10 NSFETs is much larger than that of one-fin FinFETs due to large *C*para induced by large facing area between S/D epi and gate. At the cell level, NS-based cells can decrease

the delay and  $t_{\text{tran}}$  by increasing the  $N_{\text{NS}}$ , but increase the  $P_{\text{int}}$ over fin-based cells as a tradeoff at the same input slew and *C*load. Overall, fin-based cells have smaller EDP than NS-based cells, especially for large cells where more number of FETs are arranged. As the BPR is implemented, all the standard cells improve  $C_{\text{pin}}$ , delay,  $t_{\text{tran}}$ , and  $P_{\text{int}}$ , and more with larger cells. Therefore, under two-fin configuration, FinFETs using BPR scheme are promising for 3 nm node.

#### **REFERENCES**

- [1] C. Auth *et al.*, "A 10 nm high performance and low-power CMOS technology featuring 3rd generation FinFET transistors, self-aligned quad patterning, contact over active gate and cobalt local interconnects," in *IEDM Tech. Dig.*, Dec. 2017, pp. 673–676.
- [2] X. Wang *et al.*, "Design-technology co-optimization of standard cell libraries on Intel 10 nm process," in *IEDM Tech. Dig.*, Dec. 2018, pp. 636–639.
- [3] J. Yuan *et al.*, "High performance mobile SoC productization with second-generation 10-nm FinFET technology and extension to 8-nm scaling," in *Proc. IEEE Symp. VLSI Technol.*, Jun. 2018, pp. 219–220.
- [4] J. Deng *et al.*, "5G and AI integrated high performance mobile SoC process-design co-development and production with 7 nm EUV FinFET technology," in *Proc. IEEE Symp. VLSI Technol.*, Jun. 2020, pp. 1–2.
- [5] S. Narasimha *et al.*, "A 7 nm CMOS Technology platform for mobile and high performance compute application," in *IEDM Tech. Dig.*, Dec. 2017, pp. 689–692.
- [6] G. Yeap *et al.*, "5 nm CMOS production technology platform featuring full-fledged EUV, and high mobility channel FinFETs with densest  $0.021 \mu m^2$  SRAM cells for mobile SoC and high performance computing applications," in *IEDM Tech. Dig.*, Dec. 2019, pp. 879–882.
- [7] W. C. Jeong *et al.*, "True 7 nm platform technology featuring smallest FinFET and smallest SRAM cell by EUV, special constructs and 3rd generation single diffusion break," in *Proc. IEEE Symp. VLSI Technol.*, Jun. 2018, pp. 59–60.
- [8] N. Loubet *et al.*, "Stacked nanosheet gate-all-around transistor to enable scaling beyond FinFET," in *Proc. Symp. VLSI Technol.*, Jun. 2017, pp. 230–231.
- [9] G. Bonilla, "The continuation of an interconnect-centric technology era—The MOL and BEOL challenge," in *Proc. VLSI Tech. Short Course*, Jun. 2018, pp. 1–55.
- [10] A. Gupta *et al.*, "Buried power rail scaling and metal assessment for the 3 nm node and beyond," in *IEDM Tech. Dig.*, Dec. 2020, pp. 413–416.
- [11] D. Prasad *et al.*, "Buried power rails and back-side power grids: Arm<sup>®</sup> CPU power delivery network design beyond 5 nm," in *IEDM Tech. Dig.*, Dec. 2019, pp. 446–449.
- [12] R. Mathur *et al.*, "Buried bitline for sub-5 nm SRAM design," in *IEDM Tech. Dig.*, Dec. 2020, pp. 409–412.
- [13] *Version O-2018.06*, Synopsys, Mountain View, CA, USA, 2018.
- [14] J.-S. Yoon, J. Jeong, S. Lee, and R.-H. Baek, "Systematic DC/AC performance benchmarking of sub-7-nm node FinFETs and nanosheet FETs," *IEEE J. Electron Devices Soc.*, vol. 6, pp. 942–947, 2018.
- [15] J.-S. Yoon *et al.*, "Source/drain patterning FinFETs as solution for physical area scaling toward 5-nm node," *IEEE Access*, vol. 7, pp. 172290–172295, 2019.
- [16]  $J.-S.$  Yoon, J. Jeong, S. Lee, and R.-H. Baek, "Multi- $V_{th}$  strategies of 7-nm node nanosheet FETs with limited nanosheet spacing," *IEEE J. Electron Devices Soc.*, vol. 6, pp. 861–865, 2018.
- [17] J.-S. Yoon, J. Jeong, S. Lee, and R.-H. Baek, "Optimization of nanosheet number and width of multi-stacked nanosheet FETs for sub-7-nm node system on chip applications," *Jpn. J. Appl. Phys.*, vol. 58, no. SB, Mar. 2019, Art. no. SBBA12.
- [18] L. T. Clark et al., "ASAP7: A 7-nm FinFET predictive process design kit," *Microelectron. J.*, vol. 53, pp. 105–115, Jul. 2016.
- [19] X. He et al., "Impact of aggressive fin width scaling on FinFET device characteristics," in *IEDM Tech. Dig.*, Dec. 2017, pp. 493–496.
- [20] T. Nogami *et al.*, "Comparison of key fine-line BEOL metallization schemes for beyond 7 nm node," in *Proc. Symp. VLSI Technol.*, Jun. 2017, pp. 148–149.
- [21] V. Moroz *et al.*, "Can we ever get to a 100 nm tall library? Power rail design for 1 nm technology node," in *Proc. Symp. VLSI Technol.*, Jun. 2020, pp. 1–2.
- [22] J. Zhang *et al.*, "Full bottom dielectric isolation to enable stacked nanosheet transistor for low power and high performance applications," in *IEDM Tech. Dig.*, Dec. 2019, pp. 250–253.
- [23] J.-S. Yoon, J. Jeong, S. Lee, and R.-H. Baek, "Punch-throughstopper free nanosheet FETs with crescent inner-spacer and isolated source/drain," *IEEE Access*, vol. 7, pp. 38593–38596, 2019.
- [24] J. Jeong, J.-S. Yoon, S. Lee, and R.-H. Baek, "Threshold voltage variations induced by  $Si_{1-x}Ge_x$  and  $Si_{1-x}C_x$  of sub 5-nm node silicon nanosheet field-effect transistors," *J. Nanosci. Nanotechnol.*, vol. 20, no. 8, pp. 4684–4689, Aug. 2020.
- [25] J.-S. Yoon and R.-H. Baek, "Device design guideline of 5-nm-node FinFETs and nanosheet FETs for analog/RF applications," *IEEE Access*, vol. 8, pp. 189395–189403, 2020.