# Efficient Look-up Table-based FPGA Implementation of the Simplified Bi-dimensional Memory Polynomial Model

A. A. H. Machoski GICS (Grupo de Circuitos Integrados e Sistemas) Universidade Federal do Paraná (UFPR) adrcsv@gmail.com O. A. P. Riba GICS (Grupo de Circuitos Integrados e Sistemas) Universidade Federal do Paraná (UFPR) otavio.riba@gmail.com L. Schuartz GICS (Grupo de Circuitos Integrados e Sistemas) Universidade Federal do Paraná (UFPR Iuisschuartz@ufpr.br E. G. Lima GICS (Grupo de Circuitos Integrados e Sistemas) Universidade Federal do Paraná (UFPR) elima@eletrica.ufpr.br

### ABSTRACT

This work addresses the digital baseband predistortion of dualband radio frequency (RF) power amplifiers (PAs) for wireless communication systems. Specifically, it is proposed an implementation with reduced computational complexity for the simplified bi-dimensional memory polynomial model (2D-MPM). In here, the simplified 2D-MPM is constructed using only one dimensional (1D) look-up tables (LUTs), which dramatically reduces its complexity and, as a consequence, provides a model suitable to be synthesized in low cost digital hardwares, such as field programmable gate arrays (FPGAs). Notwithstanding, the high accuracy of the simplified 2D-MPM is kept in the proposed FPGA implementation because interactions between the two bands are still allowed. A case study is reported to illustrate the high accuracy and low computational complexity of the proposed LUT-based FPGA implementation of the simplified 2D-MPM, when applied to the modeling of the forward and inverse transfer characteristics of an RF PA using fixed-point arithmetic.

### Keywords

Digital baseband predistortion, field programmable gate array, fixed-point arithmetic, memory polynomial, modeling, power amplifier, radio frequency, wireless communication systems.

### **1. INTRODUCTION**

In modern wireless communication systems, radio frequency (RF) power amplifiers (PAs) present at the transmitter chain are allowed to operate at strong compression gain for improving the efficiency [1]. Indeed, the improvement of power efficiency is essential for increasing the battery autonomy in handsets, as well as for reducing the costs associated with the acquisition and maintenance of equipments for heat dissipation in base stations. However, in this scenario, the RF PA exhibits inacceptable levels of nonlinear distortions that cause spectral regrowth and, as a consequence, interference between users allocated to neighbor channels. To compensate for the nonlinear distortions introduced by the RF PA, an excellent alternative is the inclusion of a linearization technique called digital baseband predistortion (DPD) [2]. In particular, the DPD is placed before the RF PA and designed to have a transfer characteristic equal to the RF PA inverse transfer characteristic. In this case, any band-limited signal that passes throughout the cascade connection of the DPD and the RF PA will keep its original bandwidth. A fundamental step in the design of a DPD scheme is the availability of a nonlinear dynamic model having low complexity and able to represent the RF PA forward and inverse transfer characteristics with high accuracy [3].

Polynomial approximations that incorporate memory effects are widely employed for the modeling of dynamic nonlinear systems [3]. They have the advantage of being linear in their parameters, which significantly simplifies the modeling identification. However, their number of parameters increases rapidly with the increasing of the polynomial order truncation (P) and the memory length (M). In fact, when the number of parameters is extremely high, the modeling identification can be very poor due to the ill-conditioning of the regression matrix of the least squares algorithm. Moreover, the number of multiplications and additions required for the processing of a single sample throughout the model also increases quickly according to the number of parameters. The situation is worse because DPD models deal with complex-valued signals and, hence, demand for multiplications and additions and additions of complex-valued numbers.

The processing of each sample throughout the DPD model must comply with very stringent real-time requirements. In fact, in nowadays standards (for instance, WCDMA and OFDMA signals for 3G and 4G standards, respectively), the processing of a signal throughout the DPD model must have a sampling frequency of about 50 MHz, which means that the time interval for the processing of each sample throughout the DPD model must take less than 20 ns. When performing additions and multiplications of two binary numbers in field programmable gate arrays (FPGAs), the time delay of the logic circuit blocks must be considered. In particular, it was reported in [4] that in fixed-point arithmetic with binary numbers having 16 bits, the time delay of the FPGA Xilinx Virtex5 LX50T for performing a multiplication is about 16 ns. Therefore, in order to comply with the real-time requirements, each addition and multiplication operation must have its exclusive hardware.

For the case of single band RF PAs, the memory polynomial model (MPM) provides a good trade-off between modeling error and number of parameters [5]. The MPM can be seen as a one dimensional (1D) model because the output is given by the sum of 1D functions. In fact, there are (M+1) different 1D functions in a MPM. Moreover, the independent variable of each 1D function is the amplitude of the input signal applied at one specific time sample. The processing of a sample throughout the MPM demands for a number of additions and multiplications that increases with M and P. In [6], the 1D characteristic of the MPM was exploited and the polynomials were substituted by look-up tables (LUTs). In doing that, it was obtained a look-up table-based FPGA implementation that is much more parsimonious in the usage of logic circuit blocks than a direct FPGA implementation of the polynomial approximation based only on multiplications and additions. Observe that in the LUT-based FPGA

implementation, the circuit complexity is independent of the polynomial order.

For the case of dual band RF PAs, a bi-dimensional (2D) MPM, having two inputs and two outputs, was proposed in [7]. In the 2D-MPM, each output is formulated as a function of the two inputs, in order to take into account interactions between the two bands. Since the MPM is based on a polynomial expansion with memory, doubling the number of inputs and outputs hugely increases the number of parameters of the 2D-MPM. Furthermore, as the terminology indicates, the 2D-MPM is a 2D model because the output is given by the sum of 2D functions.

Recently, in [8] it was proposed the so-called simplified 2D-MPM. Both the 2D-MPMs of [7] and [8] are models linear in their parameters. Hence, their parameter identification can be performed using the least squares algorithm. In case of same M and P, the simplified 2D-MPM of [8] has fewer parameters than the complete 2D-MPM of [7]. As a consequence, the regression matrix for the identification of the simplified 2D-MPM model of [8] is less susceptible to be ill-conditioned than the one for the identification of the complete 2D-MPM model. Moreover, in [8] it was reported that the simplified 2D-MPM can accurately predict the RF PA forward and inverse transfer characteristics. Nevertheless, in [8] the FPGA implementation of the simplified 2D-MPM has not been addressed.

In this work, the simplified 2D-MPM is first rearranged. In particular, it will be shown that the simplified 2D-MPM outputs can be calculated by the sum of very familiar terms. Specifically, each familiar term is given by the product of a complex-valued input and a 1D function. Then, it is proposed a LUT-based FPGA implementation for the simplified 2D-MPM that only requires 1D LUTs and, hence, uses a much reduced number of logic circuit blocks than a direct FPGA implementation based only on multiplications and additions. Therefore, in this work it is identified an additional advantage of the simplified 2D-MPM of [8] in comparison with the complete 2D-MPM of [7]. In fact, in a LUT-based FPGA implementation of the complete 2D-MPM of [7], it is mandatory the use of 2D LUTs, which dramatically increases the usage of logic circuit blocks.

This work is organized as follows. Section 2 derives the LUTbased architecture for the simplified 2D-MPM. Section 3 investigates the accuracies of the LUT-based simplified 2D-MPM for the modeling of the RF PA forward and inverse transfer characteristics using floating-point arithmetic. Section 4 reports the accuracy and the consumption of logic circuit blocks for the case in which the LUT-based simplified 2D-MPM is implemented in the FPGA Xilinx Virtex5 LX50T using fixed-point arithmetic. Section 5 includes the conclusions of this work.

#### 2. LUT-BASED SIMPLIFIED 2D-MPM

In the simplified 2D-MPM, proposed in [8], the complex-valued output samples at the instantaneous time *n*, designated by  $x_1(n)$  and  $x_2(n)$  for the first and second bands, respectively, are calculated by:

$$\begin{aligned} x_1(n) &= \sum_{p=0}^{P} \sum_{m=0}^{M} h_{p,m}^{(1)} u_1(n-m) \big| u_1(n-m) \big|^p + \\ &+ \sum_{p=1}^{P} \sum_{m=0}^{M} h_{p,m}^{(2)} u_1(n-m) \big| u_2(n-m) \big|^p \end{aligned} \tag{1}$$

and

$$x_{2}(n) = \sum_{p=0}^{P} \sum_{m=0}^{M} h_{p,m}^{(3)} u_{2}(n-m) |u_{2}(n-m)|^{p} + \sum_{p=1}^{P} \sum_{m=0}^{M} h_{p,m}^{(4)} u_{2}(n-m) |u_{1}(n-m)|^{p}$$
(2)

where  $h_{p,m}^{(1)}$ ,  $h_{p,m}^{(2)}$ ,  $h_{p,m}^{(3)}$  and  $h_{p,m}^{(4)}$  are the simplified 2D-MPM coefficients and  $u_1(n)$  and  $u_2(n)$  are the complex-valued input samples at the instantaneous time *n* for the first and second bands, respectively. The memory length is *M* and *P* is the polynomial order truncation. Observe that the 2D-MPM is linear in its parameters.

As pointed out in Section 1, (1) and (2) can be rearranged to:

$$x_{1}(n) = \sum_{m=0}^{M} u_{1}(n-m) f_{m1} \Big[ |u_{1}(n-m)|^{2} \Big] + \sum_{m=0}^{M} u_{1}(n-m) f_{m2} \Big[ |u_{2}(n-m)|^{2} \Big]$$
(3)

and

$$x_{2}(n) = \sum_{m=0}^{M} u_{2}(n-m) f_{m3} \left[ \left| u_{2}(n-m) \right|^{2} \right] + \sum_{m=0}^{M} u_{2}(n-m) f_{m4} \left[ \left| u_{1}(n-m) \right|^{2} \right]$$
(4)

where  $f_{m1}(.)$ ,  $f_{m2}(.)$ ,  $f_{m3}(.)$  and  $f_{m4}(.)$  are complex-valued functions on a single independent real-valued variable. Observe that, in (3) and (4), the simplified 2D-MPM outputs are calculated by the sum of products between a complex-valued input and a 1D polynomial function. In order to reduce the computational complexity, the 1D polynomial functions can be implemented using 1D look-up tables (LUTs). Moreover, the amount of data stored at each LUT can be made equal to a positive integer power of 2. In this way, the content of a LUT position can be addressed by binary numbers. Additionally, in order to increase the modeling accuracy, an interpolation strategy must also be included. Indeed, linear interpolation between two consecutive LUT positions is applied in here due to its simplicity.

## **3. COMPUTER SIMULATIONS USING FLOATING-POINT ARITHMETIC**

Using double-precision floating-point arithmetic, the LUT-based simplified 2D-MPM described in Section 2 is now applied to the modeling of the forward and inverse transfer characteristics of a dual-band RF PA.

The RF PA under test is represented by a cascade connection between a linear finite impulse response (FIR) filter and a static third-order polynomial. The RF PA is excited by a dual-band signal given by: a 3GPP WCDMA signal having a bandwidth of 8.84 MHz and centered at 900 MHz; and an LTE OFDMA signal having a bandwidth of 10 MHz and centered at 2.5 GHz. A set of 5000 baseband input-output samples for each band is collected at a sampling frequency of 61.44 MHz. The set of input-output data is divided into two independent subsets of 2500 samples: one for the modeling extraction and one for the modeling validation.

The memory length was fixed in M = 1. The simplified 2D-MPM coefficients of (1) and (2) were identified using the least squares algorithm [8] and the input-output data from the extraction subset. Then, the 1D polynomials were mapped into 1D LUTs. The number of bits for addressing each LUT was varied from 3 to 7. Linear interpolation was applied. Table 1 reports the normalized mean square error (NMSE), as defined in [9], between the desired and estimated outputs for the input-output data from the validation subset. Observe that, for the modeling of the forward characteristic, 3 bits are enough to provide very accurate results. Furthermore, for the modeling of the inverse characteristic, 6 bits are sufficient to provide highly accurate results.

 Table 1. NMSE in dB using floating-point arithmetic

| Number  | Forward Modeling |        | Inverse Modeling |        |
|---------|------------------|--------|------------------|--------|
| of Bits | Band 1           | Band 2 | Band 1           | Band 2 |
| 3       | -69.9            | -68.8  | -37.4            | -40.5  |
| 4       | -69.8            | -70.1  | -40.5            | -43.0  |
| 5       | -69.2            | -70.1  | -45.1            | -46.2  |
| 6       | -69.2            | -70.1  | -48.6            | -48.2  |
| 7       | -69.3            | -70.2  | -49.6            | -48.8  |

### 4. COMPUTER SIMULATIONS USING FIXED-POINT ARITHMETIC

In this section, the FPGA implementation in fixed-point arithmetic of the LUT-based simplified 2D-MPM is addressed. For that purpose, the floating-point design reported in Section 3 is converted into fixed-point. Table 2 shows the deterioration in modeling accuracy as a function of the number of precision bits used in the fixed-point arithmetic. It is worth mentioning that, for the forward modeling and inverse modeling, the numbers of bits for addressing the LUTs were fixed in 3 and 6, respectively. According to Table 2, 14 precision bits provide a good compromise between accuracy and computational complexity, for the modeling of both the forward and inverse characteristics.

 Table 2. NMSE in dB using fixed-point arithmetic

| Number  | Forward Modeling |        | Inverse Modeling |        |
|---------|------------------|--------|------------------|--------|
| of Bits | Band 1           | Band 2 | Band 1           | Band 2 |
| 12      | -45.6            | -44.8  | -44.3            | -43.9  |
| 13      | -51.4            | -50.8  | -47.2            | -46.6  |
| 14      | -57.2            | -56.9  | -48.2            | -47.7  |
| 15      | -62.6            | -61.9  | -48.5            | -48.1  |
| 16      | -66.7            | -65.6  | -48.6            | -48.2  |

The fixed-point LUT-based simplified 2D-MPM using 14 precision bits was translated to hardware description language (HDL). The HDL code is programmed in such a way that only three basic circuits are allowed: LUTs having one real-valued input and two real-valued outputs; adders and multipliers having 2 real-valued inputs and 1 real-valued output. Moreover, each operation is realized by a dedicated digital hardware. Considering that each linear interpolation requires 3 additions, 1 multiplication and 2 LUT readings, a total of 56 multipliers, 80 adders and 16 LUTs are necessary.

The HDL code was then simulated in the ISE Design Suite from Xilinx having as target device the FPGA Xilinx Virtex5 LX50T. Table 3 reports the usage of FPGA digital logic circuits. Specifically, in Case I all the 56 multipliers are constructed by the HDL library command "\*", while in Case II 48 multipliers are constructed by the HDL library command "\*" and 8 multipliers are constructed using the Booth architecture [10].

Table 3. Usage of the FPGA Xilinx Virtex5 LX50T

| Case | Forward Modeling |        | Inverse Modeling |        |
|------|------------------|--------|------------------|--------|
|      | Slice LUTs       | DSP48E | Slice LUTs       | DSP48E |
| Ι    | 2088             | 48     | 1098             | 48     |
| II   | 5394             | 42     | 4933             | 42     |

Figures 1, 2, 3 and 4 compare the transfer characteristics estimated by the fixed-point LUT-based simplified 2D-MPM with the RF PA measured transfer characteristics. Specifically, these figures show the instantaneous amplitude of the output signal as a function of the instantaneous amplitude of the input signal. Observe that in all of these four figures, excellent agreements between desired and estimated behaviors are illustrated.



Figure 1. Forward transfer characteristics at the first band.



Figure 2. Forward transfer characteristics at the second band.



Figure 3. Inverse transfer characteristics at the first band.



Figure 4. Inverse transfer characteristics at the second band.

### 5. CONCLUSIONS

Linearization of dual-band RF PAs can significantly improve their power efficiency. However, the linearization circuit requires additional hardware that will also consume energy. Therefore, it is very important to keep at reduced levels the power consumption of the additional hardware, once the overall wireless transmitter power efficiency is increased only if the energy consumption of the linearization block is small in comparison with the energy saved inside the PA circuit.

This work has proposed an efficient FPGA implementation for the simplified 2D-MPM, a model that can be applicable to the linearization of dual-band RF PAs using the digital baseband predistortion scheme.

### 6. ACKNOWLEDGMENTS

The authors would like to acknowledge the financial support provided by Pró-reitoria de Assuntos Estudantis da Universidade Federal do Paraná (PRAE-UFPR) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) under Programa de Iniciação Científica e em Desenvolvimento Tecnológico e Inovação, UFPR Edital 2014-2015.

### 7. REFERENCES

- [1] S. Cripps, *RF Power Amplifiers for Wireless Communications*. Norwood, MA: Artech House, 2006.
- [2] P. B. Kenington, *High Linearity RF Amplifier Design*. Norwood, MA: Artech House, 2000.
- [3] J. C. Pedro and S. A. Maas, "A comparative overview of microwave and wireless power-amplifier behavioral modeling approaches," *IEEE Trans. Microw. Theory Tech.*, vol. 53, no. 4, pp. 1150–1163, Apr. 2005.
- [4] F. I. Yasuda, R. A. S. Cavalheiro, F. J. L. Luiz, A. A. Mariano, O. C. Gouveia Filho, and E. G. Lima, "FPGA Implementation of a Fixed-Point Digital Baseband Pre-Distorter for the Linearization of Wireless Transmitters", in *XIII Microelectronics Student Forum*, Curitiba, Sep. 2013, pp. 1–4.
- [5] J. Kim and K. Konstantinou, "Digital predistortion of wideband signals based on power amplifier model with memory," *Electron. Lett.*, vol. 37, no. 23, pp. 1417–1418, Nov. 2001.
- [6] A. Kwan, F. M. Ghannouchi, O. Hammi, M. Helaoui, and M. R. Smith, "Look-up table-based digital predistorter implementation for field programmable gate arrays using long-term evolution signals with 60 MHz bandwidth", *IET Sci. Meas. Technol.*, v. 6, n. 3, pp. 181–188, May 2012.
- [7] S. A. Bassam, M. Helaoui, and F. M. Ghannouchi, "2-D digital predistortion (2-D-DPD) architecture for concurrent dual-band transmitters," *IEEE Trans. Microw. Theory Tech.*, vol. 59, no. 10, pp. 2547–2554, Oct. 2011.
- [8] O. A. P. Riba and E. G. Lima, " A Simplified Bi-dimensional Memory Polynomial Model for the Predistortion of Dualband RF PAs," in *30th South Symposium on Microelectronics*, Santa Maria, May 2015, pp. 1–4.
- [9] M. S. Muha, C. J. Clark, A. Moulthrop, and C. P. Silva, "Validation of power amplifier nonlinear block models," in *IEEE MTT-S Int. Microwave Symp. Dig.*, Anaheim, Jun. 1999, pp. 759–762.
- [10] V. A. Pedroni, *Digital Electronics and Design with VHDL*. Morgan Kaufmann: Elsevier, 2008.