# FPGA Implementation of a Fix-Point Digital Baseband Pre-Distorter for the Linearization of Wireless Transmitters

F. I. Yasuda<sup>1</sup>, R. A. S. Cavalheiro<sup>1</sup>, F. J. L. Luiz<sup>1</sup>, A. A. Mariano<sup>1</sup>, O. C. Gouveia Filho<sup>1</sup>, E. G. Lima<sup>1</sup>

<sup>1</sup> Grupo de Circuitos Integrados e Sistemas (GICS) – Departamento de Engenharia Elétrica

Universidade Federal do Paraná, Curitiba, Brasil

Abstract — Digital baseband predistortion (DPD) is a cost effective solution to improve the energetic efficiency of power amplifiers (PA) for wireless communication systems. This paper addresses the design of DPD in fix-point arithmetic, suitable for real-time implementation in a low-cost hardware such as a Field Programmable Gate Array (FPGA). The design starts in floating-point double-precision arithmetic for the parameter identification of a DPD having a memory polynomial topology. Then, the not straightforward conversion to fix-point arithmetic is discussed. The presented fix-point design allows for the realtime applicability of the DPD, by processing the polynomials as Look-Up-Tables (LUT) and by using a sequential logic circuit that exploits as much as possible the parallel processing. For the FPGA Xilinx Virtex 5 and for numbers having 16 bits, it was verified that a DPD frequency as high as 60 MHz can be used and the DPD power consumption is approximately 0.5 W. The validation of the accuracy of the designed fix-point DPD is done based on computer simulations performed on a PA behavioral model excited by a 3GPP WCDMA signal. It is verified that, if the PA is operated with a constant average output power in the cases with and without DPD, then the inclusion of the DPD improves the ACPR metric at the PA output by 25 dB.

*Index Terms* — FPGA, linearization, memory polynomial, power amplifiers, VHDL language.

## I. Introduction

Wireless communication systems must provide a high rate of data transfer in the reduced frequency band available. This is obtained by modulating both the amplitude and phase a carrier. As a consequence, wireless communication standards impose limits in order to keep the linearity of the transmitted signal [1], [2]. The most important element in the wireless transmitters is the power amplifier (PA). Independent of the class of operation, PAs exhibit intrinsically a compromise between linearity and efficiency. With the rapid growth of wireless technologies in the market, the needs of reduction the loss of power in the transmitter became relevant [1]. Furthermore, the battery consumption by the users and environmental issues demands for more efficient transmitters [3], [4].

To improve efficiency without compromising linearity, the alternative is to linearize the PA. Among the available linearization schemes, digital base-band predistortion (DPD) is a cost effective solution. It consists of purposely distorting the signal prior to its amplification by the PA, in a way that the signal at the PA output is a linear replica of the input signal. The DPD is subject to many other recent papers [1, 2, 4, 5], which the common objectives is to raise the RF PAs efficiency even more by considering memory effects [1].

In this work, it is presented a fix-point design of a DPD that is suitable for implementation in a Field Programmable Gate Array (FPGA). The implemented DPD overcomes the PA non-linear effects, raising its efficiency. Thus, the linearization and high efficiency becomes possible.

This work is organized as follows. Section II describes the modeling and identification of a DPD in floating-point double-precision arithmetic. Section III addresses the design of a DPD in fix-point arithmetic and for real-time processing in a FPGA. Based on numerical simulations performed on a behavioral model of a class AB PA, in Section IV the accuracy of the designed fix-point DPD is validated. Finally, Section V includes the conclusions of this work.

# II. DPD design in float-point double-precision arithmetic

The purpose of a DPD system is that the output signal in a cascade connection of the pre-distorter (PD) followed by a power amplifier (PA) be a linear version of the input signal applied to the series connection [6], as shown in Fig. 1.



Fig. 1. Cascade connection of a pre-distorter followed by a PA.

The first step in the design of a DPD system is the selection of the pre-distorter topology. In this work, it was chosen the memory polynomial (MP) architecture [7], because it provides a good trade-off between accuracy and computational complexity. The MP model is a particular instance of a Volterra system, in which the complex-valued low-pass equivalent baseband signals at the PD input, u(n) and output x(n) are related by:

$$x(n) = \sum_{p=1}^{P} \sum_{m=0}^{M} h_{p}(m) u(n-m) |u(n-m)|^{p-1}$$
(1)

where  $h_p(.)$  are complex-valued coefficients, *P* is the polynomial order truncation and *M* is the memory length. As recommended in [8], (1) includes both odd and even powers of the input amplitudes. In order to extract the parameters  $h_p(.)$ , the indirect learning architecture was used [9]. In this algorithm, as shown in Fig. 2, it is identified the parameters of a post-distorter (PoD), e.g. an inverse system that is also connected in cascade with the PA, put placed after it. The MP is a model linear in its parameters and the identification can be performed by least-squares in MATLAB software [10]. Since the PoD and DPD have the same Volterra-based topologies, it can be shown that their parameters are theoretically the same [9]. So, the parameters of the DPD are just copies of the extracted PoD parameters.



Fig. 2. Block diagram of the indirect learning.

Observe that (1) can be rearranged to:

$$x(n) = \sum_{m=0}^{M} u(n-m) f_{m}(|u(n-m)|^{2})$$
<sup>(2)</sup>

where  $f_m(.)$  are complex-valued functions having just one independent real-valued variable. In the fix-point design presented in the next section, the formulation in (2) is exploited and each one-dimensional (1D) function  $f_m(.)$  is implemented by 2 Look-Up-Tables (LUTs): one for getting its real part and other to get its imaginary part.

### III. DPD design in fix-point arithmetic

In order to make the data handling possible in fix-point, the floating-point double-precision data must be normalized, for instance to guarantee that the any data will exceed the unity in floating-point arithmetic. Then, the normalized floating-point double-precision data must be converted to fix-point data. This can be performed in MATLAB software [10]. In fix-point arithmetic, in here, negative numbers are represented by 2's complement. The block diagram of the DPD suitable for fix-point arithmetic is shown in Fig.3. Observe that it implements (2) when the memory length M is truncated to 1.



Fig. 3. Block diagram of the fix-point DPD.

As shown in Fig. 3, the real and imaginary parts from the signal to be pre-distorted, at the instantaneous time sample (n) and past time sample (n-1) are both squared. Their results are summed and then indexed by 2 LUTs: one LUT for the instantaneous time sample and the other LUT for the past time sample. Each LUT provides 2 outgoing values to a single input: a real component and an imaginary component. The real components at the LUTs outputs are multiplied by their respective (e.g. at the same time sample) real components of the DPD input signal (input signal without changes). The imaginary components at the LUTs outputs are multiplied by their respective (e.g. at the same time sample) imaginary components of the DPD input signal (input signal without changes). In the sequence, for each time sample, the subtraction of those multiplications aforementioned are computed and, finally, the real component at the DPD output is just the sum of the results at each time sample. The imaginary component at the DPD output is acquired by performing similar multiplication and sum operations, as illustrated in Fig. 3.

The block diagram is then written in VHDL language. Observe that the VHDL code must be capable of: adding two signals, processing a signal through a LUT having 1 input and 2 outputs and multiplying two signals. The codes to perform these operations can be handled to make the operations faster and more efficient. The Adders and Multipliers needed were provided by the VHDL library. However, using the standard libraries, the optimization cannot be done. Some tests were made, and whatever changes were, there were no better results than the ordinary libraries codes. Yet, the LUTs were created and optimized as possible.

To evaluate the accuracy of the designed DPD, the real and imaginary components at the DPD output can be converted from fix-point to float-point data. This can be performed in MATLAB software [10].

# A. Real-time DPD processing

As the DPD must be applicable in real-time and it must have an operating frequency as high as 100 MHz, the total time for processing a signal through the DPD cannot exceed 10 ns. Furthermore, observe that, in according to Fig. 3, the processing of a signal through the DPD demands for a minimal sequence of 6 operations, where each operation depends of the outcome of the former. These operations are restricted to the addition, multiplication or processing through a LUT.

Computer simulations were performed in the software ISE Design Suite [11] for the specific FPGA Xilinx Virtex5 in specify LX50T. Specifically, Post-and-route simulations were done to estimate the time delays in computing each one of the three basic operations. The number of bits was fixed to 16, e.g. 15 precision bits plus 1 parity bit. Simulation results show that the time delay generated by the addition operation is about to 9 ns, by the multiplication operation is about to 16 ns and by the processing through the LUT is about to 10 ns. With these delays, the real-time implementation of the DPD based solely on combinational logic is not feasible because the total delay limits the maximum operating frequency of the DPD in 60 MHz.

It then becomes necessary to include sequential logic. If this is done, the maximum operating frequency of the DPD is limited only by the larger individual delay, which in this case is the multiplication delay and, therefore, the DPD frequency can be as high as 60 MHz. A consequence of including sequential logic in the design is that the DPD output signal has a delay, with respect to the DPD input signal, larger than one clock period. Based on Fig. 3, it can be seen that a possible DPD implementation in real-time using also sequential logic requires that the DPD output has a fixed delay, with respect to the DPD input, of at least 6 clock periods.

#### **B.** DPD Power Consumption

To estimate the DPD power consumption, again computer simulations were performed in the software ISE Design Suite [11] for the specific FPGA Xilinx Virtex5 LX50T. Once more, the number of bits was fixed to 16, e.g. 15 precision bits plus 1 parity bit. The power consumption for the sequential circuit is not too high. The FPGA uses approximated 4,401 slices LUTs and the same number of LUT flip flops. The estimated power consumption is 0.560 W. This quantity is very low compared with total slices LUTs, for instance, it corresponds for just 15% of the total one.

## IV. Validation

Computer simulations were performed to validate the DPD design. The fix-point MP DPD scheme shown in Fig. 3 was simulated on the ISE Design Suite from Xilinx. The device-under-test (DUT) is a PA behavioral model that represents experimental data measured on a GaN-based PA, operating in class AB and having a center frequency of 900 MHz. The MP topology of (1) was also chosen for the PA behavioral model.

The excitation signal is a 3GPP WCDMA signal. To measure the linearity (or, conversely, the distortion) provided by the PA under test, it is used the ACPR (Adjacent Channel Power Ratio) metric. ACPR is obtained by the ratio between the powers in the adjacent and main signal channels. In this section, it was used a bandwidth of 3.84 MHz for both channels and also a 5 MHz separation between the center frequencies of the adjacent and main channels.

To assess the accuracy of the designed fix-point DPD, the ACPR metric at the PA output is computed for the cases with and without pre-distortion. For a fair comparison, the PA average output power is the same in both scenarios. Fig. 4 shows the power spectral densities (PSD) at the PA output. Note that, by the inclusion of the designed DPD, the ACPR at the PA output was reduced by xx dB, in this way validating the presented linearizer design.



Fig. 4. Power Spectral Density.

Finally, in Fig. 5 is shown the amplitude of the PA output signal as a function of the amplitude of the excitation signal

(the AM-AM plot) for the unlinearized and linearized PA. Observe that the unlinearized PA exhibits memory effects (by the scattering pattern of the AM-AM plot). The DPD was able to compensate for mostly of the nonlinear and memory effects.



Fig. 5. AM-AM plot of unlinearized and linearized PA.

## V. Conclusions

In this paper it was presented a fix-point design of a digital pre-distorter for linearizing wireless transmitters in order to improve its efficiency. The DPD has a memory polynomial topology. The most important aspects concerning its practical implementation (real-time and power consumption) in a low-cost hardware such as a FPGA were investigated in order to assert the applicability of the designed DPD. The accuracy of the designed DPD was validated based on computer simulations under a PA behavioral model, illustrating an improvement in the ACPR metric at the PA output of 25 dB.

# Acknowledgement

The authors would also like to acknowledge the financial support provided by UPFR/TN under the Program UFPR Iniciação Científica 2012-2013.

#### References

- [1] Gilabert, P.L; Cesari, A; Montoro, G; Bertran, E; Dilhac, J. Multi-Lookup Table FPGA Implementation of an Adaptive Digital Predistorter for Linearizing RF Power Amplifiers With Memory Effects. IEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, v. 56, n. 2, 2008.
- [2] Guan, Lei; Zhu, Anding; Low-Cost FPGA Implementation of Volterra Series-Based Digital Predistorter for RF Power Amplifiers. IEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, v. 58, n. 4, 2010.
- [3] GONZÁLEZ, D.R.L. Automatic Design-Space Exploration of Integrated Multi-Standard Wireless Radio Receivers. Stockolm, 2006. Tese (Licenciatura) – KTH Information and Communication Technology.
- [4] Cui, Xian. Efficient Radio Frequency Power Amplifiers for Wireless Communications. Ohio, 2007. Tese (Doutorado em Engenharia Elétrica) – The Ohio State University.
- [5] Kwan, A; Ghannouchi, F.M; Hammi, O; Helaoui, M; Smith, M.R. Look-up table-based digital predistorter implementation for field programmable gate arrays using long-term evolution signals with 60 MHz bandwidth. IET SCIENCE, MEASUREMENT AND TECHNOLOGY, 2012.
- [6] P. B. Kenington, *High Linearity RF Amplifier Design*. Norwood, MA: Artech House, 2000.
- [7] J. Kim and K. Konstantinou, "Digital predistortion of wideband signals based on power amplifier model with memory," *Electron. Lett.*, vol. 37, no. 23, pp. 1417–1418, Nov. 2001.
- [8] E. Lima, T. Cunha, H. Teixeira, M. Pirola, J. Pedro, "Base-band derived volterra series for power amplifier modeling", microwave symposium digest, 2009. mtt '09. ieee mtt-s international, pp. 1361-1364, 2009.
- [9] C. Eun and E. J. Powers, "A new Volterra predistorter based on the indirect learning architecture," *IEEE Trans. Signal Process.*, vol. 45, no. 1, pp. 223–227, Jan. 1997.
- [10]MATLAB The MathWorks Inc., Natick, MA. http://www.mathworks.com/help/matlab/
- [11] ISE Design Suite 14.5 Xilinx, 2013.