# Digital Predistortion for RF Communications: From Mathematical Equations to Solution Implementation

[Introduction]DPD is an acronym for digital predistortion, a term familiar to many radio frequency (RF) engineers, signal processing hobbyists, and embedded software developers. DPDs are ubiquitous in cellular communication systems, enabling power amplifiers (PAs) to efficiently deliver maximum power to the antenna. As 5G increases the number of antennas in base stations and the spectrum becomes more congested, DPD is starting to become a key technology enabling the development of cost-effective and specification-compliant cellular systems.

Many of us have our own unique insights into DPD, whether from a purely mathematical point of view or if it is more constrained to implement on a microprocessor. You may be an engineer tasked with evaluating the performance of DPDs in RF base station products, or you may be an algorithm developer wondering how mathematical modeling techniques are implemented in real-world systems. This article is designed to broaden your knowledge and help you gain a comprehensive understanding of the subject from all angles.

What is DPD? Why use DPD?

When a base station RF device outputs an RF signal (see Figure 1), it needs to be amplified before being transmitted through the antenna. We use an RF PA to do this (amplify). Ideally, a PA receives an input signal and then outputs a higher power signal proportional to its input. During this operation, the PA is as efficient as possible, converting most of the DC power supplied to the amplifier into signal output power.

Figure 1. Simplified RF block diagram with and without DPD technology

But this is not an ideal world. PAs consist of transistors, which are active devices and inherently nonlinear. As shown in Figure 2, if we use the PA in its “linear” region (linear here is relative; hence the quotes), the output power is relatively proportional to the input power. The disadvantage of this method is that the PA is usually used very inefficiently, and most of the power provided is lost as heat. We usually want to use when PA starts to compress. This means that if the input signal is increased by a set amount (say 3 dB), the PA output will not be increased by the same amount (perhaps only by 1 dB). Obviously, the amplifier distorts the signal badly at this point.

Figure 2. Graph of PA input power versus output power (shows projection of sample input/output signals)

This distortion occurs at known locations in the frequency domain, depending on the input signal. Figure 3 shows these locations and the relationship between the fundamental frequency and these distortion products. In RF systems, we only need to compensate for distortions near the fundamental signal, which are odd-order intermodulation products. System filtering handles out-of-band products (harmonics and even-order intermodulation products). Figure 4 shows the output near the compression point of the RF PA. Intermodulation products (especially the third order) are clearly visible, like a “skirt” around the target signal.

Figure 3. Location of Two-Tone Input Intermodulation and Harmonic Distortion

Figure 4. 2 × 20 Mhz carrier through SKY66391-12 RF PA, center frequency = 1850 MHz

DPD is designed to characterize this distortion by looking at the PA output, knowing the desired output signal, and then changing the input signal so that the PA output is close to ideal. This can only be achieved effectively in fairly specific situations, where we need to configure the amplifier and input signal so that the amplifier has some degree of compression but is not fully saturated.

The math behind PA distortion modeling

Volterra series is an important mathematical basis of DPD, which is used to model nonlinear systems with memory. Memory simply means that the current output of the system depends on current and past inputs. Volterra series are very common (and therefore powerful) and are used in many fields outside of electrical engineering. For PA DPDs, the Volterra series can be used compactly, making it easier to implement and more stable in real-time digital systems. GMP is one such streamlining method.

Figure 5 shows how the relationship between the PA’s input x and output y can be modeled using GMP. As you can see, the three separate summation blocks of this equation are all very similar to each other. Let’s start with the first one circled in red below. The |x(…)|k term refers to the envelope of the input signal, where k is the polynomial order. l Integrate memory into the system. If La = {0,1,2}, then the model allows the output yGMP(n) to be determined by the current input x(n) and the past inputs x(n – 1) and x(n – 2). Figure 6 analyzes the effect of polynomial order k on sample vectors. Vector x is a single 20 MHz carrier, represented on complex baseband. The memory part is removed to simplify the GMP modeling equation. The distortion shown by the x|x|k plot is very similar to the actual distortion in Figure 4.

Each polynomial order (k) and memory delay (l) has an associated complex-valued weight (akl). After choosing the complexity of the model (which includes the values ​​of k and l), these weights need to be solved based on actual observations of the PA output for the known input signal. Figure 7 converts the simplified equation into matrix form. The model can be represented concisely using mathematical notation. However, to implement DPD in digital data buffers, matrix notation is simpler and more representative.

Let’s look at the second and third lines of the equation in Figure 6, which are ignored for simplicity. Note that if m is set to 0, then these two lines become exactly the same as the first line. These lines allow adding delays (both positive and negative) between the envelope term and the complex baseband signal. These are called lag-cross and lead-cross terms and can significantly improve the modeling accuracy of DPD. These terms provide additional degrees of freedom as we try to model the behavior of the amplifier. Note that Mb, Mc, Kb, and Kc do not contain 0; otherwise, the items in the first row are repeated.

Figure 5. GMP for PA distortion modeling

Figure 6. Graph of the effect of order (k) on the signal in the frequency domain of signal x

Figure 7. Converting the simplified equation to a matrix operation for a data buffer (closer to a digital implementation)

So, how do we determine the order of the model, the number of memory terms, and which cross terms should be added? At this point, a certain amount of “black magic” is required. Our knowledge of the physics of distortion can help. The type of amplifier, materials of manufacture, and the bandwidth of the signal passing through the amplifier all affect the modeling terms and can help engineers familiar with the field determine which model should be used. However, beyond that, there is a certain amount of trial and error involved.

Now that we have the model architecture, the last aspect of the problem we tackle from a mathematical perspective is how to solve for the weight coefficients. In practical scenarios, people tend to solve the inverse of the above model. It turns out that these model coefficients are mutually beneficial, post-distorting the captured PA output vector with the same weights to remove nonlinearities, and predistorting the transmit signal sent through the PA to make the PA output as linear as possible. In the block diagram shown in Figure 8, it is shown how the weight coefficients are estimated and predistorted.

Figure 8. Modeling and predistortion indirect implementation block diagram

In the inverse model, the matrix equations given in Figure 7 are interchanged to give X̂ = Yw. The structure of matrix Y is the same as that of X in other examples, as shown in FIG. 9 . In this example, a mnemonic is included and the order of the included polynomial is reduced. To solve for w, we need to find the inverse of Y. Y is not square (it is an elongated matrix), so it needs to be solved using a “pseudo-inverse” matrix (see Equation 1). This is solving for w in the least squares sense, that is, minimizing the square of the difference between X̂ and Yw, exactly what we want!

Given that it is used in a real environment with different signals, we can further optimize it. Here, the coefficients are updated based on previous values ​​and are therefore limited. μ is a constant value between 0 and 1 that controls how much the weights change at each iteration. If μ = 1 and w0 = 0, then the equation immediately reverts to the basic least squares solution. If μ is set to a value less than 1, it will take many iterations for the coefficients to converge.

Note that the modeling and estimation techniques described here are not the only way to perform DPD. Other techniques, such as modeling based on dynamic bias reduction, can also be used instead or in addition.

How can this technology be implemented in a microprocessor?

Typically, it is implemented in digital baseband, typically in a microprocessor or FPGA. ADI’s RadioVerse transceiver products, such as the ADRV902x family, have a built-in microprocessor core with a structure that facilitates easy DPD implementation.

Figure 9. Inverse algorithm equation in matrix form.some memories are contained in it

Figure 10. Example of a third-order predistortion calculation with one memory selection and one third-order cross-term element

Implementing DPD in embedded software involves two aspects. One is the DPD actuator, which performs real-time predistortion on the data sent in real-time, and the other is the DPD adaptation engine, which updates the DPD coefficients based on the observed PA output.

The key to how DPD and many other signal processing concepts are implemented in real-time in a microprocessor or similar device is the use of look-up tables (LUTs). LUTs allow replacing costly runtime computations with simpler matrix indexing operations. Let’s take a look at how the DPD executor applies predistortion to the data samples sent. The representative symbols are shown in Figure 8, where u(n) represents the new data sample to be transmitted and x(n) represents the predistorted version. Figure 10 shows the computations required to obtain a predistorted sample for a given scenario. This is a relatively restricted example, the highest polynomial order is third, with only one memory pick and one cross term. Even in this case, getting such a data sample would require a lot of multiplication, exponentiation, and addition.

In this case, the use of LUTs can reduce the real-time computational burden. The equation shown in Figure 10 can be rewritten in the style shown in Figure 11, where the data input to the LUT becomes more apparent. Each LUT contains the resulting value of the highlighted term in the equation, which corresponds to multiple possible values ​​of |u(n)|. The resolution depends on the LUT size implemented in the available hardware. The magnitude of the current input sample is quantized based on the resolution of the LUT and can be used as an index to access the correct LUT element for a given input.

Figure 11. Regrouping the equality terms to show the structure of the LUT

Figure 12 shows how the LUT is integrated into the fully predistorted actuator implementation of our example case. Note that this is only one possible implementation. Changes can be made while still maintaining the same output, for example: delay element z–1 can be moved to the right of LUT2.

Figure 12. Block diagram of a possible DPD implementation using LUTs

The adaptive engine is responsible for solving the coefficients used to calculate the LUT values ​​in the actuator. This involves solving for the w vector described in Equations 1 and 2. Pseudo-inverse matrix operation (YH Y)-1 YHIt will consume a lot of computing resources.Equation 1 can be rewritten as

if CYY = YHYCYx = YH x, Equation 3 becomes

CYYis a rectangular matrix, which can be decomposed into an upper triangular matrix L and a conjugate transpose matrix (CYY =LH L) product. In this way, we can solve w by introducing a dummy variable z, and the solution method is as follows:

Then, re-substituting this dummy variable, solve

because L and LHare the upper and lower triangular matrices, respectively, so with very little computational resources, equations 5 and 6 can be solved to obtain w. Every time the adaptive engine runs and comes up with a new value for w, the actuator LUT needs to be updated to reflect this. Depending on the observed PA output, or the operator’s perception of changes in the signal to be transmitted, the adaptive engine can perform actions at set regular or irregular intervals.

Implementing DPD in an embedded system requires a lot of checks and balances to ensure the stability of the system. Most importantly, the timing of the send data buffer and the capture buffer data is consistent to ensure that the mathematical relationship established between them is correct and remains correct over time. If this consistency is lost, the coefficients returned by the adaptation engine will not perform the correct predistortion on the system, potentially leading to system instability. The predistortion actuator output should also be checked to ensure that the signal does not saturate the DAC.

in conclusion

This paper studies DPD and its implementation in hardware from the perspective of basic mathematics, hoping to reveal some mysteries about DPD. The discussion of this topic in this article is just the tip of the iceberg and may help push the reader to further research on the application of signal processing techniques in communication systems. ADI’s RadioVerse transceiver products can integrate algorithms such as DPD, providing customers with highly integrated RF hardware and configurable software tools.

Author: Yoyokuo