100 kHz Large-Signal Bandwidth GaN-Based 10 kVA Class-D Power Amplifier with 4.8 MHz Switching Frequency

P. Niklaus,
J. W. Kolar,
D. Bortis

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
100 kHz Large-Signal Bandwidth GaN-Based 10 kVA Class-D Power Amplifier with 4.8 MHz Switching Frequency

Pascal S. Niklaus, Student Member, IEEE, Johann W. Kolar, Fellow, IEEE, and Dominik Bortis, Senior Member, IEEE

Abstract—Power Amplifiers (PAs) are widely used, for example, to emulate the behavior of the power grid or electric machines under critical operating conditions, to measure the impedance of the power grid, or to generate specific impedance profiles in Power-Hardware-in-the-Loop (P-HIL) tests. To accurately emulate dynamic effects and to characterize power electronic systems featuring Wide-Bandgap (WBG) power semiconductors, PAs with very high output voltage quality and ever higher bandwidth (BW) at full output power are required, motivating the development of Ultra-High Bandwidth Power Amplifiers (UHBW-PAs). While linear UHBW-PAs achieve very high signal fidelity and BW, they suffer from a tremendously bad efficiency, demanding large cooling effort and resulting in uneconomical operation, particular at high power levels and/or during long-term tests. Therefore, this paper investigates possibilities for a switch-mode realization of UHBW-PAs with significantly higher efficiency and power density compared to existing solutions. There are two key concepts, namely series- and parallel-interleaving of multiple switching and/or converter cells, that allow to increase the effective switching frequency relevant to output filtering without increasing the individual device switching frequency that determines the per device switching losses. This paper analyzes comprehensively the advantages and disadvantages of a combination of series- and parallel-interleaving in terms of losses, volume and complexity scaling. Finally, a UHBW-PA with 10 kVA output power (single-phase), a nominal rms output voltage of 230 V, a full-power BW of 100 kHz, very high output voltage quality (3rd and 5th harmonic < 2.5 V and < 1.2 V, respectively), an efficiency > 95 %, a power density of 25 kW/dm³ (410 W/in³), and a switching frequency of 4.8 MHz is presented. A hardware demonstrator is built and extensive measurements verify the system performance and confirm the calculation from the initial analyses with loss models.

Index Terms—Power amplifiers, Inverters, Power electronics, DC-AC power conversion, Power system testing, wide-bandgap semiconductors, Power-hardware-in-the-loop

I. INTRODUCTION

TESTING and characterization of power electronic systems is of great importance for their stable and reliable operation in the field. In industrial practice, Power-Hardware-in-the-Loop (P-HIL) test environments are mostly used for this purpose, since they offer a cost-effective way to test various operating modes, difficult to achieve with fully implemented hardware setups [1]. Here, the behavior of a system model is emulated by means of a Real Time Simulator (RTS) and a Power Amplifier (PA), where the latter is connected to a power electronic System Under Test (SUT) [2], [3]. Examples are the emulation of the power grid for testing grid-tied inverter stages (e.g. photovoltaic inverters) or rectifiers with regard to grid compatibility (voltage imbalances, transient phenomena and/or frequency deviations) [4], the emulation of a certain virtual grid impedance [5], [6], or any dc or ac load, e.g., an electrical machine [7], for analyzing drive systems and/or their control loops. A further application scenario is the measurement of the grid impedance [8], which has a significant impact on the stability of grid-tied converters [9]. As shown, e.g., in [2], the Bandwidth (BW) of such power amplifiers is the ultimate limiting factor for the achievable accuracy when emulating dynamic effects like voltage and/or load transients. Furthermore, maximum output voltage quality, i.e., minimum distortion and minimum noise, must be ensured while simultaneously offering the ability to source and/or sink multiple kVAs of output power (bidirectional power flow, arbitrary load phase angle φ). This clearly motivates the use of Ultra-High Bandwidth Power Amplifiers (UHBW-PAs) as an interface to the SUT. Fig. 1 shows a very simplified block diagram of such a three-phase UHBW-PA that is composed of a (typically isolated) three-phase grid interfacing rectifier

Fig. 1. System overview of the 100 kHz large-signal bandwidth power amplifier. Figure taken from [10].

TABLE I. Main system specifications for one single-phase of the investigated power amplifier.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Peak Output Voltage per Phase</td>
<td>V_{out, pk}</td>
</tr>
<tr>
<td>Output Frequency</td>
<td>f_{out}</td>
</tr>
<tr>
<td>Output Power per Phase</td>
<td>S_{out}</td>
</tr>
<tr>
<td>DC Link Voltage</td>
<td>V_{dc}</td>
</tr>
<tr>
<td>Effective Switching Frequency</td>
<td>f_{sw, eff}</td>
</tr>
<tr>
<td>System Efficiency (Nom. Op. Pt.)</td>
<td>η</td>
</tr>
</tbody>
</table>

Sout
Vdc
230V
3-Φ Grid
3-Φ
UHBW
Power
Amplifier

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.
stage to provide a (galvanically isolated) dc link voltage $V_{\text{dc}}$ for the subsequent ultra-high BW inverter stage (dc/ac stage, highlighted in blue). The design of one single-phase module of the latter is the focus of this article. A full three-phase system can then easily be assembled with three such modules. Note that within this article the terms UHBW-PA and PA (used interchangeably) refer to a single-phase module of the highlighted dc/ac stage from Fig. 1 with the specifications listed in Table I. In the nominal operating point, the full 10 kVA output power (per phase) is delivered for a nominal rms output voltage of 230 V at a fundamental frequency of 100 kHz into an ohmic load ($S_{\text{out}} = P_{\text{out}} = 10$ kW). Note that with “large-signal BW” we refer to the frequency at which the amplifier can provide its full-scale output power (at nominal output voltage and current) according to Table I, as defined in [11].

Traditionally, such PAs are implemented as linear power amplifiers, especially those with very high BW requirements [12], [13]. While they offer a very high output voltage quality, a major disadvantage of such solutions is their low efficiency (in particular for non-resistive loads and for power sinking), which is especially concerning for high output powers and/or long-term tests. Besides the inefficient operation, the high losses demand a large cooling effort and consequently lead to a low power density. For high output powers, switch-mode amplifiers (Class-D amplifiers) are therefore clearly preferred and are also increasingly found in commercial implementations [14], [15]. However, as explained later, to reach the desired output BW in a switch-mode PA, very high switching frequencies in the multi-MHz range are required.

In [11], different implementation options for such switch-mode PAs are presented and their advantages and disadvantages are compared with those of linear and hybrid PAs. Furthermore, a comprehensive review of currently available and developed PAs in industry and academia is given and a switch-mode realization composed of multiple cascaded (series-connected) full-bridges that achieves the same full-power BW of 100 kHz and same single-phase output power of 10 kW is presented. The cascade/series-connection of complete converter cells (full-bridges), denoted Cascaded H-Bridge (CHB) converter, enables an increased effective switching frequency and reduced switch-node voltage steps, which is advantageous for output filtering. There are other PA designs realized as CHB converter that achieve remarkable performance, such as [16], a Magnetic Resonance Imaging (MRI) high-power gradient PA with 7 kHz large-signal BW, an output capability of 1000 V/500 A, extremely low noise and high precision resulting from a sophisticated control method, or [17], a 41 level pulsed-sine generator with an output fundamental frequency of up to 1 MHz at an output voltage of up to 2 kV peak-to-peak used for cancer treatment in biomedical applications (output power and output signal quality unspecified). Similarly, [18] demonstrates a CHB based converter with a 1 MHz, 200 V amplitude sinusoidal output with a maximum power of 1.35 kW. The downside common to all CHB realizations, despite their very promising dynamic performance especially under aggressive load conditions and their full modularity (limited output voltage/power might still be available in case of failure of one CHB cell), is, however, the need for a galvanically isolated dc supply for each converter cell and the resulting relatively large coupling capacitances towards Protective Earth (PE), which cause significant Common Mode (CM) currents. These CM currents must be adequately filtered in order to not impair the converter operation.

A further interesting concept are hybrid analog/digital PA realizations that combine a high power, high efficiency switch-mode PA (main amplifier) with a relatively low power (yet high voltage or high current capable) linear PA (correction amplifier) to achieve maximum output BW and/or maximum signal fidelity/precision and yet moderate system efficiency. They can be seen as compromise between purely analog and fully digital implementations. There are three possible configurations, namely series- and parallel-connection and envelope tracking, which are comprehensively reviewed in [19]. In [20], [21], a CHB based main amplifier is connected in series to a linear correction amplifier in a way that the closed-loop dynamics are determined solely by the latter. Thereby, a BW of 105 kHz of the linear amplifier is achieved, which, e.g., allows to generate full-scale test signals of up to 60 kHz with the hybrid PA. An example of a parallel hybrid PA is shown in [22], where the linear amplifier defines the output voltage and the via a single filter inductor coupled switch-mode amplifier delivers the bulk load current. In that sense, the linear amplifier can be seen as active filter of the switch-mode main amplifier. Similarly, the linear amplifier could be replaced by a low power, fast switching (and therefore, high BW) digital converter, shown, e.g., in [23], [24]. Envelope tracking is a technique where a switch-mode converter forms a varying supply voltage for the linear PA based on the desired output voltage envelope, in order to minimize the voltage drop across and hence the conduction losses in the linear power transistors [25]. A particularly interesting application field are Radio-Frequency (RF) power amplifiers where an RF signal with varying amplitude envelope has to be amplified. Realizations of such an envelope tracking PA based on a three-level buck converter are presented in [26], [27] where experimental results verify accurate tracking of a 10 kHz rectified sine wave with an amplitude of 4 V. A boost-type solution for higher operating voltages (up to 130 V rms) and power levels (up to 1.5 kW) and a tracking BW of 1 kHz is shown in [28].

In contrast to CHB converter cells and/or hybrid analog/digital and digital/digital approaches, which typically are more complex in realization, this article discusses the realization of an entirely switch-mode PA with multiple Switching Cells (SCs) connected in parallel and series, which, in addition to reducing the switch-node voltage steps, also allows an increase in the effective switching frequency [29]. Within this article we use the term Switching Cell for one half-bridge with corresponding dc link capacitor, unlike CHB Converter Cells, which are composed of one full-bridge with dedicated dc link capacitor. The aim is to explore the performance limitations by pushing the (effective) switching frequency to very high values, while still keeping circuit complexity at a reasonable level. In [10], the optimal number of parallel- and series-connected SCs for the specifications given in Table I has
already been discussed. In this article, this analysis is extended and verified using a hardware demonstrator. Section II derives the circuit topology, focusing on semiconductor loss scaling depending on the number of series- and parallel-interleaved SCs and ultimately a suitable topology is selected. Section III then shows a detailed design procedure for the High-Frequency (HF) output filter inductors before Section IV presents the realized hardware demonstrator of the UHBW-PA including a liquid cooling system and highlights important design aspects. The performance of the hardware demonstrator is experimentally verified in Section V. Finally, Section VI concludes this article.

II. CONVERTER TOPOLOGY

A. Switch-Mode Power Amplifier

The BW of switch-mode PAs is limited by the necessary output filter (typically a k-stage second-order LC filter with corner frequency \( f_c \) and an attenuation of \(-k \cdot 40\text{ dB/dec}\)), which attenuates the HF spectral content in the switched output voltage, such that the local average (averaged over one switching period \( T_{sw} \)) remains [22]. This local average is tracking a programmed reference voltage/waveform \( v_{ref} \) given by the desired application scenario (e.g., coming from the RTS in P-HIL test environments). Within this article we restrict the discussion to single-stage LC output filters. Assuming a naturally sampled (continuous) Pulse Width Modulation (PWM) with a triangular carrier and a purely sinusoidal \( v_{ref} \) with frequency \( f_{out} \), the switched voltage \( v_{sw} \) contains spectral components at multiples of the switching frequency \( n \cdot f_{sw} \) and the respective sidebands \( n \cdot f_{sw} \pm k \cdot f_{out} \) with \( n \in \mathbb{N} \) and \( k \in \{2, 4, \ldots\} \) if \( n \) is odd or \( k \in \{1, 3, \ldots\} \) if \( n \) is even [30]. To attenuate sidebands at frequencies below \( f_{sw} \), the filter corner frequency \( f_c \) must be substantially lower than \( f_{sw} \). To account for effects such as finite filter slopes (e.g., \(-40\text{ dB/dec}\)) and component tolerances, in practice, \( f_c < f_{sw} / 10 \) is typically chosen [31]. At the same time, \( f_c \) must be higher than the maximum anticipated output frequency \( f_{out,max} \) to avoid exciting the filter resonance \((f_c \geq k_1 \cdot f_{out,max})\). Thereby, to reach the desired BW of 100 kHz, switching frequencies in the range of several MHz are required \((f_{sw} \geq 10 \cdot k_1 \cdot f_{out,max})\).

The authors of [11] derived the minimum required \( f_{sw} \) to reach a certain output voltage quality (quantified by means of the output voltage peak-to-peak ripple \( \Delta v_{out,pp} \)) for a certain maximum capacitive current and maximum inductive voltage drop in the respective filter elements. For a practical case of \( \Delta v_{out,pp} = 2\% \cdot \bar{v}_{out} \) (peak full-scale output voltage \( \bar{v}_{out} \)), 30\% capacitive current and 15\% inductive voltage drop a standard 2-level (2L) Voltage Source Inverter (VSI) requires \( f_{sw} = 5\text{ MHz} \). Even with the ever more widespread availability of Wide-Bandgap (WBG) power semiconductors with significant reduction of switching losses compared to their traditional Silicon (Si) counterparts, \( f_{sw} \) in normal 2L operation is still limited by the (hard) switching losses, particularly if a certain efficiency (95\% in the case at hand) is targeted. Therefore, alternative circuit topologies to the 2L-VSI are required, which allow to increase the effective switching frequency \( f_{sw,\text{eff}} \) of the switched voltage seen by the output filter without the penalty of high switching losses per device (switching losses distributed between multiple devices), i.e., the individual devices switch at a much lower frequency \( f_{sw} \).

Based on results of a previous study on this topic in [10], we briefly repeat the basic idea of series- and parallel-interleaving to ultimately find the most suitable combination of both approaches, i.e., a series-parallel-interleaved multi-level converter topology, which best fulfills the design goals (cf. Table I).

1) Series Interleaving:

Multi-level converters are used to increase the number of voltage levels of \( v_{sw} \) that is applied to the output filter inductor by phase-shifting the operation of \((M - 1)\) series-connected SCs or full converter cells. Each cell is operated with \( f_{sw} \), generating \( M \) distinct voltage levels at the switch-node as depicted in Fig. 2 (a.i)-(a.iii). The HF harmonics of \( v_{sw} \) are shifted to higher frequencies, i.e., to \( f_{sw,\text{eff}} = (M - 1) \cdot f_{sw} \). Furthermore, the blocking voltage stress of the switches is reduced to \( V_{dc} / (M - 1) \), which on the one hand renders certain semiconductor technologies usable in application scenarios where they normally cannot be used (e.g., 600 V Gallium Nitride (GaN) devices with an 800 V dc link) and on the other hand offers the possibility to use lower voltage devices with a potentially better Figure-of-Merit (FOM) [32]. From the various multi-level topologies described in literature, such as Modular Multi-Level Converters (MMCs) [33] (which need active control of the individual submodule capacitor voltages [34]), CHB [35] and Neutral-Point Clamped (NPC) converters [36], [37], the Flying Capacitor Converter (FCC) initially proposed in [38] and depicted in Fig. 2 (a.i) has the fundamental advantage that it can generate a high number of voltage levels with reasonable semiconductor effort, lower circuit complexity compared to other multi-level approaches and is capable of generating a dc output without auxiliary circuits for active voltage balancing of the stacked dc link capacitors. It has to be mentioned, however, that under certain operating and load conditions (e.g., low inductor current ripple and dc output voltage component), active balancing of the Flying Capacitor (FC) voltages is required and can be implemented with relatively simple control algorithms, such as the one presented in [26]. As mentioned in the introduction, there are several application scenarios where CHB converters are very well suited but due to circuit complexity and overall system efficiency considerations (an isolated dc link voltage needs to be provided to each individual CHB cell), the FCC is identified as a most promising converter candidate for the given application. The operation of the \( M \)-level FCC has been widely discussed in literature [39]-[41] and a further explanation is omitted here.

2) Parallel Interleaving:

High current ratings demand an increased semiconductor area, which can either be realized with large devices or by paralleling \( N \) small devices. In the latter case, phase-shifted operation of \( N \) parallel-interleaved 2L SCs as shown in Fig. 2 (b.i), hereinafter called branches, can favorably be used (commonly also referred to as multi-phase operation in literature) [42].
Not only does this lead to a (partial) cancellation of the current ripple between the individual branches, and hence a reduction of the ripple in the summed output current $i_{\text{sum}}$ seen by the filter capacitance $C$, but at the same time, an increased effective switching frequency $f_{\text{sw,eff}} = N \cdot f_{\text{sw}}$ is obtained at the output capacitor as illustrated in Fig. 2 (b.ii)-(b.iii) [43]. There, $f_{\text{sw}}$ denotes the switching frequency of each individual 2L SC and $N$ the number of interleaved branches. The effective (or virtual) switch-node voltage $v_{\text{sw,eff}}$ results from the inductive voltage divider and shows a multi-level nature that is equal for both, one single series-interleaved $M$-level bridge leg and $N = M - 1$ times parallel-interleaved two-level half-bridges (cf. Fig. 2 (a.ii) and (b.ii) for $M = 5$ and $N = 4$, respectively). In both cases the voltage spectrum contains no components between $f_{\text{out}}$ and $f_{\text{sw,eff}}$ (assuming natural sampling PWM), and therefore, the filtering effort can be drastically reduced while keeping a moderate switching frequency $f_{\text{sw}}$ of each half-bridge, which is beneficial in terms of switching losses per device. Note that coupled inductors can be used to symmetrize the individual branch currents $i_{1,i}$ and to further reduce the current ripple, while achieving the same or even better transient response [44], [45]. Uncoupled filter inductors, however, are offering the desired flexibility to operate UHBW-PAs either as single-phase high current sources or three-phase lower current sources, which extends the possible application scenarios as mentioned earlier.

3) Series-Parallel Interleaving:
Since series-interleaving distributes the voltage stress among several devices and parallel interleaved operation distributes the current stress among the parallel branches, a combination of the two approaches gives additional degrees of freedom in terms of loss and stress distribution. For a parallel-interleaved multi-level converter, the effective switching frequency of the current ripple seen by the filter capacitor $C$ is given as

$$f_{\text{sw,eff}} = N \cdot (M - 1) \cdot f_{\text{sw}} = n_{\text{SC}} \cdot f_{\text{sw}},$$ (1)

where $n_{\text{SC}} = N \cdot (M - 1)$ denotes the total number of utilized SCs. At the same time, the effective (or virtual) switch-node voltage is composed of $n_{\text{SC}} + 1$ distinct voltage levels, i.e., the voltage steps applied to the output filter are $\Delta v_{\text{sw,eff}} = V_{\text{dc}}/n_{\text{SC}}$. Considering a required $f_{\text{sw,eff}}$ in the range of 5 MHz, a combination of both approaches allows for a low individual device switching frequency $f_{\text{sw}}$ and offers both, voltage and current stress sharing.

B. Output Filter Design
To design the single-stage $LC$ output filter, four different constraints are derived based on the general HF equivalent circuit of a switching stage depicted in Fig. 3 (a.i), where $v_{\text{sw,eff}}$ denotes the unfiltered $(n_{\text{SC}} + 1)$-level switched
Fig. 3. (a.i) Equivalent circuit of \( N \) parallel-interleaved \( M \)-level series-interleaved branches indicating the effective switching-node voltage \( v_{sw,eff} \), the effective filter inductance \( L_{filt} \) and the filter capacitance \( C_{filt} \) for a single-stage output filter. (a.ii) Fundamental frequency phasor diagram showing the impact of a load phase angle \( \varphi \) (here: ohmic-inductive load) and the reactive current and voltage in the filter elements. (b) Representation of four design criteria that define the valid range for the filter components in the filter design space according to [46], displayed for different number of switching cells \( n_{SC} = N \cdot (M - 1) \). The valid Design Space (DS) for \( n_{SC} = 6 \) (realizable, e.g., with \( M = N = 3 \)) and the preferred solution highlighted with \( \star \); Figure based on [10].

Voltage with switching frequency \( f_{sw,eff} \) that is generated with the series- and parallel-interleaved converter (with arbitrary \( M \) and \( N \)).

In the resultant filter Design Space (DS) proposed in [46], the criteria are graphically visualized on a \( L_{filt} \) vs. \( C_{filt} \) plane (cf. Fig. 3 (b) for different \( n_{SC} \) and highlighted for \( n_{SC} = 6 \)). Note that for the output filter operation it is irrelevant by which combination of \( M \) and \( N \) a certain \( n_{SC} \) is obtained.

The primary goal of UHBW-PA is to achieve maximum BW. This criterion formulated with a minimum ratio \( k_l \) between filter corner frequency \( f_c \) and maximum output frequency \( f_{out,max} \), i.e.,

\[
L_{filt} \cdot C_{filt} \leq \frac{1}{(2\pi)^2 \cdot k_l^2 \cdot f_{out,max}^2},
\]

which corresponds to a minimum required \( f_c \) to prevent peaking of \( v_{out} \) at the maximum output frequency, (2) embodies a hyperbola in the DS and is illustrated with the pink line in Fig. 3 (b) for \( f_{out,max} = 100 \text{kHz} \) and \( k_l = 4 \) (empirical value).

To maximize the output voltage quality (besides the BW the main goal in PAs) a certain maximum peak-to-peak output voltage ripple \( \Delta v_{out,pp} \) has to be defined. Assuming a fixed \( f_{sw,eff} \), the relative output voltage ripple \( \Delta v_{out,pp}/V_{dc} \) is found as

\[
\Delta v_{out,pp}/V_{dc} = \frac{1}{32 \cdot n_{SC} \cdot L_{filt} \cdot C_{filt} \cdot f_{sw,eff}^2}.
\]

and corresponds to a hyperbola in the filter DS (green curves in Fig. 3 (b) for different \( n_{SC} \) and for \( \Delta v_{out,pp}/V_{dc} = 1 \%).

There, \( V_{dc} \) is the full dc link voltage, even though in practice, a split dc link with \( 2 \times V_{dc}/2 \) is used. With \( f_c = 1/(2\pi \cdot \sqrt{L_{filt}C_{filt}}) \), (3) can be rearranged to

\[
f_c = \sqrt{\frac{8 \cdot n_{SC} \cdot f_{sw,eff}^2 \cdot \Delta v_{out,pp}}{\pi^2 \cdot V_{dc}}},
\]

and shows that with increasing \( M \) and/or \( N \) (that is, with increasing \( n_{SC} \)) the same output voltage quality (e.g., \( \Delta v_{out,pp}/V_{dc} = 1 \%) \) is achieved with a higher filter corner frequency. In fact, (4) gives the maximum allowed \( f_c \) to still achieve the desired output voltage quality. This clearly motivates the series- and parallel-interleaved circuit topology, because for a given effective switching frequency \( f_{sw,eff} = n_{SC} \cdot f_{sw} \) a realization with series- and parallel-interleaving \( (n_{SC} > 1) \), thanks to the increased maximum possible \( f_c \), therefore reduces the filtering effort. In any case, a higher \( f_c \) allows to use smaller filter components and thereby advantageously helps to minimize the system volume.

Additional design constraints result from the maximum voltage drop \( v_L \) and maximum current \( i_C \) in the filter elements at the nominal operating point with \( V_{out} = 230 \text{V rms} \), \( P_{out} = 10 \text{kW} \) (ohmic load) and the maximum output frequency \( f_{out,max} = 100 \text{kHz} \), i.e., as already calculated in [11],

\[
2\pi \cdot f_{out,max} \cdot L_{filt} \cdot i_{out} \leq k_v \cdot V_{out}
\]

(5)

\[
2\pi \cdot f_{out,max} \cdot C_{filt} \cdot V_{out} \leq k_i \cdot i_{out}
\]

(6)

with the nominal rms output current \( i_{out} = P_{out}/V_{out} \) (for an ohmic load). \( k_v \) and \( k_i \) are the maximum allowed fractions of \( V_{out} \) and \( i_{out} \) to appear across \( L_{filt} \) and flow through \( C_{filt} \), respectively. (5) and (6) correspond to the vertical blue and horizontal red line in the DS in Fig. 3 (b) and result in

\[
L_{filt} \leq 1.26 \mu\text{H} \quad \text{and} \quad C_{filt} \leq 99 \text{nF}
\]

for \( f_{out} \leq 1.6 \text{MHz} \) (cf. dashed gray outline in Fig. 3 (b)). Note that only the output voltage quality criterion depends on \( n_{SC} \). In the interest of maximum output voltage quality, the minimum possible filter corner frequency \( f_{c,min} = 1/(2\pi \sqrt{L_{filt, max}C_{filt, max}}) = 472 \text{kHz} \) (marked with \( \star \) in Fig. 3 (b) with \( L_{filt, max} = 1.26 \mu\text{H} \) and \( C_{filt, max} = 99 \text{nF} \) is selected. If the smallest possible filter volume is favored, the design with \( C_{filt} = C_{filt, max} \) and the minimum required \( L_{filt} \) should be selected from the DS (marked with \( \blacklozenge \) in Fig. 3 (b) for \( n_{SC} = 6 \)). Generally, a low filter inductance and high filter capacitance facilitates a low converter output impedance to prevent load dependence of the output voltage. Note that depending on the load phase angle \( \varphi \), the inductive voltage drop across \( L_{filt} \) and the capacitive current through \( C_{filt} \) lead to an increase or a decrease of \( v_{sw,eff} \) and \( i_{sum} \) as illustrated with the fundamental frequency phasor diagram in Fig. 3 (a.ii). In the nominal operating point with an ohmic load, the chosen \( k_v \) and \( k_i \) increase the required \( v_{sw,eff} \) by \( \approx 1 \% \) and \( i_{sum} \) by \( \approx 4.5 \% \).

From Fig. 3 (b) follows that a valid DS only results for \( n_{SC} \geq 3 \), which again proves that for a given \( f_{sw,eff} \) a higher \( n_{SC} \) is beneficial in terms of filtering, i.e., that for \( f_{sw,eff} = 4.8 \text{MHz} \), a single 2L (\( n_{SC} = 1 \)) or three-level
(n\textsubscript{SC} = 2) SC could not achieve sufficient output voltage quality. As we show later, for the given specifications and the available power semiconductors, only designs with \( M \geq 3 \) and \( N \geq 2 \) (\( n\textsubscript{SC} \geq 4 \)) are feasible.

### C. Quantitative Performance Evaluation

To ultimately determine the best-suited combination of \( N \) parallel-interleaved \( M \)-level series-interleaved branches, a comprehensive simulation model is used to estimate the occurring converter losses. Due to the arbitrary output voltage and current waveform, generally, Hard Switching (HSW) losses occur. Given the \( t_s \) at the same time relatively high expected device switching frequency \( f_s \), only GaN power semiconductors can be reasonably utilized in this application, since both, Silicon-Carbide (SiC) and Si devices have too high specific switching losses (either due to the output charge \( Q_{\text{oss}} \) and/or the reverse recovery charge \( Q_{\text{rr}} \)). Generally, GaN devices with a small die area and therefore low \( Q_{\text{oss}} \) are preferred as they have lower Zero-Current Switching (ZCS) and HSW losses. Furthermore, the device must be available in a package that allows adequate heat dissipation, since despite the series- and parallel-interleaved operation, substantial losses per device are expected. To avoid heat dissipation through the Printed Circuit Board (PCB), e.g., by means of thermal vias [48] or advanced PCB technologies such as copper inlays or PCB integrated power devices [49], only top-side cooled devices are considered. Gallium Nitride High-Electron-Mobility Transistors (GaN HEMTs) are available either as High-Voltage (HV) devices with blocking voltage capabilities of 600 – 650 V or as Low-Voltage (LV) devices with blocking voltages of 100 – 200 V. The latter are only applicable for \( M \geq 7 \) voltage levels and therefore, 600 V devices are better suited for the analysis under the given specifications. 70 m\( \Omega \) 600 V GaN Gate Injection Transistors (GITs) (a special realization of GaN HEMTs [50]) turned out to be best-suited from the currently available devices on the market in terms of switching performance and particularly regarding heat dissipation capabilities. As will be shown later in more detail, the overall losses are prominently dominated by the HSW losses. This again motivates that in general a low die area and therefore, a low output charge \( Q_{\text{oss}} \) of the semiconductor is preferred. Moreover, top-side cooled devices allow to decouple the electrical layout of the power commutation loop from the thermal design. Thereby, no thermal vias and/or copper inlays are required, which on the one hand limit the possibilities to design a power commutation loop with as low an inductance as possible and on the other hand would also limit the heat dissipation capabilities because the heat has to flow through the PCB. From the available top-side cooled 600 V/650 V devices, many are intended for high current applications, i.e., feature a low \( R_{\text{ds(on)}} \), thus a relatively large \( Q_{\text{oss}} \), and are therefore not well suited for this application. The selected 70 m\( \Omega \) devices have a \( Q_{\text{oss}} \) of only 41 nC (at 400 V) and at the same time offer a large metallic cooling pad (area of \( \approx 0.8 \text{cm}^2 \)). A similar device would be [51] with \( R_{\text{ds(on)}} = 67 \text{m}\Omega \) but it has a higher \( Q_{\text{oss}} \) of 47 nC (14 % more than [47]) and a smaller cooling pad of only \( \approx 0.16 \text{cm}^2 \) (80 % less than [47]).

Estimated losses of \( \approx 35 \text{W} \) per semiconductor (cf. Fig. 4) the power loss density is the limiting factor for the thermal design and therefore, a large cooling surface is preferred. There are other devices with larger cooling surface, e.g., [52] with \( \approx 0.47 \text{cm}^2 \) (still 40 % less compared to [47]) but they have a \( Q_{\text{oss}} \) of 134 nC (3 \times more compared to [47]), which would significantly increase the already high switching losses (cf. loss breakdown in Fig. 11) and are thus not favored. The selected 70 m\( \Omega \) devices are therefore a reasonable trade-off between the considered aspects. In addition, these particular devices by means of a so-called hybrid drain prevent the phenomenon of increased dynamic on-state resistance (dynamic \( R_{\text{ds(on)}} \)) after application of a large drain-source voltage (current collapse) reported to occur in GaN switches [53]. An additional p-GaN region electrically connected to the drain injects holes during the off-state, which completely release the trapped electrons and thus eliminate the effect of the dynamic on-state resistance [54], [55]. This is a significant advantage for operation at high switching frequencies. Therefore, these devices are used for the loss evaluation [47]. It has to be mentioned, however, that there is still a significant temperature dependence of \( R_{\text{ds(on)}} \) which is considered in the loss model by utilizing the worst-case \( R_{\text{ds(on)}} \), i.e., the value at a junction temperature of 125°C.

Fig. 4 shows the result of a detailed loss analysis for different possible converter realizations with \( N \) parallel-interleaved \( M \)-level series-interleaved branches. Fig. 4 (a) shows the conduction losses, Fig. 4 (b) the switching losses and Fig. 4 (c) the resultant semiconductor efficiency for different combinations of \( M \) and \( N \) in the nominal operating point. In all cases, the output filter with \( L_{\text{filt, max}} \) and \( C_{\text{filt, max}} \) is considered. The reverse conduction losses during the dead time are included in the total conduction losses of Fig. 4 (a) assuming a fixed dead time of \( t_d = 24 \text{ns} \) (a more detailed explanation regarding the selection follows just below). Note that at least \( M_{\text{min}} = 3 \) voltage levels are required in each bridge-leg for \( V_{\text{dc}} = 800 \text{V} \) when using 600 switches. Similarly, at least \( N_{\text{min}} = 2 \) parallel-interleaved branches are required to not exceed the maximum current rating of the utilized switches (if paralleling of multiple devices is not considered). Moreover, the efficiency target can only be achieved with \( N > 2 \).

1) Conduction Losses:

From Fig. 4 (a), it can be deduced that for any given \( N \), the conduction losses increase linearly with increasing \( M \), since at any time \((M - 1)\) series-connected transistors conduct the branch current. Similarly, for a given \( M \), they scale with \( 1/N^\alpha \) whereas ideally, \( \alpha = 1 \), meaning that the current perfectly distributes among the \( N \) branches. In practice, \( 0 < \alpha < 1 \) because only the fundamental component of the total inductor current (\( i_{\text{sum}} \)) splits equally among the \( N \) branches but the current ripple does not scale with \( 1/N \) for a fixed \( f_{\text{sw, eff}} \) and a fixed \( L_{\text{filt}} \). It does, however, reduce with increasing \( M \), because the voltage difference (and hence the voltage-time area) applied to the inductor reduces.

2) Switching Losses:

The switching losses in Fig. 4 (b) include calorimetrically measured HSW and Soft Switching (SSW) (Zero Voltage Switching (ZVS)) losses [10]. Depending on the switched
current, full ZVS may not be possible within the given dead time \( t_d = 24 \text{ ns} \) in all cases - a minimum \( t_d \) is favorable to minimize Low-Frequency (LF) harmonics in \( V_{\text{out}} \) as will be seen in Section V-B, so Partial-Hard Switching (PHSW) occurs where a certain residual charge on the output capacitor is shorted inside the transistor. PHSW losses are modeled based on the approach presented in [56] and are included in the calculation. Note that generally the selection of the dead time is subject to an optimization process to minimize the overall losses. A large dead time enables full ZVS for lower switched currents but at the same time has the disadvantage of increasing the reverse conduction (3rd quadrant) losses due to the increased voltage drop between source and drain. This is in particular a concern with GaN semiconductors, which do not have a physical body diode but are inherently symmetrical devices such that during reverse conduction the voltage between source and drain equals the threshold voltage plus the absolute value of the negative gate to source voltage to keep the transistor safely in the off-state [57]. Ideally, the dead time is adapted based on the switched current in order to always achieve full ZVS without the disadvantage of keeping the complementary device in the off-state for an unnecessary long time with increased reverse conduction losses [56]. A low fixed dead time on the other hand has the advantage of giving the lowest output voltage distortion (cf. Section V-B) but achieves full ZVS only for higher switched currents (\( \approx 4 \text{ A} \) in the given case), i.e., PHSW is more likely to occur. In the interest of maximum output voltage quality, a value as low as possible (but fixed) is selected for the dead time \( t_d = 24 \text{ ns} \) in the following analysis.

The HSW losses per device can be modeled as

\[
P_{\text{HSW}} = f_{\text{sw}} \left( Q_{\text{oss}} V_{\text{sw}} + \frac{1}{2} \frac{V_{\text{sw}}^2}{\frac{dV}{dt}} I_{\text{sw}} + \frac{1}{2} \frac{I_{\text{sw}}^2}{\frac{dI}{dt}} V_{\text{sw}} \right) \tag{7}
\]

with the transistor output charge \( Q_{\text{oss}} \) (which is in fact voltage dependent, i.e., \( Q_{\text{oss}}(V_{\text{sw}}) \)), the switched voltage \( V_{\text{sw}}, \) the switched current \( I_{\text{sw}} \) and the voltage and current transition slopes \( \frac{dV}{dt} \) and \( \frac{dI}{dt} \). Only the effect of charging and discharging of the output capacitor \( C_{\text{oss}} \) as well as \( V - I \) overlap during the turn-on transition (turn-off transition assumed lossless) is considered in (7) [58]. According to [10], the term in (7) that goes quadratically with \( I_{\text{sw}} \) can be neglected unless very high currents are switched \((\frac{dI}{dt} \rightarrow \infty)\). In the considered device, \( Q_{\text{oss}}(V_{\text{sw}}) \) scales approximately linearly with \( V_{\text{sw}} \) and therefore, for a given \( I_{\text{sw}} \) the HSW losses are expected to scale quadratically with \( V_{\text{sw}} \) under the simplified assumption of a constant \( \frac{dV}{dt} \) independent of the switched voltage and current. For a given \( V_{\text{sw}} \), however, a linear relation between the HSW losses and \( I_{\text{sw}} \) is expected on top of a certain loss offset (ZCS losses, \( Q_{\text{oss}} V_{\text{sw}}^\gamma \) term). In a series- and parallel-interleaved converter, \( V_{\text{sw}} \) reduces linearly with \( N \) and similarly, the fundamental component of \( I_{\text{sw}} \) reduces linearly with \( N \). For a fixed \( f_{\text{sw,eff}} \) and a fixed \( L_{\text{ilt}} \) the current ripple, however, does not scale with \( N \). Therefore, with increasing \( N \) (and fixed \( M \)), i.e., decreasing fundamental component of \( I_{\text{sw}} \) but constant current ripple, the HSW losses are expected to scale with \( 1/N^\beta \) \((\beta > 1)\). For low switched currents, that is for high \( N \), the ZCS and/or PHSW and/or SSW losses dominate and the simplified scaling with \( 1/N^\beta \) is not valid anymore as seen in Fig. 4 (b).

With increasing \( M \) (and fixed \( N \)) and therefore decreasing \( V_{\text{sw}} \) a loss scaling with \( 1/(M-1) \) and \( \gamma = 2 \) would be forecasted using (7). Yet, Fig. 4 (b) rather indicates a linear relation, i.e., \( \gamma = 1 \), which has two reasons:

i) Calorimetric HSW loss measurements confirm the quadratic dependence on \( V_{\text{sw}} \) only for low currents \((I_{\text{sw}} < 8 \text{ A})\), but for higher \( I_{\text{sw}} \) the losses scale linearly with \( V_{\text{sw}} \).

ii) For a given \( N \), the hard switched current \( I_{\text{sw}} \) increases with increasing \( M \) due to the accompanying lower current ripple. This accordingly leads to larger HSW losses and counteracts the expected quadratic decrease.

3) Semiconductor Efficiency:

Finally, Fig. 4 (c) shows the resultant expected semiconductor efficiency \( \eta_{\text{semi}} \) considering the total forward and reverse conduction, HSW, PHSW and SSW losses, i.e.,

\[
\eta_{\text{semi}} = \frac{P_{\text{out}}}{P_{\text{out}} + P_{\text{cond,tot}} + P_{\text{sw,tot}}} \tag{8}
\]

in the nominal operating point \((P_{\text{out}} = 10 \text{ kW})\). As expected from the loss scaling, \( \eta_{\text{semi}} \) improves with increasing \( N \),...
whereas for a given $N$ it generally decreases with increasing $M$ as the conduction losses start to dominate. With very high $M$ and $N$, efficiencies up to 98% are theoretically possible, but come at the expense of a very complex design. Fortunately, the efficiency target is reached already with lower $M$ and $N$. It has to be kept in mind that the presented calculation only considers one specific switch. For $M \geq 7$, LV 200 V GaN HEMTs could be used, however, as shown in [32], the area-specific on-state resistance $r_{ds,on}$ (in $\Omega \cdot \text{m}^2$ or $\Omega \cdot \text{mm}^2$) of GaN devices roughly scales linearly with the required Blocking Voltage from Drain to Source (BVDS). Therefore, the penalty of utilizing a high BVDS GaN device at a lower voltage is not as high compared to Si devices, which scale roughly with BVDS$^{2.2-5}$.

D. Design Selection

To choose a suitable design, not only the losses but also the volume, particularly of the FCs and the branch inductors, as well as the overall design complexity (e.g., placement of Gate Drivers (GDs) for each of the $2 \cdot n_{SC}$ switches, layout of the commutation loop in each SC, routing of the signals, etc.) must be considered.

1) Flying Capacitor Volume:
The FC volume is approximated using a polynomial fit of the capacitance density vs. rated voltage from commercially available ceramic capacitors. As shown, e.g., in [40], the required capacitance is found as a function of the peak switched current and the desired absolute voltage ripple $\Delta V_{FC}$ (5 V peak-to-peak in our case) as well as the frequency $f_{LPP} = f_{sw,eff}/N$ at which the FCs are charged and discharged (equal to the current ripple frequency in each branch). The peak switched current reduces slightly with increasing $M$ for a given $N$ thanks to the smaller current ripple and therefore, the required capacitance per FC reduces. By adding an additional voltage level with series-interleaving ($M = M + 1$), the FC volume automatically increases, since on top of the already present FCs of the $M$-level series-interleaved bridge leg now an additional FC is placed. While the voltage rating of each additional capacitor is smaller, the overall FC volume still scales linearly with $M$. However, with increasing $N$, for a given $M$, the volume scales even quadratically for three reasons:

i) Every branch needs the same number of FCs at the same voltage levels (volume scaling linear with $N$),

ii) Due to the fixed $f_{sw,eff}$, $f_{LPP}$ reduces with $1/N$. Thus the required capacitance and therefore the required FC volume increases linearly with $N$.

iii) The peak switched current, however, does not linearly reduce with increasing $N$, which would lead to a scaling of the required capacitance and FC volume with $1/N$, but due to the constant current ripple shows a less pronounced dependence on $N$, as explained earlier.

2) Branch Inductor Volume:
The volume of each individual branch inductor $L_{br}$ scales roughly with $k_{VL} = L_{br} \cdot I_{L, pk} \cdot I_{L, rms}$, where $k_{VL}$ is an indicator for the stored energy in $L_{br}$. With a fixed value for $L_{filt}$, the branch inductance scales linearly with $N$ and $M$.

is independent of $M$. The branch inductor rms current $I_{L, rms}$ does not significantly reduce with increasing $M$ (i.e., lower current ripple), but reduces approximately with $1/N$, since it is mainly defined by the fundamental component, which scales with $1/N$. The branch inductor peak current $I_{L, pk}$ slightly reduces with increasing $M$ due to the smaller current ripple and except for high $N$ ($N > 6$) reduces with roughly $1/N$ due to the smaller fundamental component. For $N > 6$, the ripple current dominates and therefore, $I_{L, pk}$ does not reduce any further. All in all, the total volume of all $N$ required branch inductors slightly reduces with increasing $M$ (more pronounced for high $N$) and increases with increasing $N$ (less pronounced for high $M$).

3) Three-Level Triple-Interleaved (3L3) Converter:

From the $\eta_{semi} = 95\%$ contour in Fig. 4 (c) can be seen that almost all designs with $N > 2$ (exception $M = 9/N = 3$) are fulfilling the efficiency criterion. It should be noted, however, that additional losses in the converter, e.g., in the inductors, cause the system efficiency $\eta$ to be smaller than $\eta_{semi}$. Therefore, a design with $\eta_{semi} > 95\%$ has to be chosen. After a comprehensive efficiency, volume, and design complexity trade-off, finally the solution with $M = N = 3$, i.e., a Three-Level Triple-Interleaved (3L3) converter topology is selected (highlighted in Fig. 4), which is expected to achieve 96% semiconductor efficiency, allows for a very power dense system realization and has manageable design complexity. The topology is depicted in Fig. 5. With $n_{SC} = 6$ switching cells, each transistor has a device switching frequency of $f_{sw} = 800$ kHz to finally achieve $f_{sw,eff} = 4.8$ MHz at the summation node. The triple-interleaved design advantageously offers the flexibility to reconfigure the converter for a three-phase output (with 1/3 power rating in each phase compared to the total single-phase output power) by placing three individual filter capacitors with a capacitance of $C_{filt}/3$ after each branch inductor. Therefore, the converter can be used, for example, as high BW drive system with sine filter. Please note that a three-level FCC has also been identified in [27] as best-suited for a tracking power supply of a linear RF PA.

III. BRANCH INDUCTOR DESIGN

A. Core Material and Geometry

A vital part of the system are the branch inductors $L_{br1...3}$ that give the effective filter inductance $L_{filt} = 1.26$ $\mu$H (cf. Fig. 3). With $N = 3$ branches, each of the three inductors

\[ V_{dc/2} \quad V_{dc/2} \]

\[ f_{sw} = 800\text{kHz} \quad f_{sw,eff} = 4.8\text{MHz} \]

\[ L_{filt} = 1.26 \mu\text{H} \]

\[ \text{Fig. 5. Topology of the selected three-level triple-interleaved flying capacitor converter (3L3) realized with 70 m}\Omega \text{ 600 V GaN power switches switching at 800kHz, resulting in an effective switching frequency } f_{sw,eff} \text{ of 4.8 MHz.} \]
has an inductance of 3.8 μH and sees a triangular current ripple with a frequency $f_{\text{ripple}} = f_{\text{sw,eff}}/N = 1.6$ MHz. Since both, the fundamental and the ripple components of the inductor current are relatively HF, a suitable core material with good HF properties, such as those offered by Manganese-Zinc (MnZn) and Nickel-Zinc (NiZn) ferrites, is required. It should further have a sufficient magnetic permeability $\mu_r$ at frequencies in the MHz range without generating excessive core losses. A comprehensive analysis of various available materials has shown that the 3F46 ferrite from Ferroxcube designed for a maximum operating frequency of $1 - 3$ MHz is the most suitable.

A pot core geometry that covers the whole winding minimizes the external magnetic stray field around the inductor (magnetic shielding) provided the air gap is only in the center pillar. This is a very important benefit, since it allows to encapsulate the inductor in a metallic housing to facilitate the heat dissipation without generating excessive ac losses due to the proximity effect, as it would be the case for other core shapes, e.g., E-cores, where the winding heads are not covered by the magnetic material. The required core cross-sectional area $A_{Fe}$ is selected based on the maximum allowed peak magnetic flux density $B$ (single-sided amplitude), which necessarily needs to be below the saturation flux density $B_{sat,3F46} \approx 430$ mT @ 100°C, but typically is selected considerably smaller to limit core losses. An empirically determined value $B = 250$ mT < $B_{sat,3F46}$ is used in this case. In a simplified assumption, $B$ can be thought of as being composed of two components:

i) a current-impressed fundamental component $B_0$ and  
ii) a voltage-impressed ripple component $B_{HF}$.

Their respective single-sided amplitudes $B_0$ and $B_{HF}$ (peak values) are expressed as

$$
B_0 = L_{HF} \cdot i_{L,0}/(N_t \cdot A_{Fe}) \quad (9)
$$

$$
B_{HF} = V_{dc}/(8 \cdot (M - 1) \cdot N_t \cdot f_{\text{ripple}} \cdot A_{Fe}) \quad (10)
$$

with the peak fundamental branch inductor current $i_{L,0}$ (equal distribution of the sum of the output current $i_{out}$ and the filter capacitor current $i_C$ between the $N$ branches; $i_{L,0} = 21$ A in the nominal operating point for $N = 3$) and the number of turns $N_t$. The worst-case current ripple occurring for duty-cycles $D = 0.5 / (M - 1)$ (and odd multiples thereof) [29] is assumed in (10). The remaining two degrees of freedom are $N_t$ and $A_{Fe}$, which must be chosen to limit the peak total flux density $B \in \left[ \max \left\{ B_0, B_{HF} \right\}, B_0 + B_{HF} \right]$ accordingly. The worst-case $B = B_0 + B_{HF}$ arises when the maximum current ripple occurs at the same time as the peak fundamental current. Due to the arbitrary load phase angle $\varphi$, no a priori statement regarding the location of the worst-case current ripple with respect to the peak fundamental current can be made, hence the worst-case $B = B_0 + B_{HF}$ has to be considered.

Using the well-known area product $A_{Fe} A_w$ (winding window area $A_w$) [59] a P26/16 core with $A_{Fe} = 87$ mm² is chosen from all cores of the selected material and shape (core dimensions indicated in Fig. 6 (a.i)-(a.ii)). With (9) and (10) together with the now known $A_{Fe}$, the minimum required number of turns is found as

$$
N_{t,\text{min}} = \left[ \frac{1}{B_{max} A_{Fe}} \left( L_{HF} i_{L,0} + \frac{V_{dc}}{8(M - 1)f_{\text{ripple}}} \right) \right] = 6. 
$$

(11)

This finally results in $B_0 = 155$ mT and $B_{HF} = 60$ mT and therefore a worst-case $B = 215$ mT.

### B. Winding Arrangement

It was already shown in [10] that HF litz wire results in minimum losses for the given inductor current profile. In particular, the losses at fundamental frequency $f_{\text{out}}$ are substantially lower compared to, e.g., solid round or flat wire. The above mentioned encapsulation of the inductor in the metallic housing has the disadvantage that it is more difficult to extract heat from the winding. While heat extraction is improved by potting the winding in the core with a thermally conductive compound (Bergquist TGF 3500LVO), still a relatively conservative maximum LF rms current density...
TABLE II. Calculated (italic) and measured (upright) ac resistance at $f = 100\,\text{kHz}$ and $f = 1.6\,\text{MHz}$ with resulting ac winding losses. In all cases, the losses are calculated with the simulated inductor current spectrum at the nominal operating point.

<table>
<thead>
<tr>
<th>$R_{ac}$</th>
<th>$R_{ac,100k}$</th>
<th>$R_{ac,1.6M}$</th>
<th>$P_{100k}$</th>
<th>$P_{HF}$</th>
<th>$P_{tot}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>m$\Omega$</td>
<td>m$\Omega$</td>
<td>m$\Omega$</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Arr. 1</td>
<td>3.40</td>
<td>9.14</td>
<td>1.47</td>
<td>2.10</td>
<td>29.21</td>
</tr>
<tr>
<td>Arr. 2</td>
<td>6.20</td>
<td>4.53</td>
<td>0.33</td>
<td>1.04</td>
<td>6.63</td>
</tr>
<tr>
<td>$L_{br1}$</td>
<td>3.15</td>
<td>4.41</td>
<td>0.28</td>
<td>1.02</td>
<td>6.59</td>
</tr>
<tr>
<td>$L_{br2}$</td>
<td>3.46</td>
<td>5.84</td>
<td>0.30</td>
<td>1.28</td>
<td>6.84</td>
</tr>
<tr>
<td>$L_{br3}$</td>
<td>3.25</td>
<td>4.83</td>
<td>0.33</td>
<td>1.18</td>
<td>7.59</td>
</tr>
</tbody>
</table>

$S_{\text{rms}} = 7\,\text{A}$ is selected to prevent substantial self-heating. With $I_{L,\text{rms}} = 17\,\text{A}$ (obtained from circuit simulation at the nominal operating point), a copper cross section $A_{Cu} \approx 2.5\,\text{mm}^2$ is required. A HF litz wire with $625 \times 71\,\mu\text{m}$ strands is selected (twisted with $5 \times 5 \times 25$ bundles/strands).

C. AC Winding Losses

The air gap is calculated to obtain the required inductance ($\delta = 1.6\,\text{mm}$ in this case) and as mentioned earlier, is realized only in the center pillar of the pot core to avoid external stray fields. Inside the winding window, however, there is a strong magnetic field near the gap. Consequently, this region must not be filled with turns. Fig. 6 motivates this by comparing two winding arrangements in the P26/16 core that both lead to the desired $L_{br}$. In Fig. 6 (a.i), the $N_t = 6$ turns are placed uniformly distributed starting from the bottom (Arrangement 1), whereas in Fig. 6 (a.ii) the area in the vicinity of the air gap is kept empty (Arrangement 2). In both cases, the magnetic field $H$ in the winding window is obtained with Finite Element Method (FEM) simulations (also indicated in Fig. 6 (a.i)-(a.ii)). Thereby, a homogeneous current density in each turn is assumed (ideal HF litz wire). The ac resistance $R_{ac}(f)$ versus frequency with contributions from the skin effect and proximity effect (internal and external) is then analytically calculated as proposed in [60] using the mean $H_{\text{rms}}^2$ in each of the turns. Of particular interest are the values $R_{ac,100k}$ at $f_{\text{out,max}} = 100\,\text{kHz}$ and $R_{ac,1.6M}$ at $f_{\text{iLpp}} = 1.6\,\text{MHz}$. As visualized in Fig. 6 (b), compared to arrangement 1, arrangement 2 reduces $R_{ac,100k}$ by a factor of 2 and $R_{ac,1.6M}$ by almost a factor of 4.5 for the same dc resistance and is therefore clearly preferred.

Finally, Fig. 6 (c) compares the measured $R_{ac}$ of the three constructed prototypes according to winding arrangement 2 with the analytically calculated value and reveals very close matching. The measurements are obtained with a precision impedance analyzer (Agilent Technologies 4294A). Moreover, Fig. 6 (c) includes a picture of one inductor winding (Arrangement 2) where the empty space on the inner side is visible.

The resulting ac winding losses are estimated with the simulated inductor current spectrum for the nominal operating point and with $R_{ac}(f)$ (calculated and/or measured). Table II summarizes the results for the calculated $R_{ac}$ (italic) of arrangements 1 and 2 and the measured $R_{ac}$ (upright) of the three inductors and the resulting ac winding losses. The total losses $P_{\text{tot}}$ are split into a fundamental component $P_{100k}$ and a HF component $P_{HF}$. The latter contains everything except the fundamental component, i.e., all frequency components above $100\,\text{kHz}$. There are slight variations between the three inductors due to tolerances in the manufacturing, but in general the measured $R_{ac}$ conforms very well with the model. It can be seen that the ac winding losses are mainly determined by the HF components of the current ripple. The calculated $R_{ac}$ in Fig. 6 (c) is further decomposed into contributions from the skin-, internal and external proximity effect. The latter clearly dominates $R_{ac}$ for frequencies above several hundred kilohertz and could be reduced by using a litz wire with a smaller strand diameter, e.g., $40\,\mu\text{m}$ instead of $71\,\mu\text{m}$, and more strands to achieve a similar copper cross section area. This would lower $P_{HF}$ (approximately by a factor of $3 - 4$ predicted with calculations, i.e., by around $16 - 18\,\text{W}$ for all three inductors combined), however, the filling factor is also reduced and the placement of the winding according to arrangement 2 is potentially not possible anymore. In addition, the combined ac winding losses of all three branch inductors are below $25\,\text{W}$, which is a very insignificant part of the total converter losses, as will be shown later.

D. Core Losses and Properties of the Realized Inductors

The realized inductor prototypes are potted in metallic cooling enclosures with the same compound used to pot the winding in the core. The final prototypes have a Self-Resonance Frequency (SRF) of approximately $23\,\text{MHz}$, which is significantly above $f_{\text{iLpp}}$. The saturation current is determined with measurements and exceeds $60\,\text{A}$. The worst-case core losses in the final inductors are roughly estimated with electrical loss measurements with sinusoidal excitation in a series-resonant configuration as proposed in [61]. Due to the highly non-uniform flux distribution in pot cores, the P26/16 core is also used for the loss measurements in order to have conditions as similar as possible to those in the final operation. Pure ac excitation with $100\,\text{kHz}$ and $B_{dc} = 155\,\text{mT}$ results in losses of $\approx 1.1\,\text{W}$. Similarly, a HF excitation with a sinusoidal flux density profile with $f_{\text{iLpp}} = 1.6\,\text{MHz}$ and $B_{HF,\text{avg}} = 49\,\text{mT}$ (amplitude of the first harmonic in the triangular flux density profile with $B_{HF} = 60\,\text{mT}$, cf. (10)) on top of a dc bias $B_{dc} = [0, 77, 100, 125, 155]\,\text{mT}$ results in losses of $[1.53, 2.45, 3.01, 4.17, 4.34]\,\text{W}$. Therefore, the worst-case weighted core losses over one fundamental period are $\approx 4.2\,\text{W}$ (sum of the $100\,\text{kHz}$ loss component and the weighted HF ripple components for different dc bias flux densities occurring over one fundamental period). Here, always the maximum HF flux density ripple is assumed. In practice, depending on the duty-cycle, the HF flux density ripple and therefore the associated core losses are smaller, so core losses of $\approx 4.2\,\text{W}$ per inductor are a worst-case approximation. Due to the large surface and very good thermal connection of the core to the cooling housing, the core losses can anyway be dissipated very well. Furthermore, similar to the ac winding losses, compared to the semiconductor losses, the core losses do not significantly influence the system efficiency.

IV. HARDWARE DEMONSTRATOR

This section describes the development of a 3L3 converter hardware prototype, which is used to demonstrate the full func-

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.
for almost 40% of the total volume. The capacitor share is composed of the dc link capacitors ($\approx 14 \, \mu F$ effective capacitance between DC+ and DC- at $V_{\text{dc}} = 800 \, \text{V}$, split as $2 \times 28 \, \mu F$ at 400 V at the split dc link), the FCs ($\approx 6 \, \mu F$ effective capacitance at $V_{\text{FC}} = 400 \, \text{V}$) and the output filter capacitance ($3 \times 33 \, \text{nF COG}$).

### A. Power Stage Layout Considerations

To minimize the $V - I$ overlap losses, the switching transitions must be as fast as possible (high $dv/dt$, cf. (7)), demanding careful layout of the GDs and the power commutation loop. Fig. 9 (a) shows the bottom view of the power stage PCB with the top-side cooled switches. The layout of the three branches is fully modular and could easily be extended for any $N$. Fig. 9 (b) exemplary highlights the coplanar commutation loop layout for the outer switching cell of the third branch ($T_{3\text{HL}}$ and $T_{3\text{L}}$). The layout is identical for all six switching cells. The copper plane on the inner layer 6 (distance of 150 $\mu m$ to the bottom layer where the switches are mounted, cf. Fig. 9 (d)) is on “DC-” potential and provides a coplanar return path for the commutation current (indicated with the blue and cyan arrow for current flow on the bottom and the inner layer 6, respectively). Measurements reveal a contribution of the PCB layout to the power loop inductance of $<2.2 \, \text{nH}$ for the outer switching cells and of $<1.5 \, \text{nH}$ for the inner switching cells (copper plates that cover the entire semiconductor footprint are used instead of the transistors to account solely the PCB contribution). In the latter case, the inductance is lower because the source of $T_{4\text{HL}}$ and the drain of $T_{4\text{L}}$ are directly connected at the respective switch-nodes “SWi” ($i \in \{1, 2, 3\}$). Related to the device internal inductance of 3.5 $\text{nH}$ between drain and source, the layout does not significantly contribute to the overall power loop inductance.

Fig. 9 (c) shows the GD layout of $T_{3\text{L}}$ (identical layout for all 12 switches). In contrast to Metal-Oxide-Semiconductor Field-Effect Transistors (MOSFETs) with an isolated gate, the utilized GaN GITs have a diode between gate and source, which after turn-on requires a small steady-state gate current $I_{g,ss}$ to flow [57]. Therefore, a GD circuit featuring an ac-coupled high current turn-on path and in parallel a dc-coupled path for the small $I_{g,ss}$ based on the one proposed in [62] is utilized. In this case, the GD has a bipolar supply to ensure a constant negative bias $V_{GD-} = -5 \, \text{V}$ in the off-state to prevent parasitic turn-on. Similar to the power loop, the gate loop is realized coplanar to minimize the loop inductance (return path on inner layer 1). Measurements reveal a turn-on and turn-off gate loop inductance of $L_{g,\text{on}} \approx 3 \, \text{nH}$ and $L_{g,\text{off}} \approx 4 \, \text{nH}$. Since the GD Integrated Circuit (IC) sink pin (for the turn-off) is located further away from the GaN transistor, it makes sense that $L_{g,\text{off}} > L_{g,\text{on}}$. Compared to the measured device internal gate to source loop inductance of $L_{G,\text{int}} \approx 7 \, \text{nH}$, the external contribution is minor. Strict care was taken to prevent overlapping copper of jumping potentials (e.g. the source nodes) and logic nets (e.g. PWM, enable or measurement signals coming from/to the FPGA), which would introduce significant CM current over the parasitic

---

© 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.


The three branch inductors is designed. Fig. 10 shows a semi-transparent bottom view of the coldplate and the power PCB. The path for the coolant (H₂O in this case) from the inlet (cold side, blue) to the output (hot side, red) through the single meandering channel as well as the positions of the components to be cooled are highlighted. A detailed side view of the assembly is shown at the bottom of Fig. 10. The rectangular water channel is 15 mm wide to match the width of the power transistor’s heat pad. Due to the coldplate’s low profile of only 3 mm (2 mm channel height), the 1.5 mm thick cover plate is glued to the coldplate to ensure a fully sealed assembly. An electrically insulating thermal gap pad (indicated with pink color) attaches the transistors to the coldplate.

With the resulting volume flow of roughly $\dot{V} = 2\ \text{l/min}$ and the measured water inlet and outlet temperatures $\vartheta_{\text{H₂O,in}}$ and $\vartheta_{\text{H₂O,out}}$, the losses can be estimated from the basic heat transfer equation with

$$P_{\text{loss tot}} \approx \dot{Q} = \rho_{\text{H₂O}} \cdot \dot{V} \cdot c_{\text{H₂O}} \cdot (\vartheta_{\text{H₂O,out}} - \vartheta_{\text{H₂O,in}}) \quad (12)$$

where $c_{\text{H₂O}} = 4181\ \text{J/(kg·K)}$ is the specific heat capacity and $\rho_{\text{H₂O}} = 997\ \text{kg/m}^3$ the density of water at a temperature of 25°C, respectively. Note that (12) neglects heat dissipation from the surface of the coldplate, e.g., through natural convection or radiation. Therefore, the actual total losses are slightly higher than the estimated ones. With (12), the calculated total losses of $\approx 420\ \text{W}$ would only lead to a temperature difference of 3 K, which is difficult to measure accurately. Nevertheless, the simple estimation is a good sanity check for the precision electrical loss measurements presented later (cf. Section V-A). Note that the coolant (H₂O) heats up while flowing from the inlet towards the outlet such that, e.g., underneath $T_{\text{L3}}$ and $T_{\text{L3}}$ the temperature is already higher than the inlet temperature $\vartheta_{\text{H₂O,in}}$. This could in principle lead to unequal semiconductor temperatures and therefore, to unbalanced losses (e.g., due to the strongly temperature-dependent on-state resistance in GaN devices). However, thanks to the relatively high volume flow and short channel length, with the given cooling solution the overall temperature difference between inlet and outlet is only roughly 3 K (at maximum losses of $\approx 420\ \text{W}$) and therefore, there is

B. Cooling System

An anticipated semiconductor efficiency of $\eta_{\text{semi}} \approx 96\%$ at a power level of $P_{\text{out}} = 10\ \text{kW}$ (nominal ohmic load) results in power losses of roughly $P_{\text{loss,device}} = 35\ \text{W}$ in each switching transistor of the 3L3 design (total semiconductor losses $\approx 420\ \text{W}$). The resultant semiconductor power loss density of $\approx 42\ \text{W/cm}^2$ demands for liquid cooling to efficiently dissipate the heat without excessive temperature rise [63]. Along with reliability concerns, a too high junction temperature $\vartheta_j$ of the GaN transistor substantially increases the conduction losses because of the very pronounced temperature dependence of the on-state resistance $R_{\text{ds,off}}$ [53]. A custom-milled aluminum liquid cooling coldplate to cool the 12 power transistors and the three branch inductors is designed. Fig. 10 shows a semi-transparent bottom view of the coldplate and the power PCB.

The simplified setup of the liquid cooling solution for the 3L3 converter with indicated dimensions of the overall system ($123 \times 123\ \text{mm}$), the thickness of the coldplate ($4.5\ \text{mm}$) and the dimensions of the liquid cooling channel ($15 \times 2\ \text{mm}$). From the measured coolant inlet and outlet temperature and the volume flow of approximately $2\ \text{l/min}$, the total losses can be estimated.

The capacitance due to the high $dv/dt$ and potentially lead to distortions and/or malfunction of the digital circuit.

\begin{align*}
\vartheta_{\text{H₂O,in}} & \\
\approx & \ 2\ \text{liter/min} \\
\end{align*}

Fig. 9. (a) Power stage of the 3L3 with highlighted individual branches and corresponding components. (b) Detailed view of the coplanar power commutation loop (drain to source) in the outer FC cell of branch 3. (c) Detailed view of the coplanar Gate (G) to Kelvin Source (KS; equal to $GND_{GDS}$ potential) commutation loop of one switch ($T_{4H1}$) (turn-on path highlighted; equivalent for the turn-off path). (d) Layer stack of the 2.4 mm thick power PCB. Each copper layer has a thickness of $70\ \mu\text{m}$ (2 oz.).

Fig. 10. Simplified setup of the liquid cooling solution for the 3L3 converter with indicated dimensions of the overall system ($123 \times 123\ \text{mm}$), the thickness of the coldplate ($4.5\ \text{mm}$) and the dimensions of the liquid cooling channel ($15 \times 2\ \text{mm}$). From the measured coolant inlet and outlet temperature and the volume flow of approximately $2\ \text{l/min}$, the total losses can be estimated.

The path for the coolant (H₂O in this case) from the inlet (cold side, blue) to the output (hot side, red) through the single meandering channel as well as the positions of the components to be cooled are highlighted. A detailed side view of the assembly is shown at the bottom of Fig. 10. The rectangular water channel is 15 mm wide to match the width of the power transistor’s heat pad. Due to the coldplate’s low profile of only 3 mm (2 mm channel height), the 1.5 mm thick cover plate is glued to the coldplate to ensure a fully sealed assembly. An electrically insulating thermal gap pad (indicated with pink color) attaches the transistors to the coldplate.

With the resulting volume flow of roughly $\dot{V} = 2\ \text{l/min}$ and the measured water inlet and outlet temperatures $\vartheta_{\text{H₂O,in}}$ and $\vartheta_{\text{H₂O,out}}$, the losses can be estimated from the basic heat transfer equation with

$$P_{\text{loss tot}} \approx \dot{Q} = \rho_{\text{H₂O}} \cdot \dot{V} \cdot c_{\text{H₂O}} \cdot (\vartheta_{\text{H₂O,out}} - \vartheta_{\text{H₂O,in}}) \quad (12)$$

where $c_{\text{H₂O}} = 4181\ \text{J/(kg·K)}$ is the specific heat capacity and $\rho_{\text{H₂O}} = 997\ \text{kg/m}^3$ the density of water at a temperature of 25°C, respectively. Note that (12) neglects heat dissipation from the surface of the coldplate, e.g., through natural convection or radiation. Therefore, the actual total losses are slightly higher than the estimated ones. With (12), the calculated total losses of $\approx 420\ \text{W}$ would only lead to a temperature difference of 3 K, which is difficult to measure accurately. Nevertheless, the simple estimation is a good sanity check for the precision electrical loss measurements presented later (cf. Section V-A). Note that the coolant (H₂O) heats up while flowing from the inlet towards the outlet such that, e.g., underneath $T_{\text{L3}}$ and $T_{\text{L3}}$ the temperature is already higher than the inlet temperature $\vartheta_{\text{H₂O,in}}$. This could in principle lead to unequal semiconductor temperatures and therefore, to unbalanced losses (e.g., due to the strongly temperature-dependent on-state resistance in GaN devices). However, thanks to the relatively high volume flow and short channel length, with the given cooling solution the overall temperature difference between inlet and outlet is only roughly 3 K (at maximum losses of $\approx 420\ \text{W}$) and therefore, there is

The simplified setup of the liquid cooling solution for the 3L3 converter with indicated dimensions of the overall system ($123 \times 123\ \text{mm}$), the thickness of the coldplate ($4.5\ \text{mm}$) and the dimensions of the liquid cooling channel ($15 \times 2\ \text{mm}$). From the measured coolant inlet and outlet temperature and the volume flow of approximately $2\ \text{l/min}$, the total losses can be estimated.

The path for the coolant (H₂O in this case) from the inlet (cold side, blue) to the output (hot side, red) through the single meandering channel as well as the positions of the components to be cooled are highlighted. A detailed side view of the assembly is shown at the bottom of Fig. 10. The rectangular water channel is 15 mm wide to match the width of the power transistor’s heat pad. Due to the coldplate’s low profile of only 3 mm (2 mm channel height), the 1.5 mm thick cover plate is glued to the coldplate to ensure a fully sealed assembly. An electrically insulating thermal gap pad (indicated with pink color) attaches the transistors to the coldplate.

With the resulting volume flow of roughly $\dot{V} = 2\ \text{l/min}$ and the measured water inlet and outlet temperatures $\vartheta_{\text{H₂O,in}}$ and $\vartheta_{\text{H₂O,out}}$, the losses can be estimated from the basic heat transfer equation with

$$P_{\text{loss tot}} \approx \dot{Q} = \rho_{\text{H₂O}} \cdot \dot{V} \cdot c_{\text{H₂O}} \cdot (\vartheta_{\text{H₂O,out}} - \vartheta_{\text{H₂O,in}}) \quad (12)$$

where $c_{\text{H₂O}} = 4181\ \text{J/(kg·K)}$ is the specific heat capacity and $\rho_{\text{H₂O}} = 997\ \text{kg/m}^3$ the density of water at a temperature of 25°C, respectively. Note that (12) neglects heat dissipation from the surface of the coldplate, e.g., through natural convection or radiation. Therefore, the actual total losses are slightly higher than the estimated ones. With (12), the calculated total losses of $\approx 420\ \text{W}$ would only lead to a temperature difference of 3 K, which is difficult to measure accurately. Nevertheless, the simple estimation is a good sanity check for the precision electrical loss measurements presented later (cf. Section V-A). Note that the coolant (H₂O) heats up while flowing from the inlet towards the outlet such that, e.g., underneath $T_{\text{L3}}$ and $T_{\text{L3}}$ the temperature is already higher than the inlet temperature $\vartheta_{\text{H₂O,in}}$. This could in principle lead to unequal semiconductor temperatures and therefore, to unbalanced losses (e.g., due to the strongly temperature-dependent on-state resistance in GaN devices). However, thanks to the relatively high volume flow and short channel length, with the given cooling solution the overall temperature difference between inlet and outlet is only roughly 3 K (at maximum losses of $\approx 420\ \text{W}$) and therefore, there is
no significant uneven temperature distribution that could have an influence on the loss sharing and the symmetric current handling, as verified in Fig. 15. There are alternative cooling channel geometries, which do ensure a more symmetrical temperature distribution, i.e., multiple parallel channels, which is of particular importance for large heat sinks. There, it must be ensured that the volume flow is equal in all parallel channels as otherwise the temperature distribution can be significantly non-uniform [63]. The geometry in Fig. 10 is mainly selected because it inherently ensures equal volume flow throughout the channel, allows for a straight-forward channel design and still gives a small overall temperature difference thanks to the relatively small dimensions and the high volume flow.

**C. Measurement Circuits and Control**

An FPGA is responsible for the PWM pattern generation, the processing of measurement signals, the supervision of voltages and currents and for the communication with a host computer. In this article, the 3L3 is operated without closed-loop control but in a next step, suitable control schemes such as the ones proposed in [64] are implemented. This has to be done directly in hardware on the FPGA for maximum efficiency (purely ohmic load) and for different output frequencies \( f_{\text{out}} \) \( \in \{3.3\, \text{kW}, 5\, \text{kW}, 7.3\, \text{kW}, 8\, \text{kW}, 9\, \text{kW}, 10\, \text{kW}\} \) (purely ohmic load) and for different output frequencies \( f_{\text{out}} \in \{10\, \text{kHz}, 20\, \text{kHz}, 50\, \text{kHz}, 100\, \text{kHz}\} \).

In all cases, the system is supplied with a split dc link composed of two 400 V dc voltage sources and a fixed dead time \( t_d = 24\, \text{ns} \) is set. The load resistor \( R_L \) is realized with multiple parallel low-inductive planar resistors (Ohmite TAP800). To minimize the parasitic inductance \( L_L \) from the connection between converter output and load, depending on \( P_{\text{out}} \), multiple separate load resistor assemblies are connected directly to the converter output. At maximum \( f_{\text{out}} \) and \( P_{\text{out}} \), the influence of \( L_L \) is the strongest, since at this operating point simultaneously the inductive voltage drop becomes maximum and the load resistance minimum. This worst-case corresponds to the nominal operating point, where \( L_L \approx 1.4\, \mu\text{H} \), and results in a phase-shift of roughly 10° between \( v_{\text{out}} \) and \( i_{\text{out}} \cos(\varphi) \approx 0.985 \).

The efficiency measurements are performed with a precision power analyzer (Yokogawa WT3000) and are depicted in Fig. 11 (a). The measured efficiency (dots) is compared with calculations (continuous lines) resulting from the loss model introduced in Section II-C, additionally including the branch inductor ac winding losses according to Section III-C. The target efficiency of 95% is achieved over a very wide range of \( P_{\text{out}} \) and \( f_{\text{out}} \). In the nominal operating point (marked with \( \star \)) the efficiency is 95.8%. The partial load efficiency is worse for higher \( f_{\text{out}} \) because of the capacitive current \( i_C \) in \( C_{\text{filt}} \), which depends solely on \( f_{\text{out}} \) and \( v_{\text{out}} \) and not on \( P_{\text{out}} \) (for a fixed output voltage). This current generates switching and conduction losses in the power stage. Furthermore, Fig. 11 shows detailed calculated loss breakdowns for \( f_{\text{out}} = 10\, \text{kHz} \), \( f_{\text{out}} = 50\, \text{kHz} \) and \( f_{\text{out}} = 100\, \text{kHz} \). In general, the measurements conform very well with the calculations and the absolute difference between the measured and calculated losses is < 25 W. Because measurements give a better efficiency at low \( P_{\text{out}} \) than the calculations but a lower efficiency at high \( P_{\text{out}} \), it can be concluded that the model underestimates ohmic losses and slightly overestimates constant losses (independent of \( P_{\text{out}} \)) and/or losses scaling linear with \( P_{\text{out}} \). More specifically, since the effect is more pronounced for high \( f_{\text{out}} \), the real system has more ac resistance in the power path than modeled. Potential sources for the ac resistance are, e.g., the primary conductor of the current transformers (two in total), PCB tracks carrying the

### TABLE III. Available measurements on the 3L3 converter with their sampling rate \( f_s \), resolution \( B \) and Bandwidth (BW).

<table>
<thead>
<tr>
<th>Signal</th>
<th>( f_s )</th>
<th>( B )</th>
<th>( \text{BW} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>Output Voltage ( V_{\text{out}} )</td>
<td>125 MSPS</td>
<td>14 Bit</td>
<td>( \approx 30 \text{ MHz} )</td>
</tr>
<tr>
<td>Load Voltage ( V_{\text{load}} )</td>
<td>125 MSPS</td>
<td>14 Bit</td>
<td>( \approx 30 \text{ MHz} )</td>
</tr>
<tr>
<td>Output Current ( i_{\text{out}} )</td>
<td>125 MSPS</td>
<td>14 Bit</td>
<td>( &gt; 30 \text{ MHz} )</td>
</tr>
<tr>
<td>Summed Ind. Current ( i_{\text{sum}} )</td>
<td>125 MSPS</td>
<td>14 Bit</td>
<td>( &gt; 30 \text{ MHz} )</td>
</tr>
<tr>
<td>Branch Currents ( i_{\text{b},1,2,3} )</td>
<td>3.125 MSPS</td>
<td>14 Bit</td>
<td>( \approx 1 \text{ MHz} )</td>
</tr>
<tr>
<td>FC Voltages ( v_{\text{FC1,2,3}} )</td>
<td>3.125 MSPS</td>
<td>14 Bit</td>
<td>( \approx 100 \text{ kHz} )</td>
</tr>
<tr>
<td>DC Link Voltages ( v_{\text{dcp},\text{dcm}} )</td>
<td>3.125 MSPS</td>
<td>14 Bit</td>
<td>( \approx 800 \text{ kHz} )</td>
</tr>
</tbody>
</table>

V. PERFORMANCE VERIFICATION

This section presents experimental measurement results to verify the performance of the 3L3 hardware prototype. As mentioned above, the converter is operated in open-loop configuration.

### A. Efficiency and Losses

The system efficiency \( \eta \) is electrically measured for sinusoidal output voltages with a constant rms value \( V_{\text{out}} = 230\, \text{V} \) at different output power levels \( P_{\text{out}} \in \{3.3\, \text{kW}, 5\, \text{kW}, 7.3\, \text{kW}, 8\, \text{kW}, 9\, \text{kW}, 10\, \text{kW}\} \) (purely ohmic load) and for different output frequencies \( f_{\text{out}} \in \{10\, \text{kHz}, 20\, \text{kHz}, 50\, \text{kHz}, 100\, \text{kHz}\} \).

The measured efficiency (dots) is compared with calculations (continuous lines) resulting from the loss model introduced in Section II-C, additionally including the branch inductor ac winding losses according to Section III-C. The target efficiency of 95% is achieved over a very wide range of \( f_{\text{out}} \) and \( f_{\text{out}} \). In the nominal operating point (marked with \( \star \)) the efficiency is 95.8%. The partial load efficiency is worse for higher \( f_{\text{out}} \) because of the capacitive current \( i_C \) in \( C_{\text{filt}} \), which depends solely on \( f_{\text{out}} \) and \( v_{\text{out}} \) and not on \( P_{\text{out}} \) (for a fixed output voltage). This current generates switching and conduction losses in the power stage. Furthermore, Fig. 11 shows detailed calculated loss breakdowns for \( f_{\text{out}} = 10\, \text{kHz} \), \( f_{\text{out}} = 50\, \text{kHz} \) and \( f_{\text{out}} = 100\, \text{kHz} \). In general, the measurements conform very well with the calculations and the absolute difference between the measured and calculated losses is < 25 W. Because measurements give a better efficiency at low \( P_{\text{out}} \) than the calculations but a lower efficiency at high \( P_{\text{out}} \), it can be concluded that the model underestimates ohmic losses and slightly overestimates constant losses (independent of \( P_{\text{out}} \)) and/or losses scaling linear with \( P_{\text{out}} \). More specifically, since the effect is more pronounced for high \( f_{\text{out}} \), the real system has more ac resistance in the power path than modeled. Potential sources for the ac resistance are, e.g., the primary conductor of the current transformers (two in total), PCB tracks carrying the...
load current (in particular the return connection from the summing node to the filter capacitor located next to the dc link) and contact resistance of soldering joints. Fig. 11 (d) nicely shows the loss “offset” for \( f_{\text{out}} = 100 \text{ kHz} \) compared to \( f_{\text{out}} = 10 \text{ kHz} \), even at low \( P_{\text{out}} \), due to the higher switching losses given by the aforementioned capacitive current in \( C_{\text{fit}} \). The semiconductor loss distribution in the nominal operating point (marked with ★) is practically identical to the one in Fig. 4 for the \( M = N = 3 \) design.

Furthermore, Fig. 11 (b) and (c) reveal that for \( f_{\text{out}} = 10 \text{ kHz} \) and \( 50 \text{ kHz} \) a better partial load efficiency (lower losses) could be achieved by implementing an adaptive dead time, which enables full ZVS also for lower switched currents. Since this would require an increase of \( t_d \) from its minimum value of 24 ns and in turn would lead to a distortion of the output voltage, further measures would need to be implemented to compensate this effect (cf. Section V-B).

Fig. 12 depicts the efficiency and a breakdown of the calculated losses (obtained with the extended loss model also used in Fig. 11) for operation with \( f_{\text{out}} = 100 \text{ kHz} \) and full-scale output current, i.e., an rms output current \( I_{\text{out}} = P_{\text{out,max}}/V_{\text{out,nom}} = 43.5 \text{ A} \) into a purely ohmic load, for different rms output voltages \( V_{\text{out}} \) between 40 V and 230 V (nominal operating point). This corresponds to the worst-case operating mode, since the output power decreases linearly with output voltage and the output current is always maximum. The efficiency drops from 96% in the nominal operating point (\( V_{\text{out}} = 230 \text{ V} \) and \( P_{\text{out}} = 10 \text{ kW} \)) to 80% for \( V_{\text{out}} = 40 \text{ V} \) and \( P_{\text{out}} = 1.74 \text{ kW} \). This is expected, since the total losses do not reduce with lower output voltage (they are more or less constant and show a slight bathtub shape), whereas the output power does (linear decrease with decreasing \( V_{\text{out}} \)). The bathtub shape of the losses can be explained with the current ripple, which depends on the modulation depth (and therefore on \( V_{\text{out}} \)). Reducing the output voltage from the nominal value of 230 V initially increases the current ripple (duty-cycle more often close to 0.25 and 0.75, which gives maximum current ripple in a three-level FCC [26]) and therefore, leads to reduced HSW losses (lower current switched hard) and more PHSW and SSW transitions. With further output voltage reduction (below approximately 170 V rms), the ripple is reduced because the duty-cycle is always close to 0.5, which in a three-level FCC gives minimum current ripple [26]. Opposite to the case of large ripple, this leads to more HSW losses (higher current switched hard) and less PHSW and SSW transitions. The conduction losses, however, slightly decrease thanks to the lower ripple but this decrease has negligible impact on the total losses. The reduced current ripple also explains the lower inductor conduction losses for low output voltages, which according to Table II are predominated by the HF losses (current ripple) at \( f_{\text{ILpp}} \) due to the substantially higher ac resistance at \( f_{\text{ILpp}} \) compared to \( f_{\text{out}} \) (approximately a factor of 50). Despite the efficiency degradation with reduced output voltage in this worst-case operation mode it is still substantially higher compared to linear PAs (best-case efficiency of 78.5% for a full-scale output into a purely ohmic load [11]).
Phase-Shifted PWM (PSPWM), where one reference signal \( v_{\text{ref}} \) is compared with \( v_{\text{SC}} \) phase-shifted carrier signals, each with a frequency \( f_{\text{sw}} \) (device switching frequency) is one possible modulation strategy for series- and/or parallel-interleaved converters [67]. This strategy is used in the presented 3L3 converter. Digital implementation of the PWM offers maximum flexibility, but has the disadvantage of finite horizontal (time) and vertical (amplitude) resolution. It was shown in [64] that a regularly sampled PWM (also called synchronous PWM), where the compare values/duty-cycles are updated at the top and bottom of a triangular carrier introduces a time delay \( T_d \) of \( 3/(2f_{\text{sw, eff}}) = 312.5 \) ns (with \( f_{\text{sw, eff}} = 4.8 \) MHz), which can substantially limit the maximum achievable controller BW. Therefore, a quasi-continuous operation of the PWM unit is favorable, i.e., the duty-cycles are updated in every FPGA clock period, resulting in an update rate of \( f_{\text{PWM}} = f_{\text{clk}} = 125 \) MHz, where 125 MHz is the maximum achievable FPGA clock frequency. This implementation is comparable to a naturally sampling PWM [30] and is used in the 3L3 to virtually eliminate the PWM delay [68]. The vertical resolution is limited by the ratio of FPGA clock frequency to the symmetric (triangular) carrier signal frequency \( f_{\text{sw}} \). The number of possible duty-cycles equals \( [f_{\text{clk}}/(2 \cdot f_{\text{sw}})] = 78 \). The 3L3 can set the local average of the switch-node voltage \( v_{\text{sw}} \) (mean over one switching period) with a resolution of \( \Delta v_{\text{sw}} = V_{\text{dc}}/78 = 10.25 \) V in each branch. Effectively, this corresponds to a voltage resolution \( \Delta(v_{\text{sw, eff}}) = \Delta v_{\text{sw}}/N = 3.42 \) V (local average of the effective switch-node voltage \( v_{\text{sw, eff}} \)) thanks to the parallel-interleaved topology. A further reduction of \( \Delta(v_{\text{sw, eff}}) \) would be possible with a High-Resolution PWM (HRPWM), which is currently being investigated as part of ongoing research on this topic.

**Fig. 13 (a)** shows measured output voltages at \( f_{\text{out}} = 100 \) kHz, \( V_{\text{out}} = 230 \) V rms and \( P_{\text{out}} = 3.3 \) kW (ohmic load) for different dead times \( t_d \) (interlocking delays). Note that the different waveforms are offset by multiples of 20 V to better visualize the distortions. The resulting output voltage spectra are depicted in **Fig. 13 (b)** and indicate the presence of odd harmonics of the output voltage. With a higher \( t_d \), the amplitudes of the odd harmonics are increasing, which is also visible in **Fig. 13 (a)** as distorted waveform around the minima and maxima of the sine. Compared to \( t_d = 48 \) ns, for \( t_d = 16 \) ns, the 3rd and 5th harmonic are substantially reduced from 7.5 V to 1 V and from 4.8 V to 0.8 V, respectively. During the dead time, depending on the direction of the switched current, either a resonant ZVS transition starts, which directly after the turn-off signal from the FPGA changes the switch-node voltage (SSW), or the switch to be turned off goes into reverse conduction and only after the signal to turn on the complementary switch, i.e., after \( t_d \), the switch-node voltage changes (HSW). This is a fundamental cause of non-linearities, in particular the source of odd harmonics [69], because depending on the direction and magnitude of the switched current, a voltage-time area is applied to the inductor, which is different from the one calculated in the controller [70]. Nevertheless, **Fig. 13** indicates a very high output voltage quality for low \( t_d \). A further improvement, however, could be desired, e.g., to reduce the error voltage applied to the output voltage controller. A common strategy to mitigate the dead time nonlinearities is to delay the switching pulses in the FPGA depending on the current direction and magnitude to effectively apply the correct voltage-time area to the inductor. This acts as feed-forward term to correct the dead time induced non-linearities. Alternatively, the actual pulses at the switch-node can be measured and compared with the ideal PWM pulses (without the dead time), which does not need output current sign and/or magnitude measurements. Noise-shaping is then applied to shift the dead time induced non-linearities to higher frequencies [71]. While the accurate measurement of the fast pulses at the switch-node can be difficult because of HF noise coming from the extremely fast \( \text{d}c/\text{d}t \) in GaN transistors, the noise-shaping approach further requires that \( f_{\text{sw}} \) (and not \( f_{\text{sw, eff}} \) because the dead time is present for every SC, i.e., occurs in every switching period 1/\( f_{\text{out}} \)) is significantly greater than \( f_{\text{out}} \) (typically, \( f_{\text{sw}}/f_{\text{out}} > 10 \) [71]), which is not given in our application. In the current implementation, no dead time correction is used and therefore, a minimum \( t_d \) should be selected in the interest of maximum output voltage quality. Because efficiency measurements showed that \( t_d = 24 \) ns has the lowest losses, this value is finally selected. The amplitudes of the 3rd and 5th harmonic are 2.5 V and 1.2 V, respectively, in this case, which is still lower than \( \Delta(v_{\text{sw, eff}}) \).
C. Exemplary Waveforms at Nominal Operating Point

Fig. 14 shows the measured output voltage $v_{\text{out}}$ (green; 100 V/div) and output current $i_{\text{out}}$ (red; 15 A/div) in the nominal operating point ($f_{\text{out}} = 100$ kHz, $P_{\text{out}} = 10$ kW (ohmic load), $V_{\text{out}} = 230$ V rms). The output represents a clean sine wave without harmonic distortion. In Fig. 15, the measured individual branch inductor currents $i_{\text{L1}}$ (blue; 6 A/div), $i_{\text{L2}}$ (red; 6 A/div) and $i_{\text{L3}}$ (green; 6 A/div) and the calculated summed current $i_{\text{sum}}$ (orange; 18 A/div) are depicted for the same operating point. It can be seen that the three currents are symmetrically distributed and correctly phase-shifted by 120° between the branches. The ripple on the resulting summed current $i_{\text{sum}}$ is therefore significantly limited (note the different vertical scale in Fig. 15). At the moment, no active (closed-loop) current balancing is employed, i.e., the current sharing is established purely by the application of the same reference signal to the PWM modulator in each branch (nominally identical voltage-time areas applied to each $L_{\text{br}}$) and by the parasitic resistances ($R_{\text{ds,on}}$ of the transistors and the winding resistance of $L_{\text{br}}$). The FC voltages are naturally balanced with an error $< \pm 2\%$ of the nominal voltage thanks to the inherent losses in the power stage. Active balancing would be possible with the currently implemented modulation to further improve the steady-state FC balancing but is not used at this stage.

Due to the compact realization and the mounting of the top-side cooled switches on the coldplate, the temperature distribution of the switches could not be measured during operation, e.g., with a thermal camera. The switch case temperatures as well as the branch inductor winding temperatures are measured with Negative Temperature Coefficient (NTC) temperature sensors. For a H$_2$O inlet temperature of $\vartheta_{\text{H}_2\text{O, in}} = 30^\circ$C the semiconductor case temperatures are all below 60°C. With a datasheet specified thermal resistance from junction to case of $R_{\text{th,j-c}} = 1$ K/W and with approximately 35 W losses per switch this leads to a junction temperature $\vartheta_1 \approx 95^\circ$C, which is well below the maximum allowed value. The inductor winding temperatures are all below 40°.

VI. Conclusion

The growing prevalence of power electronic converter systems featuring Wide-Bandgap (WBG) semiconductors with ever higher switching frequencies motivate the need for Ultra-High Bandwidth Power Amplifiers (UHBW-PAs) for characterization and testing purposes. In this article, circuit topologies for a switch-mode (Class-D) realization of such power amplifiers are analyzed. A switch-mode realization is generally desired because it features a significantly higher efficiency compared to traditional analog realizations. However, the achievable Bandwidth (BW) in switch-mode amplifiers is ultimately limited by the maximum possible switching frequency. This frequency has to be chosen sufficiently higher than the corner frequency of the output filter. Therefore, series- and parallel-interleaving concepts are investigated to allow an effective switching frequency $f_{\text{sw, eff}}$ (relevant for filtering) in the MHz range despite a moderate switching frequency $f_{\text{sw}}$ of the individual switches. Series-interleaving distributes the blocking voltage stress and parallel interleaving the current stress between multiple devices. Therefore, the combination of both approaches, i.e., the parallel-interleaved operation of several series-interleaved (multi-level) bridge-legs, is comprehensively analyzed in terms of loss, volume and design complexity scaling and eventually a Three-Level Triple-Interleaved (3L3) Flying Capacitor Converter (FCC) is selected. A highly compact hardware demonstrator (power density of 25 kW/dm$^3$ / 410 W/in$^3$) of a single-phase module is realized for performance verification. Important design aspects are described in detail. Experimental results verify the operation with 10 kVA output power (per phase) at 230 V rms output voltage with an output frequency of 100 kHz (nominal operating point), where a system efficiency of 95.8% and excellent output voltage quality (3rd and 5th harmonic 2.5 V
and 1.2 V, respectively) is achieved. A three-phase output can be realized with three individual single-phase modules of the presented converter or alternatively, the three parallel-interleaved branches of a 3L3 converter can be reconfigured for a three-phase output.

The presented versatile UHBW-PA module serves as a solid basis for testing and characterization of future power converters with WBG semiconductors. Further work on this topic includes the implementation of highly dynamic closed-loop controllers for different application scenarios, e.g., general amplification, virtual impedance emulation, machine emulation and so on. Important characterization parameters are then dynamic specifications such as the responses to a reference voltage step and to a load step (disturbance). Thereby, also a High-Resolution Pulse Width Modulation (PWM) (HRPWM) and dead time correction are advantageously implemented to additionally improve the output voltage quality and/or to achieve high precision reference tracking.

**Appendix A**

**Theoretical Performance Limitation**

This appendix describes the output power limitations/derating for different load types, i.e., load phase angles. This derivation has already been presented in [10] and is included here for completeness. Given the sinusoidal nature of the output voltage and current, theoretical output power limitations for the proposed converter can directly be derived using the phasor representation of output voltage and current (cf. phasor diagram in Fig. 3 (a,ii)). For the nominal operating point at $f_{\text{out}} = 100$ kHz and $V_{\text{out}} = 230$ V rms, the theoretical possible apparent output power $S_{\text{Lim}}$ for each load phase angle $\varphi$ (phase-shift between output voltage and current waveforms) is depicted in Fig. 16 in green. It is limited by two constraints:

1. The maximum possible voltage amplitude that can be generated at the switch node, basically given by $V_{\text{dc}}/2$, here set to $0.95 \cdot V_{\text{dc}}/2$ to have a certain margin (corresponds to a modulation depth $m_{\text{M}} = 0.95$), i.e., $\left| V_{\text{w},\text{eff}} \right| \leq m_{\text{M}} \cdot V_{\text{dc}}/2$ according to the phasor diagram.

**Fig. 15.** Measured branch inductor currents $i_{L1}$ (blue), $i_{L2}$ (red) and $i_{L3}$ (green) on a scale of 6 A/div indicating balanced current sharing between the three branches. Additionally, $i_{\text{sum}}$ (orange, 18 A/div) indicates the reduced current ripple thanks to the parallel interleaving.

**Fig. 16.** Theoretical limitation $S_{\text{Lim}}$ (green) of the apparent output power versus the load phase angle $\varphi$ (phase-shift between the output voltage and current waveforms) for the given selection of the filter elements ($L_{\text{filt}} = 1.26 \mu F$, $C_{\text{filt}} = 99 \mathrm{nF}$) for a 100 kHz sinusoidal output signal with 230 Vrms relative to a nominal output power of 10 kVA (blue) and ohmic load ($\varphi = 0^\circ$). The limitation is derived from the switch rms current for $M = 3$ and $N = 3$ under the assumption of a worst-case peak-to-peak current ripple of 100% (with respect to the fundamental component). The required derating of around 27.5% of the output power for capacitive loads is highlighted in red; Figure based on [10].
in Fig. 3 (a(ii)).

ii) The maximum possible switch rms current, which is limited by the semiconductor losses and thus highly depends on the thermal design.

The rms current derived from the conduction losses occurring for $M = 3$ and $N = 3$ and ohmic load in the nominal operating point is taken as a reference (cf. Fig. 4 (a)), i.e., $I_{sw, rms} \approx 11.6 \, A$. The inductor current relates the switch current to the output current and its rms value is calculated under the assumption of a triangular peak-to-peak current ripple of $100 \, \%$ (with respect to the amplitude of the fundamental component) as worst-case approximation. Since the current limit is derived from the ohmic load case, the $S_{lim}$ circle intersects with the blue circle representing the nominal power $S_{Nom} = 10 \, kVA$ at $\varphi = 0\, ^\circ$ and $\varphi = 180\, ^\circ$. It is shown that for capacitive-resistive loads ($0^\circ < \varphi < 180^\circ$) the current limitation (constraint ii) in above list) leads to an output power derating of maximum $27.5\, \%$, which is much lower compared to state-of-the-art linear power amplifiers (derating by a factor of three) [13]. For all other load cases, there is a sufficient margin, especially for inductive loads (> 30\% margin) because they partially compensate the reactive power of the output filter. If higher currents would be allowed, the maximum possible converter voltage (constraint iii) in above list) would limit the maximum output power for inductive loads ($\varphi = -90^\circ$), since in this case $\omega_{inj}$ and $\omega_L$ are in phase (cf. Fig. 3 (a(ii))) and arithmetically add up to $\omega_{sw, eff}$. This cannot be seen in Fig. 16 due to the strong current limit and because the inductance is chosen based on limiting the maximum inductor voltage to $k_v = 15 \%$ of the peak output voltage for an ohmic load scenario.

REFERENCES


Johann W. Kolar (F’10) received his M.Sc. and Ph.D. degree (summa cum laude) from the University of Technology Vienna, Austria, in 1997 and 1999, respectively. Since 1984, he has been working as an independent researcher and international consultant in close collaboration with the Vienna University of Technology, in the fields of power electronics, industrial electronics and high performance drive systems. He was appointed Assoc. Professor and Head of the Power Electronic Systems Laboratory at the Swiss Federal Institute of Technology (ETH) Zurich on Feb. 1, 2001, and was promoted to the rank of Full Prof. in 2004. Dr. Kolar has proposed numerous novel converter concepts incl. the Vienna Rectifier, the Sparse Matrix Converter and the Swiss Rectifier, has spearheaded the development of x-million rpm motors, and has pioneered fully automated multi-objective power electronics design procedures. He has graduated 80+ Ph.D. students, has published 1000+ journal and conference papers and 4 book chapters, and has filed 200+ patents. He has presented 30+ educational seminars at leading international conferences and has served as IEEE PELS Distinguished Lecturer from 2012 – 2016. He has received 40+ IEEE Transactions and Conference Prize Paper Awards, the 2014 IEEE Power Electronics Society R. David Middlebrook Achievement Award, the 2016 IEEE PEMC Council Award, the 2016 IEEE William E. Newell Power Electronics Award, the 2021 EPE Outstanding Achievement Award and 2 ETH Zurich Golden Owl Awards for excellence in teaching. He is a Fellow of the IEEE and was elected to the U.S. National Academy of Engineering as an international member in 2021. The focus of his current research is on ultra-compact/efficient WBG converter systems, ANN-based design procedures, Solid-State Transformers, ultra-high speed drives, bearingless motors, and life cycle analysis of power electronics converter systems.

Dominik Bortis (SM’21) received the M.Sc. and Ph.D. degree in electrical engineering from the Swiss Federal Institute of Technology (ETH) Zurich, Switzerland, in 2005 and 2008, respectively. In May 2005, he joined the Power Electronic Systems Laboratory (PES), ETH Zurich, as a Ph.D. student. From 2008 to 2011, he has been a Postdoctoral Fellow and from 2011 to 2016 a Research Associate with PES. Since January 2016 Dr. Bortis is heading the research group Advanced Mechatronic Systems at PES, which concentrates on ultra-high speed motors, bearingless drives, linear-rotary actuator and machine concepts with integrated power electronics. Targeted applications include e.g. highly dynamic positioning systems, medical systems, and future mobility concepts. Dr. Bortis has published 90+ scientific papers in international journals and conference proceedings. He has filed 30+ patents and has received 10 IEEE Conference Prize Paper Awards and 2 First Prize Transaction Paper Awards.