Computationally Efficient Integration of Complex Thermal Multi-Chip Power Module Models into Circuit Simulators

Uwe DROFENIK*, Didier COTTET**, Andreas MÜSING*, Jean-Marc MEYER*** and Johann W. KOLAR*

* Power Electronic Systems Laboratory, ETH Zurich, ETH-Zentrum / ETL H13, CH-8092 Zurich, Switzerland
Phone: +41-1-632-4267, Fax: +41-1-632-1212, E-mail: drofenik@lem.ee.ethz.ch
** ABB Switzerland Ltd, Corporate Research, CH-5405 Baden-Dättwil, Switzerland
*** ABB Switzerland Ltd, CH-5300 Turgi, Switzerland

Abstract – For analyzing reliability or short-term overload conditions of power electronic systems, it is necessary to know transient temperatures of the power semiconductors. Directly coupling thermal and circuit simulators increases the simulation time by orders of magnitude, therefore making such an approach impractical. A well-known solution to this problem is to extract thermal equivalent circuits from 3D-field simulations and to insert them directly into the circuit simulator. In this paper we discuss the poor scaling performance of this state-of-the-art approach. There is an enormous increase in simulation time if there are more than just a few chips thermally modeled. We propose a general procedure at the circuit simulator solver level to increase the calculation speed of such a coupled simulation significantly.

Index Terms – electric-thermal simulation, increasing simulation speed, thermal coupling

I. INTRODUCTION

Thermal conduction inside a solid structure is described by the heat conduction equation

\[ c_p(T) \cdot \rho \frac{\partial T}{\partial t} = \nabla \cdot (\lambda(T) \cdot \nabla T) + w(x,t) \]  

(1)

with thermal capacitance \( c_p [\text{Ws/(Kkg)}] \), material density \( \rho [\text{kg/m}^3] \), temperature dependent thermal conductivity \( \lambda [\text{W/Km}] \), thermal power density \( w [\text{W/m}^3] \), and temperature \( T [\text{K}] \). Solving the equation for the 3D-structure of a power module employing the Finite Element Method (FEM) is of very high computational effort. Therefore, solving (1) inside the time-loop of a circuit simulation at every time step (coupling the calculated temperature with the temperature dependent semiconductor properties of the circuit simulator and, vice-versa, coupling the calculated semiconductor power loss of the circuit simulator with the heat sources of the 3D-FEM solver) will result in impractically long simulation times. A well-known solution to this problem is to extract a thermal model from the 3D structure of the power module, and to integrate this significantly simplified model into the circuit simulation.

To create a simplified thermal model describing the dependency between the chip junction temperature and the thermal losses, two different approaches are possible. One method is based on the Finite Difference Method (FDM) where the 3D-geometry is divided into many small volume elements, with the heat conduction equation (1) linearized within each volume element [1], [2]. Mathematically describing the boundary conditions of each element results in a huge set of coupled linear equations, that can be written in matrix-form. If the number of the elements remains small, e.g. just three or four elements per chip, the resulting model for a multi-chip module will be quite compact, but in most cases not very accurate. With such a model it is generally difficult to model systematically and accurately the effects of thermal coupling between individual chips inside a power module. The structure of such an equivalent circuit is known as “Cauer”-type.

The alternative modeling approach is based on an impedance matrix [3]. Here, equation (1) is assumed to be a linear differential equation (\( c_p, \rho, \) and \( \lambda \) not dependent on temperature) over the whole volume which is in good approximation true for most power electronic applications. Applying superposition, all thermal contributions are modeled by thermal impedance circuits that show approximately the same signal-behavior as the 3D-structure, but do not have physical meaning. This modeling procedure can be easily automatized. The resulting thermal equivalent circuit structures could theoretically be of “Cauer”-type, but are typically presented in form of “Foster”-type (Fig.1), because of the simpler modeling procedure.
Both modeling approaches have been described in literature, and also the integration into a circuit simulation is well known (e.g. [4], [5], [6], [7], [8], [9], [10], [11]). Integrating thermal models of multi-chip modules becomes a problem when more than just a few chips should be considered, because the calculation effort of the simulation is proportional to the third order of the model size (as shown in the following), and the contribution of the thermal model might be much larger than the contribution of the power circuit. Therefore, integrating a multi-chip module into a circuit simulation might result in extremely long simulation times which might become impractical. After shortly introducing the concept of thermal equivalent circuits in section II, the scaling problem characterizing the relationship between model size and computational effort will be discussed in detail in section III. In section IV, a simple but very effective method, to be applied at the solver-level of the circuit simulator, will be introduced that can improve the computational efficiency of such models by many orders of magnitude. Finally, in section V as an example the simulation of a 36-chip power module is discussed.

II. BUILDING A THERMAL MODEL OF A MULTI-CHIP MODULE FOR A CIRCUIT SIMULATION

A. Assuming No Thermal Coupling between Chips

If a constant power loss $P_V=\text{const}$ of the semiconductor chip is “switched on” (step-function) at $t=0$, the junction temperature will rise as shown in Fig.1(b) (logarithmic time-scale). This thermal step response can be numerically calculated from a 3D-model employing FEM or can be measured ([1], [4], [7]). Normalizing the rise of the junction temperature by using unground power $P_V$ gives the transient thermal impedance. Setting the parameters ($R_{th,i}$ and $C_{th,i}$) accordingly, an equivalent circuit as shown in Fig.1(a), with a signal-controlled current source representing thermal loss and input voltage representing junction temperature, will give approximately the same step-response. Therefore, the equivalent circuit approximately describes the thermal behavior of the junction of the according 3D-structure, but is significantly simplified and can be easily built into a circuit simulator employing resistor, capacitor, signal-controlled current source and voltage measurement. How to build the thermal equivalent circuit from the thermal step response and integrate it into the circuit simulator is described in detail in the literature ([8],[12]).

If thermal behavior of, e.g., four different chips should be analyzed, four thermal equivalent circuits (like the one in Fig.1) can be set up directly in a circuit simulator. Semiconductor losses must be calculated, transformed into signals and employed to control the current sources ([1], [6], [10], [13]). The measured input voltages represent the junction temperatures. Here, the assumption is that the three chips don’t influence each other thermally. Choosing third order models (three resistors and three capacitors) provides in many typical applications a good compromise between model accuracy and model size [1]. One has to keep in mind that setting up extremely accurate thermal models of power semiconductors does typically not make much sense, because the interface between power module case and heat sink surface (typically thermal grease) is not well defined but contributes significantly to the thermal resistance.
B. Considering Thermal Coupling between Chips Inside a Power Module

Inside a power module, where chips are very closely located to each other, every chip will heat up its neighbor chips. This kind of thermal coupling can be calculated or measured by heating up one chip and measuring the thermal step responses occurring for all other chips. Finding thermal equivalent circuits for these step responses, and coupling the effects accordingly, results in an impedance matrix as given in equation (2) in case of four different chips with full thermal coupling.

\[
\begin{bmatrix}
    T_{\text{junc}(1)}(t) \\
    T_{\text{junc}(2)}(t) \\
    T_{\text{junc}(3)}(t) \\
    T_{\text{junc}(4)}(t)
\end{bmatrix} =
\begin{bmatrix}
    z_{11} & z_{12} & z_{13} & z_{14} \\
    z_{21} & z_{22} & z_{23} & z_{24} \\
    z_{31} & z_{32} & z_{33} & z_{34} \\
    z_{41} & z_{42} & z_{43} & z_{44}
\end{bmatrix}
\begin{bmatrix}
    p_{V,(1)}(t) \\
    p_{V,(2)}(t) \\
    p_{V,(3)}(t) \\
    p_{V,(4)}(t)
\end{bmatrix} + T_{\text{ambient}} \quad (2)
\]

Equation (2) can be modeled in the circuit simulator as shown in Fig.2. The current sources are controlled by the power losses \( p_{V,(i)}(t) \) of the four individual chips as calculated in the circuit simulator in form of signals. The measured input voltages \( T_{\text{junc}(i)}(t) \) are added in form of signals and give the four junction temperatures \( T_{\text{junc}(i)}(t) \). One can see that even in case of just four chips, the thermal model in the circuit simulator becomes quite large and complex.

In Fig.2 the main diagonal \((ii)\) represents the thermal effect without mutual coupling. All other equivalent circuits \((ij)\) in the matrix represent the coupling between two different chips, respectively. The dominant effect is described by the main-diagonal circuits \((ii)\). Therefore, the coupling circuits \((ij)\) could be simplified as shown in Fig.3, especially if the two coupled chips are located far from each other, resulting in a reduced thermal model. In many cases it is important to consider weak coupling because many small coupling effects might add up to a thermal contribution that cannot be neglected.

\[\begin{align*}
T_{\text{junc}(1)}(t) &= p_{V,(1)}(t) + C_{\text{A12}} R_{\text{A12}} T_{\text{junc}(2)}(t) \\
T_{\text{junc}(2)}(t) &= p_{V,(2)}(t) + C_{\text{A23}} R_{\text{A23}} T_{\text{junc}(3)}(t) \\
T_{\text{junc}(3)}(t) &= p_{V,(3)}(t) + C_{\text{A34}} R_{\text{A34}} T_{\text{junc}(4)}(t)
\end{align*}\]

III. SCALING ISSUES, MODEL COMPLEXITY AND SIMULATION EFFORT

A circuit simulator typically works as explained in the following. Each component of the power circuit is described by an equation (e.g. differential equation for inductor and capacitor, non-linear characteristics for switching elements, ….). Linearizing the equations within one time step of the circuit simulation, results in a system of linear equations that can be written and solved in matrix form (e.g. equation (3) as mathematical formulation of the circuit of Fig.4(a) with modeling just one of the four semiconductor chips thermally).

\[\begin{bmatrix}
\varphi_{\text{NC1}} \\
\varphi_{\text{NC2}} \\
\varphi_{\text{NC3}} \\
\varphi_{\text{NT1}} \\
\varphi_{\text{NT2}} \\
\varphi_{\text{NT3}}
\end{bmatrix} \cdot \begin{bmatrix}
A
\end{bmatrix} = b \quad (3)
\]

A systematic procedure of setting up the simulation matrix is provided by the Node Voltage Method (e.g., p.213 in [15]), where, generally speaking, each node in the power circuit makes one contribution to the order of the simulation matrix \( A \). During the simulation, the time proceeds via small time steps \( \Delta t \) from \( t_{\text{START}} \) to \( t_{\text{END}} \), and equation (3) is repeatedly solved after each time step. After solving (3), the signal blocks are processed employing the just calculated voltages and currents of the power circuit, and providing control- and gate-signals for the power circuit after the next time step.

Fig.4 shows the screenshot from a simulator under development at ETH Zurich working according to this principle. The four nodes in the power circuit \((\text{NC1}, \text{NC2}, \text{NC3}, \text{NC4})\) result in a simulation matrix \( A \) of order 3 because one node can always be set to an arbitrary potential (here, we set the ground potential \( \text{NC4} \) to zero). If thermal circuits are modelled employing power circuit components, the additional nodes will increase the order of the simulation matrix accordingly. If for the power circuit topology in Fig.4(a) only one of the four semiconductor chips is modelled thermally, the additional nodes \( \text{NT1}, \text{NT2}, \text{NT3} \) (\( \text{NT4} \) is equal to \( \text{NC4} \)) will increase the order of the simulation matrix \( A \) from 3 to 6 as described by equation (3).

Solving a matrix equation like (3) is a standard procedure with different kind of optimized algorithms available and described in the literature. Generally, there are two groups of algorithms for solving matrix equations: Direct matrix solver and indirect matrix solvers. As long as the order of \( A \) is comparably small (< 10³), direct solvers can be employed.
Computational effort, number and order of matrices for different simulation scenarios in case of the Conventional Implementation, where one large matrix is built from all components in the power circuit and from all thermal circuits. As example, results are given for the power circuit topology of Fig.4 with all four semiconductor chips (two switches and two diodes) thermally modeled.

<table>
<thead>
<tr>
<th>Conventional Implementation</th>
<th>Computational effort / Number of matrices x Matrix order</th>
<th>Example: Fig.4 with all 4 chips thermally modeled ( n_{NC} = 3, n_{CCHIP} = 4, n_{NT} = 3 )</th>
</tr>
</thead>
<tbody>
<tr>
<td>Circuit simulation only</td>
<td>( CE \propto n_{NC}^3 \times 1 \times M(n_{NC}) )</td>
<td>( CE \propto 27 \times 1 \times M(3) )</td>
</tr>
<tr>
<td>Include thermal simulation</td>
<td>( CE \propto (n_{NC} + n_{CCHIP} + n_{NT})^3 \times 1 \times M(n_{NC} + n_{CCHIP} + n_{NT}) )</td>
<td>( CE \propto 3.4 \times 10^3 \times 1 \times M(15) )</td>
</tr>
<tr>
<td>for thermally non-coupled</td>
<td></td>
<td></td>
</tr>
<tr>
<td>chips</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Include thermal simulation</td>
<td>( CE \propto (n_{NC} + n_{CCHIP} + n_{NT} + (n_{CCHIP}^2 - n_{CCHIP})^3 \times 1 \times M(n_{NC} + n_{CCHIP} + n_{NT} + (n_{CCHIP}^2 - n_{CCHIP})) )</td>
<td>( CE \propto 19.7 \times 10^3 \times 1 \times M(27) )</td>
</tr>
<tr>
<td>for chips with</td>
<td></td>
<td></td>
</tr>
<tr>
<td>weak resistive coupling</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Include thermal simulation</td>
<td>( CE \propto (n_{NC} + n_{CCHIP} + n_{NT})^3 \times 1 \times M(n_{NC} + n_{CCHIP} + n_{NT}) )</td>
<td>( CE \propto 132.7 \times 10^3 \times 1 \times M(51) )</td>
</tr>
<tr>
<td>for full dynamic coupling</td>
<td></td>
<td></td>
</tr>
<tr>
<td>between all chips</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\( n_{NC} \) ... total node-number in power circuit (minus ground node)  
\( n_{NT} \) ... total node-number of all thermal circuits (minus ground node)  
\( n_{CCHIP} \) ... number of chips to be modeled thermally  
\( CE \) ... computational effort (proportional to simulation time)

For larger matrices that are typical for 3D-field problems solved by FEM, the solution has to be found in an iterative way because direct solvers would not work due to their need of huge memory resources. Iterative solvers like Gauss-Seidel ([16]), Successive Over-Relaxation (SOR) ([16], [17]), or Multigrid Methods (p.871 in [17], p.68 in [18]) need to fulfill certain stability criterions and/or have to be adapted to certain partial differential equations that describe the underlying problem. All these iterative methods are not guaranteed to reliably work for solving matrix equations based on switched systems as typically found in power electronic circuit simulation.

For power electronic circuit simulation the number of nodes remains typically small (e.g. just four nodes \( NC1, NC2, NC3, NC4 \) for the power circuit topology in Fig.4(a)), and a simulation matrix equation (3) can be easily solved by direct methods. A widely used standard method is the LU Decomposition [17]. Here, the number of executions of the inner loop in the solver algorithm is \( 1/3 N^3 \) with \( N \) to be the order of \( \Delta \) (p.48 in [17]). One execution is defined as one multiplication plus one addition. The computation time is, therefore, proportional to the third order of simulation matrix \( \Delta \) and the needed memory space of LU Decomposition is proportional to \( N^2 \). These relationships provide the base for the detailed description of computational effort for thermal modeling of multi-chip power modules in circuit simulators given in Table I. In this paper, we define the computational effort \( CE \) to be proportional to the number of executions.

For a straight-forward implementation (“Conventional Implementation”) with building all thermal equivalent circuits directly in the circuit simulator together with the power circuit, and setting up one single simulation matrix \( \Delta \) (as done internally in a circuit simulator), different scenarios are listed in Tab.1. If only the power circuit has to be simulated, the total number of nodes in Fig.4(a) is \( n_{NC} = 3 \). The order of the resulting simulation matrix \( \Delta \) is 3, and the computational effort is proportional 27. If all four semiconductors in Fig.4(a) would be modeled as thermally coupled chips, the number of nodes of the power circuit would still be \( n_{NC} = 3 \), but the thermal models would contribute 48 additional nodes. If the circuit simulator algorithm combines all these nodes, the order of \( \Delta \) will be 51 and the computational effort will be proportional to 1.327 \times 10^5. This much larger simulation matrix \( \Delta \) can still be handled easily by LU Decomposition, but the simulation speed will be reduced by a factor 4900. Performing simplified “weak” thermal coupling (see Fig.3) will still give a reduction of the simulation speed by a factor 730 compared to the “pure” circuit simulation.

IV. MATRIX SPLITTING IN THE CIRCUIT SIMULATOR

The huge increase of computational effort, characterized by an according slowdown of the numerical simulation, is based on the fact that for direct matrix solvers the computational effort is proportional to \( N^3 \) (with \( N \) as order of simulation matrix \( \Delta \)). The simulation matrix \( \Delta \) is set up by the circuit simulator internally and cannot be influenced by the user of a commercial circuit simulator. Therefore, we developed a simulation platform at the Power Electronic Systems Laboratory (PES), ETH Zurich, to gain unlimited access to the solver source code, which enables us to implement Matrix Splitting as proposed in this paper.

The main idea of solving the scaling problem described in section III is to avoid setting up a single large matrix incorporating all nodes (power circuit nodes plus thermal equivalent circuit nodes), but to split the matrix equations into as many sub-matrices as possible and solve those sub-matrices sequentially during one time step of the circuit simulator.
Computational effort, number and order of matrices for different simulation scenarios in case of Matrix Splitting, where one matrix is built from the components in the power circuit, and small independent matrices are built for each thermal circuit.

<table>
<thead>
<tr>
<th>Matrix Splitting</th>
<th>Computational effort / Number of matrices x Matrix order</th>
<th>Example: Fig.4 with all 4 chips thermally modeled ( n_{MC} = 3, n_{CHIP} = 4, n_{NT} = 3 )</th>
<th>Reduction of computational effort compared to the Conventional Impl.</th>
</tr>
</thead>
<tbody>
<tr>
<td>Circuit simulation only</td>
<td>( CE_{SPLIT} \propto n_{MC} \times n_{CHIP} \times M(n_{ICH}) )</td>
<td>( CE_{SPLIT} \propto 27 \times M(3) )</td>
<td>no effect</td>
</tr>
<tr>
<td>Include thermal simulation for thermally non-coupled chips</td>
<td>( CE_{SPLIT} \propto n_{MC} + n_{CHIP} \times n_{NT} \times M(n_{ICH}) )</td>
<td>( CE_{SPLIT} \propto 135 \times M(3) @ 4 \times M(3) )</td>
<td>factor 25</td>
</tr>
<tr>
<td>Include thermal simulation for chips with weak resistive coupling</td>
<td>( CE_{SPLIT} \propto n_{MC} + n_{CHIP} \times n_{NT} \times (n_{CHIP} - n_{ICHI}) \times M(1) )</td>
<td>( CE_{SPLIT} \propto 147 \times M(3) @ 4 \times M(3) @ 12 \times M(1) )</td>
<td>factor 134</td>
</tr>
<tr>
<td>Include thermal simulation for full dynamic coupling between all chips</td>
<td>( CE_{SPLIT} \propto n_{MC} \times n_{CHIP} \times n_{NT} \times M(n_{ICH}) )</td>
<td>( CE_{SPLIT} \propto 459 \times M(3) @ 16 \times M(3) )</td>
<td>factor 289</td>
</tr>
</tbody>
</table>

The power circuit can be described as one single matrix being independent from all thermal equivalent circuits because there is no direct exchange of flow properties or connection of potentials. The electric-thermal coupling is completely done via signals by processing the signal blocks during each time-step right after the matrix equation (3) has been solved (see Fig 2). The computational effort of executing the signals is proportional to the number of signal blocks and, therefore, not considered to be a bottleneck.

Now the complex thermal model, as shown in Fig 2, has to be split into sub-matrices. As one can see, all equivalent circuits exchange information among each other exclusively via signals (signal-control of sources, temperature measurement and addition of partial junction temperatures). There are no shared potentials (excepting the ground node), and no direct exchange of flow properties. Therefore, each thermal equivalent circuit (totally 16 in Fig 2) can be described by a low-order matrix. These matrices can be solved sequentially during each time-step of the circuit simulation, and all coupling is performed by processing the signal blocks afterwards. The resulting performance of consequent Matrix Splitting is given in Table II. The results of the numerical calculations are identical to the Conventional Implementation. There is no loss of accuracy.

Matrix Splitting, as defined above, only shows an effect in the presence of thermal models. Considering the example of Fig 4, and building a fully dynamically coupled thermal chip model according to the procedure shown in Fig 2, will result in 16 thermal circuits with 3 nodes each, that are all mutually coupled and coupled with the power circuit via the signal blocks. Therefore, besides the power circuit contributing one simulation matrix \( A_{NC} \) of order 3, there will be additional 16 small matrices \( A_{NTJ} \) of order 3 each, which have all to be solved sequentially after each time step of the simulation.

The resulting computational effort is proportional to 459, which slows down the simulation speed by a factor 17 compared to the “pure” circuit simulation, but provides an increase of the simulation speed by a factor 289 compared to the Conventional Implementation.

V. EXAMPLE: POWER-MODULE WITH 36 INTERNAL CHIPS

A. 3300V/1200A HiPak IGBT Module with 36 Internal Chips

In the following it will be shown what kind of scaling problem could be faced if a multi-chip module has to be thermally modeled in a circuit simulator. As an example, a 3300V/1200A IGBT power module [14] will be discussed with a total of 36 internal chips (Fig 5). The general impedance matrix of the according thermal model is given in (4). The order of the matrix is equal to the chip number. A total of 36x36 = 1296 equivalent circuits is needed, and they all have to be coupled via signals (see Fig 2). Even if the coupling circuits (ij) outside the main diagonal of the matrix are simplified (Fig 3) or even omitted, the size of the model is huge.

![Fig.5. 3300V/1200A ABB HiPak IGBT Module [14] employing 36 internal chips. Here, 24 chips connected in parallel form one switch, and the remaining 12 chips connected in parallel form the anti-parallel diode. One such module could provide the lower switch plus its anti-parallel diode in Fig.4, and a second module would provide the upper switch plus diode.](image)
It is problematic to set up a model of such a size in a circuit simulator by hand, not only because it is a very time-consuming task but also because of reliability concerns since modeling errors are difficult to detect. There is need of automatically setting up such thermal networks inside the circuit simulator, but this is not a principle limitation of the whole procedure. The real problem of the approach is the increase in computational effort due to the huge model size as discussed theoretically in sections III and IV.

\[
\begin{pmatrix}
T_{junc,\text{CH}(1)}(t) \\
T_{junc,\text{CH}(2)}(t) \\
\vdots \\
T_{junc,\text{CH}(36)}(t)
\end{pmatrix}
= \begin{pmatrix}
\tau_{1,1} & \tau_{1,2} & \cdots & \tau_{1,35} & \tau_{1,36} \\
\tau_{2,3} & \tau_{2,2} & \cdots & \tau_{2,35} & \tau_{2,36} \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
\tau_{36,1} & \tau_{36,2} & \cdots & \tau_{36,35} & \tau_{36,36}
\end{pmatrix}
\begin{pmatrix}
P_{\text{CE}(1)}(t) \\
P_{\text{CE}(2)}(t) \\
\vdots \\
P_{\text{CE}(36)}(t)
\end{pmatrix}
+ T_{\text{ambient}}
\]  

(4)

In Table III a comparison between the Conventional Implementation and Matrix Splitting is shown, where the power circuit’s lower transistor and the anti-parallel diode in Fig.4 are realized employing the 3300V/1200A multi-chip power module (Fig.5). In case of this 36-chip module, 24 internal chips form the lower switch, and 12 chips compose the lower anti-parallel diode. Accordingly, if a very detailed model is demanded, one could replace the lower switch in Fig.4(a) by 24 parallel switches and the lower anti-parallel diode by 12 parallel diodes. This would increase the number of elements in the power circuit significantly, but would not change the number of power circuit nodes \( n_{\text{NC}} \) and the order of simulation matrix \( A \). In case of applying LU Decomposition, the computational effort is basically defined by the node number, and, therefore, would not be increased.

Table III shows the computational effort and gives the number of matrices to be solved after each time step of the circuit simulation plus the order of the matrices. In case of the Conventional Implementation the order of the simulation matrix is even in case of simplified weak thermal coupling with 1371 so large that the direct method LU Decomposition might run into memory problems on today’s (2007) PCs. On the other hand, in case of Matrix Splitting the order of the small thermal sub-matrices is never larger than 3. Accordingly, the computational effort is reduced by a huge factor of approximately 10^8 if Matrix Splitting replaces the Conventional Implementation. Compared to the “pure” circuit simulation the simulation speed is reduced by a factor 84, but the Conventional Implementation would slow down the thermal simulation with weak coupling by a factor of approximately 10^5.

B. Software Implementation

A virtual design platform for power electronic systems is under development at PES/ETH Zurich, where Matrix Splitting as described in this paper has been implemented at the solver level of the circuit simulator. One feature of the developed simulator in Fig.6 is that complex thermally coupled models as shown in Fig.2 are built automatically from 3D-CAD models of the power module with minimum user-input necessary, saving the user a lot of time and guaranteeing reliability of the model. The transient power losses, consisting of temperature-dependent conduction and switching losses, are calculated as described in [1] and [6]. Junction temperatures of all 36 chips can be measured directly at the according ports.

For the transient simulation of the power circuit topology shown in Fig.6, the thermal properties of the chips were defined to consider not only the power module, but also the water-cooled heat sink (chip-junction to water-inlet). The according values of thermal resistances and capacitances of the 24 internal IGBT-chips were assumed to be \( R_{\text{th},\text{IGBT}} = [0.341, 0.156, 0.128] [\text{K/W}] \) and \( C_{\text{th},\text{IGBT}} = [48.7, 7.2, 0.7] [\text{Ws/K}] \) (third-order model), the 12 internal diodes were modeled with \( R_{\text{th},\text{D}} = [0.194, 0.147, 0.142] [\text{K/W}] \) and \( C_{\text{th},\text{D}} = [78.5, 5.4, 0.55] [\text{Ws/K}] \) (third-order model), and the coupling between the single chips was modeled with \( R_{\text{th, coupling}} = [0.018] [\text{K/W}] \) and

<table>
<thead>
<tr>
<th>Example: 3300V/1200A IGBT Module</th>
<th>Conventional Implementation</th>
<th>Matrix Splitting</th>
<th>Reduction of computational effort by Matrix Splitting</th>
</tr>
</thead>
<tbody>
<tr>
<td>( n_{\text{NC}} = 3, n_{\text{CHIP}} = 36, n_{\text{NT}} = 3 )</td>
<td>( \text{CE} \propto 27 \times M(3) )</td>
<td>( \text{CE}^{\text{SPLIT}} \propto 27 \times M(3) )</td>
<td>no effect</td>
</tr>
<tr>
<td>Include thermal simulation for thermally non-coupled chips</td>
<td>( \text{CE} \propto 1.37 \times 10^8 \times M(111) )</td>
<td>( \text{CE}^{\text{SPLIT}} \propto 999 \times M(3) \times M(36) )</td>
<td>factor 1.4 \times 10^4</td>
</tr>
<tr>
<td>Include thermal simulation for chips with weak resistive coupling</td>
<td>( \text{CE} \propto 2.58 \times 10^9 \times M(1371) )</td>
<td>( \text{CE}^{\text{SPLIT}} \propto 2259 \times M(3) \times M(36) \times M(1260) )</td>
<td>factor 1.1 \times 10^9</td>
</tr>
<tr>
<td>Include thermal simulation for full dynamic coupling between all chips</td>
<td>( \text{CE} \propto 58.9 \times 10^9 \times M(3891) )</td>
<td>( \text{CE}^{\text{SPLIT}} \propto 35019 \times M(3) \times M(1296) )</td>
<td>factor 1.7 \times 10^9</td>
</tr>
</tbody>
</table>

Comparison of the computational effort between the Conventional Implementation and Matrix Splitting for the 3300V/1200A IGBT Module with 36 internal chips employed as the lower bridge leg of the power circuit of Fig.4.
The thermal properties of all chips were assumed to be equal. However, this assumption is only approximate due to geometric asymmetries. Since the scope of this paper is to discuss the scaling problems of state-of-the-art implementations of electrical-thermal coupling, detailed simulation results with emphasis on accuracy and experimental verifications must be omitted here, and will be given in future research papers.

The loss properties of the power semiconductors were taken from the datasheet of the power module [14]. For the single IGBT-chip we get $U_{CEO,125} = 1.75V$ and $r_{ON,125} = 1.7m\Omega$ for calculation of the conduction losses and $k_{ON,125} = 0.10mWs/A$; $k_{OFF,125} = 0.11mWs/A$ @3000V for calculation of the switching-losses. For the conduction losses of the single diode-chip we get $U_{FE,125} = 1.5V$ and $r_{ON,125} = 0.73m\Omega$. Generally, the loss properties could be easily defined in dependency of the chip’s junction temperature, which has not been performed in this example, but will be discussed in detail in future research.

Voltage source, load resistance and inductor of the power circuit are defined as $U_{DC} = 3000V$, $R_{OUT} = 1.5\Omega$, $L = 200\mu H$, switching frequency and duty-cycle are defined as $f_{PWM} = 10kHz$ and $d = 0.5$. The constant numerical step-width of the simulation is $\Delta t_{Simulation} = 1\mu s$. The first 500ms of the electric-thermal numerical simulation are shown in Fig.7.

Junction-temperature of one IGBT-chip and one diode-chip during the first two seconds after activating the system are shown in Fig.8. Only the IGBT-chips are carrying current in this example, so that the diode-chips are heated solely due to thermal coupling. Because the thermal model in this example includes also the water-cooled heat sink, the corresponding time-constant of thermal coupling is in the range of two minutes, which results in a very slow rise of the diode-chip temperature. To fully simulate two real minutes in order to reach steady-state employing a constant step-width of $\Delta t_{Simulation} = 1\mu s$ would require about 70 hours simulation time (Pentium 4, 3.6GHz, 1GB RAM) even in this extremely fast Matrix-Splitting implementation.

Fig.6. Simulator employing Matrix Splitting developed at PES/ETH Zurich. The $36\times36 = 1296$ thermal equivalent circuits, mutually coupled with each other, are ‘hidden’ behind the symbol of the power module, where all 36 internal semiconductor chips are available for inflow of thermal power and measuring junction temperature. The thermal losses are available via thermal blocks named ‘Pv_chip’.

Fig.7. Simulation of the first 500µs of the topology in Fig.6. (Top) Load current. (Middle) Transient power losses of one of the 24 IGBT-chips consisting of on- and off-switching losses shown as needles, and conduction losses proportional to current during on-state. (Bottom) Step-wise rise of junction-temperature of one IGBT-chip, and junction-temperature of one diode-chip (close to zero during the first 500µs).

Fig.8. Simulation of the first two seconds of the topology in Fig.6 requiring a simulation time of about 70 minutes on a Pentium 4 (3.6GHz, 1GB RAM). The top curve gives the junction-temperature of a single IGBT-chip, and the lower curve gives the junction-temperature of a diode-chip which is heated by the IGBTs in its neighborhood.
Besides solving the matrix equations there is also significant computational effort during each time step in setting up the matrix equations by linearization, and performing a multitude of other tasks to make the simulator work. Therefore, comparing real simulation times of Matrix Splitting and Conventional Implementation will give improvement factors that might differ from the theoretical factors as listed in Tab. III. The factors calculated in Tab. III basically give a good assessment of the general scaling issues and theoretically possible reduction of simulation time.

VI. CONCLUSION

A state-of-the-art circuit simulator implementation (Conventional Implementation) of thermal models for a multi-chip power module comprising more than just a few chips not only slows down the simulation speed to unacceptable levels, but can also prevent the application of direct matrix solution algorithms such as LU Decomposition which are essential to ensure the stability of numerical simulations of switched power electronic systems. Matrix Splitting, built into the circuit simulator solver algorithm as proposed, is a method that can help to reliably simulate complex and large thermal models coupled with power circuits by orders of magnitude faster than the Conventional Implementation method.

REFERENCES


