Solvers

See more details about the Solvers in the API

Algorithms for solving Models are implemented in Solvers. Since this typically is where most of the computational time is spent, low-level optimization often is required. The purpose of the Solver is to contain these numerical details and present an intuitive interface to the application developer. The interface allows for the algorithm to be configured on-demand, but aims to minimize the amount of method-specific knowledge that is necessary to use the Solver.

Internally, the Solver converts the Model to the format that is best suited for the algorithm. Likewise, the output of the algorithm is stored in a method-specific format. This data can be requested from the Solver, but the Solver is not responsible for providing the output on a method independent format. Instead, each Solver can be wrapped in a corresponding PropertyExtractor, through which Properties can be extracted using a notation that is Solver independent. In this chapter, we therefore only discuss how to set up, configure, and run a Solver, leaving the extraction of Properties to the next chapter.

TBTK currently contains four production-ready Solvers: the Diagonalizer, BlockDiagonalizer, ArnoldiIterator, and ChebyshevExpander. All of these are single-particle Solvers, which is the reason that this manual at the moment only describes how to set up and solve single-particle problems.

The two first Solvers are based on diagonalization and can calculate all eigenvalues and eigenvectors. Their strength is that they provide solutions with complete information about the system, allowing for arbitrary Properties to be calculated. However, diagonalization scales poorly with system size and is therefore not feasible for very large systems. If the Model is block-diagonal, the BlockDiagonalizer can handle much larger systems than the Diagonalizer.

The ArnoldiIterator also calculates eigenvalues and eigenvectors but can be used to calculate only a few of them. Arnoldi iteration is a so-called Krylov method that iteratively builds a larger and larger subspace of the total Hilbert space. It then performs diagonalization on this restricted subspace. If only a few eigenvalues or eigenvectors are needed, they can be calculated quickly, even if the Model is very large.

The ChebyshevExpander can calculate the Green's function \(G_{\mathbf{i}\mathbf{j}}(E)\) for any given pair of Indices \(\mathbf{i}\) and \(\mathbf{j}\). It can be used to calculate Properties for very large systems when they are needed for a limited number of Indices. However, the method is iterative, and infinite precision along the energy axis requires an infinite number of steps.

All Solvers inherit from the base class TBTK::Solver::Solver. This is a very simple class that only implements two functions for setting and getting the Model to solve.

Solver::Solver solver;

solver.setModel(model);

const Model &referenceToModel = solver.getModel();

It is important that the Model that is passed to the Solver remains in memory for the full lifetime of the Solver. Therefore, do not call *solver.setModel()* with a temporary Model object.

Solvers are also Communicators, which means that they can output diagnostic information that is possible to mute. To turn on the output, call

solver.setVerbose(true);

The Diagonalizer sets up a dense matrix representing the Hamiltonian and diagonalizes it to obtain the eigenvalues and eigenvectors. Because of its simplicity, it is a good choice for Models with a Hilbert space size up to a few thousands. Setting it up and running the diagonalization is straight forward.

Solver::Diagonalizer solver;

solver.setModel(model);

solver.run();

The Solver is now ready to be wrapped in its PropertyExtractor to allow for Properties to be extracted.

The required amount of space is particularly easy to estimate for the Diagonalizer. Internally the Hamiltonian is stored as an upper triangular matrix of type std::complex<double> (16 bytes per entry). The space required to store a Hamiltonian with basis size N is therefore roughly \(16(N^2/2) = 8N^2\) bytes. The eigenvectors and eigenvalues require another \(16N^2\) and \(8N\) bytes, respectively. This adds up to about \(24N^2\) bytes, which runs into the GB range around a basis size of 7000.

The time it takes to diagonalize a matrix cannot be estimated with the same precision since it depends on clock speed etc. However, as of 2018 it runs into the hour range for basis sizes of a few thousands. Knowing the execution time for a specific size and computer, the time required for a different size can be accurately predicted using that the time scales as \(O(N^3)\).

In a self-consistent procedure, some input parameter is unknown but can be calculated as an output. It is, therefore, possible to make an initial guess, calculate the parameter, and feed the result back in as input. If this procedure is repeated until (and if) the input and output converge (within a given tolerance), the solution is said to be self-consistent.

The Diagonalizer can execute such a self-consistency loop. However, what parameter that is to be solved for is problem-specific, and so is the measure of tolerance. Therefore, after the Hamiltonian has been diagonalized, the Diagonalizer can give control back to the application developer through a self-consistency callback. The callback is required to calculate the relevant parameter, check for convergence, and return true or false depending on whether or not convergence has been reached. If the callback returns true, the Diagonalizer finishes, otherwise, the Hamiltonian is rediagonalized, and the procedure is repeated.

A self-consistency callback is implemented as follows.

class SelfConsistencyCallback :

public Solver::Diagonalizer::SelfConsistencyCallback

{

public:

bool selfConsistencyCallback(Solver::Diagonalizer &solver){

//Calculate and update the parameter(s).

//...

//Determine whether the solution has converged or not.

//...

//Return true if the solution has converged, otherwise return

//false.

if(hasConverged)

return true;

else

return false;

}

};

We note that for the self-consistency procedure to work, those HoppingAmplitudes that depend on the parameter must themselves be dependent on a HoppingAmplitude callback. It is the application developer's responsibility to make sure that the parameter that is calculated in the self-consistency callback is passed on to the HoppingAmplitude callback.

With the callback defined, the self-consistent calculation can be set up.

Solver::Diagonalizer solver;

solver.setModel(model);

SelfConsistencyCallback selfConsistencyCallback;

solver.setSelfConsistencyCallback(selfConsistencyCallback);

solver.setMaxIterations(100);

solver.run();

The call to *solver.setMaxIterations(100)* caps the number of iterations at 100. If the calculation has not converged by then it finishes anyway. For a complete example of a self-consistent calculation, the reader is referred to the SelfConsistentSuperconductivity template in the Templates folder.

The BlockDiagonalizer mimics the Diagonalizer, but can take advantage of a Hamiltonians block structure.

Solver::BlockDiagonalizer solver;

solver.setModel(model);

solver.run();

The implementation of a compatible self-consistency callback for the BlockDiagonalizer also closely parallels that of the Diagonalizer.

class SelfConsistencyCallback :

public Solver::BlockDiagonalizer::SelfConsistencyCallback

{

public:

bool selfConsistencyCallback(Solver::BlockDiagonalizer &solver){

//Calculate and update the parameter(s).

//...

//Determine whether the solution has converged or not.

//...

//Return true if the solution has converged, otherwise return

//false.

if(hasConverged)

return true;

else

return false;

}

};

The ArnoldiIterator allows for a few eigenvalues and eigenvectors to be calculated quickly, even if the system is very large. Starting with the Hamiltonian \(H\) and a random vector \(v_0\), it iteratively calculates \(v_n\) for higher \(n\). It does so by first calculating \(\tilde{v}_n = Hv_{n-1}\) and then orthonormalizing it against \(v_{n-1}, ..., v_{0}\) to obtain \(v_n\). The product \(Hv_{n-1}\) amplifies the components of \(v_{n-1}\) that are eigenvectors with extremal eigenvalues. Therefore, the subspace spanned by \(\{v_{n}\}\) quickly approximates the part of the Hilbert space that contains these extremal eigenvalues. Finally, performing diagonalization on this much smaller subspace, the extremal eigenvalues and eigenvectors are calculated quickly. The eigenvalues and eigenvectors obtained in this way are called Ritz-values and Ritz-vectors.

The ArnoldiIterator can be set up as follows.

Solver::ArnoldiIterator solver;

solver.setModel(model);

solver.setNumLanczosVectors(200);

solver.setMaxIterations(500);

solver.setNumEigenVectors(100);

solver.setCalculateEigenVectors(true);

solver.run();

Here, *solver.setNumLanczosVectors()* sets the size of the subspace that is to be generated, while *solver.setNumEigenVectors()* determines the number of eigenvalues to calculate. The number of eigenvalues should at most be as large as the number of Lanczos vectors. However, since the iteration starts with a random vector, choosing a smaller number of eigenvalues results in the less accurate subspace to be ignored. Finally, the ArnoldiIterator uses a modified version of the procedure described above called implicitly restarted Arnoldi iteration (see the ARPACK documentation). This leads to more iterations being executed than the number of Lanczos vectors finally generated. The line *solver.setMaxIterations()* puts a cap on the number of iterations.

The ArnoldiIterator can also be used to calculate eigenvalues and eigenvectors around a given value \(\lambda\). This is done by replacing \(H\) by \((H - \lambda)^{-1}\), which is referred to as shift-and-invert. To perform shift-and-invert, we need to set the "central value" \(\lambda\) and instruct the ArnoldiIterator to work in this mode.

solver.setCentralValue(1);

solver.setMode(Solver::ArnoldiIterator::Mode::ShiftAndInvert);

The ChebyshevExpander calculates the Green's function on the form

Here \(F(x)\) is one of the functions \(\cos(x)\), \(\sin(x)\), \(e^{ix}\), and \(e^{-ix}\). Further, the "scale factor" \(s\) must be chosen large enough that all of the Hamiltonian's eigenvalues fall inside of the range \(|E/s| < 1\). This is called a Chebyshev expansion of the Green's function, and for more details, the reader is referred to Phys. Rev. Lett. **78**, 275 (2006), Phys. Rev. Lett. **105**, 1 (2010), and urn:nbn:se:uu:diva-305212.

The ChebyshevExpander calculates the Green's function in two steps. In the first step, the expansion coefficients

\begin{eqnarray*} b_{\mathbf{i}\mathbf{j}}^{(m)} &=& \langle j_{\mathbf{i}}^{(0)}|j_{\mathbf{j}}^{(m)}\rangle, \end{eqnarray*}

are calculated using the recursive expressions

\begin{eqnarray*} |j_{\mathbf{j}}^{(1)}\rangle &=& H|j_{\mathbf{j}}^{(0)}\rangle,\\ |j_{\mathbf{j}}^{(m)}\rangle &=& 2H|j_{\mathbf{j}}^{(m-1)}\rangle - |j_{\mathbf{j}}^{(m-2)}\rangle. \end{eqnarray*}

Here \(|j_{\mathbf{i}}^{(0)}\rangle\) and \(|j_{\mathbf{j}}^{(0)}\rangle\) are vectors with zeros everywhere, except for a one in the position associated with \(\mathbf{i}\) and \(\mathbf{j}\), respectively. Using sparse matrix-vector multiplication, these expansion coefficients can be calculated efficiently. However, an exact solution requires an infinite number of coefficients, and a cutoff must, therefore, be introduced.

In the second step, the ChebyshevExpander generates the Green's function using the calculated coefficients and the expression for the Green's function above.

A ChebyshevExpander with \(s = 10\) and the number of expansion coefficients capped at 1000 can be set up as follows.

Solver::ChebyshevExpander solver;

solver.setModel(model);

solver.setScaleFactor(10);

solver.setNumCoefficients(1000);

The expansion coefficients can be efficiently calculated on a GPU.

solver.setCalculateCoefficientsOnGPU(true);

Also, the Green's functions can be generated on a GPU (but this is less important and runs into memory limitations relatively quickly).

solver.setGenerateGreensFunctionsOnGPU(true);

When the Green's function is generated for more than one pair of Indices, the execution time can be significantly decreased by using a precalculated lookup table.

solver.setUseLookupTable(true);

The lookup table is required to generate the Green's function on a GPU, and the only downside of generating it is that it can consume large amounts of memory.