Abstract
There exist algorithms, also called fast algorithms, which exploit the special structure of Toeplitz matrices so that, e.g., allow to solve a linear system of equations in O(n2) flops. However, some implementations of classical algorithms that do not use this structure (O(n3) flops) highly reduce the time to solution when several cores are available. That is why it is necessary to work on fast algorithms so that they do not lose track of the benefits of new hardware/software. In this work, we propose a new approach to the Generalized Schur Algorithm, a very known al- gorithm for the solution of Toeplitz systems, to work on a BlockToeplitz matrix. Our algorithm is based on matrixmatrix multiplications, thus allowing to exploit an efficient implementation of this operation if it exists. Our algorithm also makes use of the thread level parallelism featured by multicores to decrease execution time.