matrix inversion

Matrix multiplication

In mathematics, matrix multiplication is the operation of multiplying a matrix with either a scalar or another matrix. This article gives an overview of the various ways to perform matrix multiplication.

Ordinary matrix product

This is the most often used and most important way to multiply matrices. It is defined between two matrices only if the number of columns of the first matrix is the same as the number of rows of the second matrix.

Formally, for

A in {mathbb R}^{m times n}, B in {mathbb R}^{n times p}
then
(AB) in {mathbb R}^{m times p}
where the elements of A.B are given by
(AB)_{i,j} = sum_{r=1}^n A_{i,r}B_{r,j}

for each pair i and j with 1 ≤ im and 1 ≤ jp. The algebraic system of "matrix units" summarises the abstract properties of this kind of multiplication.

Calculating directly from the definition

The picture to the left shows how to calculate the (1,2) element and the (3,3) element of AB if A is a 4×2 matrix, and B is a 2×3 matrix. Elements from each matrix are paired off in the direction of the arrows; each pair is multiplied and the products are added. The location of the resulting number in AB corresponds to the row and column that were considered.

(mathbf{AB})_{1,2} = sum_{r=1}^2 a_{1,r}b_{r,2} = a_{1,1}b_{1,2}+a_{1,2}b_{2,2}
(mathbf{AB})_{3,3} = sum_{r=1}^2 a_{3,r}b_{r,3} = a_{3,1}b_{1,3}+a_{3,2}b_{2,3}

For example:

begin{bmatrix}
    1 & 0 & 2 
    -1 & 3 & 1
end{bmatrix} cdot begin{bmatrix}
   3 & 1 
   2 & 1 
   1 & 0
end{bmatrix} = begin{bmatrix}
  1 times 3 + 0 times 2 + 2 times 1 & 1 times 1 + 0 times 1 + 2 times 0 
 -1 times 3 + 3 times 2 + 1 times 1 & -1 times 1 + 3 times 1 + 1 times 0
end{bmatrix} = begin{bmatrix}
   5 & 1 
   4 & 2
end{bmatrix}

The coefficients-vectors method

This matrix multiplication can also be considered from a slightly different viewpoint: it adds vectors together after being multiplied by different coefficients. If A and B are matrices given by:

mathbf{A} =

begin{bmatrix} a_{1,1} & a_{1,2} & dots a_{2,1} & a_{2,2} & dots

  vdots & vdots & ddots
end{bmatrix}

and mathbf{B} =

begin{bmatrix} b_{1,1} & b_{1,2} & dots b_{2,1} & b_{2,2} & dots

  vdots & vdots & ddots
end{bmatrix} = begin{bmatrix}
  B_1 
  B_2 
  vdots
end{bmatrix}

then

mathbf{AB} = begin{bmatrix} a_{1,1} B_1 + a_{1,2} B_2 + cdots a_{2,1} B_1 + a_{2,2} B_2 + cdots
  vdots
end{bmatrix}

The example revisited:

begin{bmatrix}
    1 & 0 & 2 
    -1 & 3 & 1
end{bmatrix} cdot begin{bmatrix}
   3 & 1 
   2 & 1 
   1 & 0
end{bmatrix} = begin{bmatrix} 1 begin{bmatrix} 3 & 1 end{bmatrix} + 0 begin{bmatrix} 2 & 1 end{bmatrix} + 2 begin{bmatrix} 1 & 0 end{bmatrix} -1 begin{bmatrix} 3 & 1 end{bmatrix} + 3 begin{bmatrix} 2 & 1 end{bmatrix} + 1 begin{bmatrix} 1 & 0 end{bmatrix} end{bmatrix} = begin{bmatrix} begin{bmatrix} 3 & 1 end{bmatrix} + begin{bmatrix} 0 & 0 end{bmatrix} + begin{bmatrix} 2 & 0 end{bmatrix} begin{bmatrix} -3 & -1 end{bmatrix} + begin{bmatrix} 6 & 3 end{bmatrix} + begin{bmatrix} 1 & 0 end{bmatrix} end{bmatrix}

= begin{bmatrix}

   5 & 1 
   4 & 2
end{bmatrix}

The rows in the matrix on the left can be thought of as the list of coefficients and the matrix on the right as the list of vectors. In the example, the first row is [1 0 2], and thus we take 1 times the first vector, 0 times the second vector, and 2 times the third vector.

The equation can be simplified further by using outer products:

mathbf{A} =
begin{bmatrix}
  A_1  & A_2 & dots
end{bmatrix} implies mathbf{AB} = sum_i A_iB_i

The terms of this sum are matrices of the same shape, each describing the effect of one column of A and one row of B on the result. The columns of A can be seen as a coordinate system of the transform, i.e. given a vector x we have mathbf{A}x=A_1x_1+A_2x_2+cdots where x_i are coordinates along the A_i "axes". The terms A_iB_i are like A_ix_i, except that B_i contains the ith coordinate for each column vector of B, each of which is transformed independently in parallel.

The example revisited:

begin{bmatrix}
    1 & 0 & 2 
    -1 & 3 & 1
end{bmatrix} cdot begin{bmatrix}
   3 & 1 
   2 & 1 
   1 & 0
end{bmatrix} = begin{bmatrix}1 -1end{bmatrix}begin{bmatrix}3 & 1end{bmatrix}+ begin{bmatrix}0 3end{bmatrix}begin{bmatrix}2 & 1end{bmatrix}+ begin{bmatrix}2 1end{bmatrix}begin{bmatrix}1 & 0end{bmatrix}
= begin{bmatrix} 1 cdot 3 & 1 cdot 1 -1 cdot 3 & -1 cdot 1 end{bmatrix}+ begin{bmatrix} 0 cdot 2 & 0 cdot 1 3 cdot 2 & 3 cdot 1 end{bmatrix}+ begin{bmatrix} 2 cdot 1 & 2 cdot 0 1 cdot 1 & 1 cdot 0 end{bmatrix} = begin{bmatrix} 5 & 1 4 & 2 end{bmatrix}

The vectors begin{bmatrix}3 & 2 & 1end{bmatrix}^top and begin{bmatrix}1 & 1 & 0end{bmatrix}^top have been transformed to begin{bmatrix}5 & 4end{bmatrix}^top and begin{bmatrix}1 & 2end{bmatrix}^top in parallel. One could also transform them one by one with the same steps:

begin{bmatrix}
    1 & 0 & 2 
    -1 & 3 & 1
end{bmatrix} cdot begin{bmatrix}
   3 
   2 
   1
end{bmatrix} = begin{bmatrix}1 -1end{bmatrix}3+ begin{bmatrix}0 3end{bmatrix}2+ begin{bmatrix}2 1end{bmatrix}1 = begin{bmatrix} 1cdot 3 -1cdot 3end{bmatrix}+ begin{bmatrix} 0cdot 2 3cdot 2end{bmatrix}+ begin{bmatrix} 2cdot 1 1cdot 1end{bmatrix} = begin{bmatrix} 5 4 end{bmatrix}

Vector-lists method

The ordinary matrix product can be thought of as a dot product of a column-list of vectors and a row-list of vectors. If A and B are matrices given by:

mathbf{A} =

begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & dots a_{2,1} & a_{2,2} & a_{2,3} & dots a_{3,1} & a_{3,2} & a_{3,3} & dots

  vdots & vdots & vdots & ddots
end{bmatrix} = begin{bmatrix}
  A_1 
  A_2 
  A_3 
  vdots
end{bmatrix}

and mathbf{B} =

begin{bmatrix} b_{1,1} & b_{1,2} & b_{1,3} & dots b_{2,1} & b_{2,2} & b_{2,3} & dots b_{3,1} & b_{3,2} & b_{3,3} & dots

  vdots & vdots & vdots & ddots
end{bmatrix} = begin{bmatrix} B_1 & B_2 & B_3 & dots end{bmatrix}

where

A1 is the row vector of all elements of the form a1,x      A2 is the row vector of all elements of the form a2,x     etc,
and B1 is the column vector of all elements of the form bx,1      B2 is the column vector of all elements of the form bx,2     etc,

then

mathbf{AB} =

begin{bmatrix}

  A_1 
  A_2 
  A_3 
  vdots
end{bmatrix} * begin{bmatrix} B_1 & B_2 & B_3 & dots end{bmatrix} = begin{bmatrix} (A_1 cdot B_1) & (A_1 cdot B_2) & (A_1 cdot B_3) & dots (A_2 cdot B_1) & (A_2 cdot B_2) & (A_2 cdot B_3) & dots (A_3 cdot B_1) & (A_3 cdot B_2) & (A_3 cdot B_3) & dots vdots & vdots & vdots & ddots

end{bmatrix}.

Properties

  • Matrix multiplication is not guaranteed to be commutative (though it sometimes can be by coincidence)

AB ne BA

  • If A and B are square matrices of equal dimension, then the determinants of their product is equal, regardless of order.

det(AB) = det(BA)

  • If both matrices are diagonal square matrices of the same dimension, their product is commutative.
  • If A is a matrix representative of a linear transformation L and B is a matrix representative of a linear transformation P then AB is a matrix representative of a linear transform P followed by the linear transformation L. Note that although it appears the transformations are done in reverse order, this is actually the "correct" order because the right-most function is always performed first in function composition. Thus the linear transformation corresponding to B should be done first.
  • Matrix multiplication is associative:

mathbf{A} (mathbf{B C} ) = (mathbf{A B} ) mathbf{C}

  • Matrix multiplication is distributive:

mathbf{A} (mathbf{B} + mathbf{C} ) = mathbf{A B} + mathbf{AC}
(mathbf{A} + mathbf{B} ) mathbf{C} = mathbf{A C} + mathbf{B C}.

  • If the matrix is defined over a field (for example, over the Real or Complex fields), then it is compatible with scalar multiplication

c (mathbf{A B} ) = (c mathbf{A} ) mathbf{B}
(mathbf{A} c ) mathbf{B} = mathbf{A} (c mathbf{B} )
(mathbf{A B} ) c = mathbf{A} (mathbf{B} c )
where c is a scalar.

Algorithms

The complexity of matrix multiplication, if carried out naively, is O(n^3 ), but more efficient algorithms do exist. Strassen's algorithm, devised by Volker Strassen in 1969 and often referred to as "fast matrix multiplication", is based on a clever way of multiplying two 2 × 2 matrices which requires only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this trick recursively gives an algorithm with a multiplicative cost of O(n^{log_{2}7}) approx O(n^{2.807}). Strassen's algorithm is awkward to implement, compared to the naive algorithm, and it lacks numerical stability. Nevertheless, it is beginning to appear in libraries such as BLAS, where it is computationally interesting for matricies with dimensions n > 100 (Press 2007, p. 108).

The algorithm with the lowest known exponent, which was presented by Don Coppersmith and Shmuel Winograd in 1990, has an asymptotic complexity of O(n2.376). It is similar to Strassen's algorithm: a clever way is devised for multiplying two k × k matrices with fewer than k³ multiplications, and this technique is applied recursively. It improves on the constant factor in Strassen's algorithm, reducing it to 4.537. However, the constant term implied in the O(n2.376) result is so large that the Coppersmith–Winograd algorithm is only worthwhile for matrices that are too big to handle on present-day computers (Knuth, 1998).

Since any algorithm for multiplying two n × n matrices has to process all 2 × n² entries, there is an asymptotic lower bound of omega (n^2) operations. Raz (2002) proves a lower bound of Omega(m^2 log m) for bounded coefficient arithmetic circuits over the real or complex numbers.

Cohn et al. (2003, 2005) put methods such as the Strassen and Coppersmith–Winograd algorithms in an entirely different, group-theoretic context. They show that if families of wreath products of Abelian with symmetric groups satisfying certain conditions exists, matrix multiplication algorithms with essential quadratic complexity exist. Most researchers believe that this is indeed the case (Robinson, 2005).

Because of the nature of Matrix operations and the layout of Matrices in memory, it is typically possible to gain substantial performance gains through use of parallelisation and vectorisation. It should therefore be noted that some lower time-complexity algorithms on paper may have indirect time complexity costs on real machines.

Relationship to linear transformations

Matrices offer a concise way of representing linear transformations between vector spaces, and (ordinary) matrix multiplication corresponds to the composition of linear transformations. This will be illustrated here by means of an example using three vector spaces of specific dimensions, but the correspondence applies equally to any other choice of dimensions.

Let X, Y, and Z be three vector spaces, with dimensions 4, 2, and 3, respectively, all over the same field, for example the real numbers. The coordinates of a point in X will be denoted as xi, for i = 1 to 4, and analogously for the other two spaces.

Two linear transformations are given: one from Y to X, which can be expressed by the system of linear equations

begin{align}
x_1 & = a_{1,1}y_1+a_{1,2}y_2 x_2 & = a_{2,1}y_1+a_{2,2}y_2 x_3 & = a_{3,1}y_1+a_{3,2}y_2 x_4 & = a_{4,1}y_1+a_{4,2}y_2 end{align} and one from Z to Y, expressed by the system
begin{align}
y_1 & = b_{1,1}z_1+b_{1,2}z_2+b_{1,3}z_3 y_2 & = b_{2,1}z_1+b_{2,2}z_2+b_{2,3}z_3 end{align} These two transformations can be composed to obtain a transformation from Z to X. By substituting, in the first system, the right-hand sides of the equations of the second system for their corresponding left-hand sides, the xi can be expressed in terms of the zk:
begin{align}
x_1 & = a_{1,1}(b_{1,1}z_1{+}b_{1,2}z_2{+}b_{1,3}z_3)+a_{1,2}(b_{2,1}z_1{+}b_{2,2}z_2{+}b_{2,3}z_3) & = (a_{1,1} b_{1,1}{+}a_{1,2} b_{2,1})z_1+(a_{1,1} b_{1,2}{+}a_{1,2} b_{2,2})z_2+(a_{1,1} b_{1,3}{+}a_{1,2} b_{2,3})z_3 x_2 & = a_{2,1}(b_{1,1}z_1{+}b_{1,2}z_2{+}b_{1,3}z_3)+a_{2,2}(b_{2,1}z_1{+}b_{2,2}z_2{+}b_{2,3}z_3) & = (a_{2,1} b_{1,1}{+}a_{2,2} b_{2,1})z_1+(a_{2,1} b_{1,2}{+}a_{2,2} b_{2,2})z_2+(a_{2,1} b_{1,3}{+}a_{2,2} b_{2,3})z_3 x_3 & = a_{3,1}(b_{1,1}z_1{+}b_{1,2}z_2{+}b_{1,3}z_3)+a_{3,2}(b_{2,1}z_1{+}b_{2,2}z_2{+}b_{2,3}z_3) & = (a_{3,1} b_{1,1}{+}a_{3,2} b_{2,1})z_1+(a_{3,1} b_{1,2}{+}a_{3,2} b_{2,2})z_2+(a_{3,1} b_{1,3}{+}a_{3,2} b_{2,3})z_3 x_4 & = a_{4,1}(b_{1,1}z_1{+}b_{1,2}z_2{+}b_{1,3}z_3)+a_{4,2}(b_{2,1}z_1{+}b_{2,2}z_2{+}b_{2,3}z_3) & = (a_{4,1} b_{1,1}{+}a_{4,2} b_{2,1})z_1+(a_{4,1} b_{1,2}{+}a_{4,2} b_{2,2})z_2+(a_{4,1} b_{1,3}{+}a_{4,2} b_{2,3})z_3 end{align} These three systems can be written equivalently in matrix–vector notation – thereby reducing each system to a single equation – as follows:
begin{bmatrix}
   x_1 
   x_2 
   x_3 
   x_4
end{bmatrix} = begin{bmatrix} a_{1,1} & a_{1,2} a_{2,1} & a_{2,2} a_{3,1} & a_{3,2} a_{4,1} & a_{4,2} end{bmatrix} cdot begin{bmatrix}
   y_1 
   y_2
end{bmatrix}

begin{bmatrix}
   y_1 
   y_2
end{bmatrix} = begin{bmatrix} b_{1,1} & b_{1,2} & b_{1,3} b_{2,1} & b_{2,2} & b_{2,3} end{bmatrix} cdot begin{bmatrix}
   z_1 
   z_2 
   z_3
end{bmatrix}

begin{bmatrix}
   x_1 
   x_2 
   x_3 
   x_4
end{bmatrix} = begin{bmatrix} a_{1,1} b_{1,1}{+}a_{1,2} b_{2,1} & a_{1,1} b_{1,2}{+}a_{1,2} b_{2,2} & a_{1,1} b_{1,3}{+}a_{1,2} b_{2,3} a_{2,1} b_{1,1}{+}a_{2,2} b_{2,1} & a_{2,1} b_{1,2}{+}a_{2,2} b_{2,2} & a_{2,1} b_{1,3}{+}a_{2,2} b_{2,3} a_{3,1} b_{1,1}{+}a_{3,2} b_{2,1} & a_{3,1} b_{1,2}{+}a_{3,2} b_{2,2} & a_{3,1} b_{1,3}{+}a_{3,2} b_{2,3} a_{4,1} b_{1,1}{+}a_{4,2} b_{2,1} & a_{4,1} b_{1,2}{+}a_{4,2} b_{2,2} & a_{4,1} b_{1,3}{+}a_{4,2} b_{2,3} end{bmatrix} cdot begin{bmatrix}
   z_1 
   z_2 
   z_3
end{bmatrix} Representing these three equations symbolically and more concisely as
begin{align} mathbf{x} = mathbf{Ay} mathbf{y} = mathbf{Bz} mathbf{x} = mathbf{Cz} end{align} inspection of the entries of matrix C reveals that .

This can be used to formulate a more abstract definition of matrix multiplication, given the special case of matrix–vector multiplication: the product AB of matrices A and B is the matrix C such that for all vectors z of the appropriate shape .

Scalar multiplication

The scalar multiplication of a matrix A = (aij) and a scalar r gives a product r A of the same size as A. The entries of r A are given by

(rmathbf{A})_{ij} = r cdot a_{ij}. ,
For example, if
mathbf{A}=begin{bmatrix} 1 & 2 3 & 4 end{bmatrix}
then
7 mathbf{A}=begin{bmatrix} 7 cdot 1 & 7 cdot 2 7 cdot 3 & 7 cdot 4 end{bmatrix} = begin{bmatrix} 7 & 14 21 & 28 end{bmatrix}.
If we are concerned with matrices over a ring, then the above multiplication is sometimes called the left multiplication while the right multiplication is defined to be
(mathbf{A}r)_{ij} = a_{ij} cdot r. ,

When the underlying ring is commutative, for example, the real or complex number field, the two multiplications are the same. However, if the ring is not commutative, such as the quaternions, they may be different. For example

ibegin{bmatrix}
   i & 0 
   0 & j 
end{bmatrix} = begin{bmatrix}
   -1 & 0 
    0 & k 
end{bmatrix} ne begin{bmatrix}
   -1 & 0 
   0 & -k 
end{bmatrix} = begin{bmatrix}
   i & 0 
   0 & j 
end{bmatrix}i.

Hadamard product

For two matrices of the same dimensions, we have the Hadamard product, also known as the entrywise product and the Schur product. It can be generalized to hold not only for matrices but also for operators. The Hadamard product of two m-by-n matrices A and B, denoted by AB, is an m-by-n matrix given by (AB)ij = aij bij. For instance

begin{bmatrix}
   1 & 2 
   3 & 1 
end{bmatrix} bullet begin{bmatrix}
   0 & 3 
   2 & 1 
end{bmatrix} = begin{bmatrix}
   1cdot 0 & 2cdot 3 
   3cdot 2 & 1cdot 1 
end{bmatrix}

= begin{bmatrix}

   0 & 6 
   6 & 1 
end{bmatrix} .

Note that the Hadamard product is a submatrix of the Kronecker product (see below).

The Hadamard product is commutative.

The Hadamard product is studied by matrix theorists, and it appears in lossy compression algorithms such as JPEG, but it is virtually untouched by linear algebraists. It is discussed in (Horn & Johnson, 1994, Ch. 5).

Kronecker product

Main article: Kronecker product.

For any two arbitrary matrices A and B, we have the direct product or Kronecker product A B defined as

begin{bmatrix} a_{11}B & a_{12}B & cdots & a_{1n}B
   vdots & vdots & ddots & vdots 
a_{m1}B & a_{m2}B & cdots & a_{mn}B end{bmatrix}.

Note that if A is m-by-n and B is p-by-r then A B is an mp-by-nr matrix. Again this multiplication is not commutative.

For example

begin{bmatrix}
   1 & 2 
   3 & 1 
end{bmatrix} otimes begin{bmatrix}
   0 & 3 
   2 & 1 
end{bmatrix} = begin{bmatrix}
   1cdot 0 & 1cdot 3 & 2cdot 0 & 2cdot 3 
   1cdot 2 & 1cdot 1 & 2cdot 2 & 2cdot 1 
   3cdot 0 & 3cdot 3 & 1cdot 0 & 1cdot 3 
   3cdot 2 & 3cdot 1 & 1cdot 2 & 1cdot 1 
end{bmatrix}

= begin{bmatrix}

   0 & 3 & 0 & 6 
   2 & 1 & 4 & 2 
   0 & 9 & 0 & 3 
   6 & 3 & 2 & 1
end{bmatrix} .

If A and B represent linear transformations V1W1 and V2W2, respectively, then A B represents the tensor product of the two maps, V1 V2W1 W2.

Common properties

All three notions of matrix multiplication are associative:

mathbf{A} (mathbf{B C} ) = (mathbf{A B} ) mathbf{C}

and distributive:

mathbf{A} (mathbf{B} + mathbf{C} ) = mathbf{A B} + mathbf{AC}

and

(mathbf{A} + mathbf{B} ) mathbf{C} = mathbf{A C} + mathbf{B C}.

and compatible with scalar multiplication:

c (mathbf{A B} ) = (c mathbf{A} ) mathbf{B}
(mathbf{A} c ) mathbf{B} = mathbf{A} (c mathbf{B} )
(mathbf{A B} ) c = mathbf{A} (mathbf{B} c )

Note that these three separate couples of expressions will be equal to each other only if the multiplication and addition on the scalar field are commutative, i.e. the scalar field is a commutative ring. See Scalar multiplication above for a counter-example such as the scalar field of quaternions.

Frobenius inner product

The Frobenius inner product, sometimes denoted A:B is the component-wise inner product of two matrices as though they are vectors. In other words, it is the sum of the entries of the Hadamard product, that is,
mathbf{A}:mathbf{B}=sum_isum_j A_{ij} B_{ij} = operatorname{trace}(mathbf{A}^T mathbf{B}) = operatorname{trace}(mathbf{A} mathbf{B}^T).
This inner product induces the Frobenius norm.

See also

References

  • Henry Cohn, Robert Kleinberg, Balazs Szegedy, and Chris Umans. Group-theoretic Algorithms for Matrix Multiplication. . Proceedings of the 46th Annual Symposium on Foundations of Computer Science, 23-25 October 2005, Pittsburgh, PA, IEEE Computer Society, pp. 379–388.
  • Henry Cohn, Chris Umans. A Group-theoretic Approach to Fast Matrix Multiplication. . Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 11-14 October 2003, Cambridge, MA, IEEE Computer Society, pp. 438–449.
  • Coppersmith, D., Winograd S., Matrix multiplication via arithmetic progressions, J. Symbolic Comput. 9, p. 251-280, 1990.
  • Horn, Roger; Johnson, Charles: "Topics in Matrix Analysis", Cambridge, 1994.
  • Knuth, D.E., The Art of Computer Programming Volume 2: Seminumerical Algorithms. Third Edition, 1998. pp 501.
  • .
  • R. Raz. On the complexity of matrix product. In Proceedings of the thirty-fourth annual ACM symposium on Theory of computing. ACM Press, 2002.
  • Robinson, Sara, Toward an Optimal Algorithm for Matrix Multiplication, SIAM News 38(9), November 2005. PDF
  • Strassen, Volker, Gaussian Elimination is not Optimal, Numer. Math. 13, p. 354-356, 1969.

External links

Search another word or see matrix inversionon Dictionary | Thesaurus |Spanish
Copyright © 2014 Dictionary.com, LLC. All rights reserved.
  • Please Login or Sign Up to use the Recent Searches feature