A Gauche perspective on row reduced echelon form and its uniqueness
aa r X i v : . [ m a t h . G M ] J u l A GAUCHE PERSPECTIVEONROW REDUCED ECHELON FORMANDITS UNIQUENESS
ERIC L. GRINBERG
Abstract.
Using a left-to-right “sweeping” algorithm, we define the
Gauche basis for the column spaceof a matrix M . Interpreting the row reduced echelon form (RREF) of M by Gauche means gives adirect proof of its uniqueness. A corollary shows that the (right) null space of M determines its rowspace and unmasks a sanitized version of the assertion “if two systems are solution-equivalent they arerow-equivalent”. We conclude with pedagogical reflections. Introduction.
The Row Reduced Echelon Form of a matrix M , RREF( M ), is a useful tool when working with linearsystems [2, 9]; its uniqueness is an important property. A survey of papers and textbooks yields a varietyof uniqueness proofs. Some are simpler [11] and shorter than others. Generally proofs begin with twocandidates for RREF( M ) and conclude that these are equal. It is deemed desirable to have a directproof, one that simply identifies every atom and molecule of RREF( M ) in terms of properties of M andstandard conventions. We use the Gauche basis of the column space of M to give such a proof, takingthe opportunity to view RREF from a shifted perspective. This context makes it convenient to observethat the (right) null space of M determines its row space, and yields a near-converse of the familiarassertion “if two systems are row-equivalent then they are solution-equivalent”. In conclusion we offersome reflections on teaching. 2. Conventions and Notations.
We will work mostly in the vector space F p , consisting of p × F , and sometimes denote these as transposed row vectors, e.g., (cid:16) . . . (cid:17) t . We’ll adhere to the orderingconventions of left to right and up to down . Thus the first column of a matrix is the leftmost, and firstentry of a column is its top entry. Recall the notation for the “canonical” or “standard” basis of F p : { ~e j } , where ~e j stands for the p × (cid:16) . . . . . . (cid:17) t with zeros throughout, except fora 1 in the j th entry. Recall also that the span of a set S of vectors in F p is the collection of all linearcombinations of these vectors. Thus the span of the singelton set { ~v } consists of the set of all scalarmultiples of ~v , i.e., a line in F p , unless ~v = ~
0, in which case the span of { ~v } is { ~ } . We also have thedegenerate case where S is the empty set; by convention, the span of the empty set is { ~ } .3. The Remembrance of Row Reduced Echelon Form (RREF).
Given a matrix M , viewed as the coefficient portion of a linear system M ~x = ~b , we can apply rowoperations to M , or to the augmented matrix ( M | ~b ), and corresponding equation operations on the system M ~x = ~b , to yield a simpler system that is solution-equivalent to the original. These operations include scaling a row by a non-zero scalar, interchanging two rows, and subtracting a scalar multiple of one row Mathematics Subject Classification.
Key words and phrases.
Row Reduced Echelon Form, Uniqueness, Gauche basis. from another row. This last operation is the most commonly used, and is sometimes called a workhorse row operation.Starting with a matrix M and applying carefully chosen row operations, one can obtain a matrix E with, arguably, the “best possible” form among all matrices row equivalent to M . This is the RowReduced Echelon Form of M , or RREF( M ), or just RREF. We use the definite article the because thisform turns out to be be unique, as we’ll see.A matrix E is in RREF if it satisfies the follow conditions. • Pivots
Sweeping each row of E from the left, the first nonzero scalar encountered, if any, is a 1.We call this entry, along with its column, a pivot . • Pivot Column Insecurity
In a pivot column, the scalar 1 encountered in the row sweep is the onlynon-zero entry in its column. • Downright Conventional
If a pivot scalar 1 is to the right of another, it is also lower down. • Bottom Zeros
Rows consisting entirely of zeros, if they appear, are at the bottom of the matrix.
The label
Pivot Insecurity requires explanation. We think of the pivot scalar 1’s as insecure: they don’twant competition from other nonzero entries along their column. Sorry pivots–row insecurity cannot beaccommodated. 4.
A Gauche Basis for a Matrix with A Fifth Column.
For the purpose of introduction and illustration we’ll begin with a specific matrix [3, SAE]: T ≡ − − − − − . We will “sweep” the columns of T from left to right, and designate each column as a keeper or as subordinate . These are meant to be value-neutral, not value judgements, and we hope that no vectorswill take offense. For each column we ask Can we present this column (LLQ) as a linear combination of keeper columns to its left?
We will call this the
Left-Leaning Question , or
LLQ for short. Columns for which the answer is no willbe designated as keepers and the rest as subordinates .When focusing on the first column of T , we recall the convention that a linear combination of theempty set is, in the context of a vector space V , the zero vector of V . Thus the LLQ for the first columnof T is tantamount to asking: Is this vector non-zero?
For T the answer is yes . Therefore, we adorn column one with the adjective keeper . In the aim ofresponsible accounting, we “journal” our action with the vector ~J ≡ ~e . (Recall that in our context ~e is the 3 × LLQ , which, in the current context, asks:
Is this column a scalar multiple of column one?
The answer is no , so column two is a keeper, and we journal it with ~J ≡ ~e .The LLQ for third column asks if this column is a linear combination of the first two (keeper) columnsof T ; by inspection, column three is presentable as a linear combination of columns one and two, withscalings 3 , subordinate and we journal our action with the vector ~J ≡ ~e + 1 ~e , which encodes the manifestation of this vector as a linear combination of keeper columns AUCHE ECHELON FORM 3 to its left: − = 3 · − + 1 · ; ~J ≡ . Similarly, the fourth column of T is subordinate, and journaled with ~J ≡ ( − · ~e + ( − ~e . The fifth and final column vector of T is not presentable as a linear combo of previous keepers. Thereader is invited to prove this or, alternatively, perform a half-turn on the solution box below. (Take a timesthefirstcolumnof T andadditto b timesthesecondcolumn,and lookatthetopandbottomentries.Toproducethefifthcolumnof T ,weneed 2 a + b =2andalso a + b =2.Thisimpliesthat a =0,andthenwerunintotrouble withthemiddleentriesofourvectors.) We declare the fifth column a keeper (at our peril), and journal it with ~J ≡ ~e . Now form a 3 × J ≡ (cid:16) ~J ~J · · · ~J (cid:17) , or J ≡ − − . (1)This turns out to be the RREF of T , perhaps surprisingly. For an independent verification, using Gauss-Jordan elimination on the same matrix T , see Example SAE in [3]. Notice that our procedure doesnot show that (1) is row-equivalent to T , whereas the Gauss-Jordan algorithm, e.g., as in [3], does. It’snot difficult to show directly, in this context, that the Gauche procedure yields a matrix that is row-equivalent to the original. In case anyone insists, we will prove this later on; the approach is entirelyGauss-Jordan-esque.5. Beyond the Fifth Column: a General Gauche Algorithm.
Here we detail the procedure for generating the Gauche basis for an arbitrary matrix and use it toproduce the corresponding RREF. For student readers, we suggest following the ideas of John H. Hubbardand Bill Thurston in
How To Read Mathematics , [6]: jump to the illustrative concrete example above,whenever a point in the general procedure below appears nebulous.Let M be a p × q matrix (over a fixed field F ). We outline a general algorithm that transforms M into row reduced echelon form without invoking row reduction. This manifests, among other things, theuniqueness of the row reduced echelon form. Sweeping the columns of M from left to right, we will adornsome of the columns with the title of keeper . Initially, the set of keepers is empty. Going from left toright, we take a column of M and ask the LLQ. For the first column of M this is tantamount to asking: Is this column nonzero?
If so, we declare it a keeper and journal our action with the vector ~J ≡ ~e ∈ F p .If the first column is zero, we do not adorn it with the title of keeper; we call it subordinate and wejournal our action with the vector ~J ≡ ~ n th column of M and ask the LLQ . If this column is not in the span ofthe current keeper set, we adorn this column with the keeper designation and journal our action withthe vector ~J n ≡ ~e ℓ +1 , where ℓ + 1 is the number of keepers adorned up to this step, current columnincluded. If the current column is presentable as a linear combination of (already designated) keepers,say α ~k + . . . + α ℓ ~k ℓ , where the already designated keeper columns are { ~k i } ℓi =1 , then we call the currentcolumn subordinate and journal our action with the vector ~J n ≡ α ~e + . . . + α ℓ ~e ℓ , recalling that we arefocusing on column n and we have ℓ keeper columns already designated. The careful (or fussy, or both)reader may object that the current column may be expressible as a linear combination of keepers in morethan one way. However, induction readily shows that at each stage the keeper set is linearly independent. ERIC L. GRINBERG
At the end of this procedure we obtain a matrix E of the same size as M .We will call the algorithm above, transforming M into E , the Gauche procedure and the resulting basisfor the column space of M the Gauche basis . Lemma 1.
The matrix E is in row reduced echelon form.Proof. In this discussion we will sometimes tacitly identify columns of E with corresponding columnsof M . We take row i of E and “sweep” it from the left. We encounter a first non-zero entry in only onecircumstance: where we meet a pivot of E , i.e., a journaled vector ~J ℓ corresponding to keeper column of M . (A nonzero entry in a subordinate column is always assigned only after a pivot 1 entry has alreadybeen assigned earlier in the same row.) In the Gauche algorithm, whenever we introduce a new journalvector ~J ℓ , corresponding to a pivot, the scalar 1 appears in a lower slot than those of any prior keepers,and prior subordinate columns are linear combinations of prior keepers, so their entries are zero at thisrow altitude level as well.What about the downright condition? If a pivot 1 is to the right of another, it is also lower down, as itgets adorned with the keeper designation at a later stage and is journaled as ~e k with a larger value of k .When the pivot journaling stops, no further nonzero entries are journaled in rows lower than the rowof the 1 entry in the last pivot column. Hence, in particular, all pure-zero rows are at the bottom of E .Thus we have verified that E is in RREF. (cid:3) Zeroing in on the null space.
The decisional steps of the Gauche procedure can be interpreted in terms of the null space of M . This,in turn, can give us Gauche-free information about the null space.In working with columns of M (and of E ) we used { ~e i } pi =1 , the standard basis of F p . Now null( M ) is asubset of F q and we’d like to work with the standard basis of this space as well. To avoid (read: reduce)confusion, we’ll use the notation ~f i for the q × i and zeros elsewhere, sothat { ~f i } qi =1 is the standard basis of F q .With this notation we observe that the first column of M is M ~f , so asking if the 1st column of M iszero is tantamount to asking if the vector ~f belongs to the null space of M , i.e., if M ~f = ~
0. Table 1gives further illustration of this interplay.7.
RREF is unique.
Lemma 2.
Let M be a p × q matrix over a field F and let E be a matrix in RREF which is row-equivalentto M . Let S ⊆ { , . . . , q } be the index set corresponding to the pivot vectors among the columns of E .Then • The columns of M corresponding to the index set S form the Gauche basis for the column spaceof M . • Each non-pivot column ~c of E is a linear combination of the pivot columns to its left. Thiscombination manifests the presentation of the corresponding column of M as a linear combinationof Gauche basis vectors to its left. The “top” entries of ~c encode this (unique) linear combination,and the rest of the entries of ~c are “padded” zeros. Note that if E has no pivots at all then M = E = 0, which is consistent with the vacuous interpretationof the statement of the lemma. For the matrix J in (1), which is an instance of E , the set S is { , , } . Proof.
It is well known that if M and E are row equivalent then the associated homogeneous linearsystems M ~x = ~ E~x = ~ M and E have the same (right) null space. At the risk of slightly abusing language we state a heuristic principle: Every linear property of the columns of M is also enjoyed by the columns of E ,and conversely. AUCHE ECHELON FORM 5
Table 1. (Column Property) ↔ (Null Space Property) Dictionary Property of Columns Property of Null Space
The first column of M is nonzero. The vector ~f is not in Null( M ).The k th column of M is in the span ofcolumns j , . . . j ℓ of M . There exist α , . . . α ℓ , so that α ~f j + . . . + α ℓ ~f j ℓ − ~f k ∈ Null( M ).Columns j , . . . j ℓ of M form a linearly independent set. For scalars α , . . . α ℓ ,the vector α ~f j + . . . + α ℓ ~f j ℓ is in Null( M ) ⇔ the scalars α , . . . α ℓ all vanish. This assertion requires some reflection and interpretation. It is inspired, in part, by a deep principlein the analysis of meromorphic functions [12]. (See [8] for a heuristic principle in the context of linearalgebra.) Table 1 provides illustrations of this heuristic for M , and we can do the same for E . (Althoughwe captioned the table as a dictionary, we have taken liberties with the language inside; we hope thatthis is forgivable.)Iterating the idea, we can express in this way the statement Columns j , . . . , j ℓ form the Gauche basis of the column space of ( · ) . and others like it. Indeed, in this way, all assertions in the statement of the lemma may be translatedinto assertions about solutions of the respective null spaces. Hence these are shared values [7] for E and M . (cid:3) Theorem 1.
Let M be a matrix. Then there is one and only one matrix E in RREF which is rowequivalent to M .Proof. The lemma above describes every entry of E in terms of left-down conventions and properties of M , without reference to any process for row reducing M to yield E , e.g., Gauss-Jordan elimination. Thisproves uniqueness. For existence, one can invoke the Gauss-Jordan algorithm, or prove directly (and,admittedly, with Gauss-Jordan-esque ideas) that E is row equivalent to M , as is done independently,below in Proposition 1. (cid:3) Corollary 1.
The null space of a matrix M determines the RREF and the row space of M . Hence iftwo matrices of the same size have the same null space, they are row equivalent.Proof. The matrix M has a unique RREF and its Gauche construction uses only the null space of M . (cid:3) The relation between the null space and the row space of a matrix is mentioned repeatedly in [5],sometimes concretely in examples, sometimes in generality, but in passing. Clearly, this is a well-knownfact, though not a sufficiently promulgated one.8.
The solution determines the problem.
In the television game
Jeopardy! contestants are given answers and asked to guess the questions fromwhence they came. In calculus we introduce anti-derivatives as “differentiation Jeopardy”. The followinglinear-algebraic Jeopardy variant may be considered:
If two linear systems have the same solution setthenthey are row equivalent.
Literally, as stated, this assertion is manifestly false. (We set it in small type; please do not invokeit out of context.) For suppose we have two inconsistent linear systems. They both have the empty
ERIC L. GRINBERG set of solutions, hence the same set of solutions. But the two systems may not have the same numberof equations. They may even involve different variables. Clearly, we need to focus on consistent linearsystems of the same size. We will also tacitly assume that they involve the same unknowns.
Corollary 2.
If two consistent linear systems of the same size are solution-equivalent then they are rowequivalent.Proof.
First assume that the systems are homogeneous. Then the hypothesis says that the correspondingmatrices have the same null space. Hence, by the previous corollary, they have the same RREF and arethereby row-equivalent. In the general case, simply note that the solution set of a (possibly) inhomo-geneous linear system consists of one particular solution added to the solution space of the associatedhomogeneous system. (cid:3) An Existential Question.
The Gauche procedure takes a matrix M and associates with it a matrix E which is in RREF. But howdo we know that there exists a sequence of row operations taking M to E , i.e., why is E row equivalentto M ?We can invoke the Gauss-Jordan elimination algorithm which yields a matrix in RREF that is row-equivalent to M and then cite uniqueness considerations to conclude that our Gauche E must be thatmatrix. But this is unsatisfying–we should be able to show directly that the Gauche-produced matrix E is row-equivalent to M and, if one insists, we can. Proposition 1.
For a matrix M , the Gauche-produced RREF matrix E ≡ E ( M ) is row equivalent to M .Proof. We can take M and row reduce it to yield the Gauche-produced matrix E following the algorithmillustrated below: · · · ∗ ∗ · · · ∗ · · · ∗ ∗ · · · ∗ ... ...... ... ∗ · · · ∗ · · · = 0 ∗ · · · ∗ ... ...... ... ∗ · · · ∗ · · · ∗ ∗ . . . ∗ −→ . . . = 0 . . . ∗ . . . ∗ . . . ∗ ... ...... ... . . . ∗ . . . ∗ . . . ∗ ... ...... ... . . . ∗ . . . ∗ . . . ∗ −→−→ . . . . . . ∗ . . . ∗ . . . ∗ ... ...... ... . . . ∗ . . . ∗ . . . ∗ ... ...... ... . . . ∗ . . . ∗ . . . ∗ −→ . . . . . . ∗ . . . . . . ∗ ... ...... ... . . . ∗ . . . . . . ∗ ... ...... ... . . . ∗ . . . . . . ∗ −→ · · · . If M is the zero matrix then E = M and we are done. Otherwise, E has a first pivot column, whichcorresponds to the first non-zero column of M , say column j . Taking M and permuting rows we obtaina matrix whose first non-zero column is number j , and which has a non-zero entry in the first slot; afterscaling the first row we can assume that this entry is 1. Subtracting scalar multiples of the first row fromeach of the other rows, i.e., employing workhorse row operations, we obtain a matrix whose first non-zerocolumn is the j st, with entries equal to those of ~e . If E has no other pivot columns, all later columnsare scalar multiples of the j st, and we are done. If E has a second pivot column, say in slot j , thenthis column must have a non-zero entry below the first pivot. Permuting rows other than the first andthen applying workhorse -type operations and rescaling the top non-zero entry in this column we obtain ~e in the j nd slot while retaining ~e in the first slot. Continuing this way we produce row operationsthat place appropriate canonical vectors of the form ~e ℓ in each of the pivot slots. Each of the non-pivotcolumns is a linear combination of the pivot columns to its left, and requires no additional “processing” AUCHE ECHELON FORM 7 by row operations. Thus we have manifested E ≡ E ( M ) as the result of a sequence of row operationsapplied to M . (cid:3) Reflections on Teaching.
The method of elimination via row reduction may be introduced at the very start of a course on linearalgebra. Taking the Gauche approach to echelon form, we are led naturally, directly and concretely to thenotions of linear combination , span , and linear independence . Definition and application are threaded–noneed for a separate introduction with rationale for use. This brings to mind a parallel in a Math Proof course. Every such course covers Euclid’s proof of the infinitude of primes, and rightly so. But we canalso add H. Furstenberg’s “topological” proof [4, 10]. Fursternberg’s proof leads directly to the basic setoperations of intersection, union and complement. Here too, definition and application are threaded andallied; motivation is built in.We conclude with a question: Is there a book proof (see [1]) of the uniqueness of RREF? Is the factworthy of inclusion in
The Book ? 11.
Acknowledgement
The Gauche idea emerged from a conversation with Professor Gilbert Strang in the fall of 2019 atMIT’s
Endicott House . The author is grateful to Professor Strang for the conversation and for hisinspiring writings through the years. He is also grateful to MIT for the invitation to the Endicott Houseevent, and he would happily repeat the experience.
References [1] Aigner, M., Ziegler, G. (2018).
Proofs from The Book.
Amer. Math. Monthly
A First Course in Linear Algebra , edition 3.50, online open-source edition 3.50 linear.ups.edu .[4] Furstenberg, H. (1955). On the infinitude of primes.
Amer. Math. Monthly
62: p 353.[5] Hoffman, K., Kunze, R. (1971).
Linear Algebra , 2nd ed. Upper Saddle River, Englewood Cliffs, NJ: Prentice-Hall.[6] Hubbard, J. (2002). Reading Mathematics in Hubbard, J. H. and Burke Hubbard, B.,
Vector Calculus, LinearAlgebra, and Differential Forms: A Unified Approach , 2nd ed., Englewood Cliffs, NJ: Prentice Hall; chapter 0.1:Reading Mathematics. http://pi.math.cornell.edu/~hubbard/readingmath.pdf .[7] Pang, X. and Zalcman, L. (2000). Normal families and shared values,
Bull. London Math. Soc.
J. Funct. Anal.
Notices Am. Math. Soc.
Amer. Math. Monthly
Math. Mag.
Amer. Math. Monthly,
Department of Mathematics, UMass Boston, Boston, MA 02125, USA
E-mail address ::