This note appeared in Mathematics Magazine , vol. 69, no.
4, October 1996, 261–266.
Introduction Something happened to me recently I
would wager has happened to many who read
this note. Teaching a new topic, you cannot understand one of the proofs . Your
first attempt to
fill the gap fails. You look through your books for an answer. Next, you ask
colleagues, go to the
library, maybe even use the interlibrary loan. All in vain. Then it strikes you
that, in fact, you
cannot answer an even more basic and seemingly more interesting question. You
peruse the books
again. They seem to have answers to thousands of strange questions, but not to
yours (the most
natural one→). At the same time you cannot believe that your question could have
been overlooked
by generations of mathematicians. Days pass; the agony continues.
Then one day, some way or other, you find the answer. In my case the answer was
in a book I
already owned. It followed from a theorem I had known for a long time, but I had
never thought of
this particular application. I must admit, indeed, that this theorem appeared in
almost every book
I had checked, but never with a pointer to this particular application, even as
an exercise. Were the
authors unaware of the application? Or did it seem to obvious to mention? In any
case, here is the
story.
In my graduate combinatorics course, a proof of the existence of a de sign was
based on the
following question: Given a system of linear equations Ax = b, where
is an m×n matrix
with integer entries, and b is an m × 1 column vector with integer components,
does the system
have an integer solution, i.e. an n × 1 solution vector x with integer
components? The suggested
method ([8], Th. 15.6.5) makes use of “a well–known theorem of van der Waerden”:
THEOREM (Van der Waerden). An integer solution of the system exists if and only
if, for every
row vector v with rational components such that vA has integer components, vb is
an integer.
I had never seen this theorem, and I was surprised that such a criterion could
be useful (which it
was!). In trying to prove the theorem, I realized that I did not know any good
method for resolving
a more basic question:
How can one tell whether a system of linear diophantine equations has a
solution ? If solutions exist how can one find any or all of them? (*)
I could not find this question in any of at least 30 modern texts on abstract
algebra or number
theory. The place I found it at last was the classical text of van der Waerden
[14, Exercise 12.3].
Not for the first time this book contained an answer that I could not find in
more recent sources —
why hadn’t I started with it? (Interestingly, the book contains very few
exercises, but this one was
there.)
The theory behind the solution is closely related to the
famous structure theorem for finitely
generated abelian groups, or, more generally, for finitely generated modules
over principal ideal
domains. Various proofs can be found in many books on abstract algebra, e.g.,
see [7]. We present
a matrix version of the theorem. Let Z denote the ring of integers,
,
the ring
of all integer m × n matrices,
the set of all square k × k matrices with integer entries and
determinant 1 or −1 (unimodular matrices). By
we denote
the diagonal matrix that has an integer di in the (i, i) entry, i = 1, . . . ,m,
and zeros elsewhere . Then
we have:
THEOREM 1. Let
. There exist
and
such that

where di > 0, i = 1, . . . , s, and
.
A proof can be found, e.g., in [14] or [7]. The idea is to use elementary
operations of rows and
columns of A. Matrices L and R correspond to compositions of these operations.
Though matrices
L and R in Theorem 1 may vary, the matrix D is uniquely defined by A and it is
called the Smith
normal form of A.
Let us note immediately that Theorem 1 can be used to answer question (*). Given
Ax = b,
rewrite it as Dy = c with Ry = x, LAR = D and c = Lb. But the solution to the
diagonal system
Dy = c is easy. More details and a numerical example are given in the
Applications section of this
paper.
The question of finding an efficient algorithm for computing the Smith normal
form of an integer
matrix is not trivial. The direct application of elementary operations of rows
and columns does not
lead to a polynomial-time algorithm: the integers get too large. For more
details, see [11] and [3].
Some history Theorem 1 has an interesting history. Question (*) seems not to
have been asked,
in full generality, until the mid–19th century. Its particular cases appeared in
1849 -1850 in some
number–theoretical studies of Hermite [9, p.164, p.265]. In 1858 Heger [10]
formulated conditions for
the solvability of Ax = b in the case where A has full rank (i.e., m) over Z. In
1861, the problem was
solved in full generality by H.J.S. Smith [12]. Theorem 1 appeared in a form
close to the one above
in an 1868 treatise by Frobenius [5] who generalized Heger’s theorem [5,
p.171-173] and emphasized
the unimodularity of the transformations [5, p.194-196].
By then many important results on abelian groups had been discovered. Introduced
by Gauss,
the concept of an abelian group was developed both in number–theoretical studies
of Gauss, Schering,
Kronecker, and Dirichlet, and in the studies of elliptic functions and abelian
integrals of Gauss,
Abel, and Jacobi. Not until 1879 did Frobenius and Stickelberger [6] discover
and use explicitly
the connection between the theory of finitely generated abelian groups and
Smith’s theorem. In
the same year, Frobenius showed that Smith’s theory (extended to matrices over
polynomial rings )
could be used to classify square matrices over fields, up to similarity. (For
further history, see [4] and
the Historical Notes in [1].) The story reminds us, in particular, that many
basic notions and facts
of linear algebra (including module theory) were developed within the context of
number theory .
Applications Our first application is related to question
(*). It also contains a proof of the
aforementioned theorem of van der Waerden. Let Q denote the field of rational
numbers.
PROPOSITION 2. Let A, L, R, D be as in Theorem 1, ∈ Zn and c = Lb. Then the
following
four statements are equivalent:
(1) The system of linear equations Ax = b has an integer solution
(2) The system of linear equations Dy = c has an integer solution
(3) For every rational vector u such that uA is an integer vector, the number ub
is an integer
(4) For every rational vector v such that vD is an integer vector, the number vc
is an integer.
Proof.
: Indeed, Ax = b
(L-1DR-1)x = b
D(R-1x) = c
Dy =
c, where y = R-1x. Since R ∈ SLm(Z), then R-1 ∈ SLm(Z). Therefore x ∈
Zn
y =
R-1x ∈
Zn.
: Indeed, vD ∈ Zn
v(LAR) ∈
Zn
(vL)AR ∈
Zn
(vL)A ∈ ZnR-1 =
Zn
uA ∈ Zn, where u = vL.
Since L ∈ SLn(Z), then u ∈ Qm
v ∈ Qm, and, by
(3),
ub ∈ Z. But ub ∈ Z
(vL)(L-1c) ∈
Z
vc ∈ Z. Therefore (3) implies (4).
Reversing the
order of the argument, we get uA ∈Zn
vD ∈
Zn and vc∈ Z
ub ∈
Z. Therefore
(4)
implies (3).
: Dy = c implies v(Dy) = vc for every v ∈ Qm, hence (vD)y = vc. If vD ∈
Zn,
then vc ∈Z. Thus (2) implies (4). In order to prove that (4) implies (2), first
we observe that
. For suppose
, j > s. Consider
where
appears in the j–th position. Since vD = 0 ∈
Zn, then by (4) vc = 1/2 ∈
Z, and we
arrive at a contradiction. Thus
for j > s. Next, for i = 1, . . . , s, we
consider vectors
. Since
n, then by (4),
and hence
. Let
, where
. Then y∈Zn, and Dy = c.
With notations as in Proposition 2, one can reduce the solution of the system Ax
= b to a
solution of Dy = c by performing elementary transformations (over Z) of rows and
columns of
matrix A augmented by vector b . Matrices L and R can be constructed by
multiplying matrices
corresponding to these transformations. System Dy = c has a solution if and only
if 
, and
for i = 1, . . . , s. A general solution of Dy = c
can be given in the
form
, where
,
and
are free integer
parameters. Then the general solution of Ax = b is just Ry. Clearly, we may
assume that each
equation is reduced by the greatest common divisor of the coefficients of the
variables .