Lüroth’s theorem (Lüroth 1876 for \(k = \mathbb{C}\), Steinitz 1910 in general). If \(k \subseteq K\) are fields such that \(k \subseteq K \subseteq k(x)\), where \(x\) is an indeterminate over \(k\), then \(K = k(g)\) for some rational function \(g\) of \(x\) over \(k\).
I am going to present a “constructive” proof of Lüroth’s theorem due to Netto (1895) that I learned from Schinzel’s Selected Topics on Polynomials (and give some applications to criteria for proper polynomial parametrizations). The proof uses the following result which I am not going to prove here:
Proposition (with the set up of Lüroth’s theorem). \(K\) is finitely generated over \(k\), i.e. there are finitely many rational functions \(g_1, \ldots, g_s \in k(x)\) such that \(K = k(g_1, \ldots, g_s)\).
The proof is constructive in the following sense: given \(g_1, \ldots, g_s\) as in the proposition, it gives an algorithm to determine \(g\) such that \(K = k(g)\). We use the following notation in the proof: given a rational function \(h \in k(x)\), if \(h = h_1/h_2\) with polynomials \(h_1, h_2 \in k[x]\) with \(\gcd(h_1, h_2) = 1\), then we define \(\deg_\max(h) := \max\{\deg(h_1), \deg(h_2)\}\).
Proof of Lüroth’s theorem
It suffices to consider the case that \(K \neq k\). Pick \(g_1, \ldots, g_s\) as in the proposition. Write \(g_i = F_i/G_i\), where
- \(\gcd(F_i, G_i) = 1\) (Property 1).
Without loss of generality (i.e. discarding \(g_i \in k\) or replacing \(g_i\) by \(1/(g_i + a_i)\) for appropriate \(a_i \in k\) if necessary) we can also ensure that
- \(\deg(F_i) > 0\) and \(\deg(F_i) > \deg(G_i)\) (Property 2).
Consider the polynomials \[H_i := F_i(t) – g_iG_i(t) \in K[t] \subset k(x)[t], i = 1, \ldots, s,\] where \(t\) is a new indeterminate. Let \(H\) be the greatest common divisor of \(H_1, \ldots, H_s\) in \(k(x)[t]\) which is also monic in \(t\). Since the Euclidean algorithm for computing \(\gcd\) respects the field of definition, it follows that:
- \(H\) is also the greatest common divisor of \(H_1, \ldots, H_s\) in \(K[t]\), which means, if \(H = \sum_j h_j t^j\), then each \(h_j \in K\) (Property 3).
Let \(H^* \in k[x,t]\) be the polynomial obtained by “clearing the denominator” of \(H\); in other words, \(H = H^*/h(x)\) for some polynomial \(h \in k[x]\) and \(H^*\) is primitive as a polynomial in \(t\) (i.e. the greatest common divisor in \(k[x]\) of the coefficients in \(H^*\) of powers of \(t\) is 1). By Gauss’s lemma, \(H^*\) divides \(H^*_i := F_i(t)G_i(x) – F_i(x)G_i(t)\) in \(k[x,t]\), i.e. there is \(Q_i \in k[x,t]\) such that \(H^*_i = H^* Q_i \).
Claim 1. If \(\deg_t(H^*) < \deg_t(H^*_i)\), then \(\deg_x(Q_i) > 0\).
Proof of Claim 1. Assume \(\deg_t(H^*) < \deg_t(H^*_i)\). Then \(\deg_t(Q_i) > 1\). If in addition \(\deg_x(Q_i) = 0\), then we can write \(Q_i(t)\) for \(Q_i\). Let \(F_i(t) \equiv \tilde F_i(t) \mod Q_i(t)\) and \(G_i(t) \equiv \tilde G_i(t) \mod Q_i(t)\) with \(\deg(\tilde F_i) < \deg(Q_i)\) and \(\deg(\tilde G_i) < \deg(Q_i)\). Then \(\tilde F_i(t)G_i(x) – F_i(x) \tilde G_i(t) \equiv 0 \mod Q_i(t)\). Comparing degrees in \(t\), we have \(\tilde F_i(t)G_i(x) = F_i(x) \tilde G_i(t)\). It is straightforward to check that this contradicts Propeties1 and 2 above, and completes the proof of Claim 1.
Let \(m := \min\{\deg_\max(g_i): i = 1, \ldots, s\}\), and pick \(i\) such that \(\deg_\max(g_i) = m\). Property 2 above implies that \(\deg_t(H^*_i) = \deg_x(H^*_i) = m\). If \(\deg_t(H^*) < m\), then Claim 1 implies that \(\deg_x(H^*) < \deg_x(H^*_i) = m\). If the \(h_j\) are as in Property 3 above, it follows that \(\deg_\max(h_j) < m\) for each \(j\). Since \(H^* \not\in k[t]\) (e.g. since \(t-x\) divides each \(H_i\)), there must be at least one \(h_j \not \in k\). Since adding that \(h_j\) to the list of the \(g_i\) decreases the value of \(m\), it follows that the following algorithm must stop:
Algorithm
- Step 1: Pick \(g_i := F_i/G_i\), \(i = 1, \ldots, s\), satisfying properties 1 and 2 above.
- Step 2: Compute the monic (with respect to \(t\)) \(\gcd\) of \(F_i(t) – g_i G_i(t)\), \(i = 1, \ldots, s\), in \(k(x)[t]\); call it \(H\).
- Step 3: Write \(H = \sum_j h_j(x) t^j\). Then each \(h_j \in k(g_1, \ldots, g_s)\). If \(\deg_t(H) < \min\{\deg_\max(g_i): i = 1, \ldots, s\}\), then adjoin all (or, at least one) of the \(h_j\) such that \(h_j \not\in k\) to the list of the \(g_i\) (possibly after an appropriate transformation to ensure Property 2), and repeat.
After the last step of the algorithm, \(H\) must be one of the \(H_i\), in other words, there is \(\nu\) such that \[\gcd(F_i(t) – g_i G_i(t): i = 1, \ldots, s) = F_{\nu}(t) – g_{\nu}G_{\nu}(t).\]
Claim 2. \(K = k(g_{\nu})\).
Proof of Claim 2 (and last step of the proof of Lüroth’s theorem). For a given \(i\), polynomial division in \(k(g_\nu)[t]\) gives \(P, Q \in k(g_\nu)[t]\) such that \[F_i(t) = (F_{\nu}(t) – g_{\nu}G_{\nu}(t))P + Q,\] where \(\deg_t(Q) < \deg_t(F_{\nu}(t) – g_{\nu}G_{\nu}(t))\). If \(Q = 0\), then \(F_i(t) = (F_{\nu}(t) – g_{\nu}G_{\nu}(t))P\), and clearing out the denominator (with respect to \(k[g_\nu]\)) of \(P\) gives an identity of the form \(F_i(t)p(g_\nu) = (F_{\nu}(t) – g_{\nu}G_{\nu}(t))P^* \in k[g_\nu, t]\) which is impossible, since \(F_{\nu}(t) – g_{\nu}G_{\nu}(t)\) does not factor in \(k[g_\nu, t]\). Therefore \(Q \neq 0\). Similarly, \[G_i(t) = (F_{\nu}(t) – g_{\nu}G_{\nu}(t))R + S,\] where \(R, S \in k(g_\nu)[t]\), \(S \neq 0\), and \(\deg_t(S) < \deg_t(F_{\nu}(t) – g_{\nu}G_{\nu}(t))\). It follows that \[F_i(t) – g_iG_i(t) = (F_{\nu}(t) – g_{\nu}G_{\nu}(t))(P – g_iR) + Q – g_iS.\] Since \(F_{\nu}(t) – g_{\nu}G_{\nu}(t)\) divides \(F_{i}(t) – g_{i}G_{i}(t)\) in \(k(x)[t]\) and since \(\deg_t(Q – g_iS) < \deg_t(F_{\nu}(t) – g_{\nu}G_{\nu}(t))\), it follows that \(Q = g_iS\). Taking the leading coefficients (with respect to \(t\)) \(q_0, s_0 \in k(g_\nu)\) of \(Q\) and \(S\) gives that \(g_i = q_0/s_0 \in k(g_\nu)\), as required to complete the proof.
Applications
The following question seems to be interesting (geometrically, it asks when a given polynomial parametrization of a rational affine plane curve is proper).
Question 1. Let \(k\) be a field and \(x\) be an indeterminate over \(k\) and \(g_1, g_2 \in k[x] \). When is \(k(g_1, g_2) = k(x)\)?
We now give a sufficient condition for the equality in Question 1. Note that the proof is elementary: it does not use Lüroth’s theorem, only follows the steps of the above proof in a special case.
Corollary 1. In the set up of Question 1, let \(d_i := \deg(g_i)\), \(i = 1, 2\). If the \(\gcd\) of \(x^{d_1} – 1, x^{d_2} – 1\) in \(k[x]\) is \(x – 1\), then $k(g_1, g_2) = k(t)$. In particular, if \(d_1, d_2\) are relatively prime and the characteristic of \(k\) is either zero or greater than both \(d_1, d_2\), then $k(g_1, g_2) = k(x)$.
Remark. Corollary 1 is true without the restriction on characteristics, i.e. the following holds: “if \(d_1, d_2\) are relatively prime, then $k(g_1, g_2) = k(x)$.” François Brunault (in a comment to one of my questions on MathOverflow) provided the following simple one line proof: \([k(x): k(g_1, g_2)]\) divides both \([k(x): k(g_i)] = d_i\), and therefore must be \(1\).
My original proof of Corollary 1. Following the algorithm from the above proof of Lüroth’s theorem, let \(H_i := g_i(t) – g_i(x)\), \(i = 1, 2\), and \(H \in k(x)[t]\) be the monic (with respect to \(t\)) greatest common divisor of \(H_1, H_2\).
Claim 1.1. \(H = t – x\).
Proof. It is clear that \(t-x\) divides \(H\) in \(k(x)[t]\), so that \(H(x,t) = (t-x)h_1(x,t)/h_2(x)\) for some \(h_1(x,t) \in k[x,t]\) and \(h_2(x) \in k[x]\). It follows that there is \(Q_i(x,t) \in k[x,t]\) and \(P_i(x) \in k[x]\) such that \(H_i(x,t)P_i(x)h_2(x) = (t-x)h_1(x,t)Q_i(x,t)\). Since \(h_2(x)\) and \((t-x)h_1(x,t)\) have no common factor, it follows that \(h_2(x)\) divides \(Q_i(x,t)\), and after cancelling \(h_2(x)\) from both sides, one can write \[H_i(x,t)P_i(x) = (t-x)h_1(x,t)Q’_i(x,t),\ i = 1, 2.\] Taking the leading form of both sides with respect to the usual degree on \(k[x,t]\), we have that \[(t^{d_i} – x^{d_i})x^{p_i} = a_i(t-x)\mathrm{ld}(h_1)\mathrm{ld}(Q’_i)\] where \(a_i \in k \setminus \{0\}\) and \(\mathrm{ld}(\cdot)\) is the leading form with respect to the usual degree on \(k[x,t]\). Since \(\gcd(x^{d_1} – 1, x^{d_2} – 1) = x – 1\), it follows that \(\mathrm{ld}(h_1)\) does not have any factor common with \(t^{d_i} – x^{d_i}\), and consequently, \(t^{d_i} – x^{d_i}\) divides \((t-x)\mathrm{ld}(Q’_i)\). In particular, \(\deg_t(Q’_i) = d_i – 1\). But then \(\deg_t(h_1) = 0\). Since \(H = (t-x)h_1(x)/h_2(x)\) is monic in \(t\), it follows that \(H = t – x\), which proves Claim 1.1.
Since both \(H_i\) are elements of \(k(g_1, g_2)[t]\), and since the Euclidean algorithm to compute \(\gcd\) of polynomials (in a single variable over a field) preserves the field of definition, it follows that \(H \in k(g_1, g_2)[t]\) as well (this is precisely the observation of Property 3 from the above proof of Lüroth’s theorem). Consequently \(x \in k(g_1, g_2)\), as required to prove Corollary 1.
References
- Andrzej Schinzel, Selected Topics on Polynomials, The University of Michigan Press, 1982.