I was planning to include this as part of an earlier post, but I figured this was important enough to warrant a new post. Bruhat order is a partial ordering on a Weyl group that arises from Schubert calculus. The main result required here is Theorem 5.3, but since this is its own post I added some additional material that we may explore in further detail later. We have two main goals here: one is to develop the theory sufficiently to state and prove the main result in my dissertation, and the other is to present some of my work that is unpublished and probably will only be published here. It's going to take a few more posts to get to my dissertation, after which we will see where it goes. There is an unpublished result of mine based on the root systems that has already been cited by Sara Billey in a talk and that I hope to get to here.
Definition 1: Let (R,B) be a root system. If u,v∈W(R), define u→v if there exists y∈I(v) such that usy=v; that is, u−1v is a reflection and ℓ(u)<ℓ(v). Then define a partial ordering ≤, called Bruhat order, to be the reflexive, transitive closure of →.
This definition of Bruhat order is not always the easiest to work with. The main theorem in this post gives another characterization of Bruhat order in terms of subwords.
Definition 2: Given a word (sx1,…,sxn), we say that a sequence (e1,…,en) is a subword of this word if ei∈{1,sxi} for all i. A subword has an associated word obtained by deleting all of the 1's; we say the subword is reduced if the corresponding word is reduced, and we say that the subword is for the element e1e2⋯en. We can iterate this process; if we have two subwords u′=(e1,…,en), u″ of a word \mathbf{u} we say that \mathbf{u}' is a subword of \mathbf{u}'' if e_i\in \{1,f_i\} for all i.
We will need the following lemma (or rather the corollary afterwards).
Lemma 5.1 (The deletion property): Let (R,B) be a root system, let w\in W(R), and let (s_{x_1},\ldots,s_{x_n}) be a word for w. If w is not reduced, then there exist indices i < j such that
(s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,\widehat{s_{x_j}},\ldots,s_{x_n})
is a word for w.
Proof: Let j be the minimal index such that (s_{x_1},\ldots,s_{x_j}) is not a reduced word. Let v=s_{x_1}\cdots s_{x_{j-1}}. Then \ell(vs_{x_j}) < \ell(v), so there exists an index i such that
(s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_{j-1}})
is a word for vs_{x_j}. These are the i and j we seek in the statement of the result. \square
The proof of the following is easy and we leave it to the reader.
Corollary 5.2: Let (R,B) be a root system and let w\in W(R). Given any word (s_{x_1},\ldots,s_{x_n}) for w there exists a reduced subword for w. \square
Theorem 5.3: Let (R,B) be a root system and let u,v\in W(R). Then u\leq v if and only if for any word (s_{x_1},\ldots,s_{x_n}) for v there exists a reduced subword for u.
Proof: We prove that if u\leq v then there exists such a reduced subword by induction on \ell(v)-\ell(u), the result being true by the corollary when \ell(u)=\ell(v). Otherwise by definition there exists a u' with u\leq u'\to v. By Theorem 2.3 we may delete one element from any word for v to obtain a word for u', or equivalently set the element at some single index to 1 to obtain a subword for u'. By the induction hypothesis, for this subword for u' we have that there is a reduced subword that is a word for u. Thus there is a reduced subword for u of the word for v, and one implication follows.
Suppose now that there exists such a subword for u in any word for v. Let \mathbf{v} be a reduced word for v containing a reduced subword \mathbf{u} for u; we prove by induction on \ell(v) that u\leq v. When \ell(v)=0 we must have that \ell(u)=0 so the result holds. Let s_{x_n} be the last simple reflection in \mathbf{v}, so the element \mathbf{v}_{\ell(v)}, so that \ell(v)=n. Let \widehat{\mathbf{v}} be the subword of \mathbf{v} such that \widehat{\mathbf{v}}_n=1 and \widehat{\mathbf{v}}_i=\mathbf{v}_i for i < n; note that this is a subword for vs_{x_n}. Similarly, let \widehat{\mathbf{u}} be the subword of \mathbf{u} such that \widehat{\mathbf{u}}_n=1 and \widehat{\mathbf{u}}_i=\mathbf{u}_i for i < n. Suppose first \mathbf{u}_n\neq s_{x_n}. Then \mathbf{u}=\widehat{\mathbf{u}} is a subword of \widehat{\mathbf{v}}, so u\leq vs_{x_n}\to v by the induction hypothesis and the result follows.
If \mathbf{u}_n=s_{x_n}, then since \widehat{\mathbf{u}} is a subword for us_{x_n} and is a subword of \widehat{\mathbf{v}} we have by the induction hypothesis that us_{x_n}\leq vs_{x_n}. Thus there exists some u' such that us_{x_n}\leq u'\to vs_{x_n}, and there is a reduced subword \mathbf{u}' of \widehat{\mathbf{v}} for u'. Assume without loss of generality that \widehat{\mathbf{u}} is a subword of \mathbf{u}'. We split into two cases, depending on whether x_n\in I(u').
If x_n\notin I(u'), then \ell(u's_{x_n}) > \ell(u'), hence if \mathbf{u}'' is the subword of \mathbf{v} for u's_{x_n} such that \mathbf{u}''_i=\mathbf{u}'_i for i < n and \mathbf{u}''_n=s_{x_n}, then since \mathbf{u} is a subword of \mathbf{u}'' and u's_{x_n}\to v (since if u's_y=vs_{x_n} then (u's_{x_n})s_{x_n\ast y}=v) we have by the induction hypothesis that u\leq u's_{x_n}\leq v and we are done.
If x_n\in I(u'), then we may assume that \mathbf{u}'_n=s_{x_n}. This is because if this were not the case then we could delete an element of \mathbf{u}', setting say \mathbf{u}'_i=1 to obtain a word for u's_{x_n}, then set \mathbf{u}'_n=s_{x_n} to obtain another subword for u'. \mathbf{u}' contains a reduced subword for us_{x_n} by the induction hypothesis whose last element is not s_{x_n} (because no word for us_{x_n} ends in s_{x_n}), hence there is a reduced subword \widetilde{\mathbf{u}} of \mathbf{u}' for u, so that u\leq u'\to vs_{x_n}\to v by the induction hypothesis and we are done. The result follows by induction.
\square
Now we for the additional material, where we relate Bruhat order to subsystems. The following theorem, which is an easy consequence of the theory built so far in this blog, is highly nontrivial in the setting of Coxeter systems. It was the main result of (Dyer, Matthew. On the "Bruhat graph" of a Coxeter system, Compositio Mathematica 78.2 (1991): 185-191). To start we will need a lemma, followed by a definition, then we may move on to the theorem. Recall that for a root system (R,B) and a subsystem U, we defined in Theorem-Definition 4.5 a function \phi_U:W(R)\to W(U) that sends an element w\in W(R) to the element of W(U) whose inversion set in U is the intersection I(w)\cap U.
Lemma 5.4: Let (R,B) be a root system, let U\subseteq R be a subsystem, and let w\in W(R). Then w\in W(U) if and only if
\phi_U(w)=w
Proof: By definition, the codomain of \phi_U is W(U), hence if \phi_U(w)=w then w\in W(U). Conversely, suppose w\in W(U) and \phi_U(w)=w'\in W(U). Then I_U(w)=I_U(w'), and since w,w'\in W(U) it follows by Proposition 3.5 that w=w'. The result follows. \square
Definition 3: Let (R,B) be a root system and let U be a subsystem. If u,v\in W(R) are such that \ell(u) < \ell(v) and there exists a root y\in U such that us_y=v, then we say that u\to_U v. Define \leq_U to be the reflexive, transitive closure of \to_U. Note that Bruhat order on W(U) is the restriction of the relation \leq_U.
The Bruhat graph of a Weyl group W(R) is the directed graph on W(R) with the edges u\to v, where we may consider the edge u\to v as being labeled with the reflection u^{-1}v. The theorem below says that the Bruhat graph of wW(U)\subseteq W(R) for any coset wW(R) is isomorphic as a labeled graph to the Bruhat graph of W(U) and is an induced subgraph of the Bruhat graph of W(R) (so there are no extra edges that do not come from reflections in W(U)).
Theorem 5.5: Let (R,B) be a root system, let U be a subsystem, and let w\in W(R). If u,v are both contained in the coset wW(U), then u\to_R v if and only if u\to_U v.
Proof: If u\to_U v, then clearly u\to_R v. Conversely, if u\to_R v, then there exists a reflection t\in W(R) such that ut=v. Since u,v are in the same left coset of W(U) we must have that t\in W(U). We need to prove then that if t\in W(R) is a reflection and t\in W(U), then there is a root y\in U such that t=s_y. The contrapositive is that if t=s_y for some y\in R-U, then t\notin W(U). To see this, let w'\in W(R) be an element such that w'(y)\in B. Then s_y\notin W(U) if and only if s_{w'(y)}\notin W(w'(U)). For all z\in w'(U)\cap R^+(B) we have that w'(y)\ast z\in R^+(B), hence \phi_{w'(U)}(s_{w'(y)})=1\neq s_{w'(y)}, hence s_{w'(y)}\notin W(w'(U)) by Lemma 5.4 and we are done. \square
This implies that if u,v\in wW(U) and \phi_U(u)\leq_U \phi_U(v), then u\leq v. However, this does not in general imply that if u,v\in wW(U) and u\leq v then \phi_U(u)\leq_U \phi_U(v). For an example, consider the root system R_{A}^3, defined in post 2. Let x=t_2-t_3 and let y=t_1-t_4. Then U=\{\pm x,\pm y\} is a subsystem of R_{A}^3 since x\ast y=y and y\ast x=x, and s_x\leq s_y. However it is not the case that s_x\leq_U s_y because \ell_U(s_x)=\ell_U(s_y).
4. Subsystems of root systems
Here we define subsystems of root systems and prove necessary facts about them. Unlike for groups, rings, etc. it's pretty tough to prove a subset closed under the operation is itself a root system. Though an equivalent statement is given as an exercise in the text Combinatorics of Coxeter groups, so are some unsolved problems, and the equivalent statement was the main result of Matthew Dyer's dissertation.
Definition 1: Let (R,B) be a root system. A subset U\subseteq R is called a subsystem if it is closed under the operation \ast in R. Define B_R(U)=\{x\in R^+(B)\cap U|x\ast y\in R^+(B)\mbox{ for all }y\in R^+(B)\cap U\mbox{ with }x\neq y\} We will be showing below that B_R(U) forms a basis for U, though it's not even clear that B_R(U) is nonempty. Define also for x\in R and U\subseteq R x\ast U=\{x\ast y|y\in U\} Intuitively this is a "translation" of a subsystem by a root. We've been using this notation already for inversion sets. Note that y\mapsto x\ast y is a bijection between U and x\ast U, and since (x\ast y)\ast (x\ast z)=x\ast (y\ast z) we have that x\ast U is a subsystem of R. You may wish to verify that if x\in U then x\ast U=U.
Lemma 4.1: If (R,B) is a root system, U\subseteq R is a subsystem, and x\in B-U then B_R(x\ast U)=x\ast B_R(U) and R^+(B_R(x\ast U))=x\ast R^+(B_R(U))
Proof: First we prove that B_R(x\ast U)=x\ast B_R(U). Note that x\notin x\ast U. Suppose y\in B_R(x\ast U). Then for all z\in R^+(B)\cap (x\ast U) such that y\neq z we have that y\ast z\in R^+(B)\cap U. Since x\in B and x\notin U, x\neq y,z and hence x\ast (y\ast z)=(x\ast y)\ast (x\ast z)\in R^+(B)\cap U. Thus x\ast y\in B_R(U), hence y\in x\ast B_R(U). Thus B_R(x\ast U)\subseteq x\ast B_R(U).
Suppose conversely that y\in x\ast B_R(U). Then for all z\in R^+(B)\cap U with z\neq x\ast y we have that (x\ast y)\ast z \in R^+(B)\cap U, hence x\ast ((x\ast y)\ast z)=y\ast (x\ast z)\in R^+(B)\cap (x\ast U). Thus y\in B_R(x\ast U), and the result follows.
Now note that R^+(B_R(x\ast U)) is the smallest set satisfying B_R(x\ast U)=x\ast B_R(U)\subseteq R^+(B_R(x\ast U)) and if y\in R^+(B_R(x\ast U)) and x'\in x\ast B_R(U) are such that x'\neq y, then x'\ast y\in R^+(B_R(x\ast U)). Since x\ast (y\ast z)=(x\ast y)\ast (x\ast z), we see that the bijection y\mapsto x\ast y preserves these properties, thus x\ast R^+(B_R(x\ast U))=R^+(B_R(U)). The result follows. \square
The following theorem is essentially the main result of Dyer's dissertation (Reflection subgroups of Coxeter systems, Journal of Algebra 135 (1), 1990, 57-73).
Theorem 4.2: Let (R,B) be a root system and let U\subseteq R be a subsystem. Then (U,B_R(U)) is a root system and R^+(B_R(U))=R^+(B)\cap U
Proof: The theorem follows if we prove that R^+(B_R(U))=R^+(B)\cap U. It is an easy consequence of the definition of B_R(U) that R^+(B_R(U))\subseteq R^+(B)\cap U. We prove that the reverse inclusion holds. Suppose y\in R^+(B)\cap U. Let y=w(x) for w\in W(R) and x\in B with \ell(w) minimal. We prove by induction on \ell(w) that y\in R^+(B_R(U)). If \ell(w)=0, then y\in B, hence y\in B_R(U), so y\in R^+(B_R(U)). Otherwise let x'\in B be such that \ell(s_{x'}w) < \ell(w). If x'\in U, then since x'\ast y=s_{x'}w(x) and \ell(s_{x'}w) < \ell(w) we have by the induction hypothesis that x'\ast y\in R^+(B_R(U)), hence the same is true of y=x'\ast(x'\ast y). If x'\notin U, then x'\ast y\in x'\ast U and by the induction hypothesis x'\ast y\in R^+(B_R(x'\ast U))=R^+(x'\ast B_R(U))=x'\ast R^+(B_R(U)) (the equalities by Lemma 4.1), so y=x'\ast (x'\ast y)\in R^+(B_R(U)). The result follows by induction. \square
We need to know now that W(U) can be naturally treated as a subgroup of W(R). For x\in U define s_x^U:U\to U to be the corresponding reflection in U, and s_x^R to be the corresponding reflection in R. Define W_R(U) to be the subgroup of W(R) generated by all s_x^R.
Theorem 4.3: Let (R,B) be a root system and let U\subseteq R be a subsystem. Then the group homomorphism W_R(U)\to W(U) determined by s_x^R\mapsto s_x^U is an isomorphism.
Proof: The reflections s_x^R for x\in B_R(U) generate W_R(U) by Theorem 2.2. Let w\in W_R(U) be a nonidentity element; we claim that the image of w is not the identity. Let \ell_U(w) be the minimal length of a word for w in terms of elements of S(B_R(U)). Let y\in R^+(B)\cap U. We claim that \ell_U(ws_y) < \ell_U(w) if and only if y\in I(w). The proof of this is similar to an earlier proof (Theorem 2.3), so we omit it. By exactly the same line of reasoning as the proof of Proposition 2.4, it follows that \ell_U(w)=|R^+(B)\cap U|. Thus if w fixes R we must have that \ell_U(w)=0, so w is the identity. This proves injectivity of the homomorphism. Surjectivity is clear. \square
We therefore do not distinguish reflections in a subsystem from the corresponding reflections in the larger root system, and we identify W(U) with the subgroup W_R(U) of W(R).
Definition 2: If (R,B) is a root system, U\subseteq R is a subsystem, and w\in W(U) define I_U(w) = I(w)\cap U so that I(w)=I_R(w). We use this notation to distinguish inversion sets of elements in W(U) in the subsystem U from the inversion sets of the elements in the larger root system R.
Lemma 4.4: Let (R,B) be a root system, U\subseteq R a subsystem, x\in B-U, and w\in W(U). Then I_U(w)=x\ast I_{x\ast U}(ws_x)
Proof: If y\in I_U(w), then w(y)\notin R^+(B) and y\in U. Then x\ast y\in (x\ast U)\cap R^+(B), and ws_x(x\ast y)=w(y)\notin R^+(B). Thus I_U(w)\subseteq x\ast I_{x\ast U}(ws_x). Conversely, if y\in I_{x\ast U}(ws_x), then y\in x\ast U, so that x\ast y\in U, and ws_x(y)=w(x\ast y)\notin R^+(B). Since x\notin x\ast U we have that x\ast y\in I_U(w) and the result follows. \square
Now we define a function \phi_U:W(R)\to W(U) that allows us to "project" elements of W(R) onto W(U).
Theorem-Definition 4.5 If (R,B) is a root system and U\subseteq R is a subsystem, then there is a unique function \phi_U:W(R)\to W(U) such that for each w\in W(R) we have that I(w)\cap U=I_U(\phi_U(w))
Proof: We prove existence by induction \ell(w), the result being clear if \ell(w)=0. Let x\in I(w)\cap B. If x\in U, then since I(w)=\{x\}\cup (x\ast I(ws_x)) we have that \begin{align} I(w)\cap U &= (\{x\}\cup (x\ast I(ws_x)))\cap U\\ &=\{x\}\cup (x\ast I_U(\phi_U(ws_x)))\\ &=I_U(\phi_U(ws_x)s_x) \end{align} (using Lemma 3.2), so \phi_U(w)=\phi_U(ws_x)s_x If x\notin U, we note that \begin{align} I(w)\cap U&=(\{x\}\cup (x\ast I(ws_x)))\cap U\\ &=x\ast (I(ws_x)\cap (x\ast U))\\ &=x\ast (I_{x\ast U}(\phi_{x\ast U}(ws_x)))\\ &=x\ast (x\ast I_U(\phi_{x\ast U}(ws_x)s_x))\\ &=I_U(\phi_{x\ast U}(ws_x)s_x)\\ \end{align} (using Lemma 3.2 and Lemma 4.4), hence \phi_U(w)=\phi_{x\ast U}(ws_x)s_x and the result follows by induction. Uniqueness follows by uniqueness of inversion sets (Proposition 3.5). \square
The following theorem is quite deep and is related to the theory of permutation patterns.
Theorem-Definition 4.6: Let (R,B) be a root system, let w\in W(R), and let U\subseteq R be a subsystem. Then there exist unique elements w^U\in W(R) and w_U\in W(U) such that I(w^U)\cap U=\emptyset and w=w^Uw_U In particular, w^U is the unique element of minimal length in the coset wW(U).
Proof: I claim that w_U=\phi_U(w) satisfies the condition, with w^U=ww_U^{-1}. We prove by induction on \ell_U(\phi_U(w)) that I(w\phi_U(w)^{-1})\cap U=\emptyset. Set v=\phi_U(w). If \ell_U(v)=0, then w^U=w and w_U=1=\phi_U(w) will work because I(w^U)\cap U=\emptyset by definition. Otherwise let x\in B_R(U)\cap I_U(v). Then \ell(ws_x) < \ell(w) because x\in I(w), and by the induction hypothesis since \phi_U(ws_x)=vs_x we have that I((ws_x)^U)\cap U=I((ws_x)(vs_x)^{-1})\cap U=\emptyset. Since (ws_x)(vs_x)^{-1}=wv^{-1}, the result follows by induction.
To prove uniqueness of w^U, and hence of w_U, suppose u\in wW(U) is an element of minimal length. Then I(u)\cap U=\emptyset because \ell(us_y) > \ell(u) for all y\in R^+(B)\cap U. Any element in wW(U) is equal to uv for some v\in W(U), and if y\in I_U(v) then -v(y)\in R^+(B)\cap U, hence -uv(y)\in R^+(B)\cap U, hence y\in I(uv). This proves uniqueness of the minimal element as well as the element u such that I(u)\cap U=\emptyset, because any other element in the coset contains a root in R^+(B)\cap U in its inversion set. \square
Definition 1: Let (R,B) be a root system. A subset U\subseteq R is called a subsystem if it is closed under the operation \ast in R. Define B_R(U)=\{x\in R^+(B)\cap U|x\ast y\in R^+(B)\mbox{ for all }y\in R^+(B)\cap U\mbox{ with }x\neq y\} We will be showing below that B_R(U) forms a basis for U, though it's not even clear that B_R(U) is nonempty. Define also for x\in R and U\subseteq R x\ast U=\{x\ast y|y\in U\} Intuitively this is a "translation" of a subsystem by a root. We've been using this notation already for inversion sets. Note that y\mapsto x\ast y is a bijection between U and x\ast U, and since (x\ast y)\ast (x\ast z)=x\ast (y\ast z) we have that x\ast U is a subsystem of R. You may wish to verify that if x\in U then x\ast U=U.
Lemma 4.1: If (R,B) is a root system, U\subseteq R is a subsystem, and x\in B-U then B_R(x\ast U)=x\ast B_R(U) and R^+(B_R(x\ast U))=x\ast R^+(B_R(U))
Proof: First we prove that B_R(x\ast U)=x\ast B_R(U). Note that x\notin x\ast U. Suppose y\in B_R(x\ast U). Then for all z\in R^+(B)\cap (x\ast U) such that y\neq z we have that y\ast z\in R^+(B)\cap U. Since x\in B and x\notin U, x\neq y,z and hence x\ast (y\ast z)=(x\ast y)\ast (x\ast z)\in R^+(B)\cap U. Thus x\ast y\in B_R(U), hence y\in x\ast B_R(U). Thus B_R(x\ast U)\subseteq x\ast B_R(U).
Suppose conversely that y\in x\ast B_R(U). Then for all z\in R^+(B)\cap U with z\neq x\ast y we have that (x\ast y)\ast z \in R^+(B)\cap U, hence x\ast ((x\ast y)\ast z)=y\ast (x\ast z)\in R^+(B)\cap (x\ast U). Thus y\in B_R(x\ast U), and the result follows.
Now note that R^+(B_R(x\ast U)) is the smallest set satisfying B_R(x\ast U)=x\ast B_R(U)\subseteq R^+(B_R(x\ast U)) and if y\in R^+(B_R(x\ast U)) and x'\in x\ast B_R(U) are such that x'\neq y, then x'\ast y\in R^+(B_R(x\ast U)). Since x\ast (y\ast z)=(x\ast y)\ast (x\ast z), we see that the bijection y\mapsto x\ast y preserves these properties, thus x\ast R^+(B_R(x\ast U))=R^+(B_R(U)). The result follows. \square
The following theorem is essentially the main result of Dyer's dissertation (Reflection subgroups of Coxeter systems, Journal of Algebra 135 (1), 1990, 57-73).
Theorem 4.2: Let (R,B) be a root system and let U\subseteq R be a subsystem. Then (U,B_R(U)) is a root system and R^+(B_R(U))=R^+(B)\cap U
Proof: The theorem follows if we prove that R^+(B_R(U))=R^+(B)\cap U. It is an easy consequence of the definition of B_R(U) that R^+(B_R(U))\subseteq R^+(B)\cap U. We prove that the reverse inclusion holds. Suppose y\in R^+(B)\cap U. Let y=w(x) for w\in W(R) and x\in B with \ell(w) minimal. We prove by induction on \ell(w) that y\in R^+(B_R(U)). If \ell(w)=0, then y\in B, hence y\in B_R(U), so y\in R^+(B_R(U)). Otherwise let x'\in B be such that \ell(s_{x'}w) < \ell(w). If x'\in U, then since x'\ast y=s_{x'}w(x) and \ell(s_{x'}w) < \ell(w) we have by the induction hypothesis that x'\ast y\in R^+(B_R(U)), hence the same is true of y=x'\ast(x'\ast y). If x'\notin U, then x'\ast y\in x'\ast U and by the induction hypothesis x'\ast y\in R^+(B_R(x'\ast U))=R^+(x'\ast B_R(U))=x'\ast R^+(B_R(U)) (the equalities by Lemma 4.1), so y=x'\ast (x'\ast y)\in R^+(B_R(U)). The result follows by induction. \square
We need to know now that W(U) can be naturally treated as a subgroup of W(R). For x\in U define s_x^U:U\to U to be the corresponding reflection in U, and s_x^R to be the corresponding reflection in R. Define W_R(U) to be the subgroup of W(R) generated by all s_x^R.
Theorem 4.3: Let (R,B) be a root system and let U\subseteq R be a subsystem. Then the group homomorphism W_R(U)\to W(U) determined by s_x^R\mapsto s_x^U is an isomorphism.
Proof: The reflections s_x^R for x\in B_R(U) generate W_R(U) by Theorem 2.2. Let w\in W_R(U) be a nonidentity element; we claim that the image of w is not the identity. Let \ell_U(w) be the minimal length of a word for w in terms of elements of S(B_R(U)). Let y\in R^+(B)\cap U. We claim that \ell_U(ws_y) < \ell_U(w) if and only if y\in I(w). The proof of this is similar to an earlier proof (Theorem 2.3), so we omit it. By exactly the same line of reasoning as the proof of Proposition 2.4, it follows that \ell_U(w)=|R^+(B)\cap U|. Thus if w fixes R we must have that \ell_U(w)=0, so w is the identity. This proves injectivity of the homomorphism. Surjectivity is clear. \square
We therefore do not distinguish reflections in a subsystem from the corresponding reflections in the larger root system, and we identify W(U) with the subgroup W_R(U) of W(R).
Definition 2: If (R,B) is a root system, U\subseteq R is a subsystem, and w\in W(U) define I_U(w) = I(w)\cap U so that I(w)=I_R(w). We use this notation to distinguish inversion sets of elements in W(U) in the subsystem U from the inversion sets of the elements in the larger root system R.
Lemma 4.4: Let (R,B) be a root system, U\subseteq R a subsystem, x\in B-U, and w\in W(U). Then I_U(w)=x\ast I_{x\ast U}(ws_x)
Proof: If y\in I_U(w), then w(y)\notin R^+(B) and y\in U. Then x\ast y\in (x\ast U)\cap R^+(B), and ws_x(x\ast y)=w(y)\notin R^+(B). Thus I_U(w)\subseteq x\ast I_{x\ast U}(ws_x). Conversely, if y\in I_{x\ast U}(ws_x), then y\in x\ast U, so that x\ast y\in U, and ws_x(y)=w(x\ast y)\notin R^+(B). Since x\notin x\ast U we have that x\ast y\in I_U(w) and the result follows. \square
Now we define a function \phi_U:W(R)\to W(U) that allows us to "project" elements of W(R) onto W(U).
Theorem-Definition 4.5 If (R,B) is a root system and U\subseteq R is a subsystem, then there is a unique function \phi_U:W(R)\to W(U) such that for each w\in W(R) we have that I(w)\cap U=I_U(\phi_U(w))
Proof: We prove existence by induction \ell(w), the result being clear if \ell(w)=0. Let x\in I(w)\cap B. If x\in U, then since I(w)=\{x\}\cup (x\ast I(ws_x)) we have that \begin{align} I(w)\cap U &= (\{x\}\cup (x\ast I(ws_x)))\cap U\\ &=\{x\}\cup (x\ast I_U(\phi_U(ws_x)))\\ &=I_U(\phi_U(ws_x)s_x) \end{align} (using Lemma 3.2), so \phi_U(w)=\phi_U(ws_x)s_x If x\notin U, we note that \begin{align} I(w)\cap U&=(\{x\}\cup (x\ast I(ws_x)))\cap U\\ &=x\ast (I(ws_x)\cap (x\ast U))\\ &=x\ast (I_{x\ast U}(\phi_{x\ast U}(ws_x)))\\ &=x\ast (x\ast I_U(\phi_{x\ast U}(ws_x)s_x))\\ &=I_U(\phi_{x\ast U}(ws_x)s_x)\\ \end{align} (using Lemma 3.2 and Lemma 4.4), hence \phi_U(w)=\phi_{x\ast U}(ws_x)s_x and the result follows by induction. Uniqueness follows by uniqueness of inversion sets (Proposition 3.5). \square
The following theorem is quite deep and is related to the theory of permutation patterns.
Theorem-Definition 4.6: Let (R,B) be a root system, let w\in W(R), and let U\subseteq R be a subsystem. Then there exist unique elements w^U\in W(R) and w_U\in W(U) such that I(w^U)\cap U=\emptyset and w=w^Uw_U In particular, w^U is the unique element of minimal length in the coset wW(U).
Proof: I claim that w_U=\phi_U(w) satisfies the condition, with w^U=ww_U^{-1}. We prove by induction on \ell_U(\phi_U(w)) that I(w\phi_U(w)^{-1})\cap U=\emptyset. Set v=\phi_U(w). If \ell_U(v)=0, then w^U=w and w_U=1=\phi_U(w) will work because I(w^U)\cap U=\emptyset by definition. Otherwise let x\in B_R(U)\cap I_U(v). Then \ell(ws_x) < \ell(w) because x\in I(w), and by the induction hypothesis since \phi_U(ws_x)=vs_x we have that I((ws_x)^U)\cap U=I((ws_x)(vs_x)^{-1})\cap U=\emptyset. Since (ws_x)(vs_x)^{-1}=wv^{-1}, the result follows by induction.
To prove uniqueness of w^U, and hence of w_U, suppose u\in wW(U) is an element of minimal length. Then I(u)\cap U=\emptyset because \ell(us_y) > \ell(u) for all y\in R^+(B)\cap U. Any element in wW(U) is equal to uv for some v\in W(U), and if y\in I_U(v) then -v(y)\in R^+(B)\cap U, hence -uv(y)\in R^+(B)\cap U, hence y\in I(uv). This proves uniqueness of the minimal element as well as the element u such that I(u)\cap U=\emptyset, because any other element in the coset contains a root in R^+(B)\cap U in its inversion set. \square
3. Roots, words and inversion sets
In this post we gather some technical results relating roots, words and inversion sets. This is all standard Coxeter group theory.
Definition 1: Let (R,B) be a root system and let \mathbf{w}=(s_{x_1},\ldots,s_{x_n}) be a word. For each index i\leq n we assign a root r_i(\mathbf{w})=s_{x_n}s_{x_{n-1}}\cdots s_{x_{i+1}}(x_i)
The proof of Lemma 3.1 below is actually quite easy and we leave it to the reader.
Lemma 3.1: Let (R,B) be a root system and let \mathbf{u}=(s_{x_1},\ldots,s_{x_n}) be a word. Let x\in B and let \mathbf{v}=(s_{x_1},\ldots,s_{x_n},s_x) Then for each 1\leq i\leq n we have r_i(\mathbf{v})=x\ast r_i(\mathbf{u}) \square
The next lemma allows us to build inversion sets root by root.
Lemma 3.2: Let (R,B) be a root system, w\in W(R), and x\in B. If x\notin I(w), then I(ws_x)=\{x\}\cup (x\ast I(w)) If x\in I(w), then I(ws_x)=x\ast (I(w)-\{x\})
Proof: Suppose x\notin I(w). Then since ws_x(x)=w(-x)=-w(x)\in R^-(B) we have that x\in I(ws_x). If y\in I(ws_x), then ws_x(y)=w(s_x\ast y)\notin R^+(B) Since y\neq x, x\ast y\in R^+(B). Thus x\ast y\in I(w). Conversely, if y\in I(w), then w(y)=ws_x(x\ast y)\notin R^+(B), hence x\ast y\in I(ws_x). Thus y\mapsto x\ast y is a bijection of I(w) with I(ws_x)-\{x\}, and we are done. The second part is left to the reader. \square
Theorem 3.3 tells us why the roots we assign to these indices are important.
Theorem 3.3: Let (R,B) be a root system and let \mathbf{w}=(s_{x_1},\ldots,s_{x_n}) be a reduced word for the element w=s_{x_1}\cdots s_{x_n} Then I(w)=\{r_i(\mathbf{w})|1\leq i\leq n\}
Proof: We prove this by induction on \ell(w). The base case is \ell(w)=0, in which case a reduced word for w is empty and hence the result is true. For the induction step, we assume the result is true for w with an arbitrary word \mathbf{w}=(s_{x_1},\ldots,s_{x_n}), let x\in B-I(w), and prove the result for the element ws_x and the word \mathbf{w}'=(s_{x_1},\ldots,s_{x_n},s_x) We define unambiguously I_{ws_x}=\{r_i(\mathbf{w}')|1\leq i\leq n+1\} By Lemma 3.1 above and the induction hypothesis, we have that I_{ws_x}=\{x\}\cup (x\ast I(w)) Hence by Lemma 3.2, I_{ws_x}=I(ws_x) and the result follows. \square
Definition 2: Given an element w\in W(R), denote by R(w) the set of reduced words for w. Given a reduced word \mathbf{w}=(s_{x_1},\ldots,s_{x_n}) for the element w, we define a bijection b_{\mathbf{w}}:I(w)\to [\ell(w)] by declaring that b_{\mathbf{w}}(r_j(\mathbf{w}))=j. Then define B(w)=\{b_{\mathbf{w}}|\mathbf{w}\in R(w)\} These can be seen as linear orderings on the inversion set. These correspond bijectively to reduced words as we prove now.
Theorem 3.4: Let (R,B) be a root system and let w\in W(R). Then the map R(w)\to B(w) given by \mathbf{w}\mapsto b_{\mathbf{w}} is a bijection.
Proof: Clearly the map is surjective. We prove it is injective by induction on \ell(w). This is clear if \ell(w)=0. Otherwise, suppose \mathbf{u}\neq \mathbf{v} are reduced words for w. If \mathbf{u}_{\ell(w)}\neq \mathbf{v}_{\ell(w)}, then there exist distinct x,x'\in B\cap I(w) such that b_{\mathbf{u}}(x)=b_{\mathbf{v}}(x')=\ell(w), hence b_{\mathbf{u}}\neq b_{\mathbf{v}}. If instead \mathbf{u}_{\ell(w)}=\mathbf{v}_{\ell(w)}, then there exists x\in B\cap I(w) such that b_{\mathbf{u}}(x)=b_{\mathbf{v}}(x)=\ell(w). Define \mathbf{u}'=(\mathbf{u}_i)_{i < \ell(w)} and \mathbf{v}'=(\mathbf{v}_i)_{i < \ell(w)} Then \mathbf{u}'\neq \mathbf{v'} and if y\neq x then b_{\mathbf{u}}(y)=b_{\mathbf{u}'}(x\ast y) and b_{\mathbf{v}}(y)=b_{\mathbf{v}'}(x\ast y) By the induction hypothesis, b_{\mathbf{u}'}(x\ast y)\neq b_{\mathbf{v}'}(x\ast y) for some y\in I(w), hence b_{\mathbf{u}}\neq b_{\mathbf{v}} and the result follows by induction. \square
Uniqueness of inversion sets is a very useful fact that we now prove.
Proposition 3.5: Let (R,B) be a root system and let u,v\in W(R). If I(u)=I(v), then u=v.
Proof: We prove this by induction on \ell(u)=\ell(v). If \ell(u)=\ell(v)=0 this is clear. Otherwise let x\in I(u)\cap B. Then I(us_x)=I(vs_x)=x\ast (I(u)-\{x\}) By the induction hypothesis, us_x=vs_x, hence u=(us_x)s_x=(vs_x)s_x=v by induction. \square
Definition 1: Let (R,B) be a root system and let \mathbf{w}=(s_{x_1},\ldots,s_{x_n}) be a word. For each index i\leq n we assign a root r_i(\mathbf{w})=s_{x_n}s_{x_{n-1}}\cdots s_{x_{i+1}}(x_i)
The proof of Lemma 3.1 below is actually quite easy and we leave it to the reader.
Lemma 3.1: Let (R,B) be a root system and let \mathbf{u}=(s_{x_1},\ldots,s_{x_n}) be a word. Let x\in B and let \mathbf{v}=(s_{x_1},\ldots,s_{x_n},s_x) Then for each 1\leq i\leq n we have r_i(\mathbf{v})=x\ast r_i(\mathbf{u}) \square
The next lemma allows us to build inversion sets root by root.
Lemma 3.2: Let (R,B) be a root system, w\in W(R), and x\in B. If x\notin I(w), then I(ws_x)=\{x\}\cup (x\ast I(w)) If x\in I(w), then I(ws_x)=x\ast (I(w)-\{x\})
Proof: Suppose x\notin I(w). Then since ws_x(x)=w(-x)=-w(x)\in R^-(B) we have that x\in I(ws_x). If y\in I(ws_x), then ws_x(y)=w(s_x\ast y)\notin R^+(B) Since y\neq x, x\ast y\in R^+(B). Thus x\ast y\in I(w). Conversely, if y\in I(w), then w(y)=ws_x(x\ast y)\notin R^+(B), hence x\ast y\in I(ws_x). Thus y\mapsto x\ast y is a bijection of I(w) with I(ws_x)-\{x\}, and we are done. The second part is left to the reader. \square
Theorem 3.3 tells us why the roots we assign to these indices are important.
Theorem 3.3: Let (R,B) be a root system and let \mathbf{w}=(s_{x_1},\ldots,s_{x_n}) be a reduced word for the element w=s_{x_1}\cdots s_{x_n} Then I(w)=\{r_i(\mathbf{w})|1\leq i\leq n\}
Proof: We prove this by induction on \ell(w). The base case is \ell(w)=0, in which case a reduced word for w is empty and hence the result is true. For the induction step, we assume the result is true for w with an arbitrary word \mathbf{w}=(s_{x_1},\ldots,s_{x_n}), let x\in B-I(w), and prove the result for the element ws_x and the word \mathbf{w}'=(s_{x_1},\ldots,s_{x_n},s_x) We define unambiguously I_{ws_x}=\{r_i(\mathbf{w}')|1\leq i\leq n+1\} By Lemma 3.1 above and the induction hypothesis, we have that I_{ws_x}=\{x\}\cup (x\ast I(w)) Hence by Lemma 3.2, I_{ws_x}=I(ws_x) and the result follows. \square
Definition 2: Given an element w\in W(R), denote by R(w) the set of reduced words for w. Given a reduced word \mathbf{w}=(s_{x_1},\ldots,s_{x_n}) for the element w, we define a bijection b_{\mathbf{w}}:I(w)\to [\ell(w)] by declaring that b_{\mathbf{w}}(r_j(\mathbf{w}))=j. Then define B(w)=\{b_{\mathbf{w}}|\mathbf{w}\in R(w)\} These can be seen as linear orderings on the inversion set. These correspond bijectively to reduced words as we prove now.
Theorem 3.4: Let (R,B) be a root system and let w\in W(R). Then the map R(w)\to B(w) given by \mathbf{w}\mapsto b_{\mathbf{w}} is a bijection.
Proof: Clearly the map is surjective. We prove it is injective by induction on \ell(w). This is clear if \ell(w)=0. Otherwise, suppose \mathbf{u}\neq \mathbf{v} are reduced words for w. If \mathbf{u}_{\ell(w)}\neq \mathbf{v}_{\ell(w)}, then there exist distinct x,x'\in B\cap I(w) such that b_{\mathbf{u}}(x)=b_{\mathbf{v}}(x')=\ell(w), hence b_{\mathbf{u}}\neq b_{\mathbf{v}}. If instead \mathbf{u}_{\ell(w)}=\mathbf{v}_{\ell(w)}, then there exists x\in B\cap I(w) such that b_{\mathbf{u}}(x)=b_{\mathbf{v}}(x)=\ell(w). Define \mathbf{u}'=(\mathbf{u}_i)_{i < \ell(w)} and \mathbf{v}'=(\mathbf{v}_i)_{i < \ell(w)} Then \mathbf{u}'\neq \mathbf{v'} and if y\neq x then b_{\mathbf{u}}(y)=b_{\mathbf{u}'}(x\ast y) and b_{\mathbf{v}}(y)=b_{\mathbf{v}'}(x\ast y) By the induction hypothesis, b_{\mathbf{u}'}(x\ast y)\neq b_{\mathbf{v}'}(x\ast y) for some y\in I(w), hence b_{\mathbf{u}}\neq b_{\mathbf{v}} and the result follows by induction. \square
Uniqueness of inversion sets is a very useful fact that we now prove.
Proposition 3.5: Let (R,B) be a root system and let u,v\in W(R). If I(u)=I(v), then u=v.
Proof: We prove this by induction on \ell(u)=\ell(v). If \ell(u)=\ell(v)=0 this is clear. Otherwise let x\in I(u)\cap B. Then I(us_x)=I(vs_x)=x\ast (I(u)-\{x\}) By the induction hypothesis, us_x=vs_x, hence u=(us_x)s_x=(vs_x)s_x=v by induction. \square
2. Algebraic root systems
This construction, which is my own, is an alternative axiomatization of the root systems from Lie theory. This is not completely unparalleled; there is a similar construction by Dyer (in the paper On rigidity of abstract root systems of Coxeter systems), though it is a bit more complicated. These are a combination of standard results about Coxeter groups and standard results about root systems combined in one abstract framework. For more traditional reading try Combinatorics of Coxeter groups by Björner and Brenti or Reflection Groups and Coxeter Groups by Humphreys.
A pre-root system is defined as a set R with a binary operation \ast:R\times R\to R satisfying the following axioms:
Axiom 1: x\ast(x\ast y)=y
Axiom 2: x\ast(y\ast z)=(x\ast y)\ast(x\ast z)
Note that this operation is is virtually never associative. One might suspect that this leads to pathologies but with the third axiom below quite the opposite is true.
For convenience in notation we make the following definitions.
Definition 1: Let R be a pre-root system. If x\in R, define a function s_x:R\to R by s_x(y)=x\ast y for all y\in R. s_x is the reflection corresponding to x. Let W(R) be the group generated by the reflections in R; W(R) is called the Weyl group of R. Define also -x=x\ast x=s_x(x). Then we have the following identities (proof left to the reader):
Definition 2: Let B\subseteq R be a subset of a pre-root system R and define R^+(B) to be the smallest subset of R satisfying
Now we can state the third axiom. A pre-root system R is called a root system if the following additional axiom holds:
Axiom 3: There exists a subset B\subseteq R such that for all x\in R we have x\in R^+(B) if and only if -x\notin R^+(B).
A subset B\subseteq R satisfying Axiom 3 is called a basis of R, the elements of B are called simple roots, and R^+(B) is called the set of positive roots. The subset -R^+(B)=\{y\in R|-y\in R^+(B)\} is called the set of negative roots. The reflections s_x for x\in B are called the simple reflections and the set of these will be denoted by S(B). These definitions are dependent on the choice of basis, so most of the time we will be fixing a basis of R and considering pairs (R,B) where R is a root system and B is a basis. It will therefore be convenient to refer to the pair (R,B) as a root system.
The inevitable question is, "Why should I care about this?" Let's construct an example. Let V be the real vector space spanned by the indeterminates t_1,t_2,\ldots,t_{n+1} and let R_{A}^n be the set of all differences t_i-t_j with i\neq j. For positive integers a,b with a\neq b define a permutation s_{a,b}, the transposition exchanging a and b, by s_{a,b}(i)=\left\{\begin{array}{ll}b&\mbox{ if }i=a\\a&\mbox{ if }i=b\\i&\mbox{ otherwise}\end{array}\right. Then we define a product \ast:R_A^n\times R_A^n\to R_A^n by (t_a-t_b)\ast (t_c-t_d)=t_{s_{a,b}(c)}-t_{s_{a,b}(d)} I leave it to you to prove that R_A^n satisfies Axioms 1 and 2. To convince you that our notation is consistent, note that (t_a-t_b)\ast (t_a-t_b)=t_b-t_a=-(t_a-t_b) where the right-hand side of the equation can be interpreted either in a root-system-theoretic sense or in the vector space sense; the two are equivalent.
For Axiom 3 we need to choose a basis. Define B_A^n=\{t_i-t_{i+1}|1\leq i\leq n\} Theorem 2.1: (R_A^n,B_A^n) is a root system.
Proof: I claim that R^+(B_A^n)=\{t_i-t_j|i < j\} To see that R^+(B_A^n) must contain all differences t_i-t_j with i < j we use induction on j-i, the result being by definition for j-i=1. Suppose then that t_a-t_b for all b-a < k are contained in R^+(B_A^n). Then (t_{b}-t_{b+1})\ast (t_a-t_b)=t_a-t_{b+1}, so the result follows by induction. To see that R^+(B_A^n) must be exactly equal to the set of these differences, note that if t_a-t_b is such that a < b, then if c\neq a or c+1\neq b then (t_c-t_{c+1})\ast (t_a-t_b)=t_{s_{c,c+1}(a)}-t_{s_{c,c+1}(b)} satisfies s_{c,c+1}(a) < s_{c,c+1}(b), so since R^+(B_A^n) is the smallest subset with the property that for all y\in R^+(B_A^n) and x\in B_A^n with x\neq y we have that x\ast y\in R^+(B_A^n) we have the result. \square
Now, ask yourself the following question: "What is the Weyl group of R_A^n (that is to say W(R_A^n))?" It is the symmetric group S_{n+1}! To see this, extend the action of t_a-t_b to the elements t_i, that is (t_a-t_b)\ast t_i=t_{s_{a,b}(i)}. Where an element of W(R_A^n) sends any root is completely determined by where it sends all t_i, so an element of W(R_A^n) is uniquely determined by the bijection it induces of \{t_1,\ldots,t_{n+1}\} with itself. This gives us a permutation representation of W(R_A^n). This representation contains all of the transpositions, so it must be the entirety of S_{n+1}. The theory below will allow us to prove in short order highly nontrivial results about elements of S_{n+1}.
Pre-root systems that do not satisfy Axiom 3 are of little interest to us. However, they are far from useless. Structures that satisfy only Axiom 2 are called quandles, and structures satisfying both Axioms 1 and 2 are called involutory quandles. Both have applications in knot theory.
Now we turn to the development of the theory of root systems.
Proposition 2.2: Let (R,B) be a root system. Then
Proof of 2: First assume y\in R^+(B). Set B_0=B, and for i>0 define B_i=\{x\ast y|x\in B,x\neq y,y\in B_{i-1}\} We claim that B_i\subseteq R^+(B) for all i. We prove this by induction, the base case B_0\subseteq R^+(B) being clear by definition. Suppose y\in B_i, i>1. Then there exists x\in B_i such that x\ast y\in B_{i-1} (note we are using the fact that x\ast (x\ast y)=y). Since by the induction hypothesis x\ast y\in R^+(B), and x\in B, by definition of R^+(B) we must have that x\ast (x\ast y)=y\in R^+(B), so the result follows by induction.
It follows that \bigcup_{i=0}^{\infty}{B_i}\subseteq R^+(B) We also have that if y\in\bigcup_{i=0}^{\infty}{B_i}, x\in B, and x\neq y then x\ast y\in \bigcup_{i=0}^{\infty}{B_i}. By definition, R^+(B) is the smallest subset satisfying this property, so the opposite inclusion R^+(B)\subseteq\bigcup_{i=0}^{\infty}{B_i} holds. For each y\in B_i we have that there exists x\in B and w\in W(R) such that w(x)=y; namely, if we set y_i=y\in B_i and let x_i\in B be such that x_i\ast y_i\in B_{i-1}, then y=s_{x_i}s_{x_{i-1}}\cdots s_{x_1}(x_0) where x_0\in B satisfies x_1\ast x_0=y_1, so we may take w=s_{x_i}s_{x_{i-1}}\cdots s_{x_1} and x=x_0, hence the result follows for positive roots. For negative roots y, note that -y=s_y(y) is positive by Axiom 3, so there exists w'\in W(R) and x\in B such that w'(x)=s_y(y); thus s_yw'(x)=y, so we may take w=s_yw'.
Proof of 3: It suffices to show that each reflection s_y for y\in R^+(B) can be written as a product of simple reflections since s_{-y}=s_y. We know from the proof of part 2 that there exist simple roots x_1,\ldots,x_i\in B and x\in B such that s_{x_1}\cdots s_{x_i}(x)=y. Set s_{x_1}\cdots s_{x_i}=w. Then ws_xw^{-1}=s_{w(x)}=s_y by part 1, so the result follows. \square
Definition 3: Let (R,B) be a root system and let w\in W(R). A sequence of simple reflections (s_{x_1},\ldots,s_{x_n}) is called a word for w if w=s_{x_1}\cdots s_{x_n}. We know by the previous proposition that since S(B) generates W(R), every element of W(R) has at least one word. If (s_{x_1},\ldots,s_{x_n}) is a word for w such that the length n is as small as possible, then the word is called a reduced word. Define \ell(w) to be the length of a reduced word for w. Define also the inversion set I(w) of w by I(w)=\{y\in R^+(B)|w(y)\notin R^+(B)\} A positive root y\in R^+(B) such that w(y)\notin R^+(B) will correspondingly be called an inversion.
Theorem 2.3: Let (R,B) be a root system and let w\in W(R). Let y\in R^+(B). Then \ell(ws_y) < \ell(w) if and only if y\in I(w). If y\in I(w) and (s_{x_1},\ldots,s_{x_n}) is a (possibly unreduced) word for w, then there is an index i such that (s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_n}) is a word for ws_y.
Proof: Suppose y\in I(w) and let (s_{x_1},\ldots,s_{x_n}) be a word for w. Let i be the maximal index such that s_{x_i}\cdots s_{x_n}(y)\notin R^+(B). We have that x_i\ast s_{x_{i+1}}\cdots s_{x_n}(y)\notin R^+(B), hence s_{x_{i+1}}\cdots s_{x_n}(y)=x_i because x_i\in B. Thus y=s_{x_n}\cdots s_{x_{i+1}}(x_i), hence s_y=s_{x_n}\cdots s_{x_{i+1}}s_{x_i}s_{x_{i+1}}\cdots s_{x_{n}}. Thus (s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_n}) is a word for ws_y, where the caret indicates omission. If we assume that the word is reduced, then since ws_y has a word that is shorter than a reduced word for w we must have that \ell(ws_y) < \ell(w).
Now note that if y\in R^+(B)-I(w), then ws_y(y)=w(-y)=-w(y)\notin R^+(B), hence y\in I(ws_y). Thus \ell(w)=\ell((ws_y)s_y) < \ell(ws_y), so the result follows. \square
The previous theorem is more important than it looks. It implies that (W(R),S(B)) is a Coxeter system and W(R) is a Coxeter group. Coxeter groups have extremely nice properties and there is a huge amount of literature on them. We're assuming no prior knowledge in this blog, so we will be proving the required results about Coxeter group theory as they come up.
Proposition 2.4: Let (R,B) be a root system and let w\in W(R).
As a final note, I promised you nontrivial results about S_{n+1}, so here is one that comes for free. Note that the set of simple reflections in S_{n+1} is the set of all s_{i,i+1}, that is to say the adjacent transpositions.
Corollary 2.5: If w\in S_{n+1}, then the minimal number of adjacent transpositions required to express w (meaning \ell(w)) is exactly the number of pairs i < j such that w(i) > w(j).
Proof: We know from the previous proposition that \ell(w) is equal to the number of inversions of w. What is an inversion of an element of S_{n+1}? It is a root t_i-t_j with i < j such that t_{w(i)}-t_{w(j)} is a negative root, meaning w(i) > w(j). This is exactly what the statement of the corollary claims. \square
A pre-root system is defined as a set R with a binary operation \ast:R\times R\to R satisfying the following axioms:
Axiom 1: x\ast(x\ast y)=y
Axiom 2: x\ast(y\ast z)=(x\ast y)\ast(x\ast z)
Note that this operation is is virtually never associative. One might suspect that this leads to pathologies but with the third axiom below quite the opposite is true.
For convenience in notation we make the following definitions.
Definition 1: Let R be a pre-root system. If x\in R, define a function s_x:R\to R by s_x(y)=x\ast y for all y\in R. s_x is the reflection corresponding to x. Let W(R) be the group generated by the reflections in R; W(R) is called the Weyl group of R. Define also -x=x\ast x=s_x(x). Then we have the following identities (proof left to the reader):
- s_x^2=1, the identity element of W(R), and in particular -(-x)=x.
- s_x(-y)=-s_x(y)
- s_{-x}(y)=s_x(y)
- s_x(y\ast z)=s_x(y)\ast s_x(z)=s_{s_x(y)}(s_x(z))
Definition 2: Let B\subseteq R be a subset of a pre-root system R and define R^+(B) to be the smallest subset of R satisfying
- B\subseteq R^+(B).
- If y\in R^+(B), x\in B, and x\neq y, then x\ast y\in R^+(B).
Now we can state the third axiom. A pre-root system R is called a root system if the following additional axiom holds:
Axiom 3: There exists a subset B\subseteq R such that for all x\in R we have x\in R^+(B) if and only if -x\notin R^+(B).
A subset B\subseteq R satisfying Axiom 3 is called a basis of R, the elements of B are called simple roots, and R^+(B) is called the set of positive roots. The subset -R^+(B)=\{y\in R|-y\in R^+(B)\} is called the set of negative roots. The reflections s_x for x\in B are called the simple reflections and the set of these will be denoted by S(B). These definitions are dependent on the choice of basis, so most of the time we will be fixing a basis of R and considering pairs (R,B) where R is a root system and B is a basis. It will therefore be convenient to refer to the pair (R,B) as a root system.
The inevitable question is, "Why should I care about this?" Let's construct an example. Let V be the real vector space spanned by the indeterminates t_1,t_2,\ldots,t_{n+1} and let R_{A}^n be the set of all differences t_i-t_j with i\neq j. For positive integers a,b with a\neq b define a permutation s_{a,b}, the transposition exchanging a and b, by s_{a,b}(i)=\left\{\begin{array}{ll}b&\mbox{ if }i=a\\a&\mbox{ if }i=b\\i&\mbox{ otherwise}\end{array}\right. Then we define a product \ast:R_A^n\times R_A^n\to R_A^n by (t_a-t_b)\ast (t_c-t_d)=t_{s_{a,b}(c)}-t_{s_{a,b}(d)} I leave it to you to prove that R_A^n satisfies Axioms 1 and 2. To convince you that our notation is consistent, note that (t_a-t_b)\ast (t_a-t_b)=t_b-t_a=-(t_a-t_b) where the right-hand side of the equation can be interpreted either in a root-system-theoretic sense or in the vector space sense; the two are equivalent.
For Axiom 3 we need to choose a basis. Define B_A^n=\{t_i-t_{i+1}|1\leq i\leq n\} Theorem 2.1: (R_A^n,B_A^n) is a root system.
Proof: I claim that R^+(B_A^n)=\{t_i-t_j|i < j\} To see that R^+(B_A^n) must contain all differences t_i-t_j with i < j we use induction on j-i, the result being by definition for j-i=1. Suppose then that t_a-t_b for all b-a < k are contained in R^+(B_A^n). Then (t_{b}-t_{b+1})\ast (t_a-t_b)=t_a-t_{b+1}, so the result follows by induction. To see that R^+(B_A^n) must be exactly equal to the set of these differences, note that if t_a-t_b is such that a < b, then if c\neq a or c+1\neq b then (t_c-t_{c+1})\ast (t_a-t_b)=t_{s_{c,c+1}(a)}-t_{s_{c,c+1}(b)} satisfies s_{c,c+1}(a) < s_{c,c+1}(b), so since R^+(B_A^n) is the smallest subset with the property that for all y\in R^+(B_A^n) and x\in B_A^n with x\neq y we have that x\ast y\in R^+(B_A^n) we have the result. \square
Now, ask yourself the following question: "What is the Weyl group of R_A^n (that is to say W(R_A^n))?" It is the symmetric group S_{n+1}! To see this, extend the action of t_a-t_b to the elements t_i, that is (t_a-t_b)\ast t_i=t_{s_{a,b}(i)}. Where an element of W(R_A^n) sends any root is completely determined by where it sends all t_i, so an element of W(R_A^n) is uniquely determined by the bijection it induces of \{t_1,\ldots,t_{n+1}\} with itself. This gives us a permutation representation of W(R_A^n). This representation contains all of the transpositions, so it must be the entirety of S_{n+1}. The theory below will allow us to prove in short order highly nontrivial results about elements of S_{n+1}.
Pre-root systems that do not satisfy Axiom 3 are of little interest to us. However, they are far from useless. Structures that satisfy only Axiom 2 are called quandles, and structures satisfying both Axioms 1 and 2 are called involutory quandles. Both have applications in knot theory.
Now we turn to the development of the theory of root systems.
Proposition 2.2: Let (R,B) be a root system. Then
- For all w\in W(R) and x\in R we have ws_xw^{-1}=s_{w(x)}.
- For all y\in R there exist w\in W(R) and x\in B such that y=w(x).
- The set of simple reflections S(B) is a generating set of W(R).
Proof of 2: First assume y\in R^+(B). Set B_0=B, and for i>0 define B_i=\{x\ast y|x\in B,x\neq y,y\in B_{i-1}\} We claim that B_i\subseteq R^+(B) for all i. We prove this by induction, the base case B_0\subseteq R^+(B) being clear by definition. Suppose y\in B_i, i>1. Then there exists x\in B_i such that x\ast y\in B_{i-1} (note we are using the fact that x\ast (x\ast y)=y). Since by the induction hypothesis x\ast y\in R^+(B), and x\in B, by definition of R^+(B) we must have that x\ast (x\ast y)=y\in R^+(B), so the result follows by induction.
It follows that \bigcup_{i=0}^{\infty}{B_i}\subseteq R^+(B) We also have that if y\in\bigcup_{i=0}^{\infty}{B_i}, x\in B, and x\neq y then x\ast y\in \bigcup_{i=0}^{\infty}{B_i}. By definition, R^+(B) is the smallest subset satisfying this property, so the opposite inclusion R^+(B)\subseteq\bigcup_{i=0}^{\infty}{B_i} holds. For each y\in B_i we have that there exists x\in B and w\in W(R) such that w(x)=y; namely, if we set y_i=y\in B_i and let x_i\in B be such that x_i\ast y_i\in B_{i-1}, then y=s_{x_i}s_{x_{i-1}}\cdots s_{x_1}(x_0) where x_0\in B satisfies x_1\ast x_0=y_1, so we may take w=s_{x_i}s_{x_{i-1}}\cdots s_{x_1} and x=x_0, hence the result follows for positive roots. For negative roots y, note that -y=s_y(y) is positive by Axiom 3, so there exists w'\in W(R) and x\in B such that w'(x)=s_y(y); thus s_yw'(x)=y, so we may take w=s_yw'.
Proof of 3: It suffices to show that each reflection s_y for y\in R^+(B) can be written as a product of simple reflections since s_{-y}=s_y. We know from the proof of part 2 that there exist simple roots x_1,\ldots,x_i\in B and x\in B such that s_{x_1}\cdots s_{x_i}(x)=y. Set s_{x_1}\cdots s_{x_i}=w. Then ws_xw^{-1}=s_{w(x)}=s_y by part 1, so the result follows. \square
Definition 3: Let (R,B) be a root system and let w\in W(R). A sequence of simple reflections (s_{x_1},\ldots,s_{x_n}) is called a word for w if w=s_{x_1}\cdots s_{x_n}. We know by the previous proposition that since S(B) generates W(R), every element of W(R) has at least one word. If (s_{x_1},\ldots,s_{x_n}) is a word for w such that the length n is as small as possible, then the word is called a reduced word. Define \ell(w) to be the length of a reduced word for w. Define also the inversion set I(w) of w by I(w)=\{y\in R^+(B)|w(y)\notin R^+(B)\} A positive root y\in R^+(B) such that w(y)\notin R^+(B) will correspondingly be called an inversion.
Theorem 2.3: Let (R,B) be a root system and let w\in W(R). Let y\in R^+(B). Then \ell(ws_y) < \ell(w) if and only if y\in I(w). If y\in I(w) and (s_{x_1},\ldots,s_{x_n}) is a (possibly unreduced) word for w, then there is an index i such that (s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_n}) is a word for ws_y.
Proof: Suppose y\in I(w) and let (s_{x_1},\ldots,s_{x_n}) be a word for w. Let i be the maximal index such that s_{x_i}\cdots s_{x_n}(y)\notin R^+(B). We have that x_i\ast s_{x_{i+1}}\cdots s_{x_n}(y)\notin R^+(B), hence s_{x_{i+1}}\cdots s_{x_n}(y)=x_i because x_i\in B. Thus y=s_{x_n}\cdots s_{x_{i+1}}(x_i), hence s_y=s_{x_n}\cdots s_{x_{i+1}}s_{x_i}s_{x_{i+1}}\cdots s_{x_{n}}. Thus (s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_n}) is a word for ws_y, where the caret indicates omission. If we assume that the word is reduced, then since ws_y has a word that is shorter than a reduced word for w we must have that \ell(ws_y) < \ell(w).
Now note that if y\in R^+(B)-I(w), then ws_y(y)=w(-y)=-w(y)\notin R^+(B), hence y\in I(ws_y). Thus \ell(w)=\ell((ws_y)s_y) < \ell(ws_y), so the result follows. \square
The previous theorem is more important than it looks. It implies that (W(R),S(B)) is a Coxeter system and W(R) is a Coxeter group. Coxeter groups have extremely nice properties and there is a huge amount of literature on them. We're assuming no prior knowledge in this blog, so we will be proving the required results about Coxeter group theory as they come up.
Proposition 2.4: Let (R,B) be a root system and let w\in W(R).
- \ell(w) = |I(w)|; that is, the length of a reduced word for w is the number of inversions of w.
- If x\in I(w)\cap B, then \ell(ws_x)=\ell(w)-1.
As a final note, I promised you nontrivial results about S_{n+1}, so here is one that comes for free. Note that the set of simple reflections in S_{n+1} is the set of all s_{i,i+1}, that is to say the adjacent transpositions.
Corollary 2.5: If w\in S_{n+1}, then the minimal number of adjacent transpositions required to express w (meaning \ell(w)) is exactly the number of pairs i < j such that w(i) > w(j).
Proof: We know from the previous proposition that \ell(w) is equal to the number of inversions of w. What is an inversion of an element of S_{n+1}? It is a root t_i-t_j with i < j such that t_{w(i)}-t_{w(j)} is a negative root, meaning w(i) > w(j). This is exactly what the statement of the corollary claims. \square
1. Introduction
In this blog we will be exploring Schubert calculus without any algebraic geometry or algebraic topology. Our method of investigation is just plain algebra. With just some undergraduate level abstract algebra under your belt, you can go from zero to the very core of the deep combinatorics of Schubert calculus in minutes. Allow me to demonstrate.
For this introduction we will be focusing only on type A, and the main object of study is the symmetric group S_n. S_n for our purposes will be the group of all bijective functions from the set [n]=\{1,2,\ldots,n\} to itself. There are natural injective homomorphisms S_n\hookrightarrow S_{n+1} for all n obtained by considering [n] as a subset of [n+1]; the image of a function in this homomorphism S_n\hookrightarrow S_{n+1} will fix n+1 and do to the rest of the elements what it is doing already. It will be convenient to consider all of these groups to be sitting inside S_\infty, which is the group of bijective functions f:\mathbb N\to\mathbb{N} such that f(i)=i for all but finitely many i. Essentially, S_\infty=\bigcup_{n=1}^{\infty}{S_n}
Let F be a field of characteristic zero. We turn to the polynomial ring F[x_1,x_2,\ldots] in countably many indeterminates, which we will for convenience refer to as R. S_\infty acts on R by F-automorphisms (that is, automorphisms that are F-linear). Specifically, for f\in S_\infty the action of f on x_i is f\cdot x_i=x_{f(i)} There is only one way to extend this to an F-automorphism of R. Essentially f's action is the substitution x_i\mapsto x_{f(i)} in the usual sense. This action, while natural, is not commonly used in the literature because while it works perfectly well as a left action on the polynomials themselves, meaning f\cdot (g\cdot p)=(f\circ g)\cdot p, it does not come from a left action on the variables. If you move the action inside the parentheses of the polynomial p(x_1,\ldots), viewed as a function, the order of multiplication in S_\infty is reversed. Despite this, this will be the most convenient action for our purposes.
Now we will need to refer to specific elements of S_\infty, namely the adjacent transpositions s_i defined by s_i(j)=\left\{\begin{array}{cc}i+1&\mbox{ if }j=i\\i&\mbox{ if }j=i+1\\j&\mbox{ otherwise}\end{array}\right. The elements s_i generate S_\infty as a Coxeter group, which is very important and will be discussed later, but we will not dwell on it now. Now we are interested in defining the divided difference operators \partial^i for i\in\mathbb{N}. \partial^i takes elements of R to elements of R and is F-linear, and is defined abstractly as follows: \partial^i=\frac1{x_i-x_{i+1}}(1-s_i) If this definition is too abstract for you, that's no problem. It is sufficient to know that \partial^i acts as follows: \partial^i(p)=\frac{p-s_i\cdot p}{x_i-x_{i+1}} That is, we take the polynomial, switch two of the variables, subtract from the original, and divide by the difference of the two variables we swapped. It is not obvious that this always yields a polynomial, and you may want to convince yourself of that before moving on.
Now we define the operators \partial^w for all w\in S_\infty built from the \partial^i. Namely, if w can be expressed as a specific product of adjacent transpositions w=s_{i_1}\cdots s_{i_k} with k as small as possible (so the expression is as short as possible) then we define \partial^w=\partial^{i_1}\cdots\partial^{i_k} If we take such a product where the expression is not as short as possible, we instead get 0.
It is not difficult to work out a product rule for \partial^i. You may want to try it yourself, but in any case the formula is \partial^i(pq)=\partial^i(p)q+p\partial^i(q)+(x_{i+1}-x_{i})\partial^i(p)\partial^i(q) We can iterate this to obtain a product rule for \partial^w for arbitrary permutations w. We end up with a formula of the form \partial^w(pq)=\sum_{u,v\in S_\infty}{c_{u,v}^w\partial^u(p)\partial^v(q)} where the coefficients c_{u,v}^w are polynomials depending on u, v, and w.
You may be surprised to know that by now I have already made good on my promise of bringing you from zero to Schubert calculus in minutes. The coefficients c_{u,v}^w, the equivariant Littlewood-Richardson coefficients, are one of the central objects of study in Schubert calculus, and finding a formula of a certain type for them is a hopelessly difficult unsolved problem. While irrelevant to our geometry-free investigation, it is interesting to note that the polynomials c_{u,v}^w are the structure constants in the equivariant cohomology rings of complete flag varieties over the complex numbers (rings we will construct combinatorially later), and whenever c_{u,v}^w is an integer it counts the number of points in transverse triple intersections of Schubert varieties. It is known via geometric proofs (see Graham, Duke Math. J. Volume 109, Number 3 (2001), 599-614) that c_{u,v}^w is a polynomial in the differences x_{i+1}-x_i with nonnegative integer coefficients. This has not yet been paralleled combinatorially except in special cases, but we're working on it.
For this introduction we will be focusing only on type A, and the main object of study is the symmetric group S_n. S_n for our purposes will be the group of all bijective functions from the set [n]=\{1,2,\ldots,n\} to itself. There are natural injective homomorphisms S_n\hookrightarrow S_{n+1} for all n obtained by considering [n] as a subset of [n+1]; the image of a function in this homomorphism S_n\hookrightarrow S_{n+1} will fix n+1 and do to the rest of the elements what it is doing already. It will be convenient to consider all of these groups to be sitting inside S_\infty, which is the group of bijective functions f:\mathbb N\to\mathbb{N} such that f(i)=i for all but finitely many i. Essentially, S_\infty=\bigcup_{n=1}^{\infty}{S_n}
Let F be a field of characteristic zero. We turn to the polynomial ring F[x_1,x_2,\ldots] in countably many indeterminates, which we will for convenience refer to as R. S_\infty acts on R by F-automorphisms (that is, automorphisms that are F-linear). Specifically, for f\in S_\infty the action of f on x_i is f\cdot x_i=x_{f(i)} There is only one way to extend this to an F-automorphism of R. Essentially f's action is the substitution x_i\mapsto x_{f(i)} in the usual sense. This action, while natural, is not commonly used in the literature because while it works perfectly well as a left action on the polynomials themselves, meaning f\cdot (g\cdot p)=(f\circ g)\cdot p, it does not come from a left action on the variables. If you move the action inside the parentheses of the polynomial p(x_1,\ldots), viewed as a function, the order of multiplication in S_\infty is reversed. Despite this, this will be the most convenient action for our purposes.
Now we will need to refer to specific elements of S_\infty, namely the adjacent transpositions s_i defined by s_i(j)=\left\{\begin{array}{cc}i+1&\mbox{ if }j=i\\i&\mbox{ if }j=i+1\\j&\mbox{ otherwise}\end{array}\right. The elements s_i generate S_\infty as a Coxeter group, which is very important and will be discussed later, but we will not dwell on it now. Now we are interested in defining the divided difference operators \partial^i for i\in\mathbb{N}. \partial^i takes elements of R to elements of R and is F-linear, and is defined abstractly as follows: \partial^i=\frac1{x_i-x_{i+1}}(1-s_i) If this definition is too abstract for you, that's no problem. It is sufficient to know that \partial^i acts as follows: \partial^i(p)=\frac{p-s_i\cdot p}{x_i-x_{i+1}} That is, we take the polynomial, switch two of the variables, subtract from the original, and divide by the difference of the two variables we swapped. It is not obvious that this always yields a polynomial, and you may want to convince yourself of that before moving on.
Now we define the operators \partial^w for all w\in S_\infty built from the \partial^i. Namely, if w can be expressed as a specific product of adjacent transpositions w=s_{i_1}\cdots s_{i_k} with k as small as possible (so the expression is as short as possible) then we define \partial^w=\partial^{i_1}\cdots\partial^{i_k} If we take such a product where the expression is not as short as possible, we instead get 0.
It is not difficult to work out a product rule for \partial^i. You may want to try it yourself, but in any case the formula is \partial^i(pq)=\partial^i(p)q+p\partial^i(q)+(x_{i+1}-x_{i})\partial^i(p)\partial^i(q) We can iterate this to obtain a product rule for \partial^w for arbitrary permutations w. We end up with a formula of the form \partial^w(pq)=\sum_{u,v\in S_\infty}{c_{u,v}^w\partial^u(p)\partial^v(q)} where the coefficients c_{u,v}^w are polynomials depending on u, v, and w.
You may be surprised to know that by now I have already made good on my promise of bringing you from zero to Schubert calculus in minutes. The coefficients c_{u,v}^w, the equivariant Littlewood-Richardson coefficients, are one of the central objects of study in Schubert calculus, and finding a formula of a certain type for them is a hopelessly difficult unsolved problem. While irrelevant to our geometry-free investigation, it is interesting to note that the polynomials c_{u,v}^w are the structure constants in the equivariant cohomology rings of complete flag varieties over the complex numbers (rings we will construct combinatorially later), and whenever c_{u,v}^w is an integer it counts the number of points in transverse triple intersections of Schubert varieties. It is known via geometric proofs (see Graham, Duke Math. J. Volume 109, Number 3 (2001), 599-614) that c_{u,v}^w is a polynomial in the differences x_{i+1}-x_i with nonnegative integer coefficients. This has not yet been paralleled combinatorially except in special cases, but we're working on it.
In the next post we will build our general framework for equivariant Schubert calculus.
Subscribe to:
Posts (Atom)