A pre-root system is defined as a set \(R\) with a binary operation \(\ast:R\times R\to R\) satisfying the following axioms:
Axiom 1: \(x\ast(x\ast y)=y\)
Axiom 2: \(x\ast(y\ast z)=(x\ast y)\ast(x\ast z)\)
Note that this operation is is virtually never associative. One might suspect that this leads to pathologies but with the third axiom below quite the opposite is true.
For convenience in notation we make the following definitions.
Definition 1: Let \(R\) be a pre-root system. If \(x\in R\), define a function \(s_x:R\to R\) by \(s_x(y)=x\ast y\) for all \(y\in R\). \(s_x\) is the reflection corresponding to \(x\). Let \(W(R)\) be the group generated by the reflections in \(R\); \(W(R)\) is called the Weyl group of \(R\). Define also \(-x=x\ast x=s_x(x)\). Then we have the following identities (proof left to the reader):
- \(s_x^2=1\), the identity element of \(W(R)\), and in particular \(-(-x)=x\).
- \(s_x(-y)=-s_x(y)\)
- \(s_{-x}(y)=s_x(y)\)
- \(s_x(y\ast z)=s_x(y)\ast s_x(z)=s_{s_x(y)}(s_x(z))\)
Definition 2: Let \(B\subseteq R\) be a subset of a pre-root system \(R\) and define \(R^+(B)\) to be the smallest subset of \(R\) satisfying
- \(B\subseteq R^+(B)\).
- If \(y\in R^+(B)\), \(x\in B\), and \(x\neq y\), then \(x\ast y\in R^+(B)\).
Now we can state the third axiom. A pre-root system \(R\) is called a root system if the following additional axiom holds:
Axiom 3: There exists a subset \(B\subseteq R\) such that for all \(x\in R\) we have \(x\in R^+(B)\) if and only if \(-x\notin R^+(B)\).
A subset \(B\subseteq R\) satisfying Axiom 3 is called a basis of \(R\), the elements of \(B\) are called simple roots, and \(R^+(B)\) is called the set of positive roots. The subset \(-R^+(B)=\{y\in R|-y\in R^+(B)\}\) is called the set of negative roots. The reflections \(s_x\) for \(x\in B\) are called the simple reflections and the set of these will be denoted by \(S(B)\). These definitions are dependent on the choice of basis, so most of the time we will be fixing a basis of \(R\) and considering pairs \((R,B)\) where \(R\) is a root system and \(B\) is a basis. It will therefore be convenient to refer to the pair \((R,B)\) as a root system.
The inevitable question is, "Why should I care about this?" Let's construct an example. Let $V$ be the real vector space spanned by the indeterminates $t_1,t_2,\ldots,t_{n+1}$ and let $R_{A}^n$ be the set of all differences $t_i-t_j$ with $i\neq j$. For positive integers $a,b$ with $a\neq b$ define a permutation $s_{a,b}$, the transposition exchanging $a$ and $b$, by $$s_{a,b}(i)=\left\{\begin{array}{ll}b&\mbox{ if }i=a\\a&\mbox{ if }i=b\\i&\mbox{ otherwise}\end{array}\right.$$ Then we define a product $\ast:R_A^n\times R_A^n\to R_A^n$ by $$(t_a-t_b)\ast (t_c-t_d)=t_{s_{a,b}(c)}-t_{s_{a,b}(d)}$$ I leave it to you to prove that $R_A^n$ satisfies Axioms 1 and 2. To convince you that our notation is consistent, note that $$(t_a-t_b)\ast (t_a-t_b)=t_b-t_a=-(t_a-t_b)$$ where the right-hand side of the equation can be interpreted either in a root-system-theoretic sense or in the vector space sense; the two are equivalent.
For Axiom 3 we need to choose a basis. Define $$B_A^n=\{t_i-t_{i+1}|1\leq i\leq n\}$$ Theorem 2.1: $(R_A^n,B_A^n)$ is a root system.
Proof: I claim that $$R^+(B_A^n)=\{t_i-t_j|i < j\}$$ To see that $R^+(B_A^n)$ must contain all differences $t_i-t_j$ with $i < j$ we use induction on $j-i$, the result being by definition for $j-i=1$. Suppose then that $t_a-t_b$ for all $b-a < k $ are contained in $R^+(B_A^n)$. Then $(t_{b}-t_{b+1})\ast (t_a-t_b)=t_a-t_{b+1}$, so the result follows by induction. To see that $R^+(B_A^n)$ must be exactly equal to the set of these differences, note that if $t_a-t_b$ is such that $a < b$, then if $c\neq a$ or $c+1\neq b$ then $(t_c-t_{c+1})\ast (t_a-t_b)=t_{s_{c,c+1}(a)}-t_{s_{c,c+1}(b)}$ satisfies $s_{c,c+1}(a) < s_{c,c+1}(b)$, so since $R^+(B_A^n)$ is the smallest subset with the property that for all $y\in R^+(B_A^n)$ and $x\in B_A^n$ with $x\neq y$ we have that $x\ast y\in R^+(B_A^n)$ we have the result. $\square$
Now, ask yourself the following question: "What is the Weyl group of $R_A^n$ (that is to say $W(R_A^n)$)?" It is the symmetric group $S_{n+1}$! To see this, extend the action of $t_a-t_b$ to the elements $t_i$, that is $(t_a-t_b)\ast t_i=t_{s_{a,b}(i)}$. Where an element of $W(R_A^n)$ sends any root is completely determined by where it sends all $t_i$, so an element of $W(R_A^n)$ is uniquely determined by the bijection it induces of $\{t_1,\ldots,t_{n+1}\}$ with itself. This gives us a permutation representation of $W(R_A^n)$. This representation contains all of the transpositions, so it must be the entirety of $S_{n+1}$. The theory below will allow us to prove in short order highly nontrivial results about elements of $S_{n+1}$.
Pre-root systems that do not satisfy Axiom 3 are of little interest to us. However, they are far from useless. Structures that satisfy only Axiom 2 are called quandles, and structures satisfying both Axioms 1 and 2 are called involutory quandles. Both have applications in knot theory.
Now we turn to the development of the theory of root systems.
Proposition 2.2: Let \((R,B)\) be a root system. Then
- For all \(w\in W(R)\) and \(x\in R\) we have \(ws_xw^{-1}=s_{w(x)}\).
- For all \(y\in R\) there exist \(w\in W(R)\) and \(x\in B\) such that \(y=w(x)\).
- The set of simple reflections \(S(B)\) is a generating set of \(W(R)\).
Proof of 2: First assume \(y\in R^+(B)\). Set \(B_0=B\), and for \(i>0\) define $$B_i=\{x\ast y|x\in B,x\neq y,y\in B_{i-1}\}$$ We claim that \(B_i\subseteq R^+(B)\) for all \(i\). We prove this by induction, the base case \(B_0\subseteq R^+(B)\) being clear by definition. Suppose \(y\in B_i\), \(i>1\). Then there exists \(x\in B_i\) such that \(x\ast y\in B_{i-1}\) (note we are using the fact that \(x\ast (x\ast y)=y\)). Since by the induction hypothesis \(x\ast y\in R^+(B)\), and \(x\in B\), by definition of \(R^+(B)\) we must have that \(x\ast (x\ast y)=y\in R^+(B)\), so the result follows by induction.
It follows that $$\bigcup_{i=0}^{\infty}{B_i}\subseteq R^+(B)$$ We also have that if \(y\in\bigcup_{i=0}^{\infty}{B_i}\), \(x\in B\), and \(x\neq y\) then \(x\ast y\in \bigcup_{i=0}^{\infty}{B_i}\). By definition, \(R^+(B)\) is the smallest subset satisfying this property, so the opposite inclusion \(R^+(B)\subseteq\bigcup_{i=0}^{\infty}{B_i}\) holds. For each \(y\in B_i\) we have that there exists \(x\in B\) and \(w\in W(R)\) such that \(w(x)=y\); namely, if we set \(y_i=y\in B_i\) and let \(x_i\in B\) be such that \(x_i\ast y_i\in B_{i-1}\), then $$y=s_{x_i}s_{x_{i-1}}\cdots s_{x_1}(x_0)$$ where $x_0\in B$ satisfies $x_1\ast x_0=y_1$, so we may take \(w=s_{x_i}s_{x_{i-1}}\cdots s_{x_1}\) and $x=x_0$, hence the result follows for positive roots. For negative roots \(y\), note that \(-y=s_y(y)\) is positive by Axiom 3, so there exists \(w'\in W(R)\) and \(x\in B\) such that \(w'(x)=s_y(y)\); thus \(s_yw'(x)=y\), so we may take \(w=s_yw'\).
Proof of 3: It suffices to show that each reflection \(s_y\) for \(y\in R^+(B)\) can be written as a product of simple reflections since \(s_{-y}=s_y\). We know from the proof of part 2 that there exist simple roots \(x_1,\ldots,x_i\in B\) and \(x\in B\) such that \(s_{x_1}\cdots s_{x_i}(x)=y\). Set \(s_{x_1}\cdots s_{x_i}=w\). Then \(ws_xw^{-1}=s_{w(x)}=s_y\) by part 1, so the result follows. \(\square\)
Definition 3: Let \((R,B)\) be a root system and let \(w\in W(R)\). A sequence of simple reflections \((s_{x_1},\ldots,s_{x_n})\) is called a word for \(w\) if \(w=s_{x_1}\cdots s_{x_n}\). We know by the previous proposition that since $S(B)$ generates $W(R)$, every element of $W(R)$ has at least one word. If \((s_{x_1},\ldots,s_{x_n})\) is a word for \(w\) such that the length \(n\) is as small as possible, then the word is called a reduced word. Define \(\ell(w)\) to be the length of a reduced word for \(w\). Define also the inversion set \(I(w)\) of \(w\) by $$I(w)=\{y\in R^+(B)|w(y)\notin R^+(B)\}$$ A positive root $y\in R^+(B)$ such that $w(y)\notin R^+(B)$ will correspondingly be called an inversion.
Theorem 2.3: Let \((R,B)\) be a root system and let \(w\in W(R)\). Let $y\in R^+(B)$. Then \(\ell(ws_y) < \ell(w)\) if and only if \(y\in I(w)\). If $y\in I(w)$ and $(s_{x_1},\ldots,s_{x_n})$ is a (possibly unreduced) word for $w$, then there is an index $i$ such that $$(s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_n})$$ is a word for $ws_y$.
Proof: Suppose $y\in I(w)$ and let $(s_{x_1},\ldots,s_{x_n})$ be a word for $w$. Let $i$ be the maximal index such that $s_{x_i}\cdots s_{x_n}(y)\notin R^+(B)$. We have that $x_i\ast s_{x_{i+1}}\cdots s_{x_n}(y)\notin R^+(B)$, hence $s_{x_{i+1}}\cdots s_{x_n}(y)=x_i$ because $x_i\in B$. Thus $y=s_{x_n}\cdots s_{x_{i+1}}(x_i)$, hence $s_y=s_{x_n}\cdots s_{x_{i+1}}s_{x_i}s_{x_{i+1}}\cdots s_{x_{n}}$. Thus $(s_{x_1},\ldots,\widehat{s_{x_i}},\ldots,s_{x_n})$ is a word for $ws_y$, where the caret indicates omission. If we assume that the word is reduced, then since $ws_y$ has a word that is shorter than a reduced word for $w$ we must have that $\ell(ws_y) < \ell(w)$.
Now note that if $y\in R^+(B)-I(w)$, then $ws_y(y)=w(-y)=-w(y)\notin R^+(B)$, hence $y\in I(ws_y)$. Thus $\ell(w)=\ell((ws_y)s_y) < \ell(ws_y)$, so the result follows. $\square$
The previous theorem is more important than it looks. It implies that $(W(R),S(B))$ is a Coxeter system and $W(R)$ is a Coxeter group. Coxeter groups have extremely nice properties and there is a huge amount of literature on them. We're assuming no prior knowledge in this blog, so we will be proving the required results about Coxeter group theory as they come up.
Proposition 2.4: Let $(R,B)$ be a root system and let $w\in W(R)$.
- $\ell(w) = |I(w)|$; that is, the length of a reduced word for $w$ is the number of inversions of $w$.
- If $x\in I(w)\cap B$, then $\ell(ws_x)=\ell(w)-1$.
As a final note, I promised you nontrivial results about $S_{n+1}$, so here is one that comes for free. Note that the set of simple reflections in $S_{n+1}$ is the set of all $s_{i,i+1}$, that is to say the adjacent transpositions.
Corollary 2.5: If $w\in S_{n+1}$, then the minimal number of adjacent transpositions required to express $w$ (meaning $\ell(w)$) is exactly the number of pairs $i < j$ such that $w(i) > w(j)$.
Proof: We know from the previous proposition that $\ell(w)$ is equal to the number of inversions of $w$. What is an inversion of an element of $S_{n+1}$? It is a root $t_i-t_j$ with $i < j$ such that $t_{w(i)}-t_{w(j)}$ is a negative root, meaning $w(i) > w(j)$. This is exactly what the statement of the corollary claims. $\square$
