## Sinai Theorem on Local Ergodicity

Anosov’s Ergodic theorem for uniformly hyperbolic diffeomorphisms $f: M \to M$: the stable and unstable foliations (a) cover the whole space, (b) are absolutely continuous, (c) are of uniform size, and (d) are uniformly transverse to each other.

Sinai gave a systematic method to prove the ergodicity for hyperbolic systems $F: M \to M$ with singularities. Under some mild conditions, Katok and Strelcyn proved that those two foliations cover a full measure set of the space and are still absolutely continuous. However, their leaves can be arbitrarily short, and the angle between them can be arbitrarily small. Assume the singularity sets of the iterates $F^{n}$ are regular. The Sinai Theorem states that local ergodicity holds if the stable and unstable cones are relatively small while the separation of the two cones is relatively not small. Then the short stable leaves and the short unstable leaves can be used to obtain local ergodic theorem. Assume that there is a continuous invariant cone field on $M$. Then a sufficient condition for both small cones and non-small separation of cones is that $\sigma_{\mathcal{C}}(D_xF^n) > 3$, by Liverani and Wojtkowski.

## Some comparison series

The first one appeared in a paper of R. Mane.

Let $a_n \in (0, 1), n\ge 1$ be a sequence of positive numbers such that $\sum n\cdot a_n$ is convergent. Then $\sum a_n \cdot |\log a_n|$ is also convergent.

Proof. The difference between the two series is that we replace $n$ by $|\log a_n|$. Naturally, we split the indices into two cases:

1). the mild ones: $n\in G$ if $|\log a_n| \le n$. That is, $a_n \ge e^{-n}$.
Then it is clear that $\displaystyle \sum_{G} a_n \cdot |\log a_n| \le \sum_{G} n\cdot a_n$, which is finite.

2). the not-so-mild case: $n\notin G$. It follows that $a_n \le e^{-n}$. Note that
$x^{1/2}\cdot \log x \to 0$ when $x \to 0$. In fact, $x^{1/2}\cdot |\log x| \le 1$.
It follows that $\displaystyle \sum_{n\notin G} a_n \cdot |\log a_n| \le \sum_{n\notin G} a_n^{1/2} \le \sum_{n\notin G} e^{-n/2}$, which is also finite. QED

The second one appeared in a paper of C. Liverani and M. Wojtkowski.

Let $a_n \in (0, 1), n\ge 1$ be a sequence of positive numbers and $S_n= a_1 + \cdots + a_n$. Then $\sum a_n$ is divergent if and only if $\displaystyle \sum \frac{a_n}{S_n}$ is divergent.

One can also state the convergence version.

Proof. We only need to show one direction.
Let $k, l\ge 1$ be two indices. Note that $S_{k+l} \ge S_{k+j}$ and $\displaystyle \frac{a_{k+j}}{S_{k+j}} \ge \frac{a_{k+j}}{S_{k+l}}$ for all $1\le j \le l$.
Therefore, $\displaystyle \sum_{1\le j \le l}\frac{a_{k+j}}{S_{k+j}} \ge \sum_{1\le j \le l}\frac{a_{k+j}}{S_{k+l}}= \frac{1}{S_{k+j}}(S_{k+l}- S_k) \to 1$ as $l \to \infty$.
Therefore, $\displaystyle \sum \frac{a_n}{S_n}$ is divergent. QED.

## Birkhoff Ergodic Theorem

Let $(X,\mu, T)$ be an ergodic measure-preserving system, $f\in L^1(\mu)$. Then the Birkhoff Ergodic Theorem states that for $\mu$-a.e. $x\in X$, the time average $\frac{1}{n}S_nf(x)$ converges to the space average $\mu(f):=\int f(x) d\mu(x)$. In the case that $\mu(f)>0$, we see that $S_nf(x) \approx n\cdot \mu(f)$ as $n\to \infty$ for $\mu$-a.e. $x\in X$.

In link there is an interesting observation:

Theorem. If $S_nf(x) \to +\infty$ for $\mu$-a.e. $x\in X$, then $\mu(f)>0$.

Proof. Let $\epsilon >0$, $A_{\epsilon}=\{x\in X: S_nf(x)\ge \epsilon \text{ for each } n\ge 1\}$, and $\displaystyle B=\bigcup_{k\ge 0}\bigcup_{\epsilon >0}T^{-k}A_{\epsilon}$.

Note that the complement $\displaystyle X\backslash B= \bigcap_{k\ge 0}\bigcap_{\epsilon >0}T^{-k}(X\backslash A_{\epsilon})$.

So if $x\notin B$, then for each $k\ge 0$, for each $\epsilon >0$,
there exists $n_{k,\epsilon}\ge 1$ such that $S_{n_{k,\epsilon}}f(T^kx) \le \epsilon$.

Pick a sequence $\epsilon_p =e^{-p}$, $k_0=0$, $n_p=n_{k_{p-1},\epsilon_p}$, $k_p=k_{p-1}+ n_p$ for each $p\ge 1$. Then

$k_0=0$, $n_1=n_{0,e^{-1}} \ge 1$, $k_1=n_1 \ge 1$: $S_{n_1}f(x) \le e^{-1}$;

$n_2=n_{k_1,e^{-2}} \ge 1$, $k_2=k_1+ n_2 \ge 2$: $S_{n_2}f(T^{k_1}x) \le e^{-2}$;

$n_3=n_{k_2,e^{-3}} \ge 1$, $k_3=k_2+ n_3 \ge 3$: $S_{n_3}f(T^{k_2}x) \le e^{-3}$;

$n_p=n_{k_{p-1},e^{-p}}\ge 1$, $k_p=k_{p-1}+ n_p \ge p$: $S_{n_p}f(T^{k_{n-1}}x) \le e^{-p}$;

Add them together: $S_{k_p}f(x)\le e^{-1}+ \cdots + e^{-p} \le \frac{1}{1-e^{-1}}$ with $k_p \ge p \to \infty$.

Applying the assumption of the theorem, we see that $\mu(X\backslash B)=0$. Then $\mu(A_{\epsilon}) >0$ for some $\epsilon>0$.

Let $n_k(x)$ be the $k$-th return of a typical point $x \in A_{\epsilon}$ to the set $A_{\epsilon}$. Then $S_{n_k}f(x)\ge k\epsilon$. It follows that

$\displaystyle \mu(f)=\lim_{k \to \infty}\frac{1}{n_k}S_{n_k}f(x) \ge \lim_{k\to\infty}\frac{k}{n_k}\cdot \epsilon = \mu(A_{\epsilon})\cdot \epsilon >0$.

This completes the proof.

## Some random variables

Consider a stochastic process $X_n$, $n\ge 0$, where $X_0 =0$, and $X_{n+1} = \begin{cases}1+ X_n, & p=1/2, \\ - X_{n}, & p=1/2. \end{cases}$

Then the conditional expectation $E(X_{n+1}| X_n) =(1+X_n)/2 + (-X_n)/2 = 1/2$. It follows that $E(X_{n+1}) = E(E(X_{n+1}| X_n))=1/2$.

Now we consider another stochastic process: let $A=\begin{bmatrix} 2 & 0 \\ 0 & 1/2 \end{bmatrix}$, and $B=\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$. The process $R_n$ is given by $R_0= \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$, and $R_{n+1} = \begin{cases}A\cdot R_n, & p=1/2, \\ B\cdot R_{n}, & p=1/2. \end{cases}$

We will use the max norm $\|(a_{ij})\|=\max |a_{ij}|$. Consider two successive appearances of the matrix $B$, say one time $n=a$, and the second one at tome $n=a+1+b$:

$R_a =\begin{bmatrix} 2^a & 0 \\ 0 & 2^{-a} \end{bmatrix}$,
$R_{a+1} =\begin{bmatrix} 0 & 2^{-a} \\ 2^{a} & 0 \end{bmatrix}$,
$R_{a+1+b} =\begin{bmatrix} 0 & 2^{b-a} \\ 2^{a-b} & 0 \end{bmatrix}$,
$R_{a+1+b+1} =\begin{bmatrix} 2^{a-b} & 0 \\ 0 & 2^{b-a} \end{bmatrix}$.

So if we break the process into pairs of segments of length $(a_n, b_n)$, $n\ge 1$, where $a_n \ge 0, b_n \ge 0$. Then the norms of the process follow the pattern $\|R_{t_n}\|=2^{|s_n|}$, where $t_n=\sum_{k=1}^n(a_k + b_k+2)$ and $s_n=\sum_{k=1}^n(a_k - b_k)$. One would guess that $\frac{s_n}{t_n} \to 0$ in probability one, and hence $\|R_n\|$ grows subexponentially in probability one.

## Some special actions

Consider the conjugate action $\rho$ of $GL(2,R)$ on $M(2,R)$: $\rho_A(M) = A MA^{-1}$.

1. This action $\rho$ factors through an action of $PGL(2,R)$.

2. There exists a 3D invariant subspace $E=\{M\in M(2,R): tr(M)=0\}$.

3. The determinant $\det M$ is an invariant quadratic form on $E$, and the signature of this form is $(-, - ,+)$.

Let $Q=x_1^2 + x_2^2 - x_3^2$ be a quadratic form on $R^3$, whose isometry group is $O(2,1)=\{A\in M(3,R): A^TgA=g\}$, where $g=\mbox{diag}\{1, 1, -1\}$.

This induces an injection $PGL(2,R) \subset O(2,1)$, and an identification between $PSL(2,R)$ and the connected component of $O(2,1)$.

The action $O(2,1)$ on $R^3$ passes on to the projective space $P^2$. The cone $C=Q^{-1}(0)$ is invariant, and separates $P^2$ into two domains: one of them is homeomorphic to a disk, which the other is a Mobius band. This induces an action of $PSL(2,R)$ on the disk.

## Equilibrim states

Let $S=\{1,\dots, l\}$ be the space of symbols, $A=(a_{ij})$ be an $l\times l$ matrix with $a_{ij}\in\{0,1\}$, $\Sigma_A$ be the set of sequences $x=(x_n)$ that is $A$-admissible. Consider the dynamical system $(\Sigma_A, \sigma)$ We assume this system is mixing.

Let $f:\Sigma_A \to \mathbb{R}$ be a Holder potential, which induces a transfer operator $L_f$ on the space of continuous functions: $\phi(x) \mapsto L_f\phi(x):=\sum_{\sigma y =x} e^{f(x)}\phi(x)$.

Let $\lambda$ be the spectral radius of $L_f$. Then $\lambda$ is also an eigenvalue of $L_f$, which is called the principle eigenvalue. Moreover, there exists a positive eigenfunction $h$ such that $L_f h =\lambda h$. Replacing $f$ by $f-\log\lambda$, we will assume $\lambda =1$.

Consider the conjugate action $L_f^{\ast}$ on the space of functional (or sign measures). There is a positive eigenmeasure $\nu$ such that $L_f^{\ast} \nu =\nu$.

We normalize the pair $(h,\nu)$ such that $\int h d\nu =1$. Then the measure $\mu:= h \nu$ is a $\sigma$-invariant probability measure. It is called the equilibrium state of $(\Sigma_A, \sigma, f)$.

Two continuous functions $f, g:\Sigma_A \to \mathbb{R}$ is called cohomologous if there exists a continuous function $\phi:\Sigma_A \to \mathbb{R}$ such that
$f(x)-g(x) =\phi(\sigma x) -\phi(x)$.

Let $f, g:\Sigma_A \to \mathbb{R}$ be cohomologous. Then the two operators $L_f$ and $L_g$ are different, but $\lambda(f) =\lambda(g)=1$.
Their eigenfunctions and eigenmeasures are different, but the associated equilibrium states are the same.

To find a natural representative in the class $[f]$ of functions that are cohomologous to $f$, we set $g(x)=f(x)+ \log h(x) -\log h(\sigma x)$. Then we have

1). $\displaystyle L_g1(x)=\sum_{\sigma y =x} e^{g(y)}\cdot 1= \sum_{\sigma y =x} e^{f(y)}h(y)/h(x)=\frac{L_fh(x)}{h(x)}=1$. So $1$ is the eigenfunction of $L_g$.

2). $\displaystyle \int \phi dL_g^{\ast} \mu=\int L_g\phi d\mu =\int L_f(\phi h)d\nu =\int \phi\cdot h dL_f^{\ast}\nu =\int \phi h d\nu =\phi d\mu$.
So $\mu$ is the eigenmeasure of $L_g$.

From this point of view, we might pick $g(x)=f(x)+ \log h(x) -\log h(\sigma x)$ as the representative of $[f]$.

## Notes. Some basic terms

1. Let $R$ be a commutative ring, $S$ be a multiplicatively closed subset in the sense that $a,b\in S \Rightarrow ab \in S$. Then we consider the localization $S^{-1}R$ as the quotient $S\times R/\sim$, where $(r,a)\sim (s,b)$ if $(br-as)t=0$ for some $t\in S$.

Let $f\in R$. We can construct a m.c.subset $S=\{f^n: n\ge 0\}$, and denote the corresponding local ring by $R_f=S^{-1}R$.

Let $p\triangleleft R$ be a prime ideal of $R$. Then $S=R\backslash p$ is m.c. We denote the corresponding local ring by $R_p=S^{-1}R$.

Let $\text{Spec}R$ be the set of all prime ideals of $R$. For each ideal $I\triangleleft R$, let $V_I=\{p\in \text{Spec}R: p\supset I\}$. The Zariski topology on $\text{Spec}R$ is defined that the closed subsets are exactly $\{V_I: I\triangleleft R\}$.

A basis for the Zariski topology on $\text{Spec}R$ can be constructed as follows. For each $f\in R$, let $D_f\subset \text{Spec}R$ to be the set of prime ideals not containing $f$. Then each $D_f= \text{Spec}R\backslash V_{(f)}$ is open.

The points corresponding to maximal ideals $m \triangleleft R$ are closed points in the sense that the singleton $\{m\}=V_m$.

In the case $R=C[x_1, \dots, x_n]$, we see that each maximal ideal $m=\langle x_1-a_1,\dots, x_n-a_n \rangle$ corresponds to a point $(a_1,\dots, a_n)\in C^n$. So one can interprat this as $C^n \subset X= \text{Spec} R$. A non-max prime ideal $p$ (a non-closed point) corresponds an affine variety $P$, which is a closed subset in $C^n$. Then $p$ is called the generic point of the varity $P$.

2. Let $(M,\omega)$ be a symplectic manifold, $G$ be a Lie group acting on $M$ via symplectic diffeomorphisms. Let $\mathfrak{g}$ be the Lie algebra of $G$. Each $\xi \in \mathfrak{g}$ induces a vector field $\rho(\xi):x\in M \mapsto \frac{d}{dt}\Big|_{t=0}\Big(\exp(t\xi)\cdot x\Big)$. Note that $\rho(g^{-1}\xi g)=g_\ast \rho(\xi)$, and $\rho([\xi,\eta])=[\rho(\xi),\rho(\eta)]$.

Consider the 1-form induced by the contraction $\iota_{\rho(\xi)}\omega$. Clearly this 1-form is closed: $d\iota_{\rho(\xi)}\omega=L_{\rho(\xi)}\omega=0$ since $G$ preserves the form $\omega$.

Then the action is called weakly Hamiltonian, if for every $\xi\in \mathfrak{g}$, the one-form $\iota_{\rho(\xi)} \omega$ is exact: $\iota_{\rho(\xi)} \omega=dH_\xi$ for some smooth function $H_{\xi}$ on $M$. Although $H_\xi$ is only determined up to a constant $C_\xi$, the constant $\xi \mapsto C_\xi$ can be chosen such that the map $\xi\mapsto H_\xi$ becomes linear.

The action is called Hamiltonian, if the map $\mathfrak{g} \to C^\infty(M)$, $\xi\mapsto H_\xi$ is a Lie algebra homomorphism with respect to Poisson structure. Then $\rho(\xi)=X_{H_\xi}$ and $H_{g^{-1}\xi g}(x)=H_\xi(gx)$.

A moment map for a Hamiltonian $G$-action on $(M,\omega)$ is a map $\mu: M\to \mathfrak{g}^\ast$ such that $H_\xi(x)=\mu(x)\cdot \xi$ for all $\xi\in \mathfrak{g}$. In other words, for each fixed point $x\in M$, the map $\xi \mapsto H_\xi(x)$ from $\mathfrak{g}$ to $\mathbb{R}$ is a linear functional on $\mathfrak{g}$ and is denoted by $\mu(x)$. Also note that $\mu(gx)\cdot \xi=H_\xi(gx)=H_{g^{-1}\xi g}(x)$. So $\mu(gx)=g\mu(x)g^{-1}$.

## There is no positively expansive homeomorphism

Let $f$ be a homeomorphism on a compact metric space $(X,d)$. Then $f$ is said to be $\mathbb{Z}$-expansive, if there exists $\delta>0$ such that for any two points $x,y\in X$, if $d(f^nx,f^ny)<\delta$ for all $n\in\mathbb{Z}$, then $x=y$. The constant $\delta$ is called the expansive constant of $f$.

Similarly one can define $\mathbb{N}$-expansiveness if $f$ is not invertible. An interesting phenomenon observed by Schwartzman states that

Theorem. A homeomorphism $f$ cannot be $\mathbb{N}$-expansive (unless $X$ is finite).

This result was reported in Gottschalk–Hedlund’s book Topological Dynamics (1955), and a proof was given in King’s paper A map with topological minimal self-joinings in the sense of del Junco (1990). Below we copied the proof from King’s paper.

Proof. Suppose on the contrary that there is a homeo $f$ on $(X,d)$ that is $\mathbb{N}$-expansive. Let $\delta>0$ be the $\mathbb{N}$-expansive constant of $f$, and $d_n(x,y)=\max\{d(f^k x, f^k y): 1\le k\le n\}$.

It follows from the $\mathbb{N}$-expansiveness that $N:=\sup\{n\ge 1: d_n(x,y)\le\delta \text{ for some } d(x,y)\ge\delta\}$ is a finite number. Pick $\epsilon\in(0,\delta)$ such that $d_N(x,y)<\delta$ whenever $d(x,y)<\epsilon$.

Claim. If $d(x,y)<\epsilon$, then $d(f^{-n} x, f^{-n}y)<\delta$ for any $n\ge 1$.

Proof of Claim. If not, we can prolong the $N$-string since $f^{k}=f^{k+n}\circ f^{-n}$.

Recall that a pair $(x,y)$ is said to be $\epsilon$-proximal, if $d(f^{n_i}x, f^{n_i}y)<\epsilon$ for some $n_i\to\infty$. The upshot for the above claim is that any $\epsilon$-proximal pair is $\delta$-indistinguishable: $d(f^{n}x, f^{n}y)<\delta$ for all $n$.

Cover $X$ by open sets of radius $< \epsilon$, and pick a finite subcover, say $\{B_i:1\le i\le I\}$. Let $E=\{x_j:1\le j\le I+1\}$ be a subset consisting of $I+1$ distinct points. Then for each $n\ge 0$, there are two points in $f^n E$ share the room $B_{i(n)}$, say $f^nx_{a(n)}$, and $f^nx_{b(n)}$. Pick a subsequence $n_i$ such that $a(n_i)\equiv a$ and $b(n_i)\equiv b$. Clearly $x_a\neq x_b$, and $d(f^{n_i}x_a,f^{n_i}x_b)<\epsilon$. Hence the pair $(x_a,x_b)$ is $\epsilon$-proximal and $\delta$-indistinguishable. This contradicts the $\mathbb{N}$-expansiveness assumption on $f$. QED.

## Area under holomorphic maps

Let $f$ be a map from $(x,y)\in \mathbb{R}^2$ to $(a,b)\in \mathbb{R}^2$. The area form $dA=dx\wedge dy$ gives the Jacobian $dA=da\wedge db= J(x,y)dx\wedge dy$, where $J(x,y)=a_xb_y- a_yb_x$.

Now consider the complex setting, where $\displaystyle dA=\frac{i}{2} dz\wedge d\bar z$. Let $f$ be a map from $z\in \mathbb{C}$ to $w\in \mathbb{C}$. Then $\displaystyle dA=\frac{i}{2} dw\wedge d\bar w= \frac{i}{2}f'(z)\overline{f'(z)} dz\wedge d\bar z$. So this time the Jacobian $J(z)$ becomes $f'(z)\overline{f'(z)}$.

Suppose $\displaystyle f(z)=\sum_{n\ge 0} a_n z^n$ is a holomorphic map on the unit disk $D$. Then
$\displaystyle J(z)=\sum_{n,m\ge 0}nm a_n \bar a_m z^{n-1} \bar z^{m-1}$, the area of $f(D)$ is $\displaystyle \int_D J_f(z) dA$.

Using polar coordinate, we have $dA= rdr\, d\theta$, $\displaystyle z^{n-1} \bar z^{m-1}=r^{n+m-2}e^{i\theta(n-m)}$,
and $\displaystyle \int_D r^{n+m-2}e^{i\theta(n-m)} rdr\, d\theta= 0$ if $n\neq m$, and $=\frac{\pi}{n}$ if $n=m$.

So $\displaystyle |f(D)|=\sum_{n\ge 0} n^2 |a_n|^2\cdot \frac{\pi}{n}=\pi \sum_{n\ge 0} n |a_n|^2$.

## An interesting lemma about the Birkhoff sum

A few days ago I attended a lecture given by Amie Wilkinson. She presented a proof of Furstenberg’s theorem on the Lyapunov exponents of random products of matrices in $SL(2,\mathbb{R})$.

Let $\lambda$ be a probability measure on $SL(2,\mathbb{R})$, $\mu=\lambda^{\mathbb{N}}$ be the product measure on $\Omega=SL(2,\mathbb{R})^{\mathbb{N}}$. Let $\sigma$ be the shift map on $\Omega$, and $A:\omega\in\Omega\mapsto \omega_0\in SL(2,\mathbb{R})$ be the projection. We consider the induced skew product $(f,A)$ on $\Omega\times \mathbb{R}^2$. The (largest) Lyapunov exponent of $(f,A)$ is defined to be the value $\chi$ such that $\displaystyle \lim_{n\to\infty}\frac{1}{n}\log\|A_n(\omega)\|=\chi$ for $\mu$-a.e. $\omega\in \Omega$.

To apply the ergodic theory, we first assume $\int\log\|A\| d\lambda < \infty$. Then $\chi(\lambda)$ is well defined. There are cases when $\chi(\lambda)=0$:

(1) the generated group $\langle\text{supp}\lambda\rangle$ is compact;

(2) there exists a finite set $\mathcal{L}=\{L_1,\dots, L_k\}$ of lines that is invariant for all $A\in \langle\text{supp}\lambda\rangle$.

Furstenberg proved that the above cover all cases with zero exponent:
$\chi(\lambda) > 0$ for all other $\lambda$.