## Martingale and its application to dynamical systems

In the last week of May I attended two lectures given by Professor Matthew Nicol.

Let $(\Omega,\mu)$ be a prob space with a $\sigma$-algebra $\mathcal{B}$. Let $\mathcal{F}\prec \mathcal{B}$ be a sub $\sigma$-algebra.

Example. $f(x)=2x (\text{mod} 1)$ on $\mathbb{T}$, and $\mathcal{B}$ be the Borel $\sigma$-algebra. Let $\mathcal{F}=f^{-1}\mathcal{B}$. Note that $(0.2,0.3)\notin\mathcal{F}$.

Let $Y$ be a $\mathcal{B}$-measurable r.v. and $Y\in L^1(\mu)$. The conditional expectation $E(Y|\mathcal{F})$ is the unique $\mathcal{F}$-measurable r.v. $Z$ satisfying $Z^{-1}(a,b)\in \mathcal{F}$ for all $(a,b)$, and $\int_F Z d\mu=\int _A Y d\mu$ for all $A\in \mathcal{F}$.

Note that $E(Y|\mathcal{F})=Y$ if and only if $Y$ is $\mathcal{F}$-measurable; and $E(Y|\mathcal{F})=E(Y)$ if $Y$ is independent of $\mathcal{F}$.

Let $(X_n)_{n\ge 0}$ be a stationary ergodic process with stationary initial distribution $\mu$. A basic problem is to find sufficient conditions on $(X_n)_{n\ge 0}$ and on functions $\phi\in L^2_0(\mu)$ such that $\displaystyle S_n(\phi)=\sum_{k=1}^n \phi(X_k)$ satisfies the central limit theorem (CLT) $\displaystyle \frac{1}{\sqrt{n}}S_n(\phi) \to N(0,\sigma^2)$, where the limit variance is given by $\displaystyle \sigma^2(\phi)=\lim_{n\to\infty}\frac{1}{n}E(S^2_n(\phi))$.

Let $f$ be a conservative diffeomorphism on $(M,m)$. There are two operators: $\phi\mapsto U\phi=\phi\circ f$, and $\phi\mapsto P\phi$ via $\int P\phi\cdot \psi=\int \phi\cdot \psi\circ f$ for all test function $\psi$.

Property. $PU(\phi)=\phi$ (vol-preserving) and $UP(\phi)=E(\phi|f^{-1}\mathcal{B})$.

Let $\mathcal{F}_n$ be an increasing sequence of $\sigma$-algebras. Then a sequence of r.v. $S_n$ is called a martingale w.r.t. $\mathcal{F}_n$, if $S_n$ is $\mathcal{F}_n$-measurable, $E(S_{n+1}|\mathcal{F}_n)=S_n$.

Let $\mathcal{F}_n$ be a decreasing sequence of $\sigma$-algebras. Then a sequence of r.v. $S_n$ is called a reverse martingale w.r.t. $\mathcal{F}_n$, if $S_n$ is $\mathcal{F}_n$-measurable, $E(S_{n}|\mathcal{F}_m)=S_m$ for any $n\le m$.

Theorem. Let $\{X_n:n\ge 1\}$ be a stationary ergodic sequence of (reverse) martingale differences w.r.t. $\{\mathcal{F}_n\}$. Suppose $E(X_n)=0$, and $\sigma^2=\text{Var}(X_i)>0$. Then $\displaystyle \frac{1}{\sigma\sqrt{n}}\sum_{i=1}^n X_i \to N(0,1)$ in distribution.

Gordin: Suppose $(f,m)$ is ergodic. Consider the Birkhoff sum $\displaystyle \sum_{i=1}^n \phi\circ f^i$ for some $\phi$ with $\int \phi=0$. The time series $\phi\circ f^i$ can be approximated by martingale differences provided the correlations decay quickly enough.

Suppose there exists $p(n)$ with $\sum p(n) < \infty$, such that $\|P^n\phi\|\le C\cdot p(n)\|\phi\|$. Then define $\displaystyle g=\sum_{n\ge 1}P^n\phi$, and let $X=\phi+g-g\circ f$.

Property. Let $f:M\to M$ be such that $f^{-n}\mathcal{B}$ is decreasing. $\displaystyle S_n=\sum_{i=1}^n X\circ f^i$ is a reverse martingale with respect to $f^{-n}\mathcal{B}$.

Proof. Note that $PX=P\phi+Pg-PUg=0$. Then $E(X|f^{-1}\mathcal{B})=UP(X)=U0=0$.
Let $k < n$. It remains to show $E(X\circ f^k|f^{-n}\mathcal{B})=0$. To this end, we pick an element $A\in f^{-n}\mathcal{B}$ and write it as $A=f^{-k-1}C$ for some $C\in f^{k+1-n}\mathcal{B}$. Then $\displaystyle \int_A X\circ f^k dm=\int_{f^{-1}C}X dm =\int_{f^{-1}C} E(X|f^{-1}\mathcal{B}) dm=\int_{f^{-1}C}0 dm=0$. This completes the proof.

Three theorems of Gordin. Let $(\Omega,\mu,T)$ be an invertible $\mu$-preserving ergodic system, $X\in L^1(\mu)$ and $X_k(x)=X(T^kx)$ be a strictly stationary ergodic sequence.

(*) $\displaystyle \limsup_{n\to\infty}\frac{1}{\sqrt{n}}E|S_n| < \infty$

Theorem 1. Suppose there exists $\displaystyle \mathcal{F}_k\subset T^{-1}\mathcal{F}_k=\mathcal{F}_{k+1}$ such that $\displaystyle \sum_{k\ge 0} E|E(X_0|\mathcal{F}_{-k})|<\infty$, $\displaystyle \sum_{k\ge 0} E|X_0-E(X_0|\mathcal{F}_{k})| < \infty$. Then (*) implies $\displaystyle \lambda:=\lim_{n\to\infty}\frac{1}{\sqrt{n}}E|S_n|$ exists, and $\displaystyle \frac{1}{\sqrt{n}}S_n\to N(0,\lambda^2\pi/2)$ in distribution (degenerate if $\lambda=0$).

–Mixing condition. Let $\displaystyle \alpha(n):=\sup\{P(A\cap B)-P(A)P(B):A\in\mathcal{F}^0_{-\infty}, B\in\mathcal{F}^{\infty}_n\}$.

Theorem 2. Suppose for some $1/p+1/q=1$, $X\in L^p(\mu)$ and $\displaystyle \sum_{n\ge 1}\alpha(n)^{1/q} < \infty$. Then (*) implies the conclusion of Theorem 1.

–uniform mixing condition. Let $\displaystyle \phi(n):=\sup\{P(B|A)-P(B):A\in\mathcal{F}^0_{-\infty}, \mu(A) > 0, B\in\mathcal{F}^{\infty}_n\}$.

Theorem 3. Suppose $X\in L^1(\mu)$ and $\displaystyle \sum_{n\ge 1}\phi(n) < \infty$. Then (*) implies the conclusion of Theorem 1.

Cuny–Merlevede: not only the CLT, but also the ASIP holds under the above conditions.

Note that we started with an invariant measure $m$. The operator $U$ and $P$ can be defined for all non-conservative maps. To emphasize the difference, we use $\hat P$. Suppose $\hat P h=h$ for some $h\in L^1(m)$. Then $\mu=hm$ is an absolutely continuous invariant prob. measure:

$\displaystyle \int \phi\circ f d\mu=\int \phi\circ f h dm=\int \phi\cdot \hat P h dm=\int \phi hdm=\int\phi d\mu$.

Then we can rewrite $\displaystyle P\phi=\frac{1}{h}\hat P(h\phi)$, in the sense that $\displaystyle \int P(\phi)\cdot \psi d\mu=\int \phi\cdot \psi\circ f d\mu =\int \phi h\cdot \psi\circ f dm$
$\displaystyle =\int\hat P(\phi h)\cdot \psi dm \int \frac{1}{h}\hat P(\phi h)\cdot \psi d\mu$.

## Perron–Frobenius theorem

Today I attended a lecture given by Vaughn Climenhaga. He presented a proof of the following version of Perron–Frobenius theorem:

Let $\Delta\subset \mathbb{R}^d$ be the set of probability vectors, $P$ be a stochastic matrix with positive entries. Then
–there is a positive probability $\pi\in \Delta$ fixed by $P$
–the eigenspace $E_1=\mathbb{R}\pi$
–the spectra $\Sigma(P)\subset B(0,r)\cup\{1\}$ for some $r<1$
–for all $v\in\Delta$, $P^n v\to \pi$ exponentially as $n\to \infty$.

Proof. (1) Let $v\in\Delta$. Then $\sum_i v_i=1$, and
$\sum_i (Pv)_i=\sum_i \sum_j p_{ij}v_j=\sum_j v_j=1$. So $Pv\in \Delta$. Moreover, $Pv$ is positive and $P(\Delta)\subset \text{Int}(\Delta)$. Therefore there exists some point $\pi\in\text{Int}(\Delta)$ fixed by $P$.

(2). Suppose on the contrary that there exists $v\notin \mathbb{R}\pi$ that is also fixed by $P$. Then $P$ fixes every vector in the plane $\Pi:=\mathbb{R}v\oplus\mathbb{R}\pi$, in particular the points on $\Pi\cap \partial \Delta$. This contradicts (1).

(3). We use the norm $|v|=\sum|v_i|$. Note that $|Pv|=\sum_i |(Pv)_i|\le \sum_{ij}p_{ij}|v_j|=|v|$. So $\Sigma(P)\subset D(0,1)$. It suffices to show $\Sigma(P)\backslash\{1\}\cap S^1=\emptyset$. If not, pick one ,say $\lambda$, and $n\ge 1$ such that $\text{Re}(\lambda^n)1$ for any $\epsilon>0$.

Consider the matrix $A=P^n-\epsilon I$, which is positive if $\epsilon$ is small enough. Then we have $|A|\le |P_n|$ and hence $\Sigma(A)\subset D(0,1)$. This contradicts the fact $\lambda^n-\epsilon$ is an eigenvalue of $A$.

(4). Let $W\subset \mathbb{R}^d$ be the subset of vectors with zero mean: $\sum v_i=0$, and consider the decomposition $\mathbb{R}^d=\mathbb{R}\pi\oplus W$. Note that $PW\subset W$ and hence $\Sigma(P|_{W})\subset D(0,r)$. For any $v\in\Delta$, we have $v=\pi+w$ for some $w\in W$. Then $|P^nv-\pi|=|P^n(v-\pi)|=|P^nw|\le Cr^n|w|$.

Only light calculations are used in his lecture. As pointed by Vaughn, this approach does not give precise information of the $r$.

## Some notes

Let $M$ be a complete manifold, $\mathcal{K}_M$ be the set of compact/closed subsets of $M$. Let $X$ be a complete metric space.

A map $\phi: X\to \mathcal{K}_M$ is said to be upper-semicontinuous at $x$, if
for any open neighbourhood $U\supset \phi(x)$, there exists a neighbourhood $V\ni x$, such that $\phi(x')\subset U$ for all $x' \in V$.
or equally,
for any $x_n\to x$, and any sequence $y_n\in \phi(x_n)$, the limit set $\omega(y_n:n\ge 1)\subset \phi(x)$.
Viewed as a multivalued function, let $G(\phi)=\{(x,y)\subset X\times M: y\in\phi(x)\}$ be the graph of $\phi$. Then $\phi$ is u.s.c. if and only if $G(\phi)$ is a closed graph.

And $\phi$ is said to be lower-semicontinuous at $x$, if
for any open set $U$ intersecting $\phi(x)$, there exists neighbourhood $V\ni x$ such that $\phi(x')\cap U\neq\emptyset$ for all $x'\in V$.
or equally, for any $y\in \phi(x)$, and any sequence $x_n\to x$, there exists $y_n\in \phi(x_n)$ such that $y\in \omega(y_n:n\ge 1)$.

Let $\mathrm{Diff}^r(M)$ be the set of $C^r$ diffeomorphisms, and $H(f)$ be the closure of transverse homoclinic intersections of stable and unstable manifolds of some hyperbolic periodic points of $f$. Then $H$ is lower semicontinuous.

Given $f_n\to f$. Note that it suffices to consider those points $x\in W^s(p,f)\pitchfork W^u(q,f)$. Let $p_n$ and $q_n$ be the continuations of $p$ and $q$ for $f_n$. Pick $\rho$ large enough such that $x\in W^s_\rho(p,f)\pitchfork W^u_\rho(q,f)$. Then for $f_n$ sufficiently close to $f$, $W^s_\rho(p_n,f_n)$ and $W^u_\rho(q_n,f_n)$ are $C^1$ close to $W^s_\rho(p,f)$ and $W^u_\rho(q,f)$. In particular $x_n\in W^s_\rho(p_n,f_n)\pitchfork W^u_\rho(q_n,f_n)$ is close to $x$.

## Admissible perturbations of the tangent map

Franks’s Lemma is a major tool in the study of differentiable dynamical systems. It says that along a simple orbit segment $E=\{x,fx,\cdots,f^nx\}$, the perturbation of $A\sim D_xf^n$ can be realized via a perturbation of the map $g\sim f$ (which preserves the orbit segment). Moreover, such a perturbation is localized in a neighborhood of $E$, and it can be made arbitrarily $C^1$-close to $f$.

There have been various generalizations of Franks’ Lemma. Some constraints have been noticed when generalizing to geodesic flows and billiard dynamics, since one can’t perturb the dynamics directly, but have to make geometric deformations. See D. Visscher’s thesis for more details.

Let $Q$ be a strictly convex domain, $x$ be the orbit along the/a diameter of $Q$. Clearly $x$ is 2-period. Let $r\le R$ be the radius of curvatures at $x, fx$, respectively. Then
$D_xf^2=\frac{1}{rR}\begin{pmatrix}2d(d-r-R)+rR & 2d(d-R)\\ 2(d-r)(d-r-R) & 2d(d-r-R)+rR\end{pmatrix}$, where $d$ stands for the diameter of $Q$.
Note that the two entries on the diagonal are always the same. Therefore any linearization with different entries on the diagonal can’t be realized as the tangent map along a periodic billiard orbit of period 2. In other words, even through there are three parameters that one can change: the distance $d$, the radii of curvature at both ends $r,R$, the effects lie in a 2D-subspace $\{\begin{pmatrix}a & b \\ c & d\end{pmatrix}:ad-bc=1, a=d\}$ of the 3D $\{\begin{pmatrix}a & b \\ c & d\end{pmatrix}:ad-bc=1\}$.

Visscher was able to prove that generically, for each periodic orbit of period at least 3, every small perturbation of $D_xF^3$ is actually realizable by deforming the boundary of billiard table. For more details, see Visscher’s paper:

A Franks’ lemma for convex planar billiards.

## Regularity of center manifold

Let $X:\mathbb{R}^d\to \mathbb{R}^d$ be a $C^\infty$ vector field with $X(o)=0$. Then the origin $o$ is a fixed point of the generated flow on $\mathbb{R}^d$. Let $T_o\mathbb{R}^d=\mathbb{R}^s\oplus\mathbb{R}^c\oplus\mathbb{R}^u$ be the splitting into stable, center and unstable directions. Moreover, there are three invariant manifolds (at least locally) passing through $o$ and tangent to the corresponding subspaces at $o$.

Theorem (Pliss). For any $n\ge 1$, there exists a $C^n$ center manifold $C^n(o)=W^{c,n}(o)$.

Generally speaking, the size of the center manifold given above depends on the pre-fixed regularity requirement. Theoretically, there may not be a $C^\infty$ center manifold, since $C^n(o)$ could shrink to $o$ as $n\to\infty$. An explicit example was given by van Strien (here). He started with a family of vector fields $X_\mu(x,y)=(x^2-\mu^2, y+x^2-\mu^2)$. It is easy to see that $(\mu,0)$ is a fixed point, with $\lambda_1=2\mu<\lambda_2=1$. The center manifold can be represented (locally) as the graph of $y=f_\mu(x)$.

Lemma. For $n\ge 3$, $\mu=\frac{1}{2n}$, $f_\mu$ is at most $C^{n-1}$ at $(\frac{1}{2n},0)$.

Proof. Suppose $f_\mu$ is $C^{k}$ at $(\frac{1}{2n},0)$, and let $\displaystyle f_\mu(x)=\sum_{i=1}^{k}a_i(x-\mu)^i+o(|x-\mu|^{k})$ be the finite Taylor expansion. The vector field direction $(x^2-\mu^2, y+x^2-\mu^2)$ always coincides with the tangent direction $(1,f'_\mu(x))$ along the graph $(x,f_\mu(x))$, which leads to

$(x^2-\mu^2)f_\mu'(x)=y+x^2-\mu^2=f_\mu(x)+x^2-\mu^2$.

Note that $x^2-\mu^2=(x-\mu)^2+2\mu(x-\mu)$. Then up to an error term $o(|x-\mu|^{k})$, the right-hand side in terms of $(x-\mu)$: $(a_1+2\mu)(x-\mu)+(a_2+1)(x-\mu)^2+\sum_{i=3}^{k}a_i(x-\mu)^i$; while the left-hand side in terms of $(x-\mu)$:

$(x-\mu)^2f_\mu'(x)+2\mu(x-\mu)f_\mu'(x)=\sum_{i=1}^{k}ia_i(x-\mu)^{i+1}+\sum_{i=1}^{k}2\mu i a_i(x-\mu)^i$

$=\sum_{i=2}^{k}(i-1)a_{i-1}(x-\mu)^{i}+\sum_{i=1}^{k}2\mu i a_i(x-\mu)^i$.

So for $i=1$: $2\mu a_1=a_1+2\mu$, $a_1=\frac{-2\mu}{1-2\mu}\sim 0$;

$i=2$: $a_2+1=a_1+4\mu a_2$, $a_2=\frac{a_1-1}{1-4\mu}\sim -1$;

$i=3,\cdots,k$: $a_i=(i-1)a_{i-1}+2i\mu a_i$, $(1-2i\mu)a_i=(i-1)a_{i-1}$.

Note that if $k\ge n$, we evaluate the last equation at $i=n$ to conclude that $a_{n-1}=0$. This will force $a_i=0$ for all $i=n-2,\cdots,2$, which contradicts the second estimate that $a_2\sim -1$. Q.E.D.

Consider the 3D vector field $X(x,y,z)=(x^2-z^2, y+x^2-z^2,0)$. Note that the singular set $S$ are two lines $x=\pm z$, $y=0$ (in particular it contains the origin $O=(0,0,0)$). Note that $D_OX=E_{22}$. Hence a cener manifold $W^c(O)$ through $O$ is tangent to plane $y=0$, and can be represented as $y=f(x,z)$. We claim that $f(x,x)=0$ (at least locally).

Proof of the claim. Suppose on the contrary that $c_n=f(x_n,x_n)\neq0$ for some $x_n\to 0$. Note that $p_n=(x_n,c_n,x_n)\in W^c(O)$, and $W^c(O)$ is flow-invariant. However, there is exactly one flow line passing through $p_n$: the line $L_n=\{(x_n,c_nt,x_n):t>0\}$. Therefore $L_n\subset W^c(O)$, which contradicts the fact that $W^c(O)$ is tangent to plane $y=0$ at $O$. This completes the proof of the claim.

The planes $z=\mu$ are also invariant under the flow. Let’s take the intersection $W_\mu=\{z=\mu\}\cap W^c(O)=\{(x,f(x,\mu),\mu)\}$. Then we check that $\{(x,f(x,\mu))\}$ is a (in fact the) center manifold of the restricted vector field in the plane $z=\mu$. We already checked that $f(x,\mu)$ is not $C^\infty$, so is $W^c(O)$.

## The volume of uniform hyperbolic sets

This is a note of some well known results. The argument here may be new, and may be complete.

Proposition 1. Let $f\in\mathrm{Diff}^2_m(M)$. Then $m(\Lambda)=0$ for every closed, invariant hyperbolic set $\Lambda\neq M$.

See Theorem 15 of Bochi–Viana’s paper. Note that Proposition 1 also applies to Anosov case, in the sense that $m(\Lambda)>0$ implies that $\Lambda=M$ and $f$ is Anosov.

Proof. Suppose $m(\Lambda)>0$ for some hyperbolic set. Then the stable and unstable foliations/laminations are absolutely continuous. Hopf argument shows that $\Lambda$ is (essentially) saturated by stable and unstable manifolds. Being a closed subset, $\Lambda$ is in fact saturated by stable and unstable manifolds, and hence open. So $\Lambda=M$.

Proposition 2. There exists a residual subset $\mathcal{R}\subset \mathrm{Diff}_m^1(M)$, such that for every $f\in\mathcal{R}$, $m(\Lambda)=0$ for every closed, invariant hyperbolic set $\Lambda\neq M$.

Proof. Let $U\subset M$ be an open subset such that $\overline{U}\neq M$, $\Lambda_U(f)=\bigcap_{\mathbb{Z}}f^n\overline{U}$, which is always a closed invariant set (maybe empty). Given $\epsilon>0$, let $\mathcal{D}(U,\epsilon)$ be the set of maps $f\in\mathrm{Diff}_m^1(M)$ that either $\Lambda_U(f)$ is not a uniformly hyperbolic set, or it’s hyperbolic but  $m(\Lambda_U(f))<\epsilon$. It follows from Proposition 1 that $\mathcal{D}(U,\epsilon)$ is dense. We only need to show the openness. Pick an $f\in \mathcal{D}(U,\epsilon)$. Since $m(\Lambda_U(f))<\epsilon$, there exists $N\ge 1$ such that $m(\bigcap_{-N}^N f^n\overline{U})<\epsilon$. So there exists $\mathcal{U}\ni f$ such that $m(\bigcap_{-N}^N g^n\overline{U})<\epsilon$. In particular, $m(\Lambda_U(g))<\epsilon$ for every $g\in \mathcal{U}$. The genericity follows by the countable intersection of the open dense subsets $\mathcal{D}(U_n,1/k)$.

The dissipative version has been obtained in Alves–Pinheiro’s paper

Proposition 3. Let $f\in\mathrm{Diff}^2(M)$. Then $m(\Lambda)=0$ for every closed, transitive hyperbolic set $\Lambda\neq M$. In particular, $m(\Lambda)>0$ implies that $\Lambda=M$ and $f$ is Anosov.

See Theorem 4.11 in R. Bowen’s book when $\Lambda$ is a basic set.

## Doubling map on unit circle

1. Let $\tau:x\mapsto 2x$ be the doubling map on the unit torus. We also consider the uneven doubling $f_a(x)=x/a$ for $0\le x \le a$ and $f(x)=(x-a)/(1-a)$ for $a \le x \le 1$. It is easy to see that the Lebesgue measure $m$ is $f_a$-invariant, ergodic and the metric entropy $h(f_a,m)=\lambda(m)=\int \log f_a'(x) dm(x)=-a\log a-(1-a)\log (1-a)$. In particular, $h(f_a,m)\le h(f_{0.5},m)=\log 2 =h_{\text{top}}(f_a)$ and $h(f_a,m)\to 0$ when $a\to 0$.

2. Following is a theorem of Einsiedler–Fish here.

Proposition. Let $\tau:x\mapsto 2x$ be the doubling map on the unit torus, $\mu$ be an $\tau$-invariant measure with zero entropy. Then for any $\epsilon>0$, $\beta>0$, there exist $\delta_0>0$ and a subset $E\subset \mathbb{T}$ with $\mu(E) > 0$, such that for all $x \in E$, and all $\delta<\delta_0$: $\mu(B(x,\delta))\ge \delta^\beta$.

A trivial observation is $\text{HD}(\mu)=0$, which also follows from general entropy-dimension formula.

Proof. Let $\beta$ and $\epsilon$ be fixed. Consider the generating partition $\xi=\{I_0, I_1\}$, and its refinements $\xi_n=\{I_\omega: \omega\in\{0,1\}^n\}$ (separated by $k\cdot 2^{-n}$)….

Furstenberg introduced the following notation in 1967

Definition. A multiplicative semigroup $\Sigma\subset\mathbb{N}$ is lacunary, if $\Sigma\subset \{a^n: n\ge1\}$ for some integer $a$. Otherwise, $\Sigma$ is non-lacunary.

Example. Both $\{2^n: n\ge1\}$ and $\{3^n: n\ge1\}$ are lacunary semigroups. $\{2^m\cdot 3^n: m,n\ge1\}$ is a non-lacunary semigroup.

Theorem. Let $\Sigma\subset\mathbb{N}$ be a non-lacunary semigroup, and enumerated increasingly by $s_i > s_{i+1}\cdot$. Then $\frac{s_{i+1}}{s_i}\to 1$.

Example. $\Sigma=\{2^m\cdot 3^n: m,n\ge1\}$. It is equivalent to show $\{m\log 2+ n\log 3: m,n\ge1\}$ has smaller and smaller steps.

Theorem. Let $\Sigma\subset\mathbb{N}$ be a non-lacunary semigroup, and $A\subset \mathbb{T}$ be $\Sigma$-invariant. If $0$ is not isolated in $A$, then $A=\mathbb{T}$.

Furstenberg Theorem. Let $\Sigma\subset\mathbb{N}$ be a non-lacunary semigroup, and $\alpha\in \mathbb{T}\backslash \mathbb{Q}$. Then $\Sigma\alpha$ is dense in $\mathbb{T}$.

In the same paper, Furstenberg also made the following conjecture: a $\Sigma$-invariant ergodic measure is either supported on a finite orbit, or is the Lebesgue measure.

A countable group $G$ is said to be amenable, if it contains at least one Følner sequence. For example, any abelian countable group is amenable. Note that for amenable group action $G\ni g:X\to X$, there always exists invariant measures and the decomposition into ergodic measures. More importantly, the generic point can be defined by averaging along the Følner sequences, and almost every point is a generic point for an ergodic measure. In a preprint, the author had an interesting idea: to prove Furstenberg conjecture, it suffices to show that every irrational number is a generic point of the Lebesgue measure. Then any other non-atomic ergodic measures, if exist, will be starving to death since there is no generic point for them 🙂

## Some simple dynamical systems

Dynamical formulation of Prisoner’s dilemma
Originally, consider the two players, each has a set of stratagies, say $\mathcal{A}=\{a_{i}\}$ and $\mathcal{B}=\{b_{j}\}$. The pay-off $P_k=P_k(a_{i},b_{j})$ for player $k$ depends on the choices of both players.

Now consider two dynamical systems $(M_i,f_i)$. The set of stratagies consists of the invariant probability measures, and the pay-off functions can be

$\phi_k(\mu_1,\mu_2)=\int \Phi_k(x,y)d\mu_1 d\mu_2$, where $\mu_i\in\mathcal{M}(f_i)$;

$\psi_k(\mu_1,\mu_2)=\int \Phi_k(x,y)d\mu_1 d\mu_2-h(f_i,\mu_i)$.

The frist one is related to Ergodic optimization. The second one does sound better, since one may want to avoid a complicated (measured by its entropy) stratagy that has the same $\phi$ pay-off.

Gambler’s Ruin Problem
A gambler starts with an initial fortune of $i, and then either wins$1 (with $p$) or loses \$1 (with $q=1-p$) on each successive gamble (independent of the past). Let $S_n$ denote the total fortune after the n-th gamble. Given $N>i$, the gambler stops either when $S_n=0$ (broke), or $S_n=N$ (win), whichever happens first.

Let $\tau$ be the stopping time and $P_i(N)=P(S_\tau=N)$ be the probability that the gambler wins. It is easy to see that $P_0(N)=0$ and $P_N(N)=1$. We need to figure out $P_i(N)$ for all $i=1,\cdots,N-1$.

Let $S_0=i$, and $S_n=S_{n-1}+X_n$. There are two cases according to $X_1$:

$X_1=1$ (prob $p$): win eventually with prob $P_{i+1}(N)$;

$X_1=-1$ (prob $q$): win eventually with prob $P_{i-1}(N)$.

So $P_i(N)=p\cdot P_{i+1}(N)+q\cdot P_{i-1}(N)$, or equivalently,
$p\cdot (P_{i+1}(N)-P_i(N))=q\cdot (P_i(N)-P_{i-1}(N))$ (since $p+q=1$), $i=1,\cdots,N-1$.

Recall that $P_0(N)=0$ and $P_N(N)=1$. Therefore $P_{i+1}(N)-P_i(N)=\frac{q^i}{p^i}(P_1(N)-P_{0}(N))=\frac{q^i}{p^i}P_1(N)$, $i=1,\cdots,N-1$. Summing over $i$, we get $1-P_1(N)=P_1(N)\cdot\sum_{1}^{N-1}\frac{q^i}{p^i}$, $P_1(N)=\frac{1}{\sum_{0}^{N-1}\frac{q^i}{p^i}}=\frac{1-q/p}{1-q^N/p^N}$ (if $p\neq .5$) and $P_1(N)=\frac{1}{N}$ (if $p= .5$). Generally $P_i(N)=P_1(N)\cdots\sum_{0}^{i-1}\frac{q^j}{p^j}=\frac{1-q^i/p^i}{1-q^N/p^N}$ (if $p\neq .5$) and $P_1(N)=\frac{i}{N}$ (if $p= .5$).

Observe that for fixed $i$, the limit $P_i(\infty)=1-q^i/p^i>0$ only when $p>.5$, and $P_i(\infty)=0$ whenever $p\le .5$.

Finite Blaschke products
Let $f$ be an analytic function on the unit disc $\mathbb{D}=\{z\in\mathbb{C}: |z|<1 \}$ with a continuous extension to $\overline{\mathbb{D}}$ with $f(S^1)\subset S^1$. Then $f$ is of the form

$\displaystyle f(z)=\zeta\cdot\prod_{i=1}^n\left({{z-a_i}\over {1-\bar{a_i}z}}\right)^{m_i}$,

where $\zeta\in S^1$, and $m_i$ is the multiplicity of the zero $a_i\in \mathbb{D}$ of $f$. Such $f$ is called a finite Blaschke product.

Proposition. Let $f$ be a finite Blaschke product. Then the restriction $f:S^1\to S^1$ is measure-preserving if and only if $f(0)=0$. That is, $a_i=0$ for some $i$.

Proof. Let $\phi$ be an analytic function on $\overline{\mathbb{D}}$. Then $\int_{S_1}\phi d(\theta)=\phi(0)$ and $\int_{S_1}\phi\circ f d(\theta)=\phi\circ f(0)$.

Significance: there are a lot of measure-preserving covering maps on $S^1$.

Kalikov’s Random Walk Random Scenery
Let $X=\{1,-1\}^{\mathbb{Z}}$, and $\sigma:X\to X$ to the shift $\sigma((x_n))=(x_{n+1})$. More generally, let $A$ be a finite alphabet and $p$ be probability vector on $A$, and $Y=A^{\mathbb{Z}}$, $\nu=p^{\times\mathbb{Z}}$. Consider the skew-product $T:X\times Y\to X\times Y$, $(x,y)\mapsto (\sigma x, \sigma^{x_0}y)$. It is clear that $T$ preserves any $\mu\times \nu$, where $\mu$ is $\sigma$-invariant.

Proposition. Let $\mu=(.5,.5)^{\times\mathbb{Z}}$. Then $h(T,\mu\times \nu)=h(\sigma,\mu)=\log 2$ for all $(A,p)$.

Proof. Note that $T^n(x,y)\mapsto (\sigma^n x, \sigma^{x_0+\cdot+x_{n-1}}y)$. CLT tells that $\mu(x:|x_0+\cdot+x_{n-1}|\ge\kappa\cdot \sqrt{n})< \delta(\kappa)$ as $n\to\infty$, where $\delta(\kappa)\to 0$ as $\kappa\to\infty$. There are only $2^{n+\kappa \sqrt{n}}$ different $n$-strings (up to an error).

Significance: this gives a natural family of examples that are K, but not isomorphic to Bernoulli.

Creation of one sink. 1D case. Consider the family $f_t:x\mapsto x^2+2-t$, where $0\le t\le 2$. Let $t_\ast$ the first parameter such that the graph is tangent to the diagonal at $x_\ast=f_{t_\ast}(x_\ast)$. Note that $x_\ast$ is parabolic. Then for $t\in(t_\ast,t_\ast+\epsilon)$, $f_t(x)=x$ has two solutions $x_1(t), where $x_1(t)$ is a sink, and $x_2(t)$ is a source.

2D case. Let $B=[-1,1]\times[-\epsilon,\epsilon]$ be a rectangle, $f$ be a diffeomorphism such that $f(B)$ is a horseshoe lying  above $B$ of shape ‘V’. Moreover we assume $|\det Df|<1$. Let $f_t(x,y)=f(x,y)-(0,t)$ such that $f_1(B)$ is the regular horseshoe intersection:  V . Clearly there exists a fixed point $p_1$ of $f_1$ in $B$. We assume $\lambda_1(1)<-1<\lambda_2(1)<0$. Then Robinson proved that $f_t$ admits a fixed point in $B$ which is a sink.

First note that for any $t$, and any fixed point of $f_t$ (if exists), it is not on the boundary of $B$. Since $p_1$ is a nondegenerate fixed point of $f_1$, the fixed point continues to exist for some open interval $(t_1,1)$ (assume it is maximal, and denote the fixed point by $p_t$). Clearly $t_1>0$. Note that $p_{t_1}$ is also fixed by $f_{t_1}$, since it is a closed property. If there is some moment with $\lambda_1(t)=\lambda_2(t)$ for the fixed point $p_t$ of $f_t$, then it is already a sink, since $\det Df=\lambda_1\cdot\lambda_2<1$. So in the following we consider the case $\lambda_1\neq\lambda_2$ for all $p_t$, $t\in[t_1,1]$. Then the continuous dependence of parameters implies that both are continuous functions of $t$. The fixed point $p_{t_1}$ must be degenerate, since the fixed point ceases to exist beyond $t_1$, which means: $\lambda_i(t_1)=1$ for some $i\in\{1,2\}$.

Case 1. $\lambda_1(t_1)=1$. Note that $\lambda_1(1)<-1$. So $\text{Re}\lambda_1(t_\ast)=0$ for some $t_\ast\in(t_1,1)$, which implies that $\lambda_1(t_\ast)=ai$ for some $a\neq 0$. In particular, $\lambda_2(t_\ast)=-ai$, and $a^2=|\det Df|<1$. So $p_{t_\ast}$ is a (complex) sink.

Case 2. $\lambda_2(t_1)=1$. Note that $\lambda_2(1)<0$. Similarly $\text{Re}\lambda_2(t_\ast)=0$ for some $t_\ast\in(t_1,1)$.

So in the orientation-preserving case there always exists a complex sink. In the orientation-reversing case ($\lambda_2(1)\in(0,1)$), we need modify the argument for case 2:

Case 2′. $\lambda_2(t_1)=1$. Note that $\lambda_2(1)\in(0,1)$. So $|\lambda_1(t_\ast)|<1$ for some $t_\ast\in(t_1,1)$. We pick $t_\ast$ close to $t_1$ in the sense that $|\lambda_1(t_\ast)|>|\det Df|$, which implies $|\lambda_1(t_\ast)|\ast<1$, too. So $p_{t_\ast}$ is also a sink.

## Playing pool with pi

This is a short note based on the paper

Playing pool with π (the number π from a billiard point of view) by G. Galperin in 2003.

Let’s start with two hard balls,  denoted by $B_1$ and $B_2$, of masses $0 on the positive real axis with position $0, and a rigid wall at the origin. Without loss of generality we assume $m=1$. Then push the ball $B_2$ towards $B_1$, and count the total number $N(M)$ of collisions (ball-ball and ball-wall) till the $B_2$ escapes to $\infty$ faster than $B_1$.

Case. $M=1$: first collision at $y(t)=x$, then $B_2$ rests, and $B_1$ move towards the wall; second collision at $x(t)=0$, then $B_1$ gains the opposite velocity and moves back to $B_2$; third collision at $x(t)=x$, then $B_1$ rests, and $B_2$ move towards $\infty$.

Total counts $N(1)=3$, which happens to be first integral part of $\pi$. Well, this must be coincidence, one might wonder.

However, Galperin proved that, if we set $M=10^{2k}$, then $N(M)$ gives the integral part of $10^k\pi$. For example, $N(10^2)=31$; and  $N(10^4)=314$.

## Notes-09-14

4. Borel–Cantelli Lemma(s). Let $(X,\mathcal{X},\mu)$ be a probability space. Then

If $\sum_n \mu(A_n)<\infty$, then $\mu(x\in A_n \text{ infinitely often})=0$.

If $A_n$ are independent and $\sum_n \mu(A_n)=\infty$, then for $\mu$-a.e. $x$, $\frac{1}{\mu(A_1)+\cdots+\mu(A_n)}\cdot|\{1\le k\le n:x\in A_k\}|\to 1$.

The dynamical version often involves the orbits of points, instead of the static points. In particular, let $T$ be a measure-preserving map on $(X,\mathcal{X},\mu)$. Then

$\{A_n\}$ is said to be a Borel–Cantelli sequence with respect to $(T,\mu)$ if $\mu(T^n x\in A_n \text{ infinitely often})=1$;

$\{A_n\}$ is said to be a strong Borel–Cantelli sequence if $\frac{1}{\mu(A_1)+\cdots+\mu(A_n)}\cdot|\{1\le k\le n:T^k x\in A_k\}|\to 1$ for $\mu$-a.e. $x$.

3. Let $H(q,p,t)$ be a Hamiltonian function, $S(q,t)$ be the generating function in the sense that $\frac{\partial S}{\partial q_i}=p_i$. Then the Hamilton–Jacobi equation is a first-order, non-linear partial differential equation

$H + \frac{\partial S}{\partial t}=0$.

Note that the total derivative $\frac{dS}{dt}=\sum_i\frac{\partial S}{\partial q_i}\dot q_i+\frac{\partial S}{\partial t}=\sum_i p_i\dot q_i-H=L$. Therefore, $S=\int L$ is the classical action function (up to an undetermined constant).

2. Let $\gamma_s(t)$ be a family of geodesic on a Riemannian manifold $M$. Then $J(t)=\frac{\partial }{\partial s}|_{s=0} \gamma_s(t)$ defines a vector field along $\gamma(t)=\gamma_0(t)$, which is called a Jacobi field. $J(t)$ describes the behavior of the geodesics in an infinitesimal neighborhood of a given geodesic $\gamma$.

Alternatively, A vector field $J(t)$ along a geodesic $\gamma$ is said to be a Jacobi field, if it satisfies the Jacobi equation:

$\frac{D^2}{dt^2}J(t)+R(J(t),\dot\gamma(t))\dot\gamma(t)=0,$

where $D$ denotes the covariant derivative with respect to the Levi-Civita connection, and $R$ the Riemann curvature tensor on $M$.