Martingale and its application to dynamical systems

In the last week of May I attended two lectures given by Professor Matthew Nicol.

Let (\Omega,\mu) be a prob space with a \sigma-algebra \mathcal{B}. Let \mathcal{F}\prec \mathcal{B} be a sub \sigma-algebra.

Example. f(x)=2x (\text{mod} 1) on \mathbb{T}, and \mathcal{B} be the Borel \sigma-algebra. Let \mathcal{F}=f^{-1}\mathcal{B}. Note that (0.2,0.3)\notin\mathcal{F}.

Let Y be a \mathcal{B}-measurable r.v. and Y\in L^1(\mu). The conditional expectation E(Y|\mathcal{F}) is the unique \mathcal{F}-measurable r.v. Z satisfying Z^{-1}(a,b)\in \mathcal{F} for all (a,b), and \int_F Z d\mu=\int _A Y d\mu for all A\in \mathcal{F}.

Note that E(Y|\mathcal{F})=Y if and only if Y is \mathcal{F}-measurable; and E(Y|\mathcal{F})=E(Y) if Y is independent of \mathcal{F}.

Let (X_n)_{n\ge 0} be a stationary ergodic process with stationary initial distribution \mu. A basic problem is to find sufficient conditions on (X_n)_{n\ge 0} and on functions \phi\in L^2_0(\mu) such that \displaystyle S_n(\phi)=\sum_{k=1}^n \phi(X_k) satisfies the central limit theorem (CLT) \displaystyle \frac{1}{\sqrt{n}}S_n(\phi) \to N(0,\sigma^2), where the limit variance is given by \displaystyle \sigma^2(\phi)=\lim_{n\to\infty}\frac{1}{n}E(S^2_n(\phi)).

Let f be a conservative diffeomorphism on (M,m). There are two operators: \phi\mapsto U\phi=\phi\circ f, and \phi\mapsto P\phi via \int P\phi\cdot \psi=\int \phi\cdot \psi\circ f for all test function \psi.

Property. PU(\phi)=\phi (vol-preserving) and UP(\phi)=E(\phi|f^{-1}\mathcal{B}).

Let \mathcal{F}_n be an increasing sequence of \sigma-algebras. Then a sequence of r.v. S_n is called a martingale w.r.t. \mathcal{F}_n, if S_n is \mathcal{F}_n-measurable, E(S_{n+1}|\mathcal{F}_n)=S_n.

Let \mathcal{F}_n be a decreasing sequence of \sigma-algebras. Then a sequence of r.v. S_n is called a reverse martingale w.r.t. \mathcal{F}_n, if S_n is \mathcal{F}_n-measurable, E(S_{n}|\mathcal{F}_m)=S_m for any n\le m.

Theorem. Let \{X_n:n\ge 1\} be a stationary ergodic sequence of (reverse) martingale differences w.r.t. \{\mathcal{F}_n\}. Suppose E(X_n)=0, and \sigma^2=\text{Var}(X_i)>0. Then \displaystyle \frac{1}{\sigma\sqrt{n}}\sum_{i=1}^n X_i \to N(0,1) in distribution.

Gordin: Suppose (f,m) is ergodic. Consider the Birkhoff sum \displaystyle \sum_{i=1}^n \phi\circ f^i for some \phi with \int \phi=0. The time series \phi\circ f^i can be approximated by martingale differences provided the correlations decay quickly enough.

Suppose there exists p(n) with \sum  p(n) < \infty, such that \|P^n\phi\|\le C\cdot p(n)\|\phi\|. Then define \displaystyle g=\sum_{n\ge 1}P^n\phi, and let X=\phi+g-g\circ f.

Property. Let f:M\to M be such that f^{-n}\mathcal{B} is decreasing. \displaystyle S_n=\sum_{i=1}^n X\circ f^i is a reverse martingale with respect to f^{-n}\mathcal{B}.

Proof. Note that PX=P\phi+Pg-PUg=0. Then E(X|f^{-1}\mathcal{B})=UP(X)=U0=0.
Let k < n. It remains to show E(X\circ f^k|f^{-n}\mathcal{B})=0. To this end, we pick an element A\in f^{-n}\mathcal{B} and write it as A=f^{-k-1}C for some C\in f^{k+1-n}\mathcal{B}. Then \displaystyle \int_A X\circ f^k dm=\int_{f^{-1}C}X dm =\int_{f^{-1}C} E(X|f^{-1}\mathcal{B}) dm=\int_{f^{-1}C}0 dm=0. This completes the proof.

Three theorems of Gordin. Let (\Omega,\mu,T) be an invertible \mu-preserving ergodic system, X\in L^1(\mu) and X_k(x)=X(T^kx) be a strictly stationary ergodic sequence.

(*) \displaystyle \limsup_{n\to\infty}\frac{1}{\sqrt{n}}E|S_n| < \infty

Theorem 1. Suppose there exists \displaystyle \mathcal{F}_k\subset T^{-1}\mathcal{F}_k=\mathcal{F}_{k+1} such that \displaystyle \sum_{k\ge 0} E|E(X_0|\mathcal{F}_{-k})|<\infty, \displaystyle \sum_{k\ge 0} E|X_0-E(X_0|\mathcal{F}_{k})| < \infty. Then (*) implies \displaystyle \lambda:=\lim_{n\to\infty}\frac{1}{\sqrt{n}}E|S_n| exists, and \displaystyle \frac{1}{\sqrt{n}}S_n\to N(0,\lambda^2\pi/2) in distribution (degenerate if \lambda=0).

–Mixing condition. Let \displaystyle \alpha(n):=\sup\{P(A\cap B)-P(A)P(B):A\in\mathcal{F}^0_{-\infty}, B\in\mathcal{F}^{\infty}_n\}.

Theorem 2. Suppose for some 1/p+1/q=1, X\in L^p(\mu) and \displaystyle \sum_{n\ge 1}\alpha(n)^{1/q} < \infty. Then (*) implies the conclusion of Theorem 1.

–uniform mixing condition. Let \displaystyle \phi(n):=\sup\{P(B|A)-P(B):A\in\mathcal{F}^0_{-\infty}, \mu(A) > 0, B\in\mathcal{F}^{\infty}_n\}.

Theorem 3. Suppose X\in L^1(\mu) and \displaystyle \sum_{n\ge 1}\phi(n) < \infty. Then (*) implies the conclusion of Theorem 1.

Cuny–Merlevede: not only the CLT, but also the ASIP holds under the above conditions.

Note that we started with an invariant measure m. The operator U and P can be defined for all non-conservative maps. To emphasize the difference, we use \hat P. Suppose \hat P h=h for some h\in L^1(m). Then \mu=hm is an absolutely continuous invariant prob. measure:

\displaystyle \int \phi\circ f d\mu=\int \phi\circ f h dm=\int \phi\cdot \hat P h dm=\int \phi hdm=\int\phi d\mu.

Then we can rewrite \displaystyle P\phi=\frac{1}{h}\hat P(h\phi), in the sense that \displaystyle \int P(\phi)\cdot \psi d\mu=\int \phi\cdot \psi\circ f d\mu  =\int \phi h\cdot \psi\circ f dm
\displaystyle =\int\hat P(\phi h)\cdot \psi dm  \int \frac{1}{h}\hat P(\phi h)\cdot \psi d\mu.

Perron–Frobenius theorem

Today I attended a lecture given by Vaughn Climenhaga. He presented a proof of the following version of Perron–Frobenius theorem:

Let \Delta\subset \mathbb{R}^d be the set of probability vectors, P be a stochastic matrix with positive entries. Then
–there is a positive probability \pi\in \Delta fixed by P
–the eigenspace E_1=\mathbb{R}\pi
–the spectra \Sigma(P)\subset B(0,r)\cup\{1\} for some r<1
–for all v\in\Delta, P^n v\to \pi exponentially as n\to \infty.

Proof. (1) Let v\in\Delta. Then \sum_i v_i=1, and
\sum_i (Pv)_i=\sum_i \sum_j p_{ij}v_j=\sum_j v_j=1. So Pv\in \Delta. Moreover, Pv is positive and P(\Delta)\subset \text{Int}(\Delta). Therefore there exists some point \pi\in\text{Int}(\Delta) fixed by P.

(2). Suppose on the contrary that there exists v\notin \mathbb{R}\pi that is also fixed by P. Then P fixes every vector in the plane \Pi:=\mathbb{R}v\oplus\mathbb{R}\pi, in particular the points on \Pi\cap \partial \Delta. This contradicts (1).

(3). We use the norm |v|=\sum|v_i|. Note that |Pv|=\sum_i |(Pv)_i|\le \sum_{ij}p_{ij}|v_j|=|v|. So \Sigma(P)\subset D(0,1). It suffices to show \Sigma(P)\backslash\{1\}\cap S^1=\emptyset. If not, pick one ,say \lambda, and n\ge 1 such that \text{Re}(\lambda^n)1 for any \epsilon>0.

Consider the matrix A=P^n-\epsilon I, which is positive if \epsilon is small enough. Then we have |A|\le |P_n| and hence \Sigma(A)\subset D(0,1). This contradicts the fact \lambda^n-\epsilon is an eigenvalue of A.

(4). Let W\subset \mathbb{R}^d be the subset of vectors with zero mean: \sum v_i=0, and consider the decomposition \mathbb{R}^d=\mathbb{R}\pi\oplus W. Note that PW\subset W and hence \Sigma(P|_{W})\subset D(0,r). For any v\in\Delta, we have v=\pi+w for some w\in W. Then |P^nv-\pi|=|P^n(v-\pi)|=|P^nw|\le Cr^n|w|.

Only light calculations are used in his lecture. As pointed by Vaughn, this approach does not give precise information of the r.

Some notes

Let M be a complete manifold, \mathcal{K}_M be the set of compact/closed subsets of $M$. Let X be a complete metric space.

A map \phi: X\to \mathcal{K}_M is said to be upper-semicontinuous at x, if
for any open neighbourhood U\supset \phi(x), there exists a neighbourhood V\ni x, such that \phi(x')\subset U for all x' \in V.
or equally,
for any x_n\to x, and any sequence y_n\in \phi(x_n), the limit set \omega(y_n:n\ge 1)\subset \phi(x).
Viewed as a multivalued function, let G(\phi)=\{(x,y)\subset X\times M: y\in\phi(x)\} be the graph of \phi. Then \phi is u.s.c. if and only if G(\phi) is a closed graph.

And \phi is said to be lower-semicontinuous at x, if
for any open set U intersecting \phi(x), there exists neighbourhood V\ni x such that \phi(x')\cap U\neq\emptyset for all x'\in V.
or equally, for any y\in \phi(x), and any sequence x_n\to x, there exists y_n\in \phi(x_n) such that y\in \omega(y_n:n\ge 1).

Let \mathrm{Diff}^r(M) be the set of C^r diffeomorphisms, and H(f) be the closure of transverse homoclinic intersections of stable and unstable manifolds of some hyperbolic periodic points of f. Then H is lower semicontinuous.

Given f_n\to f. Note that it suffices to consider those points x\in W^s(p,f)\pitchfork W^u(q,f). Let p_n and q_n be the continuations of p and q for f_n. Pick \rho large enough such that x\in W^s_\rho(p,f)\pitchfork W^u_\rho(q,f). Then for f_n sufficiently close to f, W^s_\rho(p_n,f_n) and W^u_\rho(q_n,f_n) are C^1 close to W^s_\rho(p,f) and W^u_\rho(q,f). In particular x_n\in W^s_\rho(p_n,f_n)\pitchfork W^u_\rho(q_n,f_n) is close to x.

Admissible perturbations of the tangent map

Franks’s Lemma is a major tool in the study of differentiable dynamical systems. It says that along a simple orbit segment E=\{x,fx,\cdots,f^nx\}, the perturbation of A\sim D_xf^n can be realized via a perturbation of the map g\sim f (which preserves the orbit segment). Moreover, such a perturbation is localized in a neighborhood of E, and it can be made arbitrarily C^1-close to f.

There have been various generalizations of Franks’ Lemma. Some constraints have been noticed when generalizing to geodesic flows and billiard dynamics, since one can’t perturb the dynamics directly, but have to make geometric deformations. See D. Visscher’s thesis for more details.

Let Q be a strictly convex domain, x be the orbit along the/a diameter of Q. Clearly x is 2-period. Let r\le R be the radius of curvatures at x, fx, respectively. Then
D_xf^2=\frac{1}{rR}\begin{pmatrix}2d(d-r-R)+rR & 2d(d-R)\\ 2(d-r)(d-r-R) & 2d(d-r-R)+rR\end{pmatrix}, where d stands for the diameter of Q.
Note that the two entries on the diagonal are always the same. Therefore any linearization with different entries on the diagonal can’t be realized as the tangent map along a periodic billiard orbit of period 2. In other words, even through there are three parameters that one can change: the distance d, the radii of curvature at both ends r,R, the effects lie in a 2D-subspace \{\begin{pmatrix}a & b \\ c & d\end{pmatrix}:ad-bc=1, a=d\} of the 3D \{\begin{pmatrix}a & b \\ c & d\end{pmatrix}:ad-bc=1\}.

Visscher was able to prove that generically, for each periodic orbit of period at least 3, every small perturbation of D_xF^3 is actually realizable by deforming the boundary of billiard table. For more details, see Visscher’s paper:

A Franks’ lemma for convex planar billiards.

Continue reading

Regularity of center manifold

Let X:\mathbb{R}^d\to \mathbb{R}^d be a C^\infty vector field with X(o)=0. Then the origin o is a fixed point of the generated flow on \mathbb{R}^d. Let T_o\mathbb{R}^d=\mathbb{R}^s\oplus\mathbb{R}^c\oplus\mathbb{R}^u be the splitting into stable, center and unstable directions. Moreover, there are three invariant manifolds (at least locally) passing through o and tangent to the corresponding subspaces at o.

Theorem (Pliss). For any n\ge 1, there exists a C^n center manifold C^n(o)=W^{c,n}(o).

Generally speaking, the size of the center manifold given above depends on the pre-fixed regularity requirement. Theoretically, there may not be a C^\infty center manifold, since C^n(o) could shrink to o as n\to\infty. An explicit example was given by van Strien (here). He started with a family of vector fields X_\mu(x,y)=(x^2-\mu^2, y+x^2-\mu^2). It is easy to see that (\mu,0) is a fixed point, with \lambda_1=2\mu<\lambda_2=1. The center manifold can be represented (locally) as the graph of y=f_\mu(x).

Lemma. For n\ge 3, \mu=\frac{1}{2n}, f_\mu is at most C^{n-1} at (\frac{1}{2n},0).

Proof. Suppose f_\mu is C^{k} at (\frac{1}{2n},0), and let \displaystyle f_\mu(x)=\sum_{i=1}^{k}a_i(x-\mu)^i+o(|x-\mu|^{k}) be the finite Taylor expansion. The vector field direction (x^2-\mu^2, y+x^2-\mu^2) always coincides with the tangent direction (1,f'_\mu(x)) along the graph (x,f_\mu(x)), which leads to

(x^2-\mu^2)f_\mu'(x)=y+x^2-\mu^2=f_\mu(x)+x^2-\mu^2.

Note that x^2-\mu^2=(x-\mu)^2+2\mu(x-\mu). Then up to an error term o(|x-\mu|^{k}), the right-hand side in terms of (x-\mu): (a_1+2\mu)(x-\mu)+(a_2+1)(x-\mu)^2+\sum_{i=3}^{k}a_i(x-\mu)^i; while the left-hand side in terms of (x-\mu):

(x-\mu)^2f_\mu'(x)+2\mu(x-\mu)f_\mu'(x)=\sum_{i=1}^{k}ia_i(x-\mu)^{i+1}+\sum_{i=1}^{k}2\mu i a_i(x-\mu)^i

=\sum_{i=2}^{k}(i-1)a_{i-1}(x-\mu)^{i}+\sum_{i=1}^{k}2\mu i a_i(x-\mu)^i.

So for i=1: 2\mu a_1=a_1+2\mu, a_1=\frac{-2\mu}{1-2\mu}\sim 0;

i=2: a_2+1=a_1+4\mu a_2, a_2=\frac{a_1-1}{1-4\mu}\sim -1;

i=3,\cdots,k: a_i=(i-1)a_{i-1}+2i\mu a_i, (1-2i\mu)a_i=(i-1)a_{i-1}.

Note that if k\ge n, we evaluate the last equation at i=n to conclude that a_{n-1}=0. This will force a_i=0 for all i=n-2,\cdots,2, which contradicts the second estimate that a_2\sim -1. Q.E.D.

Consider the 3D vector field X(x,y,z)=(x^2-z^2, y+x^2-z^2,0). Note that the singular set S are two lines x=\pm z, y=0 (in particular it contains the origin O=(0,0,0)). Note that D_OX=E_{22}. Hence a cener manifold W^c(O) through O is tangent to plane y=0, and can be represented as y=f(x,z). We claim that f(x,x)=0 (at least locally).

Proof of the claim. Suppose on the contrary that c_n=f(x_n,x_n)\neq0 for some x_n\to 0. Note that p_n=(x_n,c_n,x_n)\in W^c(O), and W^c(O) is flow-invariant. However, there is exactly one flow line passing through p_n: the line L_n=\{(x_n,c_nt,x_n):t>0\}. Therefore L_n\subset W^c(O), which contradicts the fact that W^c(O) is tangent to plane y=0 at O. This completes the proof of the claim.

The planes z=\mu are also invariant under the flow. Let’s take the intersection W_\mu=\{z=\mu\}\cap W^c(O)=\{(x,f(x,\mu),\mu)\}. Then we check that \{(x,f(x,\mu))\} is a (in fact the) center manifold of the restricted vector field in the plane z=\mu. We already checked that f(x,\mu) is not C^\infty, so is W^c(O).

The volume of uniform hyperbolic sets

This is a note of some well known results. The argument here may be new, and may be complete.

Proposition 1. Let f\in\mathrm{Diff}^2_m(M). Then m(\Lambda)=0 for every closed, invariant hyperbolic set \Lambda\neq M.

See Theorem 15 of Bochi–Viana’s paper. Note that Proposition 1 also applies to Anosov case, in the sense that m(\Lambda)>0 implies that \Lambda=M and f is Anosov.

Proof. Suppose m(\Lambda)>0 for some hyperbolic set. Then the stable and unstable foliations/laminations are absolutely continuous. Hopf argument shows that \Lambda is (essentially) saturated by stable and unstable manifolds. Being a closed subset, \Lambda is in fact saturated by stable and unstable manifolds, and hence open. So \Lambda=M.

Proposition 2. There exists a residual subset \mathcal{R}\subset \mathrm{Diff}_m^1(M), such that for every f\in\mathcal{R}, m(\Lambda)=0 for every closed, invariant hyperbolic set \Lambda\neq M.

Proof. Let U\subset M be an open subset such that \overline{U}\neq M, \Lambda_U(f)=\bigcap_{\mathbb{Z}}f^n\overline{U}, which is always a closed invariant set (maybe empty). Given \epsilon>0, let \mathcal{D}(U,\epsilon) be the set of maps f\in\mathrm{Diff}_m^1(M) that either \Lambda_U(f) is not a uniformly hyperbolic set, or it’s hyperbolic but  m(\Lambda_U(f))<\epsilon. It follows from Proposition 1 that \mathcal{D}(U,\epsilon) is dense. We only need to show the openness. Pick an f\in \mathcal{D}(U,\epsilon). Since m(\Lambda_U(f))<\epsilon, there exists N\ge 1 such that m(\bigcap_{-N}^N f^n\overline{U})<\epsilon. So there exists \mathcal{U}\ni f such that m(\bigcap_{-N}^N g^n\overline{U})<\epsilon. In particular, m(\Lambda_U(g))<\epsilon for every g\in \mathcal{U}. The genericity follows by the countable intersection of the open dense subsets \mathcal{D}(U_n,1/k).

The dissipative version has been obtained in Alves–Pinheiro’s paper

Proposition 3. Let f\in\mathrm{Diff}^2(M). Then m(\Lambda)=0 for every closed, transitive hyperbolic set \Lambda\neq M. In particular, m(\Lambda)>0 implies that \Lambda=M and f is Anosov.

See Theorem 4.11 in R. Bowen’s book when \Lambda is a basic set.

Doubling map on unit circle

1. Let \tau:x\mapsto 2x be the doubling map on the unit torus. We also consider the uneven doubling f_a(x)=x/a for 0\le x \le a and f(x)=(x-a)/(1-a) for a \le x \le 1. It is easy to see that the Lebesgue measure m is f_a-invariant, ergodic and the metric entropy h(f_a,m)=\lambda(m)=\int \log f_a'(x) dm(x)=-a\log a-(1-a)\log (1-a). In particular, h(f_a,m)\le h(f_{0.5},m)=\log 2 =h_{\text{top}}(f_a) and h(f_a,m)\to 0 when a\to 0.

2. Following is a theorem of Einsiedler–Fish here.

Proposition. Let \tau:x\mapsto 2x be the doubling map on the unit torus, \mu be an \tau-invariant measure with zero entropy. Then for any \epsilon>0, \beta>0, there exist \delta_0>0 and a subset E\subset \mathbb{T} with \mu(E) > 0, such that for all x \in E, and all \delta<\delta_0: \mu(B(x,\delta))\ge \delta^\beta.

A trivial observation is \text{HD}(\mu)=0, which also follows from general entropy-dimension formula.

Proof. Let \beta and \epsilon be fixed. Consider the generating partition \xi=\{I_0, I_1\}, and its refinements \xi_n=\{I_\omega: \omega\in\{0,1\}^n\} (separated by k\cdot 2^{-n})….

Furstenberg introduced the following notation in 1967

Definition. A multiplicative semigroup \Sigma\subset\mathbb{N} is lacunary, if \Sigma\subset \{a^n: n\ge1\} for some integer a. Otherwise, \Sigma is non-lacunary.

Example. Both \{2^n: n\ge1\} and \{3^n: n\ge1\} are lacunary semigroups. \{2^m\cdot 3^n: m,n\ge1\} is a non-lacunary semigroup.

Theorem. Let \Sigma\subset\mathbb{N} be a non-lacunary semigroup, and enumerated increasingly by s_i > s_{i+1}\cdot. Then \frac{s_{i+1}}{s_i}\to 1.

Example. \Sigma=\{2^m\cdot 3^n: m,n\ge1\}. It is equivalent to show \{m\log 2+ n\log 3: m,n\ge1\} has smaller and smaller steps.

Theorem. Let \Sigma\subset\mathbb{N} be a non-lacunary semigroup, and A\subset \mathbb{T} be \Sigma-invariant. If 0 is not isolated in A, then A=\mathbb{T}.

Furstenberg Theorem. Let \Sigma\subset\mathbb{N} be a non-lacunary semigroup, and \alpha\in \mathbb{T}\backslash \mathbb{Q}. Then \Sigma\alpha is dense in \mathbb{T}.

In the same paper, Furstenberg also made the following conjecture: a \Sigma-invariant ergodic measure is either supported on a finite orbit, or is the Lebesgue measure.

A countable group G is said to be amenable, if it contains at least one Følner sequence. For example, any abelian countable group is amenable. Note that for amenable group action G\ni g:X\to X, there always exists invariant measures and the decomposition into ergodic measures. More importantly, the generic point can be defined by averaging along the Følner sequences, and almost every point is a generic point for an ergodic measure. In a preprint, the author had an interesting idea: to prove Furstenberg conjecture, it suffices to show that every irrational number is a generic point of the Lebesgue measure. Then any other non-atomic ergodic measures, if exist, will be starving to death since there is no generic point for them :)

Some simple dynamical systems

Dynamical formulation of Prisoner’s dilemma
Originally, consider the two players, each has a set of stratagies, say \mathcal{A}=\{a_{i}\} and \mathcal{B}=\{b_{j}\}. The pay-off P_k=P_k(a_{i},b_{j}) for player k depends on the choices of both players.

Now consider two dynamical systems (M_i,f_i). The set of stratagies consists of the invariant probability measures, and the pay-off functions can be

\phi_k(\mu_1,\mu_2)=\int \Phi_k(x,y)d\mu_1 d\mu_2, where \mu_i\in\mathcal{M}(f_i);

\psi_k(\mu_1,\mu_2)=\int \Phi_k(x,y)d\mu_1 d\mu_2-h(f_i,\mu_i).

The frist one is related to Ergodic optimization. The second one does sound better, since one may want to avoid a complicated (measured by its entropy) stratagy that has the same \phi pay-off.

Gambler’s Ruin Problem
A gambler starts with an initial fortune of $i,
and then either wins $1 (with p) or loses $1 (with q=1-p) on each successive gamble (independent of the past). Let S_n denote the total fortune after the n-th gamble. Given N>i, the gambler stops either when S_n=0 (broke), or S_n=N (win), whichever happens first.

Let \tau be the stopping time and P_i(N)=P(S_\tau=N) be the probability that the gambler wins. It is easy to see that P_0(N)=0 and P_N(N)=1. We need to figure out P_i(N) for all i=1,\cdots,N-1.

Let S_0=i, and S_n=S_{n-1}+X_n. There are two cases according to X_1:

X_1=1 (prob p): win eventually with prob P_{i+1}(N);

X_1=-1 (prob q): win eventually with prob P_{i-1}(N).

So P_i(N)=p\cdot P_{i+1}(N)+q\cdot P_{i-1}(N), or equivalently,
p\cdot (P_{i+1}(N)-P_i(N))=q\cdot (P_i(N)-P_{i-1}(N)) (since p+q=1), i=1,\cdots,N-1.

Recall that P_0(N)=0 and P_N(N)=1. Therefore P_{i+1}(N)-P_i(N)=\frac{q^i}{p^i}(P_1(N)-P_{0}(N))=\frac{q^i}{p^i}P_1(N), i=1,\cdots,N-1. Summing over i, we get 1-P_1(N)=P_1(N)\cdot\sum_{1}^{N-1}\frac{q^i}{p^i}, P_1(N)=\frac{1}{\sum_{0}^{N-1}\frac{q^i}{p^i}}=\frac{1-q/p}{1-q^N/p^N} (if p\neq .5) and P_1(N)=\frac{1}{N} (if p= .5). Generally P_i(N)=P_1(N)\cdots\sum_{0}^{i-1}\frac{q^j}{p^j}=\frac{1-q^i/p^i}{1-q^N/p^N} (if p\neq .5) and P_1(N)=\frac{i}{N} (if p= .5).

Observe that for fixed i, the limit P_i(\infty)=1-q^i/p^i>0 only when p>.5, and P_i(\infty)=0 whenever p\le .5.

Finite Blaschke products
Let f be an analytic function on the unit disc \mathbb{D}=\{z\in\mathbb{C}: |z|<1 \} with a continuous extension to \overline{\mathbb{D}} with f(S^1)\subset S^1. Then f is of the form

\displaystyle f(z)=\zeta\cdot\prod_{i=1}^n\left({{z-a_i}\over {1-\bar{a_i}z}}\right)^{m_i},

where \zeta\in S^1, and m_i is the multiplicity of the zero a_i\in \mathbb{D} of f. Such f is called a finite Blaschke product.

Proposition. Let f be a finite Blaschke product. Then the restriction f:S^1\to S^1 is measure-preserving if and only if f(0)=0. That is, a_i=0 for some i.

Proof. Let \phi be an analytic function on \overline{\mathbb{D}}. Then \int_{S_1}\phi d(\theta)=\phi(0) and \int_{S_1}\phi\circ f d(\theta)=\phi\circ f(0).

Significance: there are a lot of measure-preserving covering maps on S^1.

Kalikov’s Random Walk Random Scenery
Let X=\{1,-1\}^{\mathbb{Z}}, and \sigma:X\to X to the shift \sigma((x_n))=(x_{n+1}). More generally, let A be a finite alphabet and p be probability vector on A, and Y=A^{\mathbb{Z}}, \nu=p^{\times\mathbb{Z}}. Consider the skew-product T:X\times Y\to X\times Y, (x,y)\mapsto (\sigma x, \sigma^{x_0}y). It is clear that T preserves any \mu\times \nu, where \mu is \sigma-invariant.

Proposition. Let \mu=(.5,.5)^{\times\mathbb{Z}}. Then h(T,\mu\times \nu)=h(\sigma,\mu)=\log 2 for all (A,p).

Proof. Note that T^n(x,y)\mapsto (\sigma^n x, \sigma^{x_0+\cdot+x_{n-1}}y). CLT tells that \mu(x:|x_0+\cdot+x_{n-1}|\ge\kappa\cdot \sqrt{n})< \delta(\kappa) as n\to\infty, where \delta(\kappa)\to 0 as \kappa\to\infty. There are only 2^{n+\kappa \sqrt{n}} different n-strings (up to an error).

Significance: this gives a natural family of examples that are K, but not isomorphic to Bernoulli.

 Creation of one sink. 1D case. Consider the family f_t:x\mapsto x^2+2-t, where 0\le t\le 2. Let t_\ast the first parameter such that the graph is tangent to the diagonal at x_\ast=f_{t_\ast}(x_\ast). Note that x_\ast is parabolic. Then for t\in(t_\ast,t_\ast+\epsilon), f_t(x)=x has two solutions x_1(t)<x_2(t), where x_1(t) is a sink, and x_2(t) is a source.

2D case. Let B=[-1,1]\times[-\epsilon,\epsilon] be a rectangle, f be a diffeomorphism such that f(B) is a horseshoe lying  above B of shape ‘V’. Moreover we assume |\det Df|<1. Let f_t(x,y)=f(x,y)-(0,t) such that f_1(B) is the regular horseshoe intersection:  V . Clearly there exists a fixed point p_1 of f_1 in B. We assume \lambda_1(1)<-1<\lambda_2(1)<0. Then Robinson proved that f_t admits a fixed point in B which is a sink.

First note that for any t, and any fixed point of f_t (if exists), it is not on the boundary of B. Since p_1 is a nondegenerate fixed point of f_1, the fixed point continues to exist for some open interval (t_1,1) (assume it is maximal, and denote the fixed point by p_t). Clearly t_1>0. Note that p_{t_1} is also fixed by f_{t_1}, since it is a closed property. If there is some moment with \lambda_1(t)=\lambda_2(t) for the fixed point p_t of f_t, then it is already a sink, since \det Df=\lambda_1\cdot\lambda_2<1. So in the following we consider the case \lambda_1\neq\lambda_2 for all p_t, t\in[t_1,1]. Then the continuous dependence of parameters implies that both are continuous functions of t. The fixed point p_{t_1} must be degenerate, since the fixed point ceases to exist beyond t_1, which means: \lambda_i(t_1)=1 for some i\in\{1,2\}.

Case 1. \lambda_1(t_1)=1. Note that \lambda_1(1)<-1. So \text{Re}\lambda_1(t_\ast)=0 for some t_\ast\in(t_1,1), which implies that \lambda_1(t_\ast)=ai for some a\neq 0. In particular, \lambda_2(t_\ast)=-ai, and a^2=|\det Df|<1. So p_{t_\ast} is a (complex) sink.

Case 2. \lambda_2(t_1)=1. Note that \lambda_2(1)<0. Similarly \text{Re}\lambda_2(t_\ast)=0 for some t_\ast\in(t_1,1).

So in the orientation-preserving case there always exists a complex sink. In the orientation-reversing case (\lambda_2(1)\in(0,1)), we need modify the argument for case 2:

Case 2′. \lambda_2(t_1)=1. Note that \lambda_2(1)\in(0,1). So |\lambda_1(t_\ast)|<1 for some t_\ast\in(t_1,1). We pick t_\ast close to t_1 in the sense that |\lambda_1(t_\ast)|>|\det Df|, which implies |\lambda_1(t_\ast)|\ast<1, too. So p_{t_\ast} is also a sink.

Playing pool with pi

This is a short note based on the paper 

Playing pool with π (the number π from a billiard point of view) by G. Galperin in 2003.

Let’s start with two hard balls,  denoted by B_1 and B_2, of masses 0<m\le M on the positive real axis with position 0<x< y, and a rigid wall at the origin. Without loss of generality we assume m=1. Then push the ball B_2 towards B_1, and count the total number N(M) of collisions (ball-ball and ball-wall) till the B_2 escapes to \infty faster than B_1.

Case. M=1: first collision at y(t)=x, then B_2 rests, and B_1 move towards the wall; second collision at x(t)=0, then B_1 gains the opposite velocity and moves back to B_2; third collision at x(t)=x, then B_1 rests, and B_2 move towards \infty.

Total counts N(1)=3, which happens to be first integral part of \pi. Well, this must be coincidence, one might wonder.

However, Galperin proved that, if we set M=10^{2k}, then N(M) gives the integral part of 10^k\pi. For example, N(10^2)=31; and  N(10^4)=314.

Continue reading

Notes-09-14

4. Borel–Cantelli Lemma(s). Let (X,\mathcal{X},\mu) be a probability space. Then

If \sum_n \mu(A_n)<\infty, then \mu(x\in A_n \text{ infinitely often})=0.

If A_n are independent and \sum_n \mu(A_n)=\infty, then for \mu-a.e. x, \frac{1}{\mu(A_1)+\cdots+\mu(A_n)}\cdot|\{1\le k\le n:x\in A_k\}|\to 1.

The dynamical version often involves the orbits of points, instead of the static points. In particular, let T be a measure-preserving map on (X,\mathcal{X},\mu). Then

\{A_n\} is said to be a Borel–Cantelli sequence with respect to (T,\mu) if \mu(T^n x\in A_n \text{ infinitely often})=1;

\{A_n\} is said to be a strong Borel–Cantelli sequence if \frac{1}{\mu(A_1)+\cdots+\mu(A_n)}\cdot|\{1\le k\le n:T^k x\in A_k\}|\to 1 for \mu-a.e. x.

3. Let H(q,p,t) be a Hamiltonian function, S(q,t) be the generating function in the sense that \frac{\partial S}{\partial q_i}=p_i. Then the Hamilton–Jacobi equation is a first-order, non-linear partial differential equation

H + \frac{\partial S}{\partial t}=0.

Note that the total derivative \frac{dS}{dt}=\sum_i\frac{\partial S}{\partial q_i}\dot q_i+\frac{\partial S}{\partial t}=\sum_i p_i\dot q_i-H=L. Therefore, S=\int L is the classical action function (up to an undetermined constant).

2. Let \gamma_s(t) be a family of geodesic on a Riemannian manifold M. Then J(t)=\frac{\partial }{\partial s}|_{s=0} \gamma_s(t) defines a vector field along \gamma(t)=\gamma_0(t), which is called a Jacobi field. J(t) describes the behavior of the geodesics in an infinitesimal neighborhood of a given geodesic \gamma.

Alternatively, A vector field J(t) along a geodesic \gamma is said to be a Jacobi field, if it satisfies the Jacobi equation:

\frac{D^2}{dt^2}J(t)+R(J(t),\dot\gamma(t))\dot\gamma(t)=0,

where D denotes the covariant derivative with respect to the Levi-Civita connection, and R the Riemann curvature tensor on M.

Continue reading

Follow

Get every new post delivered to your Inbox.