 Research
 Open access
 Published:
Generalized kinetic theory of coarsegrained systems. I. Partial equilibrium and Markov approximations
Advances in Continuous and Discrete Models volume 2024, Article number: 19 (2024)
Abstract
The general kinetic theory of coarsegrained systems is presented in the abstract formalism of communication theory developed by Shannon and Weaver, Khinchin and Kolmogorov. The martingale theory shows that, under reasonable, general hypotheses, coarsegrained systems can be approximated by generalized Markov systems. For mixing systems, the Kolmogorov entropy production can be defined for nonstationary processes as Kolmogorov defined it for stationary processes.
1 Introduction
The purpose of this article (and the one to follow) is to define a generalized kinetic theory of classical thermodynamic systems at a coarsegrained level (see Sect. 2 for definitions). The microscopic evolution of the system induces an evolution on the coarsegrained states, which is generally nonMarkovian.
In the same context, it has been shown recently [1] that coarsegrained deterministic dynamical systems can be approximated by generalized Markov systems, which may explain why Markov processes are so popular in modeling actual phenomena. These conclusions were obtained by applying and extending some pioneering results of Kolmogorov [2–4]. The formalism used in our previous works was relatively intuitive, even if sometimes lengthy, but it was sufficient for our first aim. However, we had to adopt some hypotheses that could seem reasonable but were difficult to justify precisely.
In the present work, we adopt a more abstract and rigorous formalism and show that the previous results can be generalized to a much broader framework, as mentioned above. This formalism is the one used in communication theory by ShannonWeaver [5] and by Khinchin [6] to define optimal coding, by Kolmogorov [2, 3] (and also [4] for a pedagogical exposition) to define entropic invariants of dynamical systems. It was also introduced in [7] in the Markovian situation only.
The system evolution is specified by a stationary distribution on the path space \(X^{Z}\), where X is the finite set of coarsegrained states, and Z represents the discrete time (see Sect. 2). At the coarsegrained level, the stationary evolution is not Markovian, but the advantage is that the evolution takes place on the finite state space X and that we avoid all controversial discussions concerning ergodicity and time scales for reaching equilibrium [8, 9].
Sections 2 and 3 fix notations and definitions and give basic examples. Section 4 introduces nonstationary processes: the initial condition is not the stationary state on X, but the evolution is given by the stationary process. It corresponds to the notion of partial equilibrium of LandauLifschitz [10]. We define the entropy production of both processes and show that they are equal, assuming a mixing property in Sect. 5. Although this result seems obvious, its proof is quite lengthy.
In Sect. 6, we address the main question of kinetic theory, namely why the evolution can be approximated by a Markovian evolution, as in the theory of Brownian motion, FokkerPlanck equation, etc. Obviously, one also has to use a coarsegrained time scale. We define various Markovian evolutions and prove that they approximate the exact evolution on the coarsegrained time scale using the production of relative entropy.
We want to dedicate this article to the memory of Prof. Mark Kac, who introduced one of us to the problems of justification of the Markov processes in statistical mechanics.
2 Notations and definitions
In this article, X denotes a finite set. Elements of X are denoted as \(x\in X\). Z is the set of positive, 0, or negative integers.
2.1 The spaces \(X^{Z}\), X
\(X^{Z}\) is the space of sequences (\(x(n)\)), \(n\in Z\), \(x(n) \in X\).
We define the shift τ: \(X^{Z} \rightarrow X^{Z}\) by
Let \(I = \{ i_{1}, \ldots, i_{l} \}\) be a finite subset of Z ordered by \(i_{1} < i_{2} < \cdots < i_{l}\). We define for \(k \in Z\)
\(X^{I}\) denotes the set of maps \(x(I)\in X^{I}\).
In expanded notations, we write
with
The shift is also defined as \(\tau : X^{I} \to X^{I + 1}\) by
for \(x(I)\) given by (2.3).
If \(J \subset I \) is a subset of I, \(x(I) \vert _{J} \) is the restriction of the map \(x(I)\) to the subset J.
Finally, if \(m \leq n\), and \(I = \{ m, m + 1, \ldots, n \}\) is the interval of integers between m and n, we denote
2.2 Probabilities on \(X^{\boldsymbol{Z}}\)
A stochastic process on X is the data of a system of probabilities \(p_{I}\) on \(X^{I}\) or all finite set \(I \subset Z \) with the compatibility conditions:
Obviously, the \(p_{I}\) are known as soon as the \(p_{[m,n]}\) are known.
It is known that a system of probabilities \(p_{I}\) satisfying the compatibility conditions of Eq. (2.6) defines a probability p sur \(X^{Z}\), the \(p_{I}\) being the marginal laws of p
This result is the extension Theorem of Kolmogorov [11]. The probability p is defined on the measurable subsets of \(X^{Z}\). By definition, the subset appearing in p in Eq. (2.7) is measurable.
The stochastic process p is stationary if for any measurable set \(A \subset X^{Z}\)
or equivalently, for any \(x(I) \in X^{I}\), and any I
In particular, if p is stationary, it defines a unique probability distribution \(p_{0}\) on X by
which is independent of n.
Remark
(Conventions and definition)
(i) In order to simplify the notations, we shall skip the index I of \(p_{I}\) whenever it is clear that we refer to \(p_{I}\). For instance, we write \(p(x(I))\) instead of \(p_{I}(x(I))\).
(ii) When we use conditional probabilities, the condition is always in the past: for instance, if \(m < n\)
(iii) According to the usual definition [4], the stochastic process p is ergodic if any measurable set \(B \subset X^{Z}\) is invariant by τ having probability \(p(B)= 0\) or 1.
2.3 Coarsegraining
Let A be a partition of X: the elements of \(a\in A\) are subsets of \(a\subset X\) such that
A probability q on X generates a probability \(q^{(A)}\) on A by
The partition A on X induces the partitions \(A^{Z}\) of \(X^{Z}\) and \(A^{I}\) of \(X^{I}\). The stochastic process p induces a stochastic process \(p^{(A)}\) defined by
where the notation \(x(I) \in a(I)\) means
\(p^{(A)}\) is a coarsegrained process of p. If p is stationary, \(p^{(A)}\) is stationary. If p is ergodic, \(p^{(A)}\) is ergodic.
Such coarsegrained processes are extensively used in physics and applied sciences when inaccurate observations cannot allow one to distinguish two different elements x belonging to the same subset a of A [1, 12].
3 Examples
We only cite a few wellknown processes that are of interest to us.
(a) Bernoulli processes
Let μ be a probability on X. The Bernoulli process defined by μ is
It is stationary. It is ergodic if and only if \(\mu (x) > 0\) for any \(x\in X\).
(b) Markov processes
Let \(R = (R_{yx})\), \(y,x \in X \) be a stochastic matrix, so
Let μ be a stationary probability for R:
Then, we define a stochastic process by
where
This process is stationary. It is ergodic if and only if R is irreducible. The Bernoulli process (a) is a particular case when \(R_{yx} = \mu (x)\).
(c) Dynamical systems
These systems are of special interest to physics (see [1] and Remark 2 below). Let \((M, \mathcal{M}, \mu )\) be a probability space so that M is a measurable space with a σalgebra \(\mathcal{M} \) of measurable subsets and μ a probability defined on \(\mathcal{M}\).
Let f: \(M\rightarrow M\) be a measurable bijection, which is measurepreserving, namely
Let X be a finite partition of M in measurable subsets. We define a coarsegrained, stochastic process on X by the formula
where \(x[m,n] \in X^{[m,n]}\) is given by Eq. (3.5).
Then \(p_{[m,n]} (x[m,n])\) is the measure of the subset of elements \(z\in M\) with
This process is stationary. It is ergodic if f is ergodic (i.e., if the only measurable subset \(\mathcal{B}\) of M invariant by f is of measure \(\mu (\mathcal{B}) = 0\) or 1).
Remark 1
This definition, due to Kolmogorov, was introduced to define nonspatial invariants of dynamical systems [4].
Remark 2
A particularly interesting example [4] is the case when M is a phase space, and f is a Hamiltonian map (i.e., the map given by the solution of the Hamilton equation at a given time) and μ is the volume on M, which is preserved by f because of the Liouville theorem.
4 Changing initial conditions: definition of the nonstationary process p̅. Production of entropy
4.1 Definition of a particular nonstationary process p̅
Let A be a partition of X. The elements of A are subsets \(a \subset X\) satisfying Eqs. (2.5)–(2.10).
Let p be a stationary process on X, and q be a probability on A. These two data determine a process on X that is a probability p̅ on \(X^{N}\) given by the formulas
Here
Definitions (4.1)–(4.2) show that the distributions \(\overline{p}_{[0,n]}\) and \(p_{[m,n]}\) satisfy the compatibility conditions and define a probability on \(X^{N}\), and a stochastic process (induced by the integers ≥0) on X. This stochastic process is nonstationary (indeed, being indexed by the ≥0 integers, the stationarity is meaningless).
The initial distribution is
and the distribution at time n is
where \((x[0, n  1], x_{n})\) denotes the path \(( x(0), 0; x(1), 1; \ldots; x(n  1), n  1; x_{n}, n )\).
Convention
As previously mentioned, we skip the indices I for \(\overline{p}_{I}\) when there is no possible confusion.
Lemma 4.1
The conditional probabilities of the process p̅ with the condition starting at time 0 are identical to the corresponding conditional probabilities of the process p, so, for \(0< k \leq n\)
The proof is obvious using (4.1).
4.2 Entropy and relative entropy
If Z is a finite set, if \(Z\) is the number of its elements, and if p, q are probabilities on Z, we define the entropy of p and the relative entropy of p and q by the usual formulas [7]
One has
4.3 Path entropy
For the stationary process, the nonstationary process, and any positive integer n, we define the path entropy \(S_{n}\)
Lemma 4.2
(a) One has the following identities:
where
and the same identities with p̅ instead of p.
(b) One has
Proof
(a) is trivial. On the other hand, one has
which is (4.12). Similarly, we derive (4.13) using Lemma 4.1. □
Lemma 4.3
(a) For the stationary process p, one has the identity
(b) \(d_{k} S_{k}(p(p_{[0,k]})\) is a decreasing sequence with a limit \(s(p)\).
Proof
Using the definition of \(d_{k} S \) (Eq. (4.12), one has by stationarity of p
□
4.4 The case of a stationary probability p
In this case, we will use a theorem that was first presented in Ref. [1], using the concept of martingale (see, for instance, [13, 14], or [12] for a simplified definition).
Theorem 4.4
For \(x = x(0)\in X\), the sequence of random variables \(p ( x \vert x[  k,  1] )\) is a martingale with respect to the sequence \(\mathcal{F}_{k}\) of σalgebras generated by \(x([k,1])\). Moreover, these random variables are positively bounded by 1.
Using this theorem, Lemma 4.3(b), and the identity (4.10), one obtains the result of KolmogorovShannon [4, 6]:
Theorem 4.5
For the stationary process p, one has
(a) \(d_{k}S_{k}(p)\) has a limit \(s(p)\) for \(k \rightarrow \infty \)
(b) One has
Definition
\(s(p)\) is the (asymptotic) production of entropy per unit time of the process p.
For completeness, the proofs of Theorems 4.4 and 4.5 are given in Appendix A.
Remark 3
In the special case where p comes from a dynamical system (Eq. (3.7)), it is proved in [3] that \(d_{k}S_{k}(p)\) is a decreasing sequence, and there is no need to use martingale theory.
4.5 Nonstationary probability p̅
In the nonstationary case, the production of entropy at time k is \(d_{k}S_{k}(\overline{p} )\). The asymptotic entropy production of p̅ is well defined if \(d_{k}S_{k}(\overline{p} )\) tends to a limit when \(k \rightarrow \infty \). As shown later, further hypotheses are necessary for such a limit to exist. In the general case, we can only prove that
where \(C'\) (resp. \(C''\)) is the lower (resp. upper) bound of \(q(x)/p(x)\) (\(0\leq C' \leq 1 \leq C''\)).
Proof
These inequalities straightforwardly result from Eqs. (4.1)–(4.2) and Theorem 4.4. □
5 Production of entropy for a nonstationary distribution p̅
5.1 Mixing process
We say that the stationary process p is a mixing process if for \(0 < n< n+ k\)
In expanded notations, this means that for \(n \rightarrow \infty \)
for any sequence \(x(0), 0: x_{n}, n; x_{n + 1}, n + 1; \ldots; x_{n + k}, n + k\) of \(k+1\) elements.
The mixing property implies ergodicity [4].
Theorem 5.1
If p is mixing, the nonstationary process p̅ defined in Sect. 4has an asymptotic distribution in X, which is \(p(x)\).
Proof
The asymptotic distribution of p̅ (if it exists) is
But \(p(x(0), 0; x(n), n) \to p(x(0)) p(x)\) by the mixing property. As the sum over \(a\in A\) is finite, the limit in Eq. (5.3) exists, and it is \(p(x)\). □
5.2 Production of entropy for p̅ when p is mixing. The main theorem
Theorem 5.2
Assume that p is a mixing process. Then

(a)
\(d_{n}S_{n}(\overline{p})\) has the limit \(s(p)\)

(b)
\(\lim_{n} \frac{1}{n}S_{n}(\overline{p}) = s(p)\)
Then p̅ has a welldefined production of entropy, which is the same as the entropy production of p.
The proof of this basic theorem is the consequence of successive partial results, which are postponed to Section 5.4 and completed in Appendix B.
5.3 Production of entropy for a mixing process p
Theorem 5.3
If p is a mixing process, one has
the sum being taken over \(x(0) \in X\) and over \(x[n, n + k  1] \in X^{[n, n + k  1]}\).
Proof
The mixing property (5.1) implies that the conditional probability \(p_{[0, n + k]} ( x(n + k)  x(0), x[n, n + k  1] )\) has a limit:
where the limit is taken with a fixed k and fixed \(x(0), x(n),\ldots,x( n + k  1), x(n + k)\). Then
and all these quantities are uniformly bounded by \(\max_{0 \leq \alpha \leq 1} \alpha \ln \alpha \). As X is finite, we can sum (5.6) on \(x(n+k)\) and obtain
while staying uniformly bounded. By the Lebesgue theorem of dominated convergence in \(L^{1}(Z, p)\), we have [15]
where \(\mathrm{E}_{p}\{\}\) is the mathematical expectation for the measure p. Now, the first term in Eq. (5.8) is
and the last member in Eq. (5.8) js, using (4.12)
which proves Eq. (5.4). □
Theorem 5.4
If p is a mixing process, for any probability q in X and for the associated process p̅ defined in Sect. 4, one has
where k is fixed and the sum is over \(x(0)\), \(x[n, n+k]\).
Proof
Using the definition of p̅ and Lemma 4.1, one has
where \(\mathbf{1}_{a}\) is the characteristic function of the subset a of X. By the mixing property Eq. (5.1) and Lebesgue theorem of dominated convergence, we have
where we have used (4.12). So, by Eq. (5.10), we have proved Eq. (5.9). □
5.4 Proof of Theorem 5.2
We first derive several successive lemmas.
Lemma 5.5
Let

\(q(x, y, z)\) be a probability distribution on three variables x, y, z taking discrete values,

\(q(x)\) and \(q(x, y)\) the corresponding marginal laws,

\(q(zx)\) and \(q(zx, y)\) the corresponding conditional laws of z.
Denote by \(S_{Z}\) the entropy of the probability distribution of z. Then, we have the identity
Proof
We apply the definitions to the first member to obtain identity (5.12). □
Lemma 5.6
One has the identity
where each summation is over the variables appearing in the concerned probabilities. For instance, the first sum on the left is over \(x(0), x(n),\ldots, x(n+k1)\).
Proof of Lemma 5.6
We apply Lemma 5.5 to \(q = \overline{p}\) with the substitutions
so that \((x, y) \to x[0, n + k  1]\). □
Lemma 5.7
We have
The proof of this lemma implies that the first member of (5.13) tends to 0 when n and k tend to be infinite, which may seem intuitive from the definition of mixing. However, rigorous proof of Lemma 5.7 requires several further steps, as shown in Appendix B. It allows one to complete the proof of the basic Theorem 5.2.
End of the proof of Theorem 5.2
We start with the identity (5.13) of Lemma 5.6. The second term of the right member of this identity is just
We see that the limit when \(n \rightarrow \infty \) of the first term is, using Theorem 5.4, Eq. (5.9)
Now, in the identity (5.13), taking the limits when \(n \rightarrow \infty \) and then \(k \rightarrow \infty \), the first member tends to 0 by Lemma 5.7. Taking the same limits in Eq. (5.16), its first term tends to \(s(p)\). So, the second term of Eq. (5.13) \(d_{n + k}S_{n + k}(\overline{p})\), has a limit, which is \(s(p)\). Thus, we have proved that the nonstationary process p̅ has a production of entropy \(s(\overline{p}) = s(p)\). □
6 Markov approximations
6.1 The process \(p^{(T)}\) of memory T associated to p
In general, the process p has an infinite memory. Let T be a positive integer. We define a process \(p^{(T)}\) on X of memory T associated to the process p by the formulas
Distance between p and \(p^{(T)}\)
An asymmetric “distance” between p and \(p^{(T)}\) for nstep trajectories can be evaluated from the relative entropy of these two processes (see Sect. 6.4. below):
This quantity is related to the total variation distance between \(p_{[0, n]}\) and \(p_{ [0, n]}^{(T)}\), as shown in Sect. 6.4.
Theorem 6.1
For every \(\varepsilon > 0\), there exists a time \(T_{\varepsilon }\) such that for \(n \geq T\geq T_{\varepsilon}\), one has
So, the distance between \(p_{[0,n]}\) and \(p_{ [0, n]}^{(T)}\) tends to 0 when \(n \rightarrow \infty \).
Proof
Using the definition of the relative entropy, Eq. (4.7), one has
On the other hand, it follows from Eq. (4.9) that
and by definition (6.1), if \(n \geq T\)
By the stationarity of p, this is
From Eqs. (6.4), (6.6), and (6.3), we obtain
According to Theorem 4.4, \(d_{k}S_{k}(p)\) decreases when k increases, so each term of the sum in Eq. (6.7) is ≥0, and \(d_{k}S_{k}(p) \to s(p)\). Choose \(T_{\varepsilon}\) so that or \(k \geq T \geq T_{\varepsilon }\)
Then, for \(n \geq T \geq T_{\epsilon }\)
which completes the proof of Theorem 6.1. □
6.2 Partial histories of length T
Definitions
A partial history of length T is an element of \(X^{T}\). The nth history of length T is
If M and N are positive integers (\(M < N\)), a sequence of partial histories is
We also denote by \(\tau ^{(T)}\) the translation of time T on histories of length T.
Theorem 6.2
(a) The process \(p^{(T)}\) induces a Markov process \(\tilde{p}^{(T)}\) on partial histories of length T by the formulas
The transition probabilities between histories of length T are
where, according to Eq. (6.10)
(b) The stationary probability of the Markov process \(\tilde{p}^{(T)}\) is \(\tilde{p}^{(T)}(x{}^{(T)}(0))\).
(c) The production of entropy of \(\tilde{p}^{(T)}\) is
where \(S( R(.  x^{(T)}(0) )\) is the entropy of the probability distribution \(x^{(T)}(1) \to R ( x^{(T)}(1)  x^{(T)}(0) )\).
Proof
Using the definitions of Eqs. (6.13) and (6.1), we have
We now show that \(\tilde{p}^{(T)}(x^{(T)}(0))\) is the stationary probability. It will prove that (6.16) is the usual formula [16] for a Markov process (Eq. (3.4) in Sect. 3). One has
So, \(\tilde{p}^{(T)}(x^{(T)}(0))\) is indeed the stationary probability of the Markov process \(\tilde{p}^{(T)}\).
On the other hand, Eq. (6.15) is just the usual formula for the entropy production of a Markov process [16]. □
6.3 Comparison of p and \(\tilde{p}^{(T)}\)
The process p induces a stationary process on the partial histories of length T (denoted \(p_{T}\)) by
The process \(p_{T}\) is exactly the same as p except that it is restricted to an integer number of time T.
The entropy production of \(p_{T}\) is
So, one can rewrite Theorem 6.1 as follows.
Theorem 6.3
Denote by \(S_{T}\)(. .) the relative entropy of two processes defined on histories of length T. Then, for any \(\epsilon > 0\), there exists \(T _{\epsilon}\), independent of N, such that for \(T \geq T_{\epsilon }\) one has
6.4 Distance between \(p_{T}\) and \(\tilde{p}^{(T)}\)
We can interpret this relation as follows. If p, q are probabilities on a finite space Z, the following Pinsker inequality [17] relates the relative entropy of p and q to the total variation distance of distribution p and q:
This shows that \(S (pq)\) represents an asymmetrical distance between p and q. Equation (6.20) implies that the absolute distance between the actual process \(p_{T}\) and the Markov process \(\tilde{p}^{(T)}\), divided by T, goes to 0 for long times T.
Theorem 6.4
One has for the production of entropy
Proof
We use the expression of the entropy production of a Markov process (Eq. (6.15))
However, by Theorem 4.5, \(d_{T}S_{T}(p) \to s(p)\) if \(T\rightarrow \infty \), which gives Theorem 6.4. □
6.5 Attenuation of the memory
We come back to the process \(p_{T}\) on histories of length T. We now prove
Theorem 6.5
For a fixed integer N≥1, one has
Proof
We use the definition of relative entropy and decompose the sum of Eq. (6.22) into two terms:
and
Both sums (6.23) and (6.24) contain T terms \(d_{k}S_{k}(p)\) which tend to \(s(p)\). This gives the result (6.22). □
As a consequence, if T is large enough, within a given accuracy, ε is possible to neglect the distance between the process at time NT, with complete history from time 0, and the process with history limited to the last period of length T, between times NT and \((N1)T\). In practice, one can neglect the memory after times larger than T.
7 Conclusion
It has been rigorously proved that coarsegraining dynamical systems induce new systems that partially approximate the original systems. This conclusion is often anticipated intuitively in modeling physical or applied phenomena, which most generally needs simplifying and approximating actual observations. Because of its importance, this question has been, for a long time, the matter of many studies (see, for instance [18], and references therein), but it is difficult to obtain both a general approach and exact results on dynamical problems. Recently, it has been shown that innovative concepts introduced by Kolmogorov somewhat sixty years ago can be combined with the martingale theory to yield novel results in this domain. At first, this point of view was applied to classical Hamiltonian systems [1], and a major result was that under appropriate, realistic conditions, coarsegraining systems generate an approximate Markov system. Here, we have seen that the same reasoning applies to much more general, possibly stochastic, processes. Using a purely mathematical formalism, we obtained new, more general conclusions.
In particular, we have proved that the Kolmogorov entropy, introduced by Kolmogorov for ergodic stationary processes [4], also exists for a class of nonstationary processes defined for coarsegrained systems: these processes are obtained by imposing a nonstationary coarsegrained initial probability distribution, whereas the initial conditional distribution remains stationary in each grain. Such nonstationary coarsegrained distributions can be adopted in realistic mesoscopic systems if they are initially constrained to nonequilibrium. In contrast, local equilibrium is almost instantaneously reestablished: these approximations are often valid in realistic examples [10, 19], which justifies studying this special class of nonstationary processes. Moreover, it has been proved that the asymptotic entropy production of these nonstationary processes is identical to the entropy production of the microscopic stationary process, provided this one is mixing. This is our main result, which allows one to approximate a large class of coarsegrained dynamical processes by Markov processes.
Alternatively, within the framework of the previous general theory, a forthcoming article [20] will present further exact results concerning the comparison of different coarsegrainings of dynamical systems that are of interest for modeling Markov and nonMarkov processes.
References
Gaveau, B., Moreau, M.: Chaos 30, 083104 (2020)
Kolmogorov, A.N.: Dokl. Akad. Nauk SSSR 119, 861 (1958)
Kolmogorov, A.N.: Dokl. Akad. Nauk SSSR 124, 754 (1959)
Arnold, V.I., Avez, A.: Ergodic problems of Classical Mechanics. Mathematical Physics, Monographs, Benjamin (1968)
Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1949)
Khinchin, A.I.: Mathematical Foundation of Information Theory. Dover, New York (1957)
Gaveau, B., Schulman, L.S.: J. Math. Phys. 37, 3897 (1996)
Huang, K.: Statistical Mechanics. CRC Press, Boca Raton (1987)
Gaveau, B., Schulman, L.S.: Eur. Phys. J. 224, 891 (2015)
Landau, L., Lifschitz, E.: Statistical Physics, 3rd edn. Pergamon, Elmsford (1969)
Bass, R.F.: Stochastic Processes. Cambridge Universiry Press, Cambridge (2011)
Moreau, M., Gaveau, B.: Stochastic theory of coarsegrained deterministic systems: martingales and Markov approximations. In: Carpentieri, B. (ed.) Advances in Dynamical Systems Theory, Models, Algorithms (2021). https://doi.org/10.5772/Intechopen.95903
Doob, J.: Stochastic Processes. Wiley, New York (1953)
Levy, P.: Théorie de l’addition des variables aléatoires. GauthierVillars, Paris (1937)
Bartle, R.G.: The Elements of Integration and Lebesgue Measure. WileyInterscience, New York (1995)
Dynkin, E.B.: Theory of Markov Processes. Pergamon, Elmsford (2015)
Cover, T., Thomas, J.: Elements of Information Theory. Wiley, New York (1991)
Achieser, N.I.: Theory of Approximation. Dover Books on Mathematics, Dover (2013)
Reif, F.: Fundamental of Statistical and Thermal Physics. McGrawHill, New York (1965)
Gaveau, B., Moreau, M.: Generalized kinetic theory of coarsegrained systems, entropy and Markov approximations. II: comparison between various coarse grainings. To be published
Acknowledgements
The authors thank their colleagues of the Laboratory of Theoretical Physics of Condensed Matter, Sorbonne Université, Paris, for their interest and many scientific discussions.
Funding
No funding.
Author information
Authors and Affiliations
Contributions
B.G. developed the mathematical aspects of this joint work. M.M. focused on the physical notion of coarsegraining and a more intuitive vision of various results presented here. All authors read and approved the final manuscript.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proofs of Theorems 4.4 and 4.5
Theorem 4.4
For \(x = x(0)\in X\), the sequence of random variables \(p ( x \vert x[  k,  1] )\) is a martingale with respect to the sequence \(\mathcal{F}_{k}\) of σalgebras generated by \(x([k,1])\). Moreover, these random variables are positively bounded by 1.
Proof
Consider the random variables
We have because \(\mathcal{F}_{k  1} \subset \mathcal{F}_{k}\)
where \(\mathrm{E}\{\}\) is the mathematical expectation for the measure p. So, \(\pi _{k}\) is a martingale on the σalgebra \(\mathcal{F}_{N}\), and by the convergence theorem of martingales [13], it converges almost surely to a limit π when \(k\rightarrow \infty \). □
From the previous theorem, one can deduce the theorem of KolmogorovShannon [4–6].
Theorem 4.5
For the stationary process p, one has

(a)
\(d_{k}S_{k}(p)\) has a limit \(s(p)\) for \(k \rightarrow \infty \)

(b)
One has
$$ \lim_{n \to \infty} \frac{1}{n} S_{n}(p) = s(p) $$(A.1)
Proof
(a) We use Eq. (2.13) and the stationarity of p so that
where
By Theorem 4.4, the sequence of random variables \(p ( x(0) \vert x[  k,  1] )\) is a martingale with respect to the sequence \(\mathcal{F}_{k}\) of σalgebras generated by \(x([k,1])\). Moreover, these random variables are positively bounded by 1. So, by the theorem of convergences of martingales [4], this sequence converges palmost surely to a certain random variable \(p ( x(0) \vert x[  \infty ,  1] )\) (as well as any Lebesgue space \(L^{r}(X^{r},p)\) for \(1\leq r \leq +\infty \)). Furthermore, \(p ( x(0) \vert x[  k,  1] ) \ln p ( x(0) \vert x[  k,  1] )\) converges also palmost surely, as well as the finite sum \(S ( p(. \vert x[  k,  1) )\) over \(x(0)\in X\), while staying uniformly bounded. By Lebesgue of dominated convergence [6], the integral over p of these random variables converges, so \(d_{k} S_{k}(p)\) converges.
(b) \(\frac{1}{n} S_{n}(p)\) is, up to \(\frac{1}{n} S_{n}(p_{0})\), the arithmetic sum of the first n differences \(d_{k}S_{k}(p)\) (Eq. (4.10)). □
Appendix B: Proofs of Lemma 5.7 and Theorem 5.2
Using the definition of p̅ and Lemma 4.1 for the conditional entropy, we have
where \(C= \max q(a)/p(a)\).
To prove Lemma 5.7, we prove
Lemma 5.8
One has
Proof of Lemma 5.8
We split the sum in the first term of (5.16) into two terms
with
and
Lemma 5.8 is proved from the next two Lemmas. □
Lemma 5.9
We have
Proof of Lemma 5.9
By the stationarity of p, we have
The martingale \(p(x(0) \vert x(  n  k,  1) \) is uniformly bounded and converges palmost surely, and it also integrable, so that, by the convergence theorem of martingales [13]
□
Lemma 5.10
One has
Proof of Lemma 5.10
We have
Now, we have seen in Eq. (5.8) that this converges when \(n\rightarrow \infty \) to
So,
Eqs. (B.8) and (B.9) prove Lemma 5.10. □
This concludes the proof of Theorem 5.2.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gaveau, B., Moreau, M. Generalized kinetic theory of coarsegrained systems. I. Partial equilibrium and Markov approximations. Adv Cont Discr Mod 2024, 19 (2024). https://doi.org/10.1186/s1366202403810x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1366202403810x