O que se entende por uma descrição completa de um processo estocástico? Bem, matematicamente, um processo estocástico é uma coleção { X ( t ) : t ∈ T }{X(t):t∈T} de variáveis aleatórias, uma para cada instante tt em um conjunto de índices TT , onde geralmente TT é a linha real inteira ou a linha real positiva, e uma descrição completa significa que para cada número inteiro n ≥ 1n≥1 e nn instantes de tempo t 1 , t 2 , ... , T n ∈Tt1,t2,…,tn∈T , conhecemos as distribuições (conjuntas) das nn variáveis aleatórias X ( t 1 )X(t1) , X ( t 2 )X(t2) , … , X ( t n )…,X(tn) . É umaquantidadeenormede informações: precisamos conhecer o CDF de X ( t )X(t) para cada instante tt , o CDF ( bidimensional) conjunto de X ( t 1 )X(t1) e X ( t 2 )X(t2) para todas as opções de tempo instantes t 1t1 e t 2t2 , os CDFs ( tridimensionais) de X ( t 1 )X(t1) , X ( t 2 )X(t2) e X ( t 3 )X(t3) , etc. etc. etc.
Então, naturalmente, as pessoas procuravam descrições mais simples e modelos mais restritivos. Uma simplificação ocorre quando o processo é invariável a uma mudança na origem do tempo. O que isso significa é que
- Todas as variáveis aleatórias no processo têm CDFs idênticos: F X ( t 1 ) ( x ) = F X ( t 2 ) ( x )FX(t1)(x)=FX(t2)(x) para todos os t 1 , t 2t1,t2 .
- Quaisquer duas variáveis aleatórias separadas por uma quantidade de tempo especificada têm o mesmo CDF conjunto que qualquer outro par de variáveis aleatórias separadas pela mesma quantidade de tempo. Por exemplo, as variáveis aleatórias X ( t 1 )X(t1) e X ( t 1 + τ )X(t1+τ) são separadas por ττ segundos, assim como as variáveis aleatórias X ( t 2 )X(t2) e X ( t 2 + τ )X(t2+τ) e, portanto, F X ( t 1 ) , X ( t 1 + τ ) (x,y ) = F X ( t 2 ) , X ( t 2 + τ ) ( x , y )FX(t1),X(t1+τ)(x,y)=FX(t2),X(t2+τ)(x,y)
- Quaisquer três variáveis aleatórias X ( t 1 )X(t1) , X ( t 1 + τ 1 )X(t1+τ1) , X ( t 1 + τ 1 + τ 2 )X(t1+τ1+τ2) espaçadas τ 1τ1 e τ 2τ2 afastadas têm a mesma CDF conjunta que X ( t 2 )X(t2) , X ( t 2 + τ 1 )X(t2+τ1) , X ( t 2 +τ 1 + τ 2 )X(t2+τ1+τ2) que, como também espaçados τ 1τ1 e τ 2τ2 ,
- e assim por diante para todos os CDFs multidimensionais. Veja, por exemplo, a resposta de Peter K. para obter detalhes do caso multidimensional.
Efetivamente, as descrições probabilísticas do processo aleatório não dependem do que escolhemos chamar de origem no eixo do tempo: deslocando todos os instantes de tempo t 1 , t 2 , … , t nt1,t2,…,tn por alguma quantidade fixa ττ para t 1 + τ , t 2 + τ , … , t n + τt1+τ,t2+τ,…,tn+τ fornece a mesma descrição probabilística das variáveis aleatórias. Essa propriedade é chamada de estacionariedade de sentido estrito e um processo aleatório que goza dessa propriedade é chamado de processo aleatório estritamente estacionário ou, mais simplesmente, processo aleatório estacionário de s.
Observe que a estacionariedade estrita por si só não requer nenhuma forma específica de CDF. Por exemplo, não diz que todas as variáveis são gaussianas.
O adjetivo sugere estritamente que é possível definir uma forma mais flexível de estacionariedade. Se o N °Nth -order CDF conjunta de
X ( t 1 ) , X ( t 2 ) , ... , X ( t N )X(t1),X(t2),…,X(tN) é o mesmo que o N °Nth -order CDF conjunta de X ( t 1 + τ ) , X ( t 2 + τ ) , … , X ( t N + τ )X(t1+τ),X(t2+τ),…,X(tN+τ) para todas as opções de t 1 , t 2 , ... , t Nt1,t2,…,tN e ττ , em seguida, o processo aleatório é dito ser estacionário a ordem NN e é referido como um N thNth processo aleatório -order estacionário. Note-se que uma
N °Nth processo aleatório -order estacionário também é estacionário a ordem nn para cada positivo n < Nn<N . (Isto acontece porque o n thnth -order CDF comum é o limite da N thNth -order CDF como N−nN−n of the arguments approach ∞∞: a generalization of FX(x)=limy→∞FX,Y(x,y)FX(x)=limy→∞FX,Y(x,y)). A strictly stationary random
process then is a random process that is stationary to all orders NN.
If a random process is stationary to (at least) order 11, then all the X(t)X(t)'s have the same distribution and so, assuming the mean exists, E[X(t)]=μE[X(t)]=μ is the same for all tt. Similarly,
E[(X(t))2]E[(X(t))2] is the same for all tt, and is referred to as the power of the process.
All physical processes have finite power and so it is common to assume that
E[(X(t))2]<∞E[(X(t))2]<∞ in which case, and especially in the older engineering
literature, the process is called a second-order process. The choice
of name is unfortunate because it invites confusion with second-order
stationarity (cf. this answer of mine on stats.SE), and so here we will call
a process for which E[(X(t))2]E[(X(t))2] is finite for all tt (whether or
not E[(X(t))2]E[(X(t))2] is a constant) as a finite-power process and avoid this confusion.
But note again that
a first-order stationary process need not be a finite-power process.
Consider a random process that is stationary to order 22. Now, since the joint distribution of X(t1)X(t1) and X(t1+τ)X(t1+τ) is the same as the joint distribution function of X(t2)X(t2) and X(t2+τ)X(t2+τ), E[X(t1)X(t1+τ)]=E[X(t2)X(t2+τ)]E[X(t1)X(t1+τ)]=E[X(t2)X(t2+τ)] and the value depends only on ττ. These expectations are finite
for a finite-power process and their value is called the autocorrelation function of the process: RX(τ)=E[X(t)X(t+τ)]RX(τ)=E[X(t)X(t+τ)] is a function of ττ, the time separation of the random variables X(t)X(t) and X(t+τ)X(t+τ), and does not depend on tt at all. Note also that
E[X(t)X(t+τ)]=E[X(t+τ)X(t)]=E[X(t+τ)X(t+τ−τ)]=RX(−τ),E[X(t)X(t+τ)]=E[X(t+τ)X(t)]=E[X(t+τ)X(t+τ−τ)]=RX(−τ),
and so
the autocorrelation function is an even function of its argument.
A finite-power second-order stationary random process has the properties that
- Its mean E[X(t)]E[X(t)] is a constant
- Its autocorrelation function RX(τ)=E[X(t)X(t+τ)]RX(τ)=E[X(t)X(t+τ)] is a function of ττ, the time separation of the random variables X(t)X(t) and X(t+τ)X(t+τ), and does not depend on tt at all.
The assumption of stationarity simplifies the description of a random process to some extent but, for engineers and statisticians interested in building models from experimental data, estimating all those CDFs is a nontrivial task, particularly when there is only a segment of one sample path (or realization) x(t)x(t) on which measurements can be made. Two measurements
that are relatively easy to make (because the engineer already has the necessary instruments on his workbench (or programs in MATLAB/Python/Octave/C++ in his software library) are the DC value
1T∫T0x(t)dt1T∫T0x(t)dt of x(t)x(t) and the autocorrelation function Rx(τ)=1T∫T0x(t)x(t+τ)dtRx(τ)=1T∫T0x(t)x(t+τ)dt (or its Fourier transform, the power spectrum of x(t)x(t)). Taking these measurements as estimates of the mean and the autocorrelation function of a finite-power
process leads to a very useful model that we discuss next.
A finite-power random process is called a wide-sense-stationary (WSS) process (also weakly stationary random process which fortunately also
has the same initialism WSS) if it has a constant mean and its autocorrelation function RX(t1,t2)=E[X(t1)X(t2)]RX(t1,t2)=E[X(t1)X(t2)] depends only on the time difference t1−t2t1−t2 (or t2−t1t2−t1).
Note that the definition says nothing about the CDFs of the random
variables comprising the process; it is entirely a constraint on the
first-order and second-order moments of the random variables. Of course, a finite-power second-order stationary (or NthNth-order stationary (for N>2N>2) or strictly stationary) random process is
a WSS process, but the converse need not be true.
A WSS process need not be stationary to any order.
Consider, for example, the random process
{X(t):X(t)=cos(t+Θ),−∞<t<∞}{X(t):X(t)=cos(t+Θ),−∞<t<∞}
where ΘΘ takes on four equally likely values 0,π/2,π0,π/2,π and 3π/23π/2. (Do not be scared: the four possible sample paths of this random process are just the four signal waveforms of a QPSK signal).
Note that each X(t)X(t) is a discrete random variable that, in general, takes on four equally likely values cos(t),cos(t+π/2)=−sin(t),cos(t+π)=−cos(t)cos(t),cos(t+π/2)=−sin(t),cos(t+π)=−cos(t) and cos(t+3π/2)=sin(t)cos(t+3π/2)=sin(t), It is easy to see that in general X(t)X(t) and X(s)X(s) have different distributions, and so the process is not even first-order stationary. On the other hand,
E[X(t)]=14cos(t)+14(−sin(t))+14(−cos(t))+14sin(t)=0E[X(t)]=14cos(t)+14(−sin(t))+14(−cos(t))+14sin(t)=0
for every tt while
E[X(t)X(s)]=14[cos(t)cos(s)+(−cos(t))(−cos(s))+sin(t)sin(s)+(−sin(t))(−sin(s))]=12[cos(t)cos(s)+sin(t)sin(s)]=12cos(t−s).E[X(t)X(s)]=14[cos(t)cos(s)+(−cos(t))(−cos(s))+sin(t)sin(s)+(−sin(t))(−sin(s))]=12[cos(t)cos(s)+sin(t)sin(s)]=12cos(t−s).
In short, the process has zero mean and its autocorrelation function depends only on the time difference t−st−s, and so the process is wide sense stationary. But it is not first-order stationary and so cannot be
stationary to higher orders either.
Even for WSS processes that are second-order stationary (or strictly stationary) random processes, little can be
said about the specific forms of the distributions of the random variables. In short,
A WSS process is not necessarily stationary (to any order), and the mean and autocorrelation function of
a WSS process is not enough to give a complete statistical description
of the process.
Finally, suppose that a stochastic process is assumed to be a Gaussian process ("proving" this with any reasonable degree of confidence is not a trivial task).
This means that for each tt, X(t)X(t) is a Gaussian random variable and for all positive integers n≥2n≥2 and choices of n time instants t1, t2, …,tn, the N
random variables X(t1), X(t2), …,X(tn) are jointly Gaussian random
variables. Now a joint Gaussian density function is completely determined by the means, variances, and covariances of the random variables, and in this case, knowing the mean function μX(t)=E[X(t)] (it need not be a constant as is required for wide-sense-stationarity) and the autocorrelation function RX(t1,t2)=E[X(t1)X(t2)] for all t1,t2 (it need not depend only on t1−t2 as is required for wide-sense-stationarity) is sufficient to determine the statistics of the process completely.
If the Gaussian process is a WSS process, then
it is also a strictly stationary Gaussian process. Fortunately
for engineers and signal processors, many physical noise processes
can be well-modeled as WSS Gaussian processes (and therefore strictly
stationary processes), so that experimental observation of the
autocorrelation function readily provides all the joint distributions.
Furthermore since Gaussian processes retain their Gaussian character
as they pass through linear systems, and the output autocorrelation
function is related to th input autocorrelation function as
Ry=h∗˜h∗RX
so that the output statistics can also be easily determined, WSS
process in general and WSS Gaussian processes in particular are
of great importance in engineering applications.