Random Walk Model and Establishing False Cause and Effect Relationships

royce crowd

Historical Notes and Applications of Random Walk Models:

Random Walk models are useful in many areas of physical and social sciences.  Random walks are models where there is an uncertain random element which changes at specified intervals.  These changes accumulate and a added to the original value of the problem creating seemingly random series of numbers whose applications are as varied as they are vast.  Here is a list of some of the applications where Random Walk models have been especially insightful:

  • Economics: Modeling stock share prices
  • Population Genetics:  Analyzing genetic drift
  • Mathematical Ecology: Modeling a foraging animals position
  • World War II:  Used to model an escaping prisoners movements
  • Brain Research: Creates models of neurons firing in the human brain and is helping to understand epilepsy
  • Gambling:  Probability models in lotteries and Las Vegas casinos use Random Walk models in slot machines and to catch cheaters
  • Internet Search Engines:  Google uses a Random Walk Model in its search engine to provide the most relevant search results

The list is long and full of interesting and dynamic problems and scientist are continuously finding more application to this fascinating and difficult area of applied mathematics. The Random Walk Model has also been used to show how to randomly generated series can appear to have a relationship when in fact they are completely random.  This problem of spurious regression is especially interesting to statisticians and economist and one experiment of spurious regression is conducted at the end of this posting.  The random numbers which accumulate in the random walk model have a mean (average) of zero, so that a you are just as likely to get a positive value as a negative value in the random component.  This might indicated that a Random Walk model is stationary, but strangely this is not the case.  The mathematical proof follows.

Mathematical Proof of Non-Stationary property of the Random Walk Models:

Given a Random Walk Model; we want to prove the two conditions of a stationary time series which are constant mean and variance and time independent covariance. Which can be defined mathematically as:

ScreenHunter_01 Oct. 25 10.13

ScreenHunter_03 Oct. 25 10.16

ScreenHunter_04 Oct. 25 10.16

In order to simplify the proof it is beneficial to write the random walk model in summation notation and we can use the principles of induction to prove that the random walk model is equivalent to the following summation…

ScreenHunter_05 Oct. 25 10.19

ScreenHunter_06 Oct. 25 10.21

ScreenHunter_07 Oct. 25 10.21

ScreenHunter_08 Oct. 25 10.21

ScreenHunter_09 Oct. 25 10.25

This simplified version of the equation composed of strictly linear terms in compact summation notation will increase the simplicity of the investigation of the stationary conditions.  The next step is to test the time invariant  covariance condition which is a necessary but not sufficient condition to establish a stationary time series model, but if there exist a time dependent covariance then we can definitively say that the random walk model fails to be stationary.

Time Series Stationary Test 1: Time Invariant Covariance Time Series

ScreenHunter_10 Oct. 25 10.29

ScreenHunter_11 Oct. 25 10.31

ScreenHunter_13 Oct. 25 10.35

ScreenHunter_14 Oct. 25 10.36

Time Series Stationary Test 2: Constant Mean and Variance Time Series

ScreenHunter_15 Oct. 25 10.39

ScreenHunter_16 Oct. 25 10.39

ScreenHunter_17 Oct. 25 10.43

ScreenHunter_18 Oct. 25 10.44

Excel: Random Walk on Excel and FALSE relationships visualized

A simple random walk model can be created on Excel.  An initial value can be added or omitted with the use of a spreadsheet and the iteration process is simply copied down to a large range of cells.  The only formula that is needed is the random number generating formula like in the picture below.  Excel also regenerates the random numbers in a spreadsheet if you press F2+Enter, this feature along with the graphical representation of two series demonstrates how Random Walks can create spurious relationships.

ScreenHunter_20 Oct. 25 10.53

ScreenHunter_11 Sep. 26 23.18

Here one can see how a two random time series models can lead a researcher to conclude a correlation where none exist:

A FALSE strong positive correlation where none exist because graphs where generated via the Random Walk Model.

ScreenHunter_12 Sep. 26 23.20

A FALSE negative correlation where no correlation exist because series are randomly generated in the Random Walk Model.

ScreenHunter_13 Sep. 26 23.21