Computing the mean square displacement of a 2d random walk in Python

I'm simulating a 2-dimensional random walk, with direction 0 < θ < 2π and T=1000 steps. I already have a code which simulates a single walk, repeats it 12 times, and saves each run into sequentially named text files:

a=np.zeros((1000,2), dtype=np.float)
print a                                   # Prints array with zeros as entries

# Single random walk
def randwalk(x,y):              # Defines the randwalk function
    theta=2*math.pi*rd.rand() 
    x+=math.cos(theta);
    y+=math.sin(theta);
    return (x,y)                # Function returns new (x,y) coordinates

x, y = 0., 0.                   # Starting point is the origin
for i in range(1000):           # Walk contains 1000 steps
    x, y = randwalk(x,y)
    a[i,:] = x, y               # Replaces entries of a with (x,y) coordinates

# Repeating random walk 12 times
fn_base = "random_walk_%i.txt"      # Saves each run to sequentially named .txt
for j in range(12):
    rd.seed()                       # Uses different random seed for every run
    x, y = 0., 0.
    for i in range(1000):
        x, y = randwalk(x,y)
        a[i,:] = x, y
    fn = fn_base % j                # Allocates fn to the numbered file
    np.savetxt(fn, a)               # Saves run data to appropriate text file

Now I want to calculate the mean square displacement over all 12 walks. To do this, my initial thought was to import the data from each text file back into a numpy array, eg:

infile="random_walk_0.txt"
rw0dat=np.genfromtxt(infile)
print rw0dat

And then somehow manipulate the arrays to find the mean square displacement.

Is there a more efficient way to go about finding the MSD with what I have?


ANSWERS:


Here is a quick snipet to compute the mean square displacement (MSD). Where path is made of points equally spaced in time, as it seems to be the case for your randwalk. You can just place in the 12-walk for loop and compute it for each a[i,:]

#input path =[ [x1,y1], ... ,[xn,yn] ].

def compute_MSD(path):
   totalsize=len(path)
   msd=[]
   for i in range(totalsize-1):
       j=i+1
       msd.append(np.sum((path[0:-j]-path[j::])**2)/float(totalsize-j))

   msd=np.array(msd)
   return msd

First, you don't actually need to store the whole 1000-step walk, just the final position.

Also, there's no reason to store them out to textfiles and load them back, you can just use them in-memory—just put them in a list of arrays, or in an array of 1 more dimension. Even if you need to write them out, you can do that as well as keeping the final values, instead of in place of. (Also, if you're not actually using numpy for performance or simplicity in building the 2D array, you might want to consider building it iteratively, e.g., using the csv module, but that one's more of a judgment call.)

At any rate, given your 12 final positions, you just calculate the distance of each one from (0, 0), then square that, sum them all, and divide by 12. (Or, since the obvious way to compute the distance from (0, 0) is to just add the squares of the x and y positions and then squareroot the result, just skip the squareroot and square at the end.)

But if you want to store each whole walk into a file for some reason, then after you load them back in, walk[-1] gives you the final position as a 1D array of 2 values. So, you can either read those 12 final positions into a 12x2 array and vectorize the mean square distance, or just accumulate them in a list and do it manually.

While we're at it, the rd.seed() isn't necessary; the whole point of a PRNG is that you continue to get different numbers unless you explicitly reset the seed to its original value to repeat them.

Here's an example of dropping the two extra complexities and doing everything directly:

destinations = np.zeros((12, 2), dtype=np.float)
for j in range(12):
    x, y = 0., 0.
    for i in range(1000):
        x, y = randwalk(x, y)
    destinations[j] = x, y

square_distances = destinations[:,0] ** 2 + destinations[:,1] ** 2
mean_square_distance = np.mean(square_distances)


 MORE:


 ? how to avoid overlaps during midpoint displacement? (2D)
 ? Mean square displacement python
 ? Computing mean square displacement using python and FFT
 ? Mean Squared error in Python
 ? How to create a fenced field for the drunken walk
 ? Python: matplotlib: coloring lines in a random walk by jump size
 ? Vectorizing a time series ensemble generator in R
 ? Function used to derive continuous series of random numbers
 ? Using SubPlots With LineCollection
 ? Random Walk with MPI: Why are my messages getting lost?