使用python读取.dat文件(Read .dat file using python)

我使用MITgcm进行了模拟,包括内部波浪和水中的粒子移动。 对于每个时间步骤,这个输出结果如下所示:

-9999 0.0000000000000000000 #Time step (0.00000000000000 seconds) 1308.2021183321899 -14.999709364517091 # Particle 1 (X,Z) 1308.2020142528656 -24.999521595698688 # Particle 2 (X,Z) 1308.2018600072618 -34.999345597877536 # . 1308.2016593336587 -44.999185870669805 # . 1308.2014165588744 -54.999046508237896 # . 1308.2011370083103 -64.998931076248894 1308.2008269116873 -74.998842490305705 1308.2004933548124 -84.998782925797485 1308.2001441978532 -94.998753764086956 1308.1997879652938 -104.99875557384759 1308.1994336881464 -114.99878812280582 1308.1990906721119 -124.99885041328211 1308.1987681881285 -134.99894073461562 1308.1984750963150 -144.99905672694641 1308.1982194336249 -154.99919545294702 1308.1980080134056 -164.99935347476733 1308.1978461242272 -174.99952693694112 1308.1977378137256 -184.99971163492469 1308.2000000000000 -195.00000000000000 5232.8000000000002 -15.000038916290352 5232.8000000000002 -25.000064153684303 5232.8000000000002 -35.000089286157163 5232.8000000000002 -45.000114270293523 5232.8000000000002 -55.000139061712051 # Particle 57

其中-9999#号是时间步长(以秒为单位),左列为X位置,右列为Z位置(以米为单位); 每条线都是不同的粒子(除-9999之外)。 因此,对于每个时间步和每个粒子,我们将拥有大量的线条。

我想绘制粒子位置的时间演变。 我该怎么做? 如果这太难了,我会对所有粒子位置的不同时间步长的静态图表感到满意。

非常感谢。

编辑1:我试图做的是这个,但我之前没有显示它,因为它远没有正确:

from matplotlib import numpy import matplotlib.pyplot as plot plot.plot(*np.loadtxt('data.dat',unpack=True), linewidth=2.0)

或这个:

plot.plotfile('data.dat', delimiter=' ', cols=(0, 1), names=('col1', 'col2'), marker='o')

I've got a simulation involving internal waves and moving particles in the water, using the MITgcm. The output of this looks something like this for each time step:

-9999 0.0000000000000000000 #Time step (0.00000000000000 seconds) 1308.2021183321899 -14.999709364517091 # Particle 1 (X,Z) 1308.2020142528656 -24.999521595698688 # Particle 2 (X,Z) 1308.2018600072618 -34.999345597877536 # . 1308.2016593336587 -44.999185870669805 # . 1308.2014165588744 -54.999046508237896 # . 1308.2011370083103 -64.998931076248894 1308.2008269116873 -74.998842490305705 1308.2004933548124 -84.998782925797485 1308.2001441978532 -94.998753764086956 1308.1997879652938 -104.99875557384759 1308.1994336881464 -114.99878812280582 1308.1990906721119 -124.99885041328211 1308.1987681881285 -134.99894073461562 1308.1984750963150 -144.99905672694641 1308.1982194336249 -154.99919545294702 1308.1980080134056 -164.99935347476733 1308.1978461242272 -174.99952693694112 1308.1977378137256 -184.99971163492469 1308.2000000000000 -195.00000000000000 5232.8000000000002 -15.000038916290352 5232.8000000000002 -25.000064153684303 5232.8000000000002 -35.000089286157163 5232.8000000000002 -45.000114270293523 5232.8000000000002 -55.000139061712051 # Particle 57

Where -9999 #number is the time step (in seconds), left column is X position and right column is Z position (in meters); and every line is a different particle (except the -9999 one). So we'll have an enormous amount of lines with something like this for every time step and every particle.

I would like to plot the time-evolution of the position of my particles. How can I do it? If that's too hard, I would be happy with static plots of different time-steps with all particles position.

Thank you so much.

Edit1: What I tried to do is this, but I didn't show it before because it is far from proper:

from matplotlib import numpy import matplotlib.pyplot as plot plot.plot(*np.loadtxt('data.dat',unpack=True), linewidth=2.0)

or this:

plot.plotfile('data.dat', delimiter=' ', cols=(0, 1), names=('col1', 'col2'), marker='o')

最满意答案

我会使用numpy.loadtxt来读取输入,但只是因为后处理也需要numpy。 您可以将所有数据读取到内存中,然后查找分隔线,然后重新整理其余数据以适应您的粒子数量。 下面假定没有任何粒子达到完全 x=-9999 ,这应该是一个合理的(尽管不是万无一失的)假设。

import numpy as np filename = 'input.dat' indata = np.loadtxt(filename, usecols=(0,1)) # make sure the rest is ignored tlines_bool = indata[:,0]==-9999 Nparticles = np.diff(np.where(tlines_bool)[0][:2])[0] - 1 # TODO: error handling: diff(np.where(tlines_bool)) should be constant times = indata[tlines_bool,1] positions = indata[np.logical_not(tlines_bool),:].reshape(-1,Nparticles,2)

上面的代码在每个时间步长为每个粒子的2d位置产生Nt元素数组times和形状的数组position (Nt,Nparticles,2) 。 通过计算粒子的数量,我们可以让numpy确定第一个维度的大小(这就是reshape()的-1索引)。

对于绘图而言,你只需切入你的positions数组来提取你确切需要的东西。 对于2d x数据和2d y数据, matplotlib.pyplot.plot()将自动尝试将输入数组的列绘制为彼此的函数。 以下是您如何使用实际输入数据进行可视化的示例:

import matplotlib.pyplot as plt t_indices = slice(None,None,500) # every 500th time step particle_indices = slice(None) # every particle #particle_indices = slice(None,5) # first 5 particles #particle_indices = slice(-5,None) # last 5 particles plt.figure() _ = plt.plot(times[myslice],positions[myslice,particle_indices,0]) plt.xlabel('t') plt.ylabel('x') plt.figure() _ = plt.plot(times[myslice],positions[myslice,particle_indices,1]) plt.xlabel('t') plt.ylabel('z') plt.figure() _ = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1]) plt.xlabel('x') plt.ylabel('z') plt.show()

x(t)图 z(t)图 z(x)图

每条线对应一个粒子。 前两个图分别显示x和z分量的时间演变,第三个图显示z(x)轨迹。 请注意,您的数据中有很多粒子根本不会移动:

>>> sum([~np.diff(positions[:,k,:],axis=0).any() for k in range(positions.shape[1])]) 15

(这计算每个粒子的两个坐标的时间差异,一个接一个地计算,并计算两个维度中每个差异为0的粒子数,即粒子不移动。)。 这解释了前两个地块中的所有水平线; 这些静止粒子根本不会出现在第三个图中(因为它们的轨迹是单点)。

我故意引入了一点花哨的索引,这可以更轻松地播放数据。 正如您所看到的,索引看起来像这样: times[myslice] , positions[myslice,particle_indices,0] ,其中两个切片都是根据......井, slice 。 你应该看一下文档,但是简短的故事是, arr[slice(from,to,stride)]等价于arr[from:to:stride] ,如果任何变量是None ,那么相应的索引是empty: arr[slice(-5,None)]等价于arr[-5:] ,即它将切片数组的最后5个元素。

因此,如果您使用减少的轨迹数来进行绘图(因为57很多),您可以考虑添加图例(只要matplotlib的默认颜色循环可以区分粒子,那么这种情况才有意义,否则您有设置手动颜色或更改轴的默认颜色周期。 为此,您必须保留从plot返回的句柄:

particle_indices = slice(None,5) # first 5 particles plt.figure() lines = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1]) plt.xlabel('x') plt.ylabel('z') plt.legend(lines,['particle {}'.format(k) for k in range(len(t))]) plt.show()

与传说的例子

I would use numpy.loadtxt for reading input, but only because post-processing would also need numpy. You can read all your data to memory, then find the separator lines, then reshape the rest of your data to fit your number of particles. The following assumes that none of the particles ever reach exactly x=-9999, which should be a reasonable (although not foolproof) assumption.

import numpy as np filename = 'input.dat' indata = np.loadtxt(filename, usecols=(0,1)) # make sure the rest is ignored tlines_bool = indata[:,0]==-9999 Nparticles = np.diff(np.where(tlines_bool)[0][:2])[0] - 1 # TODO: error handling: diff(np.where(tlines_bool)) should be constant times = indata[tlines_bool,1] positions = indata[np.logical_not(tlines_bool),:].reshape(-1,Nparticles,2)

The above code produces an Nt-element array times and an array position of shape (Nt,Nparticles,2) for each particle's 2d position at each time step. By computing the number of particles, we can let numpy determine the size of the first dimension (this iswhat the -1 index in reshape() is for).

For plotting you just have to slice into your positions array to extract what you exactly need. In case of 2d x data and 2d y data, matplotlib.pyplot.plot() will automatically try to plot the columns of the input arrays as a function of each other. Here's an example of how you can visualize, using your actual input data:

import matplotlib.pyplot as plt t_indices = slice(None,None,500) # every 500th time step particle_indices = slice(None) # every particle #particle_indices = slice(None,5) # first 5 particles #particle_indices = slice(-5,None) # last 5 particles plt.figure() _ = plt.plot(times[myslice],positions[myslice,particle_indices,0]) plt.xlabel('t') plt.ylabel('x') plt.figure() _ = plt.plot(times[myslice],positions[myslice,particle_indices,1]) plt.xlabel('t') plt.ylabel('z') plt.figure() _ = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1]) plt.xlabel('x') plt.ylabel('z') plt.show()

x(t) plot z(t) plot z(x) plot

Each line corresponds to a single particle. The first two plots show the time-evolution of the x and z components, respectively, and the third plot shows the z(x) trajectories. Note that there are a lot of particles in your data that don't move at all:

>>> sum([~np.diff(positions[:,k,:],axis=0).any() for k in range(positions.shape[1])]) 15

(This computes the time-oriented difference of both coordinates for each particle, one after the other, and counts the number of particles for which every difference in both dimensions is 0, i.e. the particle doesn't move.). This explains all those horizontal lines in the first two plots; these stationary particles don't show up at all in the third plot (since their trajectory is a single point).

I intentionally introduced a bit fancy indexing which makes it easier to play around with your data. As you can see, indexing looks like this: times[myslice], positions[myslice,particle_indices,0], where both slices are defined in terms of...well, a slice. You should look at the documentation, but the short story is that arr[slice(from,to,stride)] is equivalent to arr[from:to:stride], and if any of the variables is None, then the corresponding index is empty: arr[slice(-5,None)] is equivalent to arr[-5:], i.e. it will slice the final 5 elements of the array.

So, in case you use a reduced number of trajectories for plotting (since 57 is a lot), you might consider adding a legend (this only makes sense as long as the default color cycle of matplotlib lets you distinguish between particles, otherwise you have to either set manual colors or change the default color cycle of your axes). For this you will have to keep the handles that are returned from plot:

particle_indices = slice(None,5) # first 5 particles plt.figure() lines = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1]) plt.xlabel('x') plt.ylabel('z') plt.legend(lines,['particle {}'.format(k) for k in range(len(t))]) plt.show()

example with legend

更多推荐