将正则表达式传递给python的csv模块中的分隔符字段或numpy的genfromtxt / loadtxt?(Pass regex to delimiter field in python's csv module or numpy's genfromtxt / loadtxt?)

我用一些奇怪的消除列表数据(即用逗号分隔的值组,通过制表符与其他值分开):

A,345,567 56 67 test

有以下任何一种处理多个分隔符的干净而巧妙的方法: csv module , numpy.genfromtxt或numpy.loadtxt ?

我找到了这样的方法,但我希望有更好的解决方案。 理想情况下,我想使用genfromtxt和正则表达式作为分隔符。

I have tabulated data with some strange delimination (i.e. groups of values separated by commas, seperated from other values by tabs):

A,345,567 56 67 test

Is there a clean and clever way of handling multiple delimiters in any of the following: csv module, numpy.genfromtxt, or numpy.loadtxt?

I have found methods such as this, but I'm hoping there is a better solution out there. Ideally I'd like to use a genfromtxt and a regex for the delimiter.

最满意答案

我担心你要求的三个包中的答案是否定的。 但是,您可以直接replace('\t', ',') (或相反)。 例如:

from StringIO import StringIO # py3k: from io import StringIO import csv with open('./file') as fh: io = StringIO(fh.read().replace('\t', ',')) reader = csv.reader(io) for row in reader: print(row)

I’m afraid the answer is no in the three packages you asked for. However, you can just do replace('\t', ',') (or the reverse). For example:

from StringIO import StringIO # py3k: from io import StringIO import csv with open('./file') as fh: io = StringIO(fh.read().replace('\t', ',')) reader = csv.reader(io) for row in reader: print(row)

更多推荐