Bash regexp。*匹配太远[重复](Bash regexp .* matches too far [duplicate])

这个问题在这里已有答案:

我的正则表达式太匹配了。 如何判断它与最小的模式相匹配? 4个答案

我有一个文件input.txt,其中包含以下内容:

foo [assembly: AssemblyVersion("1.2.3")] bar")] quux

要匹配输入中的1.2.3 ,请使用以下脚本:

#!/bin/bash regex='\[assembly: AssemblyVersion\("(.*)"\)\]' fileContent=$(cat input.txt) [[ "$fileContent" =~ $regex ]] echo "${BASH_REMATCH[1]}"

我希望输出为1.2.3但它是:

1.2.3")] bar

为什么? 怎么解决?

https://regex101.com上的正则表达式测试程序按预期工作。

This question already has an answer here:

My regex is matching too much. How do I make it stop? 5 answers

I have a file input.txt with the following content:

foo [assembly: AssemblyVersion("1.2.3")] bar")] quux

To match the 1.2.3 from the input the following script is used:

#!/bin/bash regex='\[assembly: AssemblyVersion\("(.*)"\)\]' fileContent=$(cat input.txt) [[ "$fileContent" =~ $regex ]] echo "${BASH_REMATCH[1]}"

I would expect the output to be 1.2.3 but it is:

1.2.3")] bar

Why is that so? How to fix it?

The regular expressions tester at https://regex101.com works as expected.

最满意答案

.*被称为贪婪点匹配子模式,它匹配" ,和"任何字符, 包括换行符 。

因此,限制贪婪的最佳技巧是使用一个否定的字符类[^"]来匹配任何字符但是" (如果引用的字符串中没有引号):

'\[assembly: AssemblyVersion\("([^"]*)"\)\]' ^^^^^

演示

或 - 如果引用的字符串中应该没有(和) :

'\[assembly: AssemblyVersion\("([^()]*)"\)\]' ^^^^^

演示

The .* is called a greedy dot matching subpattern and it matches ", and ), any character including a newline.

Thus, the best trick to limit the greediness is using a negated character class [^"] that will match any character but " (if there can be no quotes inside the quoted string):

'\[assembly: AssemblyVersion\("([^"]*)"\)\]' ^^^^^

Demo

or - if there should be no ( and ) inside the quoted string:

'\[assembly: AssemblyVersion\("([^()]*)"\)\]' ^^^^^

Demo

更多推荐