如何比较,匹配和附加多个CSV文件中的多个值?(How to compare, match, and append multiple values in multiple CSV files?)

我试图找出最好的方法来做到这一点,我不知道如何通过同一个管道使用2个不同的文件Import-Csv并导出找到的值...

所以让我们从CSV文件1开始:我只想要LoginNumber的值,其中Type = H and (ContractorDomain -ne $null -or ContractorDomain -ne "") 。 例如,这应该只从下面提取值0031482和2167312 。

注意:我只添加了空格和箭头,以便在此处更容易阅读列。 csv文件在列值或箭头之间没有空格。

"LoginNumber","Type","ContractorDomain"
"0031482"    ,"H"   ,"P12345"  <<
"1251632"    ,"P"   ,"A52671"
"2167312"    ,"H"   ,"425126"  <<
"0598217"    ,"L"   ,""
"1405735"    ,"H"   ,""
"2058194"    ,"A"   ,"L21514"
 

当找到LoginNumber的值(根据上面解释的条件)时,请在CSV文件2中搜索它。然后为UserIDNumber的相应值获取AccountStatus和SamAccountName的值。

"SamAccountName","UserIDNumber","AccountDescriptionDetails","AccountStatus"
"jd12395"       ,"0052142"     ,"Company CEO"              ,"Enabled"
"jwet"          ,"2167312"     ,"Software Developer"       ,"Disabled"  <<
"1b3gas5"       ,"1385293"     ,"Project Manager"          ,"Disabled"
"632g1fsa"      ,"0031482"     ,"QA Tester"                ,"Enabled"   <<
"4126hs"        ,"0000418"     ,"Program Manager"          ,"Disabled"
"axv"           ,"1840237"     ,"Accountant Administrator" ,"Disabled"
 

对于第三个CSV文件,我们有以下内容:

"domainName","SameAccountName","DateExpired"
"TempDomain","jwet"           ,"20151230"    <<
"PermDomain","p21942"         ,""
"PermDomain","qz231034"       ,""
"TempDomain","632g1fsa"       ,"20151231"    <<
"TempDomain","ru20da2bb22"    ,"20160425"
 

接下来,对于第三个文件,我想添加列以插入Disabled和Enabled值(或User Match Not Found值):

"domainName","SameAccountName","DateExpired","UserStatus"
"TempDomain","jwet"           ,"20151230"   ,"Disabled"               <<
"PermDomain","p21942"         ,""           ,"User Match Not Found"
"PermDomain","qz231034"       ,""           ,"User Match Not Found"
"TempDomain","632g1fsa"       ,"20151231"   ,"Enabled"                <<
"TempDomain","ru20da2bb22"    ,"20160425"   ,"User Match Not Found"
 

我学会了如何导入-csv并用这样的东西创建新列......

Import-Csv $file | Select-Object -Property *, @{Name="UserStatus";Expression={ if ($true) {"fill value in here"} }} | Export-Csv $newFile -NoType

所以我在想这样的事情。 我只是不知道如何通过管道搜索/查找/传递多个CSV文件值。

注意:其中一些CSV文件在我们搜索的列前后有15列。 此外,一些列值有逗号,所以我不能真正依赖-Delimiter , . 此外,某些列值不包含" (如果您要以txt格式打开CSV)。

I'm trying to figure out the best way to do this, and I'm not sure how to Import-Csv with 2 different files through the same pipeline and export a value found...

So lets start with CSV file 1: I only want the values for LoginNumber where Type = H and (ContractorDomain -ne $null -or ContractorDomain -ne ""). For example, this should only pull values 0031482 and 2167312 from below.

Note: I only added spaces and arrows to make it easier to read as columns here. The csv files have no spaces between the column values or arrows.

"LoginNumber","Type","ContractorDomain"
"0031482"    ,"H"   ,"P12345"  <<
"1251632"    ,"P"   ,"A52671"
"2167312"    ,"H"   ,"425126"  <<
"0598217"    ,"L"   ,""
"1405735"    ,"H"   ,""
"2058194"    ,"A"   ,"L21514"
 

When the value number for LoginNumber (based on conditions explained above) is found, search for it in CSV file 2. Then grab the value of AccountStatus and SamAccountName for the respective value of UserIDNumber.

"SamAccountName","UserIDNumber","AccountDescriptionDetails","AccountStatus"
"jd12395"       ,"0052142"     ,"Company CEO"              ,"Enabled"
"jwet"          ,"2167312"     ,"Software Developer"       ,"Disabled"  <<
"1b3gas5"       ,"1385293"     ,"Project Manager"          ,"Disabled"
"632g1fsa"      ,"0031482"     ,"QA Tester"                ,"Enabled"   <<
"4126hs"        ,"0000418"     ,"Program Manager"          ,"Disabled"
"axv"           ,"1840237"     ,"Accountant Administrator" ,"Disabled"
 

For the 3rd CSV file we have the following:

"domainName","SameAccountName","DateExpired"
"TempDomain","jwet"           ,"20151230"    <<
"PermDomain","p21942"         ,""
"PermDomain","qz231034"       ,""
"TempDomain","632g1fsa"       ,"20151231"    <<
"TempDomain","ru20da2bb22"    ,"20160425"
 

Next, for the 3rd file, I want to add the column to plug in the Disabled and Enabled values (or User Match Not Found value):

"domainName","SameAccountName","DateExpired","UserStatus"
"TempDomain","jwet"           ,"20151230"   ,"Disabled"               <<
"PermDomain","p21942"         ,""           ,"User Match Not Found"
"PermDomain","qz231034"       ,""           ,"User Match Not Found"
"TempDomain","632g1fsa"       ,"20151231"   ,"Enabled"                <<
"TempDomain","ru20da2bb22"    ,"20160425"   ,"User Match Not Found"
 

I learned how to import-csv and create new columns with something like this...

Import-Csv $file | Select-Object -Property *, @{Name="UserStatus";Expression={ if ($true) {"fill value in here"} }} | Export-Csv $newFile -NoType

So I'm thinking something like this. I'm just not sure how to search/find/pass multiple CSV files values through the pipeline.

Note: some of these CSV files have like 15 columns before and after the columns we are searching for. Also, some of the columns values have a comma, so I can't really rely on the -Delimiter ,. Also, some of the column values do not have " (if you were to open the CSV in txt format).

最满意答案

如果值被正确引用(即CSV有效),则包含逗号的列不应成为问题。 Import-Csv将正确导入记录42,"a,b",c为三个值42 , a,b和c 。 如果您的CSV格式不正确:请先修复。

从第一个CSV文件中获取登录ID:

$logins = Import-Csv 'C:\path\to\file1.csv' | Where-Object { $_.Type -eq 'H' -and $_.ContractorDomain } | Select-Object -Expand LoginNumber

您可以将ContractorDomain属性检查简化为$_.ContractorDomain ,因为PowerShell在该上下文中将空字符串和$null 为布尔值$false 。 对于其他零或空值(0,0.0,空数组等)也会发生相同的情况,但这不应该是您的方案中的问题。

接下来创建一个哈希表映射帐户名称到它们各自的状态。 按您之前创建的ID列表过滤导入的第二个CSV,因此哈希表仅包含相关的映射。

$accountStatus = @{} Import-Csv 'C:\path\to\file2.csv' | Where-Object { $logins -contains $_.UserIDNumber } | ForEach-Object { $accountStatus[$_.SamAccountName] = $_.AccountStatus }

使用该散列表,您现在可以将UserStatus列添加到您的第三个CSV中:

(Import-Csv 'C:\path\to\file3.csv') | Select-Object -Property *, @{n='UserStatus';e={ if ($accountStatus.ContainsKey($_.SameAccountName)) { $accountStatus[$_.SameAccountName] } else { 'User Match Not Found' } }} | Export-Csv 'C:\path\to\file3.csv' -NoType

Import-Csv语句周围的括号确保在Export-Csv开始写入文件之前完全读取和关闭该文件。 只有在您将修改后的数据写回同一个文件时才需要它们,否则可以省略。 星号选择所有导入的列,并且额外的计算属性添加要包含的新列。

Columns containing commas shouldn't be an issue if the values are properly quoted (i.e. if the CSV is valid). Import-Csv will correctly import a record 42,"a,b",c as three values 42, a,b and c. If your CSV isn't well-formed: fix that first.

Fetch the login IDs from the first CSV file:

$logins = Import-Csv 'C:\path\to\file1.csv' | Where-Object { $_.Type -eq 'H' -and $_.ContractorDomain } | Select-Object -Expand LoginNumber

You can simplify the ContractorDomain property check to just $_.ContractorDomain, because PowerShell interprets both an empty string and $null as a boolean value $false in that context. The same would happen for other zero or empty values (0, 0.0, empty array, etc.), but that shouldn't be an issue in your scenario.

Next create a hashtable mapping account names to their respective status. Filter the imported second CSV by the list of IDs you created before, so the hashtable contains only relevant mappings.

$accountStatus = @{} Import-Csv 'C:\path\to\file2.csv' | Where-Object { $logins -contains $_.UserIDNumber } | ForEach-Object { $accountStatus[$_.SamAccountName] = $_.AccountStatus }

With that hashtable you can now add the UserStatus column to your third CSV:

(Import-Csv 'C:\path\to\file3.csv') | Select-Object -Property *, @{n='UserStatus';e={ if ($accountStatus.ContainsKey($_.SameAccountName)) { $accountStatus[$_.SameAccountName] } else { 'User Match Not Found' } }} | Export-Csv 'C:\path\to\file3.csv' -NoType

The parentheses around the Import-Csv statement ensure that the file is completely read and closed before Export-Csv starts writing to it. They're only required if you're writing the modified data back to the same file and can be omitted otherwise. The asterisk selects all imported columns, and the additional calculated property adds the new column you want to include.

更多推荐