如何将数据框中的字符串数字更改为R中的纯数字值(How to change stringified numbers in data frame into pure numeric values in R)

我有以下data.frame:

employee <- c('John Doe','Peter Gynn','Jolie Hope') # Note that the salary below is in stringified format. # In reality there are more such stringified numerical columns. salary <- as.character(c(21000, 23400, 26800)) df <- data.frame(employee,salary)

输出是:

> str(df) 'data.frame': 3 obs. of 2 variables: $ employee: Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2 $ salary : Factor w/ 3 levels "21000","23400",..: 1 2 3

我想要做的是将字符串中的值从df变量直接转换为纯数字。 同时保留employee的字符串名称。 我试过这个但是不行:

as.numeric(df)

在一天结束时,我想对df这些数值进行算术运算。 如df2 <- log2(df)等

I have the following data.frame:

employee <- c('John Doe','Peter Gynn','Jolie Hope') # Note that the salary below is in stringified format. # In reality there are more such stringified numerical columns. salary <- as.character(c(21000, 23400, 26800)) df <- data.frame(employee,salary)

The output is:

> str(df) 'data.frame': 3 obs. of 2 variables: $ employee: Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2 $ salary : Factor w/ 3 levels "21000","23400",..: 1 2 3

What I want to do is to convert the change the value from string into pure number straight fro the df variable. At the same time preserve the string name for employee. I tried this but won't work:

as.numeric(df)

At the end of the day I'd like to perform arithmetic on these numeric values from df. Such as df2 <- log2(df), etc.

最满意答案

好的,这里有几件事情:

R有两种不同的数据类型,看起来像字符串: factor和character 您无法在适当的位置修改大多数R对象,您必须通过分配来更改它们

您的示例的实际修复是:

df$salary = as.numeric(as.character(df$salary))

如果你试着在df$salary上调用as.numeric而不先将它转换为character ,你会得到一个有点奇怪的结果:

> as.numeric(df$salary) [1] 1 2 3

当R创建一个因子时,它会将向量的唯一元素转换为级别,然后使用整数表示这些级别,这是您在尝试转换为数字时看到的。

Ok, there's a couple of things going on here:

R has two different datatypes that look like strings: factor and character You can't modify most R objects in place, you have to change them by assignment

The actual fix for your example is:

df$salary = as.numeric(as.character(df$salary))

If you try to call as.numeric on df$salary without converting it to character first, you'd get a somewhat strange result:

> as.numeric(df$salary) [1] 1 2 3

When R creates a factor, it turns the unique elements of the vector into levels, and then represents those levels using integers, which is what you see when you try to convert to numeric.

更多推荐