为什么Go map vs slice性能在这里有10倍的速度差异(Why does Go map vs slice performance have 10x speed difference here

为什么Go map vs slice性能在这里有10倍的速度差异(Why does Go map vs slice performance have 10x speed difference here)

我刚刚解决了Project Euler上的问题23，但我注意到map [int] bool和[] bool在性能方面有很大差异。

我有一个函数来总结一个数字的正确除数：

func divisorsSum(n int) int { sum := 1 for i := 2; i*i <= n; i++ { if n%i == 0 { sum += i if n/i != i { sum += n / i } } } return sum }

然后在主要我喜欢这样：

func main() { start := time.Now() defer func() { elapsed := time.Since(start) fmt.Printf("%s\n", elapsed) }() n := 28123 abundant := []int{} for i := 12; i <= n; i++ { if divisorsSum(i) > i { abundant = append(abundant, i) } } sums := map[int]bool{} for i := 0; i < len(abundant); i++ { for j := i; j < len(abundant); j++ { if abundant[i]+abundant[j] > n { break } sums[abundant[i]+abundant[j]] = true } } sum := 0 for i := 1; i <= 28123; i++ { if _, ok := sums[i]; !ok { sum += i } } fmt.Println(sum) }

此代码在我的计算机上需要450毫秒。但是，如果我将主代码更改为下面的片段bool而不是像这样的地图：

func main() { start := time.Now() defer func() { elapsed := time.Since(start) fmt.Printf("%s\n", elapsed) }() n := 28123 abundant := []int{} for i := 12; i <= n; i++ { if divisorsSum(i) > i { abundant = append(abundant, i) } } sums := make([]bool, n) for i := 0; i < len(abundant); i++ { for j := i; j < len(abundant); j++ { if abundant[i]+abundant[j] < n { sums[abundant[i]+abundant[j]] = true } else { break } } } sum := 0 for i := 0; i < len(sums); i++ { if !sums[i] { sum += i } } fmt.Println(sum) }

现在它只需要40ms，低于之前速度的1/10。我认为地图应该有更快的查找。这里的性能差异是什么？

I just solved problem 23 on Project Euler, but I noticed a big difference between map[int]bool, and []bool in terms of performance.

I have a function that sums up the proper divisors of a number:

func divisorsSum(n int) int { sum := 1 for i := 2; i*i <= n; i++ { if n%i == 0 { sum += i if n/i != i { sum += n / i } } } return sum }

And then in main I do like this:

This code takes 450ms on my computer. But if I change the main code to below with slice of bool instead of map like this:

func main() { start := time.Now() defer func() { elapsed := time.Since(start) fmt.Printf("%s\n", elapsed) }() n := 28123 abundant := []int{} for i := 12; i <= n; i++ { if divisorsSum(i) > i { abundant = append(abundant, i) } } sums := make([]bool, n) for i := 0; i < len(abundant); i++ { for j := i; j < len(abundant); j++ { if abundant[i]+abundant[j] < n { sums[abundant[i]+abundant[j]] = true } else { break } } } sum := 0 for i := 0; i < len(sums); i++ { if !sums[i] { sum += i } } fmt.Println(sum) }

Now it takes only 40ms, below 1/10 of the speed from previous. I thought maps were supposed to have faster look ups. What is up with the performance difference here?

最满意答案

您可以查看代码并查看，但一般来说，有两个主要原因：

您将第二个示例中的sums预先分配到所需的大小。这意味着它永远不会增长，所有这一切都非常有效，没有GC压力，没有reallocs等。请尝试提前创建所需大小的地图，看看它有多大改进。

我不知道Go的哈希映射的内部实现，但一般来说，通过整数索引随机访问数组/切片是非常有效的，并且哈希表在它之上增加了开销，特别是如果它散列整数（它可能会这样做以创造更好的分配）。

You can profile your code and see, but in general, there are two main reasons:

You pre-allocate sums in the second example to its desired size. This means it never has to grow, and all this is very efficient, there's no GC pressure, no reallocs, etc. Try creating the map with the desired size in advance and see how much it improves things.

I don't know the internal implementation of Go's hash map, but in general, random access of an array/slice by integer index is super efficient, and a hash table adds overhead on top of it, especially if it hashes the integers (it might do so to create better distribution).

更多推荐