递归postgres SQL优化(recursive postgres SQL optimization)

我们在rails项目中使用了祖先gem 。 表中约有约800个类别:

db => SELECT id, ancestry FROM product_categories LIMIT 10; id | ancestry -----+------------- 399 | 3 298 | 8/292/294 12 | 3/401/255 573 | 349/572 707 | 7/23/89/147 201 | 166/191 729 | 5/727 84 | 7/23 128 | 7/41/105 405 | 339 (10 rows)

ancestry字段代表记录的“路径”。 我需要的是建立一个地图{category_id => [... all_subtree_ids ...]}

我通过使用这样的子查询解决了这个问题:

SELECT id, ( SELECT array_agg(id) FROM product_categories WHERE (ancestry LIKE CONCAT(p.id, '/%') OR ancestry = CONCAT(p.ancestry, '/', p.id, '') OR ancestry = (p.id) :: TEXT) ) categories FROM product_categories p ORDER BY id

结果

1 | {17,470,32,29,15,836,845,837} 2 | {37,233,231,205,107,109,57,108,28,58, ...}

但问题是此查询运行大约100毫秒,我想知道是否有一种方法来使用WITH recursive来优化它? 我是WITH的新手,所以我的查询挂起postgres :(

** ========= UPD ========= **接受AlexM答案最快,但如果有人感兴趣,这里是递归解决方案:

WITH RECURSIVE a AS (SELECT id, id as parent_id FROM product_categories UNION all SELECT pc.id, a.parent_id FROM product_categories pc, a WHERE regexp_replace(pc.ancestry, '^(\d{1,}/)*', '')::integer = a.id) SELECT parent_id, sort(array_agg(id)) as children FROM a WHERE id <> parent_id group by parent_id order by parent_id;

We are using ancestry gem in our rails project. There are about ~800 categories in the table:

db => SELECT id, ancestry FROM product_categories LIMIT 10; id | ancestry -----+------------- 399 | 3 298 | 8/292/294 12 | 3/401/255 573 | 349/572 707 | 7/23/89/147 201 | 166/191 729 | 5/727 84 | 7/23 128 | 7/41/105 405 | 339 (10 rows)

ancestry field represents "path" of record. What I need is to build a map { category_id => [... all_subtree_ids ... ]}

I solved this by using subqueries like this:

SELECT id, ( SELECT array_agg(id) FROM product_categories WHERE (ancestry LIKE CONCAT(p.id, '/%') OR ancestry = CONCAT(p.ancestry, '/', p.id, '') OR ancestry = (p.id) :: TEXT) ) categories FROM product_categories p ORDER BY id

which results in

1 | {17,470,32,29,15,836,845,837} 2 | {37,233,231,205,107,109,57,108,28,58, ...}

BUT the problem is this query runs about 100ms and I wonder if there's a way to optimize it using WITH recursive? I'm novice in WITH so my queries just hang the postgres :(

** ========= UPD ========= ** accepted AlexM answer as fastest, but if any one interested, here's recursive solution:

WITH RECURSIVE a AS (SELECT id, id as parent_id FROM product_categories UNION all SELECT pc.id, a.parent_id FROM product_categories pc, a WHERE regexp_replace(pc.ancestry, '^(\d{1,}/)*', '')::integer = a.id) SELECT parent_id, sort(array_agg(id)) as children FROM a WHERE id <> parent_id group by parent_id order by parent_id;

最满意答案

尝试这种方法,我认为它应该比嵌套查询快得多:

WITH product_categories_flat AS ( SELECT id, unnest(string_to_array(ancestry, '/')) as parent FROM product_categories ) SELECT parent as id, array_agg(id) as children FROM product_categories_flat GROUP BY parent

Try this approach, I think it should be much faster than nested queries:

WITH product_categories_flat AS ( SELECT id, unnest(string_to_array(ancestry, '/')) as parent FROM product_categories ) SELECT parent as id, array_agg(id) as children FROM product_categories_flat GROUP BY parent

更多推荐