内部自己加入表格(Inner Join a Table to Itself)

我有一张使用两个标识列的表,我们称它们为id和userid。 ID在每条记录中都是唯一的,而userid对用户来说是唯一的,但是在许多记录中。

我需要做的是通过用户名为用户获取记录,然后将该记录加入到我们为用户创建的第一条记录中。 查询的逻辑如下:

SELECT v1.id, MIN(v2.id) AS entryid, v1.userid FROM views v1 INNER JOIN views v2 ON v1.userid = v2.userid

我希望我不必将表加入到处理min()代码段的子查询中,因为这似乎很慢。

I have a table that uses two identifying columns, let's call them id and userid. ID is unique in every record, and userid is unique to the user but is in many records.

What I need to do is get a record for the User by userid and then join that record to the first record we have for the user. The logic of the query is as follows:

SELECT v1.id, MIN(v2.id) AS entryid, v1.userid FROM views v1 INNER JOIN views v2 ON v1.userid = v2.userid

I'm hoping that I don't have to join the table to a subquery that handles the min() piece of the code as that seems to be quite slow.

最满意答案

我想(这并不完全清楚)你想为每个用户找到具有最小id的表的行,因此每个用户一行。

在这种情况下,您可以使用子查询(派生表)并将其加入表中:

SELECT v.* FROM views AS v JOIN ( SELECT userid, MIN(id) AS entryid FROM views GROUP BY userid ) AS vm ON vm.userid = v.userid AND vm.entryid = v.id ;

如果您喜欢,也可以使用公用表表达式(CTE)编写以上内容:

; WITH vm AS ( SELECT userid, MIN(id) AS entryid FROM views GROUP BY userid ) SELECT v.* FROM views AS v JOIN vm ON vm.userid = v.userid AND vm.entryid = v.id ;

两者都是非常有效的索引(userid, id) 。

使用SQL-Server,您可以使用ROW_NUMBER()窗口函数来编写它:

; WITH viewsRN AS ( SELECT * , ROW_NUMBER() OVER (PARTITION BY userid ORDER BY id) AS rn FROM views ) SELECT * --- skipping the "rn" column FROM viewsRN WHERE rn = 1 ;

I guess (it's not entirely clear) you want to find for every user, the rows of the table that have minimum id, so one row per user.

In that case, you an use a subquery (a derived table) and join it to the table:

SELECT v.* FROM views AS v JOIN ( SELECT userid, MIN(id) AS entryid FROM views GROUP BY userid ) AS vm ON vm.userid = v.userid AND vm.entryid = v.id ;

The above can also be written using a Common Table Expression (CTE), if you like them:

; WITH vm AS ( SELECT userid, MIN(id) AS entryid FROM views GROUP BY userid ) SELECT v.* FROM views AS v JOIN vm ON vm.userid = v.userid AND vm.entryid = v.id ;

Both would be quite efficient with an index on (userid, id).

With SQL-Server, you could write this using the ROW_NUMBER() window function:

; WITH viewsRN AS ( SELECT * , ROW_NUMBER() OVER (PARTITION BY userid ORDER BY id) AS rn FROM views ) SELECT * --- skipping the "rn" column FROM viewsRN WHERE rn = 1 ;

更多推荐