运用普通SQL或许更难解决的问题是这样的问题:
哪些电影与给定的电影X有相同的艺人?
像平常相同,咱们在这个比如中运用sakila数据库。用SQL(例如PostgreSQL,具体来说)解决这个问题的或许办法是什么?下面的查询给出了每部电影的艺人概略:
SELECT
film_id,
array_agg(actor_id ORDER BY actor_id) actors
FROM film_actor
GROUP BY film_id
它产生的成果是这样的:
|film_id|actors |
|-------|----------------------------------|
|1 |{1,10,20,30,40,53,108,162,188,198}|
|2 |{19,85,90,160} |
|3 |{2,19,24,64,123} |
|4 |{41,81,88,147,162} |
|5 |{51,59,103,181,200} |
|6 |{21,23,62,108,137,169,197} |
|... |... |
请注意,在SQL中,数组的行为就像列表相同,也便是说,它们保持着自己的排序,所以清晰地对数组进行排序对于能够彼此比较艺人是很重要的。现在,咱们想从上面找到所有同享相同艺人集的电影:
WITH t AS (
-- Previous query
SELECT
film_id,
array_agg(actor_id ORDER BY actor_id) actors
FROM film_actor
GROUP BY film_id
)
SELECT
array_agg(film_id ORDER BY film_id) AS films,
actors
FROM t
GROUP BY actors
ORDER BY count(*) DESC, films
成果便是现在:
|films |actors |
|--------|----------------------------------|
|{97,556}|{65} |
|{1} |{1,10,20,30,40,53,108,162,188,198}|
|{2} |{19,85,90,160} |
|{3} |{2,19,24,64,123} |
|{4} |{41,81,88,147,162} |
|{5} |{51,59,103,181,200} |
|{6} |{21,23,62,108,137,169,197} |
|... |... |
所以,咱们可以看到,只要2部电影同享相同的艺人集,这些电影是FILM_ID IN (97, 556)
。(Sakila数据库有点无聊,由于数据集是生成的)。
运用MULTISET比较
虽然上面的内容现已相当酷了,但在这篇文章中,我想展现一下jOOQ 3.15MULTISET
支持的一个鲜为人知的功用,即它们可以彼此比较的现实。
并且正如SQL标准MULTISET
的性质相同,排序是不相关的,所以咱们不需要为这种比较添加任何清晰的ORDER BY
子句。现实上,这并不是100%的不相关。你可以 为投影意图对MULTISET
,所以排序将由jOOQ维护。但是当你在谓词中运用它们时,jOOQ将覆盖你的ORDER BY
子句。
运用jOOQ,咱们可以写:
ctx.select(FILM.FILM_ID, FILM.TITLE)
.from(FILM)
.where(
multiset(
select(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.where(FILM_ACTOR.FILM_ID.eq(FILM.FILM_ID))
).eq(multiset(
select(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.where(FILM_ACTOR.FILM_ID.eq(97L))
))
)
.orderBy(FILM_ID)
.fetch();
这比前一种形式的查询效率低一些,由于它从两个子查询中访问FILM_ACTOR
表,虽然其中只要一个是相关的。运用默许的JSONB
仿真,可以生成以下查询:
SELECT film.film_id, film.title
FROM film
WHERE (
SELECT coalesce(
jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
jsonb_build_array()
)
FROM (
SELECT film_actor.actor_id AS v0
FROM film_actor
WHERE film_actor.film_id = film.film_id
) AS t
) = (
SELECT coalesce(
jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
jsonb_build_array()
)
FROM (
SELECT film_actor.actor_id AS v0
FROM film_actor
WHERE film_actor.film_id = 97
) AS t
)
ORDER BY film.film_id
我许诺,对于MULTISET
,不需要ORDER BY
子句,对于jOOQ代码来说,这仍然是实在的。然而,在幕后,jOOQ必须按JSON数组的内容排序,以确保两个MULTISET
的值是相同的,无论其次序如何。
其成果是与前面的成果显示的两个ID相同:
+-------+--------------+
|film_id|title |
+-------+--------------+
| 97|BRIDE INTRIGUE|
| 556|MALTESE HOPE |
+-------+--------------+
比较MULTISET_AGG,而不是
如果你喜欢运用连接和GROUP BY
来生成电影的艺人MULTISET
,你也可以用jOOQ来做。这一次,咱们运用的是
-
隐式连接来简化对
FILM.TITLE
的访问,从FILM_ACTOR
- 在
HAVING
子句中的一个MULTISET
谓词,运用MULTISET_AGG
下面是jOOQ的版别:
ctx.select(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
.from(FILM_ACTOR)
.groupBy(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
.having(multisetAgg(FILM_ACTOR.ACTOR_ID).eq(multiset(
select(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.where(FILM_ACTOR.FILM_ID.eq(97L))
)))
.orderBy(FILM_ACTOR.FILM_ID)
.fetch();
后退,生成的SQL看起来像这样:
SELECT film_actor.film_id, alias_75379701.title
FROM film_actor
JOIN film AS alias_75379701
ON film_actor.film_id = alias_75379701.film_id
GROUP BY film_actor.film_id, alias_75379701.title
HAVING jsonb_agg(
jsonb_build_array(film_actor.actor_id) ORDER BY film_actor.actor_id
) = (
SELECT coalesce(
jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
jsonb_build_array()
)
FROM (
SELECT film_actor.actor_id AS v0
FROM film_actor
WHERE film_actor.film_id = 97
) AS t
)
ORDER BY film_actor.film_id
请注意隐式连接是如何自动打开的,而HAVING
谓词再次运用了通常的JSONB
仿真,用于MULTISET
和MULTISET_AGG
。
代替办法
在上面的比如中,咱们比较了投射单列的MULTISET
表达式,换句话说,Result<Record1<Long>>
嵌套调集类型。jOOQ将始终确保你的查询类型查看和生成的SQL是正确的。
运用MULTISET
的一个代替办法是运用ARRAY_AGG
和ARRAY
(现在你必须再次清晰ORDER BY
)。用jOOQ:
ctx.select(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
.from(FILM_ACTOR)
.groupBy(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
.having(arrayAgg(FILM_ACTOR.ACTOR_ID)
.orderBy(FILM_ACTOR.ACTOR_ID).eq(array(
select(FILM_ACTOR.ACTOR_ID)
.from(FILM_ACTOR)
.where(FILM_ACTOR.FILM_ID.eq(97L))
.orderBy(FILM_ACTOR.ACTOR_ID)
)))
.orderBy(FILM_ACTOR.FILM_ID)
.fetch();
运用SQL:
SELECT film_actor.film_id, film.title
FROM film_actor
JOIN film
ON film_actor.film_id = film.film_id
GROUP BY film_actor.film_id, film.title
HAVING array_agg(film_actor.actor_id ORDER BY film_actor.actor_id) =
ARRAY (
SELECT film_actor.actor_id
FROM film_actor
WHERE film_actor.film_id = 97
ORDER BY film_actor.actor_id
)
ORDER BY film_actor.film_id