如何用 SQL 求出「都关注了某100人」和「都收藏了某100帖子」的所有用户?
『回复列表(165|显示机器人聊天)』
修改了下结果列表,增加了方法二耗时对比
DISTINCT
耗时都一样user
表,再将方法一换成你的方法后,耗时更像方法二了。。SQL 序号 |
结果 | 第一次耗时/s | 第二次耗时/s | 方法二第一次耗时/s | 方法二第二次耗时/s |
---|---|---|---|---|---|
1 | 61270 | 32.388 | 1.101 | 33.373 | 2.753 |
2 | 333 | 65.031 | 1.728 | 65.338 | 3.485 |
3 | 367 | 65.115 | 1.722 | 65.338 | 3.485 |
4 | 2 | 104.504 | 2.343 | 98.5 | 4.954 |
SQL
user
表CREATE TABLE u(uid INTEGER PRIMARY KEY) WITHOUT ROWID;
INSERT INTO u SELECT value FROM generate_series(1, 10000000);
SQL
测试SELECT count(*) FROM u
WHERE uid IN (SELECT uid FROM a WHERE num = 50)
AND uid IN (SELECT uid FROM b WHERE num = 100)
SQL
测试SELECT count(*) FROM u
WHERE uid IN (SELECT uid FROM a WHERE num = 10)
AND uid IN (SELECT uid FROM a WHERE num = 20)
AND uid IN (SELECT uid FROM b WHERE num = 60)
AND uid IN (SELECT uid FROM b WHERE num = 70)
DISTINCT
SELECT count(*) FROM u
WHERE uid IN (SELECT DISTINCT uid FROM a WHERE num = 30)
AND uid IN (SELECT DISTINCT uid FROM a WHERE num = 40)
AND uid IN (SELECT DISTINCT uid FROM b WHERE num = 80)
AND uid IN (SELECT DISTINCT uid FROM b WHERE num = 90)
SQL
测试SELECT count(*) FROM u
WHERE uid IN (SELECT uid FROM a WHERE num = 200)
AND uid IN (SELECT uid FROM a WHERE num = 210)
AND uid IN (SELECT uid FROM a WHERE num = 220)
AND uid IN (SELECT uid FROM b WHERE num = 230)
AND uid IN (SELECT uid FROM b WHERE num = 240)
AND uid IN (SELECT uid FROM b WHERE num = 250)
嗯,所以我们能不能说,拼接JOIN SELECT WHERE语句是最佳解决方案?
目前看来,SQLite
上成立
所以在“员工信息表”里不断排除范围可能比让其他表互相做JOIN更快。
我原本也是这个思路:根据num
取得uid
,然后不断缩小uid
数量
但仔细想了想,觉得目前这个帖子里,没有一个SQL
是按照这个思路执行的。
因为,所有的表结构,都是(num, uid)
。上一步积累的uid
,没法用到下一步的过滤
(我觉得B
表可以改成(uid, num)
,来利用上A
表查到的uid
结果列表)
(我再想想。。)
不这么干,而直接全部变成(uid, num)
,就等着扫全表,来一个个用户检查“是否有num
”吧
(SQLite
倒是有个“跳跃扫描”,可以在一定程度上不遵循“最左原则”。但这样也是 1kw 次 uid: 1~1kw, num = ?
次匹配,和扫全表比,谁快还真的说不定)