如何用 SQL 求出「都关注了某100人」和「都收藏了某100帖子」的所有用户?
用关系除法试试
CREATE TEMPORARY TABLE 要求聊天室表(room varchar(255));
INSERT INTO 要求聊天室表 VALUES('公共聊天室'), ('文件上传');
CREATE TEMPORARY TABLE 要求论坛表(forum_id int);
INSERT INTO 要求论坛表 SELECT id FROM hu60_bbs_forum_meta WHERE name IN ('超级灌水', '网页插件');
SELECT hu60_user.uid, hu60_user.name
FROM (
SELECT DISTINCT uid
FROM hu60_addin_chat_data AS ES1
WHERE NOT EXISTS (
SELECT *
FROM 要求聊天室表
WHERE NOT EXISTS (
SELECT *
FROM hu60_addin_chat_data AS ES2
WHERE ES2.uid = ES1.uid
AND ES2.room = 要求聊天室表.room ))
) AS 除以要求聊天室表
JOIN (
SELECT DISTINCT uid
FROM hu60_bbs_topic_meta AS ES1
WHERE NOT EXISTS (
SELECT *
FROM 要求论坛表
WHERE NOT EXISTS (
SELECT *
FROM hu60_bbs_topic_meta AS ES2
WHERE ES2.uid = ES1.uid
AND ES2.forum_id = 要求论坛表.forum_id ))
) AS 除以要求论坛表 USING(uid)
JOIN hu60_user USING(uid)
GROUP BY uid;
并不快
@老虎会游泳,用这个 php 随机生成两张约 2亿行的大表吧
分别输出至 STDOUT 和 STDERR,可以在 shell 中从这两处地方导入至两表
mt_srand(strtotime('2022-08-11 20:00:00'));
foreach ([STDOUT, STDERR] as $fp)
for ($uid = 1; $uid <= 10000000; ++$uid)
for ($i = mt_rand(15, 25); $i > 0; --$i)
fprintf($fp, "%d,%d\n", $uid, mt_rand(0, 255));
可按索引需要调换字段顺序(反正数字一致):
fprintf($fp, "%d,%d\n", mt_rand(0, 255), $uid);
@老虎会游泳,我换其他语言试试了,要不生成一次太久,想换下范围(如 0~255 -> 0~ 65535),重新生成一次估计都得半个钟
Python 花了三分多钟
from datetime import datetime
from random import seed, randint
seed(int(datetime.fromisoformat('2022-08-11 20:00:00').timestamp()))
for uid in range(1, 10000000 + 1):
for i in range(randint(15, 25)):
print(f'{uid},{randint(0, 255)}')
运行:
$ time python3 /mnt/c/Users/wuxun/PycharmProjects/pythonProject/main.py | wc -c
2291797342
real 3m42.934s
user 3m42.000s
sys 0m2.875s
@老虎会游泳,卧槽!!tcc
牛逼啊!直接运行 C
源码,速度都能这么快!
randint
范围有点问题,修复了
$ time tcc -run - <<EOF | wc -c
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
static inline int randint(int min, int max) {
return rand() % (max - min + 1) + min;
}
int main() {
srand(mktime(&(struct tm){
.tm_year = 2022 - 1900,
.tm_mon = 8 - 1,
.tm_mday = 11,
.tm_hour = 20,
.tm_min = 0,
.tm_sec = 0,
}));
for (int uid = 1; uid <= 10000000; ++uid)
for (int i = randint(15, 25); i > 0; --i)
printf("%d,%d\n", uid, randint(0, 255));
}
EOF
结果:
2291774656
real 0m22.888s
user 0m22.547s
sys 0m4.516s
试试这样写
和上面的写法没有差别,可能最终的执行方式是一样的。