我有几个非常大的表(超过40万行),如下所示:

+---------+--------+---------------+
| ID      | M1     | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | NULL          |
| 3684515 | 3.0476 | NULL          |
| 3684516 | 2.6499 | NULL          |
| 3684517 | 0.3585 | NULL          |
| 3684518 | 1.6919 | NULL          |
| 3684519 | 2.8515 | NULL          |
| 3684520 | 4.0728 | NULL          |
| 3684521 | 4.0224 | NULL          |
| 3684522 | 5.8207 | NULL          |
| 3684523 | 6.8291 | NULL          |
+---------+--------+---------------+...about 400,000 more

我需要为M1\_Percentile列中的每一行分配一个值,该值表示“具有等于或低于当前行的M1值的行的百分比”

换句话说,我需要:

我成功实现了此操作,但是FAR FAR太慢了.如果有人可以创建以下代码的更有效的版本,我将不胜感激!


UPDATE myTable AS X JOIN (
SELECT
  s1.ID, COUNT(s2.ID)/ (SELECT COUNT(*) FROM myTable) * 100 AS percentile
FROM
  myTable s1 JOIN myTable s2 on (s2.M1 <= s1.M1)
GROUP BY s1.ID
ORDER BY s1.ID) AS Z 
ON (X.ID = Z.ID) 
SET X.M1_Percentile = Z.percentile;

如果行数限制为您看到的行数(10行),则这是上述查询的结果(正确但缓慢):

+---------+--------+---------------+
| ID      | M1     | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 |            60 |
| 3684515 | 3.0476 |            50 |
| 3684516 | 2.6499 |            30 |
| 3684517 | 0.3585 |            10 |
| 3684518 | 1.6919 |            20 |
| 3684519 | 2.8515 |            40 |
| 3684520 | 4.0728 |            80 |
| 3684521 | 4.0224 |            70 |
| 3684522 | 5.8207 |            90 |
| 3684523 | 6.8291 |           100 |
+---------+--------+---------------+

对于整个40万行产生相同的结果需要花费更长的时间.

解决方法:

我无法对此进行测试,但是您可以尝试执行以下操作:

update table t
set mi_percentile = (
    select count(*)
    from table t1
    where M1 < t.M1 / (
        select count(*)
        from table));

更新:

update test t
set m1_pc = (
    (select count(*) from test t1 where t1.M1 < t.M1) * 100 /
    ( select count(*) from test));

这在Oracle(我唯一可用的数据库)中有效.我确实记得在MySQL中遇到该错误.这很烦人.

标签: mysql, database, sql, rank

相关文章推荐

添加新评论,含*的栏目为必填