本文转载自微信公众号「董泽润的技术笔记」,作者董泽润 。转载本文请联系董泽润的技术笔记公众号。
看到微信群有人问 prd 用什么一致性 hash 算法好,就想起了这篇文章。这是以前做的测试。
那么什么是好的一致性 hash 算法呢?除了性能还要考虑哪些因素呢?根据我自己的经验简单聊一下,有不正确的请大家指正 ^^
关于什么是 Consistent Hashing 网上很多,一般用于对于数据状态有一定要求的缓存场景,比如 web session 数据
节点根据算法,映射到环中。用户请求时,根据 id 求出 hash 值,按顺(逆)时针找到的第一个节点,就是存储数据节点。一般为了更加均匀些,同时防止一个节点下线,导致过多数据移动,会将节点按 replica (weight) 复制多个虚拟节点映射到环中。
了解 LB 的都知道,lvs 这些负载均衡设备能做到请求均匀,但经典实现的一致性 hash 是做不到的。具体场景,可能差别更大。测试脚本放到了 consistenthash balance test github[1],感兴趣的可以看下,测试场景如下:
复制
节点数量:10 虚拟系数:3, 10, 50, 100, 200, 400, 600, 800, 1000 测试算法:MurMurHash, Crc32, Fnv1, CityHash 请求数量:50w
1.
2.
3.
4.
方差可以很好的衡量数据分布程度,可以看到,Crc32 是最差的,Fnv1 在数量很少时也差,MurMurHash 和 CityHash 在虚拟节点超过 100 个后迅速收敛。
diff ratio 指数据分布最多的与最小的差值,除以平均值。从数据分布可以看到, MurMurHash, CityHash 首选的 hash 算法。replica 虚节点数量可以调大一些,毕竟动态增减节点不是常态
冒泡哥提到了一种实现,提前预计算好节点顺序,可以减少分布 diff ratio. 他在生产环境,只用 20 replica 就降低到了 5% 当然了前提是,增加减节点,必须符合一致性 hash 的思想,数据尽可能少的移动。
复制
MurMur replica:3 var:43930 max:72370 min:35411 diff:36959 ratio:73.918000 MurMur replica:10 var:69441 max:85238 min:19257 diff:65981 ratio:131.962000 MurMur replica:50 var:20520 max:59708 min:40669 diff:19039 ratio:38.078000 MurMur replica:100 var:12556 max:57515 min:42045 diff:15470 ratio:30.940000 MurMur replica:200 var:11539 max:55858 min:41998 diff:13860 ratio:27.720000 MurMur replica:400 var:7161 max:53385 min:45323 diff:8062 ratio:16.124000 MurMur replica:600 var:5586 max:51697 min:45913 diff:5784 ratio:11.568000 MurMur replica:800 var:4101 max:52118 min:48041 diff:4077 ratio:8.154000 MurMur replica:1000 var:4369 max:52888 min:48356 diff:4532 ratio:9.064000 Crc32 replica:3 var:76353 max:97097 min:23872 diff:73225 ratio:146.450000 Crc32 replica:10 var:30219 max:61334 min:31891 diff:29443 ratio:58.886000 Crc32 replica:50 var:12231 max:55475 min:43926 diff:11549 ratio:23.098000 Crc32 replica:100 var:22105 max:58828 min:37776 diff:21052 ratio:42.104000 Crc32 replica:200 var:28465 max:59481 min:35807 diff:23674 ratio:47.348000 Crc32 replica:400 var:25440 max:58781 min:37059 diff:21722 ratio:43.444000 Crc32 replica:600 var:33926 max:61694 min:35887 diff:25807 ratio:51.614000 Crc32 replica:800 var:33978 max:61327 min:36914 diff:24413 ratio:48.826000 Crc32 replica:1000 var:27789 max:60515 min:37709 diff:22806 ratio:45.612000 Fnv1 replica:3 var:348461 max:377986 min:5997 diff:371989 ratio:743.978000 Fnv1 replica:10 var:200673 max:231641 min:14451 diff:217190 ratio:434.380000 Fnv1 replica:50 var:41072 max:84924 min:38135 diff:46789 ratio:93.578000 Fnv1 replica:100 var:11567 max:59547 min:47254 diff:12293 ratio:24.586000 Fnv1 replica:200 var:6465 max:53451 min:46127 diff:7324 ratio:14.648000 Fnv1 replica:400 var:5126 max:52454 min:47757 diff:4697 ratio:9.394000 Fnv1 replica:600 var:3748 max:52520 min:48394 diff:4126 ratio:8.252000 Fnv1 replica:800 var:6089 max:52003 min:46211 diff:5792 ratio:11.584000 Fnv1 replica:1000 var:5881 max:53028 min:46902 diff:6126 ratio:12.252000 Cityhash replica:3 var:77979 max:99378 min:11576 diff:87802 ratio:175.604000 Cityhash replica:10 var:60194 max:84045 min:25002 diff:59043 ratio:118.086000 Cityhash replica:50 var:26496 max:67141 min:38121 diff:29020 ratio:58.040000 Cityhash replica:100 var:11862 max:56008 min:44075 diff:11933 ratio:23.866000 Cityhash replica:200 var:9921 max:54711 min:44592 diff:10119 ratio:20.238000 Cityhash replica:400 var:5542 max:53629 min:47391 diff:6238 ratio:12.476000 Cityhash replica:600 var:6823 max:53417 min:46261 diff:7156 ratio:14.312000 Cityhash replica:800 var:5100 max:51615 min:46577 diff:5038 ratio:10.076000 Cityhash replica:1000 var:4884 max:52151 min:47166 diff:4985 ratio:9.970000
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
这次分享就这些,以后面还会分享更多的内容,如果感兴趣,可以关注并点击左下角的分享转发哦(:
参考资料
[1]
测试脚本: https://github.com/dongzerun/consistenthash_balance_test,