Redis 集群快速搭建

2018-02-03 10:20:33来源:oschina作者:第二帅的人人点击

分享

序言:昨晚,一轮蓝月亮挂在天边,与影踱步,抬头仰望,心里想着千里之外的姑娘。思念之情泛滥了一地,感概万千。突然,意识到自己好久不写文章了,真是时光荏苒,岁月如梭,这个习惯竟然丢了。写文章是个好习惯啊,可以加深记忆,便于分享,也方便日后查阅。所以现在开始,把这个习惯捡起来,而今天,先拿Redis Cluster祭天。

Redis 是一个由Salvatore Sanfilippo写的key-value存储系统,是目前用的最多的NoSql数据库之一。


Redis在使用中有很多模式,单机版,主从复制,哨兵,集群(proxy型),集群(直连型)


今天讨论的话题是 Redis的直连集群 Cluster。


Cluster模型:


将整个redis看成一个整体,这个整体,被分成了16384个小块,每个小块叫做slot (槽),我们平时用的key,就存储在槽里面,所以这里我们可以得到结论:1.槽的数量是固定的,2.槽的大小取决于机器的内存,3.槽与key的关系是一对多。这些槽会跟据不同的生产需求,被分成若干组,每一个组称为一个node(节点),这里可以得到结论:1.一个集群由多个节点组成。2.这些节点关系平等。3.每个节点都有自己管理的槽点集合,4.访问槽点需要先定位节点。


综上 整个cluster的结构就很清晰了:cluster>node>slot>key,如图。


集群节点:


每一个节点管理着一组槽点,所以每个节点的也是一个小集群。cluster的节点是主从模式实现的,可以看做是个阉割版的 哨兵。 一主多从。主从可以切换。节点与节点之间,各个机器之间都是彼此联通的。整个redis结构(官方推荐最少三个master):



特点: 1、无中心架构(不存在哪个节点影响性能瓶颈),少了proxy层。 2、数据按照slot存储分布在多个节点,节点间数据共享,可动态调整数据分布。 3、可扩展性,可线性扩展到1000个节点,节点可动态添加或删除。 4、高可用性,部分节点不可用时,集群仍可用。通过增加Slave做备份数据副本 5、实现故障自动failover,节点之间通过gossip协议交换状态信息,用投票机制完成Slave到Master的角色提升。 缺点: 1、资源隔离性较差,容易出现相互影响的情况。 2、数据通过异步复制,不保证数据的强一致性。强一致性问题是理论存在的情况,但是大多数情况可以忽略,因为并不是每个公司都像阿里那样经常处理 每秒30万+的数据。

理论说完,现在我们搭建一个集群。


一、安装


下载 wget http://download.redis.io/releases/redis-3.2.9.tar.gz


解压 tar xvf redis-3.2.9.tar.gz


安装make install


二、配置


redis.conf 最少配置


port 7000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
bind 0.0.0.0

我的机器只有一个,所以通过不同的端口来模拟6台机器。也就是一个机器启动六个实例,只是端口不同。


端口分别为:700070017002700370047005


其中偶数端口为master 70007002 7004


奇数端口为salve 7001 7003 7005


每一个实例都有一个单独的配置文件夹



注意: 别忘了修改各个配置文件的端口


nodes.conf 是自动生成的,不需要手动生成。可以用dir 指定其生成的目录,默认好像是src


三、启动


[root@beta-hr src]# ./redis-server ../cluster-conf/7002/7002.conf

启动好之后:


[root@beta-hr src]# ps -ef | grep redis
root9661 10 10:54 ? 00:00:36 ./redis-server 0.0.0.0:7000 [cluster]
root9689 10 10:55 ? 00:00:36 ./redis-server 0.0.0.0:7002 [cluster]
root9844 10 10:56 ? 00:00:36 ./redis-server 0.0.0.0:7004 [cluster]
root 11278 10 11:35 ? 00:00:33 ./redis-server 0.0.0.0:7001 [cluster]
root 11444 10 11:37 ? 00:00:33 ./redis-server 0.0.0.0:7003 [cluster]
root 11543 10 11:38 ? 00:00:34 ./redis-server 0.0.0.0:7005 [cluster]
root 2259198720 18:17 pts/500:00:00 grep --color=auto redis

四、创建集群


./redis-trib.rb create --replicas 1 192.168.49.52:7000 192.168.49.52:7002 192.168.49.52:7004 192.168.49.52:7001 192.168.49.52:7003 192.168.49.52:7005

注意,redis-trib.rb 是用ruby写的,所以需要安装ruby,redis版本3.2.9 需要ruby4以上。


开始用rvm安装ruby一直报错,最后还是下载了源文件自己安装的,不过好像推荐第三方管理器安装


--replicas1 表示主从比例 是1:1


开始创建:


[root@beta-hr src]# ./redis-trib.rb create --replicas 1 192.168.49.52:7000 192.168.49.52:7002 192.168.49.52:7004 192.168.49.52:7001 192.168.49.52:7003 192.168.49.52:7005
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.49.52:7000
192.168.49.52:7002
192.168.49.52:7004
Adding replica 192.168.49.52:7001 to 192.168.49.52:7000
Adding replica 192.168.49.52:7003 to 192.168.49.52:7002
Adding replica 192.168.49.52:7005 to 192.168.49.52:7004
M: 9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86 192.168.49.52:7000
slots:0-5460 (5461 slots) master
M: 128b3b42e40433dc87605a36a836df9f4140cfbc 192.168.49.52:7002
slots:5461-10922 (5462 slots) master
M: 2887d8f58899f1697fb1ca93e48eee759288efc7 192.168.49.52:7004
slots:10923-16383 (5461 slots) master
S: 1f95757226e95f684970ab0d3bb790f8b7bef39b 192.168.49.52:7001
replicates 9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86
S: c273d2e673fe4f33dec28f6cf76cf54f3a7de55b 192.168.49.52:7003
replicates 128b3b42e40433dc87605a36a836df9f4140cfbc
S: 4131643910fb67ab282f118244a2bdeec9d5743d 192.168.49.52:7005
replicates 2887d8f58899f1697fb1ca93e48eee759288efc7
Can I set the above configuration? (type 'yes' to accept):

trib首先从六个机器中按照主从比例选出前三个 为 master, 后面三个以次为 slave。


然后将16384个槽 分成三部分,分别配给三个master


7000:0-5460 (5461 slots)


7002:5461-10922 (5462 slots)


7004:10923-16383 (5461 slots)


最后设置从属关系


1f95757226e95f684970ab0d3bb790f8b7bef39b 192.168.49.52:7001
replicates 9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86

这里看到7001 配给了9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86==7000


Note:redis给每个机器起了一个识别码,来唯一识别主机,而不是通过ip


最后如果接受以上配置,输入yes,开始创建


Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join....
>>> Performing Cluster Check (using node 192.168.49.52:7000)
M: 9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86 192.168.49.52:7000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: c273d2e673fe4f33dec28f6cf76cf54f3a7de55b 192.168.49.52:7003
slots: (0 slots) slave
replicates 128b3b42e40433dc87605a36a836df9f4140cfbc
M: 128b3b42e40433dc87605a36a836df9f4140cfbc 192.168.49.52:7002
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: 4131643910fb67ab282f118244a2bdeec9d5743d 192.168.49.52:7005
slots: (0 slots) slave
replicates 2887d8f58899f1697fb1ca93e48eee759288efc7
S: 1f95757226e95f684970ab0d3bb790f8b7bef39b 192.168.49.52:7001
slots: (0 slots) slave
replicates 9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86
M: 2887d8f58899f1697fb1ca93e48eee759288efc7 192.168.49.52:7004
slots:10923-16383 (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

[OK] All 16384 slots covered. 创建成功。登入集群,如下:


[root@beta-hr src]# ./redis-cli -c -p 7000
127.0.0.1:7000> cluster nodes
c273d2e673fe4f33dec28f6cf76cf54f3a7de55b 192.168.49.52:7003 slave 128b3b42e40433dc87605a36a836df9f4140cfbc 0 1517456656084 5 connected
128b3b42e40433dc87605a36a836df9f4140cfbc 192.168.49.52:7002 master - 0 1517456654080 2 connected 5461-10922
9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86 192.168.49.52:7000 myself,master - 0 0 1 connected 0-5460
4131643910fb67ab282f118244a2bdeec9d5743d 192.168.49.52:7005 slave 2887d8f58899f1697fb1ca93e48eee759288efc7 0 1517456655081 6 connected
1f95757226e95f684970ab0d3bb790f8b7bef39b 192.168.49.52:7001 slave 9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86 0 1517456655583 4 connected
2887d8f58899f1697fb1ca93e48eee759288efc7 192.168.49.52:7004 master - 0 1517456654580 3 connected 10923-16383

这里看到,6个机器都已经连接上, 到此集群创建完毕。


五、使用


127.0.0.1:7000> set name jenny
-> Redirected to slot [5798] located at 192.168.49.52:7002
OK
192.168.49.52:7002> set name JennyM
OK

设置key name = jenny, redis 根据算法算出是哪一个槽点来存储这个key , 即: crc16(name)=5798


而5798存储在机器7002 当前登录的是7000,所以会自动切换到7002, Redirected to xxx


当再次为key name 设置值的时候 已经在7002了 所以没有切换的信息。同理:


192.168.49.52:7002> set age 18
-> Redirected to slot [741] located at 192.168.49.52:7000
OK
192.168.49.52:7000> set 97fb1ca93e48eee759288efc7 123456
-> Redirected to slot [14385] located at 192.168.49.52:7004
OK
192.168.49.52:7002> get age
-> Redirected to slot [741] located at 192.168.49.52:7000
"18"

至此,一个简易版的Redis集群创建完毕。


六、检测


将部分节点断掉,只剩了7005和7002


127.0.0.1:7002> cluster nodes
2887d8f58899f1697fb1ca93e48eee759288efc7 192.168.49.52:7004 master,fail? - 1517480723887 1517480723284 3 disconnected 10923-16383
4131643910fb67ab282f118244a2bdeec9d5743d 192.168.49.52:7005 slave 2887d8f58899f1697fb1ca93e48eee759288efc7 0 1517480880209 6 connected
1f95757226e95f684970ab0d3bb790f8b7bef39b 192.168.49.52:7001 master,fail - 1517480573304 1517480573104 7 disconnected 0-5460
c273d2e673fe4f33dec28f6cf76cf54f3a7de55b 192.168.49.52:7003 slave,fail 128b3b42e40433dc87605a36a836df9f4140cfbc 1517480710020 1517480708711 5 disconnected
9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86 192.168.49.52:7000 master,fail - 1517480535444 1517480534943 1 disconnected
128b3b42e40433dc87605a36a836df9f4140cfbc 192.168.49.52:7002 myself,master - 0 0 2 connected 5461-10922

再去访问


127.0.0.1:7002> get name
(error) CLUSTERDOWN The cluster is down

提示集群已经down了。 也是就说 如果一个节点的主从机器都挂了 整个集群就完蛋了。


我们将7005 启动起来


22957:S 01 Feb 18:28:17.606 * Connecting to MASTER 192.168.49.52:7004
22957:S 01 Feb 18:28:17.606 * MASTER <-> SLAVE sync started
22957:S 01 Feb 18:28:17.606 # Error condition on socket for SYNC: Connection refused
22957:S 01 Feb 18:28:18.612 * Connecting to MASTER 192.168.49.52:7004
22957:S 01 Feb 18:28:18.612 * MASTER <-> SLAVE sync started
22957:S 01 Feb 18:28:18.612 # Error condition on socket for SYNC: Connection refused
22957:S 01 Feb 18:28:19.617 * Connecting to MASTER 192.168.49.52:7004

屏幕一直刷Error


因为作为一个slave,他一直在尝试着连接master 7004 , 而7004已经停了。


我们把7004 起起来


[root@beta-hr src]# ./redis-server ../cluster-conf/7004/7004.conf
23227:M 01 Feb 18:29:46.098 * Background saving started by pid 23230
23230:C 01 Feb 18:29:46.101 * DB saved on disk
23230:C 01 Feb 18:29:46.101 * RDB: 0 MB of memory used by copy-on-write
23227:M 01 Feb 18:29:46.133 * Background saving terminated with success
23227:M 01 Feb 18:29:46.133 * Synchronization with slave 192.168.49.52:7005 succeeded

这个时候7005,不再刷错误,显示成功连接。


22957:S 01 Feb 18:29:46.096 * MASTER <-> SLAVE sync started
22957:S 01 Feb 18:29:46.096 * Non blocking connect for SYNC fired the event.
22957:S 01 Feb 18:29:46.097 * Master replied to PING, replication can continue...
22957:S 01 Feb 18:29:46.097 * Partial resynchronization not possible (no cached master)
22957:S 01 Feb 18:29:46.099 * Full resync from master: 71c1a3e7c890624b3eae12f6045ed0623de90015:1
22957:S 01 Feb 18:29:46.134 * MASTER <-> SLAVE sync: receiving 123 bytes from master
22957:S 01 Feb 18:29:46.134 * MASTER <-> SLAVE sync: Flushing old data
22957:S 01 Feb 18:29:46.134 * MASTER <-> SLAVE sync: Loading DB in memory
22957:S 01 Feb 18:29:46.134 * MASTER <-> SLAVE sync: Finished with success
22957:S 01 Feb 18:29:46.136 * Background append only file rewriting started by pid 23231
22957:S 01 Feb 18:29:46.160 * AOF rewrite child asks to stop sending diffs.
23231:C 01 Feb 18:29:46.160 * Parent agreed to stop sending diffs. Finalizing AOF...
23231:C 01 Feb 18:29:46.160 * Concatenating 0.00 MB of AOF diff received from parent.
23231:C 01 Feb 18:29:46.160 * SYNC append only file rewrite performed
23231:C 01 Feb 18:29:46.161 * AOF rewrite: 0 MB of memory used by copy-on-write
22957:S 01 Feb 18:29:46.196 * Background AOF rewrite terminated with success
22957:S 01 Feb 18:29:46.196 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
22957:S 01 Feb 18:29:46.196 * Background AOF rewrite finished successfully

最后我们把所有停掉的机器 无顺序 启动起来:


127.0.0.1:7004> cluster nodes
c273d2e673fe4f33dec28f6cf76cf54f3a7de55b 192.168.49.52:7003 slave 128b3b42e40433dc87605a36a836df9f4140cfbc 0 1517482410075 5 connected
1f95757226e95f684970ab0d3bb790f8b7bef39b 192.168.49.52:7001 master - 0 1517482412080 7 connected 0-5460
4131643910fb67ab282f118244a2bdeec9d5743d 192.168.49.52:7005 slave 2887d8f58899f1697fb1ca93e48eee759288efc7 0 1517482409774 6 connected
2887d8f58899f1697fb1ca93e48eee759288efc7 192.168.49.52:7004 myself,master - 0 0 3 connected 10923-16383 [5798-<-128b3b42e40433dc87605a36a836df9f4140cfbc] [8602-<-128b3b42e40433dc87605a36a836df9f4140cfbc] [9698-<-128b3b42e40433dc87605a36a836df9f4140cfbc]
128b3b42e40433dc87605a36a836df9f4140cfbc 192.168.49.52:7002 master - 0 1517482411077 2 connected 5461-10922
9e7f2d71d2813b94b5e6616b3ea9dd9719bb4a86 192.168.49.52:7000 slave 1f95757226e95f684970ab0d3bb790f8b7bef39b 0 1517482410577 7 connected

可以看到,7001 已经成了master,而7000 成了slave,说明在宕机的时候,主从进行了切换。


127.0.0.1:7004> get name
-> Redirected to slot [5798] located at 192.168.49.52:7002
"frank"

最后再次获取name,可以看到,集群已经能正常工作了。

后面的章节 包括但不限于:


1. Redis集群 如何增加节点,移除节点,重新切片等等


2. Redis的特性,基本数据结构,数据存储RDB、AOF,数据迁移。


3. Jedis客户端的使用。


4. RedisProtocol详解


5. 企业级Redis JavaApi详解


最后是土豪时间,欢迎打赏:


最新文章

123

最新摄影

闪念基因

微信扫一扫

第七城市微信公众平台