Advertisement

zookeeper节点故障之Unable to load database on disk

阅读量:

前言

今天在使用自己搭建的虚拟机测试时,发现3台zookeeper中有一台起不来,具体情况如下:

故障节点

复制代码
    $ZK_HOME/bin/zkServer.sh start 
    
    ZooKeeper JMX enabled by default
    Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED

但是jps查看进程的时候没有,其他2台是正常的。

查看zk的日志发现:

复制代码
    2021-05-10 11:34:38,908 [myid:1] - INFO  [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.380001e0ca len = 0 byte = 0
    2021-05-10 11:34:38,908 [myid:1] - INFO  [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.380000f211 len = 0 byte = 0
    2021-05-10 11:34:38,909 [myid:1] - INFO  [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.3500034cd6 len = 0 byte = 0
    2021-05-10 11:34:38,909 [myid:1] - INFO  [main:FileSnap@83] - Reading snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.3500021022
    2021-05-10 11:34:38,917 [myid:1] - ERROR [main:QuorumPeer@648] - Unable to load database on disk
    java.io.IOException: 输入/输出错误
    	at java.io.FileInputStream.readBytes(Native Method)
    	at java.io.FileInputStream.read(FileInputStream.java:255)
    	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    	at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    	at java.io.FilterInputStream.read(FilterInputStream.java:83)
    	at org.apache.zookeeper.server.persistence.FileTxnLog$PositionInputStream.read(FileTxnLog.java:452)
    	at java.io.DataInputStream.readInt(DataInputStream.java:387)
    	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
    	at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
    	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:585)
    	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:604)
    	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:570)
    	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
    	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:531)
    	at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:358)
    	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:140)
    	at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
    	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:601)
    	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:591)
    	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:164)
    	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
    	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
    2021-05-10 11:34:38,918 [myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
    java.lang.RuntimeException: Unable to run quorum server

提示故障机器的snapshot无效,无法从磁盘加载。

具体怎么做呢?

解决方案

由于另外两台机器是正常的,我们可以将故障机器的zk数据文件夹备份一下,让其从正常运行的节点之一复制快照

步骤如下:

复制代码
    mv $ZK_HOME/tmp/version-2   $ZK_HOME/tmp/version-2.bak
    
    $ZK_HOME/bin/zkServer.sh start

jps查看进程是否启动。并查看$ZK_HOME/tmp/version-2是否同步新的snapshot。

–by 俩只猴

全部评论 (0)

还没有任何评论哟~