[admin@cass048169 ~]$ /usr/install/cassandra/bin/nodetool stopdaemon Cassandra has shutdown. error: 拒绝连接 -- StackTrace -- java.net.ConnectException: 拒绝连接 at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:208) at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:147) at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:129) at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl_Stub.close(Unknown Source) at javax.management.remote.rmi.RMIConnector.close(RMIConnector.java:512) at javax.management.remote.rmi.RMIConnector.close(RMIConnector.java:452) at org.apache.cassandra.tools.NodeProbe.close(NodeProbe.java:237) at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:295) at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:206)
[qihuang.zheng@dp0652 ~]$ /usr/install/cassandra/bin/sstableloader -d localhost 1445938244634 Exception in thread "main" java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableLoader.<init>(SSTableLoader.java:59) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:80)
$ cd /home/admin/data/cassandra/data/forseti/velocity/snapshots $ /usr/install/cassandra/bin/sstableloader -d 192.168.6.52 1445938244634 Exception in thread "main" java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableLoader.<init>(SSTableLoader.java:59) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:80)
$ /usr/install/cassandra/bin/sstableloader -d 192.168.6.52 /home/admin/data/cassandra/data/forseti/velocity/snapshots/1445938244634 Could not retrieve endpoint ranges: InvalidRequestException(why:No such keyspace: snapshots) Run with --debug to get full stack trace or --help to get help.
[qihuang.zheng@mysql006070 ~]$ /usr/install/cassandra/bin/sstableloader -d 192.168.6.52 /home/admin/data/cassandra/data/forseti/velocity Established connection to initial hosts Opening sstables and calculating sections to stream Exception in thread "main" FSWriteError in /home/admin/data/cassandra/data/forseti/velocity/forseti-velocity-jb-414-Summary.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:122) at org.apache.cassandra.io.sstable.SSTableReader.loadSummary(SSTableReader.java:546) at org.apache.cassandra.io.sstable.SSTableReader.openForBatch(SSTableReader.java:173) at org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:107) at java.io.File.list(File.java:1155) at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:68) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:150) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:95) Caused by: java.nio.file.AccessDeniedException: /home/admin/data/cassandra/data/forseti/velocity/forseti-velocity-jb-414-Summary.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:118) ... 7 more [qihuang.zheng@mysql006070 ~]$ sudo -u admin /usr/install/cassandra/bin/sstableloader -d 192.168.6.52 /home/admin/data/cassandra/data/forseti/velocity Caused by: java.net.UnknownHostException: mysql006070: 未知的名称或服务 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 11 more
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.TreeMap.put(TreeMap.java:569) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.cassandra.io.compress.CompressionMetadata.getChunksForSections(CompressionMetadata.java:226) at org.apache.cassandra.streaming.messages.OutgoingFileMessage.<init>(OutgoingFileMessage.java:76) at org.apache.cassandra.streaming.StreamTransferTask.addTransferFile(StreamTransferTask.java:56) at org.apache.cassandra.streaming.StreamSession.addTransferFiles(StreamSession.java:340) at org.apache.cassandra.streaming.StreamPlan.transferFiles(StreamPlan.java:138) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:178) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:95) ERROR 13:32:10,104 Error in ThreadPoolExecutor java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.utils.BackgroundActivityMonitor.readAndCompute(BackgroundActivityMonitor.java:84) at org.apache.cassandra.utils.BackgroundActivityMonitor.getIOWait(BackgroundActivityMonitor.java:125) at org.apache.cassandra.utils.BackgroundActivityMonitor$BackgroundActivityReporter.run(BackgroundActivityMonitor.java:153) at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:80) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.cassandra.io.compress.CompressionMetadata.getChunksForSections(CompressionMetadata.java:226) at org.apache.cassandra.streaming.messages.OutgoingFileMessage.<init>(OutgoingFileMessage.java:76) at org.apache.cassandra.streaming.StreamTransferTask.addTransferFile(StreamTransferTask.java:56) at org.apache.cassandra.streaming.StreamSession.addTransferFiles(StreamSession.java:340) at org.apache.cassandra.streaming.StreamPlan.transferFiles(StreamPlan.java:138) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:178) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:95)
ERROR [StreamReceiveTask:124] 2016-06-10 18:09:31,717 StreamReceiveTask.java:183 - Error applying streamed data: org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.createSegments(MmappedSegmentedFile.java:399) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.complete(MmappedSegmentedFile.java:365) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:174) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableWriter.finish(SSTableWriter.java:463) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:447) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:442) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:141) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.io.IOException: Map failed at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888) ~[na:1.7.0_51] at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.createSegments(MmappedSegmentedFile.java:392) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 11 common frames omitted Caused by: java.lang.OutOfMemoryError: Map failed at sun.nio.ch.FileChannelImpl.map0(Native Method) ~[na:1.7.0_51] at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:885) ~[na:1.7.0_51] ... 12 common frames omitted ERROR [StreamReceiveTask:124] 2016-06-10 18:09:31,717 JVMStabilityInspector.java:117 - JVM state determined to be unstable. Exiting forcefully due to: java.lang.OutOfMemoryError: Map failed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
ERROR [CompactionExecutor:10060] 2016-07-10 15:29:45,520 CassandraDaemon.java:229 - Exception in thread Thread[CompactionExecutor:10060,1,main] java.lang.RuntimeException: Out of native memory occured, You can avoid it by increasing the system ram space or by increasing bloom_filter_fp_chance. at org.apache.cassandra.utils.obs.OffHeapBitSet.<init>(OffHeapBitSet.java:48) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.FilterFactory.createFilter(FilterFactory.java:84) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.FilterFactory.getFilter(FilterFactory.java:78) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:592) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:141) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.createCompactionWriter(CompactionTask.java:308) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:190) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
ll /home/admin/cassandra/data/forseti_fp/android_device_session/snapshots ll /home/admin/cassandra/data/forseti_fp/android_device_session/backups du -sh /home/admin/cassandra/data/forseti_fp/android_device_session/snapshots du -sh /home/admin/cassandra/data/forseti_fp/android_device_session/backups
After a system-wide snapshot is performed, you can enable incremental backups on each node to backup data that has changed since the last snapshot: each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory
cd /home/admin/cassandra/forseti_fp for file in /home/admin/cassandra/data/forseti_fp/* do iftest -d $file then table=`basename $file` snap=`ls $file/snapshots` if [ -n "$snap" ]; then echo$table$snap mkdir $table fi fi done
执行一次snapshot全量数据迁移(只需执行一次): nohup sh snap.sh > snap_alltable.log & 拷贝/home/admin/cassandra/data/forseti_fp/$table/snapshots/$snap/* 到/home/admin/cassandra/forseti_fp/$table
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
for file in /home/admin/cassandra/data/forseti_fp/* do iftest -d $file then table=`basename $file` snap=`ls $file/snapshots` if [ -n "$snap" ]; then mv $file/snapshots/$snap/* /home/admin/cassandra/forseti_fp/$table if [ "smart_device_map" == $table ]||[ "android_device_session_temp" == $table ]||[ "android_device_session" == $table ]||[ "android_device" == $table ]||[ "analysis" == $table ]||[ "device_session" == $table ]; then echo" " else echo$table$snap /usr/install/cassandra/bin/sstableloader -d 192.168.50.20,192.168.50.21,192.168.50.22,192.168.50.23,192.168.50.24 /home/admin/cassandra/forseti_fp/$table fi fi fi done
for file in /home/admin/cassandra/data/forseti_fp/* do iftest -d $file then table=`basename $file` if [ -d "$file/backups" ]; then rm /home/admin/cassandra/forseti_fp/$table/* cd$file/backups ls | xargs -t -I {} mv {} /home/admin/cassandra/forseti_fp/$table/
if [ "smart_device_map" == $table ]||[ "android_device_session_temp" == $table ]||[ "android_device" == $table ]||[ "analysis" == $table ]||[ "device_session" == $table ]; then echo" " else echo$table /usr/install/cassandra/bin/sstableloader -d 192.168.50.20,192.168.50.21,192.168.50.22,192.168.50.23,192.168.50.24 /home/admin/cassandra/forseti_fp/$table fi fi fi done
注意:要把android_device_session表加上,backups都必须同步
vi increment_timer.sh
1 2 3 4 5 6 7 8 9 10 11 12
ps -fe|grep BulkLoader |grep -v grep if [ $? -ne 0 ] then start=$(date +%s) echo"可以开始处理。。。" sh increment.sh end=$(date +%s) timeT=$(( $end - $start )) echo"COST:$timeT, start:$start, end: $end" else echo"BulkLoader正在运行,请稍等" fi
[admin@192-168-48-228 ~]$ /usr/install/cassandra/bin/sstableloader -d 10.21.21.20 -t 50 /home/admin/cassandra/forseti_fp/device ERROR 11:11:56 Error creating pool to /10.21.21.22:9042 com.datastax.driver.core.TransportException: [/10.21.21.22:9042] Cannot connect at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:156) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.driver.core.Connection$1.operationComplete(Connection.java:139) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.net.ConnectException: 拒绝连接: /10.21.21.22:9042 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_51] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) ~[na:1.7.0_51] at com.datastax.shaded.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) ~[cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] at com.datastax.shaded.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281) [cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:na] ... 6 common frames omitted Established connection to initial hosts Opening sstables and calculating sections to stream ...Streaming relevant part of /home/admin/cassan
[admin@fp-cass048159 device]$ /usr/install/cassandra/bin/sstableloader -d 10.21.21.20 -t 50 /home/admin/cassandra/forseti_fp/$table Could not retrieve endpoint ranges: org.apache.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 java.lang.RuntimeException: Could not retrieve endpoint ranges: at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:342) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:109) Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 at org.apache.thrift.transport.TSocket.open(TSocket.java:187) at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41) at org.apache.cassandra.tools.BulkLoader$ExternalClient.createThriftClient(BulkLoader.java:380) at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:302) ... 2 more Caused by: java.net.ConnectException: 拒绝连接 at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.thrift.transport.TSocket.open(TSocket.java:182) ... 6 more
1 2 3
[admin@fp-cass048159 ~]$ /usr/install/cassandra/bin/cqlsh 10.21.21.20 Connection error: ('Unable to connect to any servers', {'10.21.21.20': ProtocolError("cql_version '3.2.1' is not supported by remote (w/ native protocol). Supported versions: [u'3.3.1']",)})
nodetool status |grep RAC|awk '{print $2}' | while read ip; do echo $ip; nodetool -h $ip snapshot forseti_fp; done 192.168.48.227 Snapshot directory: 1467716730094 192.168.48.226 Snapshot directory: 1467716742257 192.168.48.176 Snapshot directory: 1467716753938 192.168.48.161 Requested creating snapshot(s) for [forseti_fp] with snapshot name [1467716770566] 192.168.48.228 Snapshot directory: 1467716780650 192.168.48.160 Requested creating snapshot(s) for [forseti_fp] with snapshot name [1467716792487] 192.168.48.175 Requested creating snapshot(s) for [forseti_fp] with snapshot name [1467716802910] 192.168.48.159 Requested creating snapshot(s) for [forseti_fp] with snapshot name [1467716815575]
mkdir data host=`ifconfig | grep "10.21.21." | awk '/inet addr/{sub("addr:",""); print $2}'` sed -i -e "s/localhost/$host/g" apache-cassandra-2.2.6/conf/cassandra.yaml sed -i -e "s#localhost#$host#g" apache-cassandra-2.2.6/conf/cassandra-env.sh sed -i -e "s/127.0.0.1/$seeds/g" apache-cassandra-2.2.6/conf/cassandra.yaml sed -i -e "s/Test Cluster/$cluster/g" apache-cassandra-2.2.6/conf/cassandra.yaml ln -s apache-cassandra-2.2.6 cassandra
cassandra/bin/cassandra cassandra/bin/nodetool status cassandra/bin/nodetool -h 10.21.21.10 status cassandra/bin/nodetool -h 10.21.21.19 status cassandra/bin/nodetool -h 10.21.21.131 status
/usr/install/hadoop/bin/hadoop fs -cat /user/tongdun/velocity/raw/2016-6-9/part-00000 | head 115.55.206|sdo|sdo_client|ip3|1465401548233|1465401548238838F30790E445243217|{"state":"0","ipAddressProvince":"河南省","ipAddressCity":"鹤壁市","ipAddress":"115.55.206.177","ext_is_gplus_login":"0","ext_is_bingding_mobile":"0","eventType":"Login","eventOccurTime":"1465401548233","deviceId":"00-0C-29-BC-62-D1","accountLogin":"166f751ebf90b9fbb38977d75b1265f6","eventId":"login_client","partnerCode":"sdo","ip3":"115.55.206","location":"鹤壁市","status":"Review"}
select * from velocity_app where attribute='115.55.206' and type='ip3' and partner_code='sdo' and app_name='sdo_client' and sequence_id > '1465401548238838F30790E445243210' and sequence_id < '1465401548238838F30790E445243220';
2015-10-31 13:46:07+0800 [forseti_cluster] WARN: Node 192.168.47.206 is reporting a schema disagreement: {UUID('bc083db4-b544-38d1-a37c-be3e78db1df1'): ['192.168.47.229'], UUID('67462d4d-a9c9-38a1-afd8-1a3917232b8a'): ['192.168.47.203', '192.168.47.228', '192.168.47.227', '192.168.47.202', '192.168.47.225', '192.168.47.204', '192.168.47.205', '192.168.47.206', '192.168.47.221', '192.168.47.222', '192.168.47.224']} 2015-10-31 13:46:07+0800 [] WARN: [control connection] No schema built on connect; retrying without wait for schema agreement 2015-10-31 13:46:13+0800 [] WARN: Host 192.168.47.229 has been marked down 2015-10-31 13:46:14+0800 [] WARN: Failed to create connection pool for new host 192.168.47.229: errors=Timed out creating connection, last_host=None 2015-10-31 13:47:07+0800 [] INFO: Host 192.168.47.229 may be up; will prepare queries and open connection pool
Streaming hanging is a familiar trouble in my cluster (2.0/2.1). In my experience, the node that being restarted recently won’t hang the streaming. So each time when I want to add or remove a node, I will restart all nodes one by one. 这种方式在线上是不可行的, 需要重启线上的每个节点. 于是尝试使用force直接中断.
[qihuang.zheng@cass047202 ~]$ nodetool removenode status RemovalStatus: Removing token (-9184682133698409841). Waiting for replication confirmation from [/192.168.47.206,/192.168.47.222,/192.168.47.221,/192.168.47.204,/192.168.47.205,/192.168.47.202,/192.168.47.224,/192.168.47.225].
[qihuang.zheng@cass047202 ~]$ jps 8596 NodeCmd [qihuang.zheng@cass047202 ~]$ kill -9 8596 [qihuang.zheng@cass047202 ~]$ nodetool removenode status RemovalStatus: Removing token (-9184682133698409841). Waiting for replication confirmation from [/192.168.47.206,/192.168.47.222,/192.168.47.221,/192.168.47.204,/192.168.47.205,/192.168.47.202,/192.168.47.224,/192.168.47.225].
4).删除操作正在进行,不能重复执行removenode命令,不过提示信息给了我们另一种解决方案.
1 2 3 4 5 6 7 8 9 10
[qihuang.zheng@cass047202 ~]$ /usr/install/cassandra/bin/nodetool status UN 192.168.47.202 612.13 GB 256 7.4% abaa0cbc-09d3-4990-8698-ff4d2f2bb4f7 RAC1 DL 192.168.47.226 406.34 GB 256 7.6% 2591cec9-42f9-4f60-a622-00d463910994 RAC1
[qihuang.zheng@cass047202 ~]$ nodetool removenode 2591cec9-42f9-4f60-a622-00d463910994 Exception in thread "main" java.lang.UnsupportedOperationException: This node is already processing a removal. Wait for it to complete, or use 'removenode force' if this has failed. at org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3342) ... at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
[qihuang.zheng@cass047202 ~]$ nodetool removenode force 2591cec9-42f9-4f60-a622-00d463910994 Missing an argument for removenode (either status, force, or an ID) usage: java org.apache.cassandra.tools.NodeCmd --host <arg> <command>
[qihuang.zheng@cass047202 ~]$ nodetool removenode force RemovalStatus: Removing token (-9184682133698409841). Waiting for replication confirmation from [/192.168.47.206,/192.168.47.222,/192.168.47.221,/192.168.47.204,/192.168.47.205,/192.168.47.202,/192.168.47.224,/192.168.47.225].
CAUTION: 1.Wait at least 72 hours to ensure that old node information is removed from gossip. If removed from the property file too soon, problems may result. 2.auto_bootstrap不能设置为false: java.lang.RuntimeException: Trying to replace_address with auto_bootstrap disabled will not work, check your configuration
WARN 13:49:37,423 Token -3920394820366823756 changing ownership from /192.168.47.229 to /192.168.48.160 INFO 13:50:40,122 Node /192.168.47.229 is now part of the cluster WARN 13:50:40,127 Not updating token metadata for /192.168.47.229 because I am replacing it INFO 13:50:40,127 Nodes /192.168.47.229 and /192.168.48.160 have the same token -1001533036333769848. Ignoring /192.168.47.229 INFO 13:51:10,272 FatClient /192.168.47.229 has been silent for 30000ms, removing from gossip
查看节点的状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
[qihuang.zheng@fp-cass048160 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 460.9 GB 256 20.1% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 6.02 MB 256 21.2% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.47.228 5.4 TB 256 19.2% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 UN 192.168.48.159 483.48 GB 256 19.5% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 686.97 GB 256 20.0% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
[qihuang.zheng@192-168-47-227 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 460.78 GB 256 20.1% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.47.228 5.4 TB 256 19.2% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 DN 192.168.47.229 937.05 GB 256 21.2% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.48.159 483.41 GB 256 19.5% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 686.93 GB 256 20.0% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
1.上面发现一个奇怪的现象就是229和160的HostID都是一样的, 所以日志中会有have the same token ignoring 2.227并不认识160, 而是认为229状态为DN 3.160上能看到其他节点, 但是看不到229.
过了有一周, 再次查看发现还是上面的状况, 而且160的日志还是经常打印Ignoring…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
[qihuang.zheng@fp-cass048160 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 972.25 GB 256 20.1% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 250.64 GB 256 21.2% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.47.228 5.89 TB 256 19.2% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 UN 192.168.48.159 1 TB 256 19.5% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 1.15 TB 256 20.0% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
[qihuang.zheng@192-168-47-227 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 972.16 GB 256 20.1% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.47.228 5.89 TB 256 19.2% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 DN 192.168.47.229 937.05 GB 256 21.2% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.48.159 1 TB 256 19.5% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 1.15 TB 256 20.0% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
[qihuang.zheng@192-168-47-227 ~]$ java -jar ./jmxterm.jar Welcome to JMX terminal. Type "help" for available commands. $>open localhost:7199 #Connection to localhost:7199 is opened $>bean org.apache.cassandra.net:type=Gossiper #bean is set to org.apache.cassandra.net:type=Gossiper $>run unsafeAssassinateEndpoint 192.168.47.229 #calling operation unsafeAssassinateEndpoint of mbean org.apache.cassandra.net:type=Gossiper #RuntimeMBeanException: java.lang.NullPointerException
% [Hit Enter to go into Browse Mode] Select a domain: [Enter number for org.apache.cassandra.net] Select an mbean: [Enter number for org.apache.cassandra.net:type=Gossiper] Select an attribute or operation: [Enter number for unsafeAssassinateEndpoint(String p1)] p1 (String): 192.168.47.229
It may also be possible to run it directly (untested): % jmx_invoke -m org.apache.cassandra.net:type=Gossiper unsafeAssassinateEndpoint <STALE-IP-ADDRESS>
[qihuang.zheng@192-168-47-227 ~]$ nodetool status -h 192.168.48.159 -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 1.15 TB 256 17.4% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 461.6 GB 256 15.4% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.47.228 6.09 TB 256 16.5% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 DN 192.168.47.229 937.05 GB 256 17.8% null RAC1 UN 192.168.48.159 1.2 TB 256 16.4% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 1.34 TB 256 16.5% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
因为peers表的主键是peer, 所以可以根据peer删除一条记录:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
cqlsh -e "delete from system.peers where peer='192.168.47.229';" 192.168.47.227 cqlsh -e "delete from system.peers where peer='192.168.47.229';" 192.168.47.228 cqlsh -e "delete from system.peers where peer='192.168.47.229';" 192.168.48.159 cqlsh -e "delete from system.peers where peer='192.168.47.229';" 192.168.48.161
[qihuang.zheng@192-168-47-227 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 1.16 TB 256 17.1% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 473.62 GB 256 16.1% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.47.228 6.1 TB 256 16.8% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 UN 192.168.47.229 44.07 KB 256 16.7% 41211503-27cd-47b0-b7c0-e5a7a4074d34 RAC1 ⬅️ UN 192.168.48.159 1.22 TB 256 17.8% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 1.35 TB 256 15.4% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
[qihuang.zheng@192-168-47-227 ~]$ nodetool removenode status RemovalStatus: Removing token (-9189073978895940412). Waiting for replication confirmation from [/192.168.48.161,/192.168.48.160,/192.168.47.228,/192.168.48.159,/192.168.47.227].
[qihuang.zheng@192-168-47-227 ~]$ nodetool ring | grep 9189073978895940412 192.168.47.229 RAC1 Down Leaving 44.07 KB 16.73% -9189073978895940412
[qihuang.zheng@192-168-47-227 ~]$ nodetool removenode force RemovalStatus: Removing token (-9189073978895940412). Waiting for replication confirmation from [/192.168.48.161,/192.168.48.160,/192.168.47.228,/192.168.48.159,/192.168.47.227]. [qihuang.zheng@192-168-47-227 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.161 1.16 TB 256 20.6% 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 474.38 GB 256 20.2% 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.47.228 6.1 TB 256 20.4% f3d26148-d2da-479c-ae9e-ae41aced1be9 RAC1 UN 192.168.48.159 1.22 TB 256 20.1% df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 UN 192.168.47.227 1.35 TB 256 18.8% 02575631-6ccb-4803-81fd-5bf7a978726d RAC1
[admin@fp-cass048160 ~]$ /usr/install/cassandra/bin/nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.227 2.89 TB 256 ? f6136233-08b3-4cad-bd37-57ab6dc4622b RAC1 UN 192.168.48.226 2.81 TB 256 ? 6f4416e1-4493-4909-8427-3738ae72fd82 RAC1 UN 192.168.48.176 6.12 TB 256 ? 08491b7a-8e9f-4b5e-b22b-2e73c455fd3f RAC1 UN 192.168.48.161 4.48 TB 256 ? 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 4.57 TB 256 ? 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.48.175 3.21 TB 256 ? 5bbd4400-2132-42b6-91c5-389592b75423 RAC1 UN 192.168.48.159 4.69 TB 256 ? df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1 DN 192.168.48.228 2.72 TB 256 ? dbf53446-35b6-4d18-80e2-396bc633924c RAC1 [admin@fp-cass048160 ~]$ nohup nodetool removenode dbf53446-35b6-4d18-80e2-396bc633924c & [admin@fp-cass048160 ~]$ nodetool removenode force [admin@fp-cass048160 ~]$ /usr/install/cassandra/bin/nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.227 2.89 TB 256 ? f6136233-08b3-4cad-bd37-57ab6dc4622b RAC1 UN 192.168.48.226 2.81 TB 256 ? 6f4416e1-4493-4909-8427-3738ae72fd82 RAC1 UN 192.168.48.176 6.12 TB 256 ? 08491b7a-8e9f-4b5e-b22b-2e73c455fd3f RAC1 UN 192.168.48.161 4.48 TB 256 ? 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.160 4.57 TB 256 ? 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.48.175 3.21 TB 256 ? 5bbd4400-2132-42b6-91c5-389592b75423 RAC1 UN 192.168.48.159 4.69 TB 256 ? df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1
但是228节点认为其他节点都是好的。 如果尝试加入,会说已经在组中了。但是实际上它并不属于组了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14
[admin@192-168-48-228 ~]$ /usr/install/cassandra/bin/nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.48.227 2.88 TB 256 ? f6136233-08b3-4cad-bd37-57ab6dc4622b RAC1 UN 192.168.48.226 2.79 TB 256 ? 6f4416e1-4493-4909-8427-3738ae72fd82 RAC1 UN 192.168.48.176 6.1 TB 256 ? 08491b7a-8e9f-4b5e-b22b-2e73c455fd3f RAC1 UN 192.168.48.161 4.47 TB 256 ? 54b3a7e0-f778-4087-98c3-ac84e56f77e6 RAC1 UN 192.168.48.228 2.72 TB 256 ? dbf53446-35b6-4d18-80e2-396bc633924c RAC1 UN 192.168.48.160 4.55 TB 256 ? 18a87c56-dcba-4614-adba-804aa7761a06 RAC1 UN 192.168.48.175 3.2 TB 256 ? 5bbd4400-2132-42b6-91c5-389592b75423 RAC1 UN 192.168.48.159 4.67 TB 256 ? df9a693b-efc1-41bc-9a42-cf868ea75e65 RAC1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless [admin@192-168-48-228 ~]$ /usr/install/cassandra/bin/nodetool join nodetool: This node has already joined the ring.
网络状态netstats: 显示一个节点的Active Stream
正常没有stream:
1 2 3 4 5 6 7 8 9 10
[admin@cass047202 ~]$ nodetool netstats Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 749 Mismatch (Blocking): 4 Mismatch (Background): 13 Pool Name Active Pending Completed Commands n/a 0 581681910 Responses n/a 0 519983825
[qihuang.zheng@cass047202 ~]$ nodetool info Token : (invoke with -T/--tokens to see all 256 tokens) ID : abaa0cbc-09d3-4990-8698-ff4d2f2bb4f7 Gossip active : true Thrift active : true Native Transport active: true Load : 618.81 GB Generation No : 1445525962 Uptime (seconds) : 651191 Heap Memory (MB) : 6167.96 / 15974.44 Off Heap Memory (MB) : 1154.13 Data Center : DC1 Rack : RAC1 Exceptions : 50 Key Cache : size 520120120 (bytes), capacity 536870912 (bytes), 112077861 hits, 182128983 requests, 0.656 recent hit rate, 14400 save period in seconds Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
仅仅查看每个节点的KeyCache:
1 2
./pssh.sh ip_all.txt "/usr/install/cassandra/bin/nodetool info | tail -2 | head -1" ./pssh.sh ip_all.txt "/usr/install/cassandra/bin/nodetool info | sed -n '14p'"
[qihuang.zheng@cass047202 cassandra]$ nodetool cfstats forseti.velocity Keyspace: forseti Read Count: 10470099 Read Latency: 1.3186399419909973 ms. Write Count: 146970362 Write Latency: 0.06062576270989929 ms. Pending Tasks: 0 Table: velocity SSTable count: 2144 SSTables in each level: [1, 10, 96, 723, 1314, 0, 0, 0, 0] Space used (live), bytes: 509031385679 Space used (total), bytes: 523815500936 Off heap memory used (total), bytes: 558210701 SSTable Compression Ratio: 0.23635049381008288 Number of keys (estimate): 269787648 Memtable cell count: 271431 Memtable data size, bytes: 141953019 Memtable switch count: 1713 Local read count: 10470099 Local read latency: 1.266 ms Local write count: 146970371 Local write latency: 0.053 ms Pending tasks: 0 Bloom filter false positives: 534721 Bloom filter false ratio: 0.13542 Bloom filter space used, bytes: 180529808 Bloom filter off heap memory used, bytes: 180512656 Index summary off heap memory used, bytes: 118613037 Compression metadata off heap memory used, bytes: 259085008 Compacted partition minimum bytes: 104 Compacted partition maximum bytes: 190420296972 Compacted partition mean bytes: 8656 Average live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0
[qihuang.zheng@spark047211 ~]$ nodetool -h 192.168.48.159 cfstats forseti_fp.android_device_session Keyspace: forseti_fp Read Count: 3436820 Read Latency: 0.7271564521854504 ms. Write Count: 1242325989 Write Latency: 0.01608114556074058 ms. Pending Flushes: 0 Table: android_device_session SSTable count: 14 Space used (live): 3315056965329 Space used (total): 3315312862453 Space used by snapshots (total): 0 Off heap memory used (total): 1813094460 SSTable Compression Ratio: 0.37206192803944754 Number of keys (estimate): 954623103 -> 9亿 Memtable cell count: 92654 Memtable data size: 104729033 Memtable off heap memory used: 0 Memtable switch count: 13386 Local read count: 3436820 Local read latency: 0.728 ms Local write count: 1242326281 Local write latency: 0.017 ms Pending flushes: 0 Bloom filter false positives: 15 Bloom filter false ratio: 0.00000 Bloom filter space used: 607412928 Bloom filter off heap memory used: 607412816 Index summary off heap memory used: 138144620 Compression metadata off heap memory used: 1067537024 Compacted partition minimum bytes: 125 Compacted partition maximum bytes: 4866323 Compacted partition mean bytes: 10051 Average live cells per slice (last five minutes): 0.10287086386072478 Maximum live cells per slice (last five minutes): 7.0 Average tombstones per slice (last five minutes): 0.14188850295078587 Maximum tombstones per slice (last five minutes): 2164.0
[qihuang.zheng@spark047211 ~]$ nodetool -h 192.168.48.162 cfstats forseti.velocity_app Keyspace: forseti Read Count: 120443046 Read Latency: 1.1013535886995087 ms. Write Count: 2058015125 Write Latency: 0.016679299918653415 ms. Pending Tasks: 0 Table: velocity_app SSTable count: 24 Space used (live), bytes: 2670044515793 Space used (total), bytes: 2670239936283 Off heap memory used (total), bytes: 2958340937 SSTable Compression Ratio: 0.28255581535232427 Number of keys (estimate): 1976325248 (19亿) Memtable cell count: 613404 Memtable data size, bytes: 233170920 Memtable switch count: 4214 Local read count: 120443053 Local read latency: 0.931 ms Local write count: 2058015151 Local write latency: 0.015 ms Pending tasks: 0 Bloom filter false positives: 45097 Bloom filter false ratio: 0.00732 Bloom filter space used, bytes: 1230425688 Bloom filter off heap memory used, bytes: 1230425496 Index summary off heap memory used, bytes: 578011001 Compression metadata off heap memory used, bytes: 1149904440 Compacted partition minimum bytes: 150 Compacted partition maximum bytes: 158(GB),683(MB),580(KB),810 Compacted partition mean bytes: 5225
[qihuang.zheng@spark047211 ~]$ nodetool -h 192.168.48.162 cfstats forseti.velocity_partner Keyspace: forseti Read Count: 3664048 Read Latency: 1.265197941730021 ms. Write Count: 1963754310 Write Latency: 0.017543929064120042 ms. Pending Tasks: 0 Table: velocity_partner SSTable count: 5798 SSTables in each level: [1, 11/10, 90, 881, 4813, 0, 0, 0, 0] Space used (live), bytes: 1237251258441 Space used (total), bytes: 1240223108780 Off heap memory used (total), bytes: 1220821758 SSTable Compression Ratio: 0.2518164031088747 Number of keys (estimate): 821395584 8亿 Memtable cell count: 775752 Memtable data size, bytes: 265533044 Memtable switch count: 4217 Local read count: 3664048 Local read latency: 1.200 ms Local write count: 1963754352 Local write latency: 0.017 ms Pending tasks: 0 Bloom filter false positives: 599 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 503778320 Bloom filter off heap memory used, bytes: 503731936 Index summary off heap memory used, bytes: 144796454 Compression metadata off heap memory used, bytes: 572293368 Compacted partition minimum bytes: 259 Compacted partition maximum bytes: 91,830,775,932 Compacted partition mean bytes: 6265 Average live cells per slice (last five minutes): 1.0 Average tombstones per slice (last five minutes): 0.0
[qihuang.zheng@spark047211 ~]$ nodetool -h 192.168.48.162 cfstats forseti.velocity_global Keyspace: forseti Read Count: 27019937 Read Latency: 0.8387888214173111 ms. Write Count: 2057412727 Write Latency: 0.01946266525841317 ms. Pending Tasks: 0 Table: velocity_global SSTable count: 5468 SSTables in each level: [7/4, 12/10, 77, 703, 4667, 0, 0, 0, 0] Space used (live), bytes: 1476150066588 Space used (total), bytes: 1477238480867 Off heap memory used (total), bytes: 1092884433 SSTable Compression Ratio: 0.24933201816062378 Number of keys (estimate): 558798080 5亿 Memtable cell count: 120625 Memtable data size, bytes: 38130612 Memtable switch count: 4250 Local read count: 27019937 Local read latency: 1.128 ms Local write count: 2057412797 Local write latency: 0.016 ms Pending tasks: 0 Bloom filter false positives: 1700 Bloom filter false ratio: 0.00000 Bloom filter space used, bytes: 319642480 Bloom filter off heap memory used, bytes: 319598736 Index summary off heap memory used, bytes: 100862649 Compression metadata off heap memory used, bytes: 672423048 Compacted partition minimum bytes: 311 Compacted partition maximum bytes: 568G,591M,960K,032 Compacted partition mean bytes: 10777 Average live cells per slice (last five minutes): 0.0 Average tombstones per slice (last five minutes): 0.0
sstable_size_in_mb默认是160MB. The target size for SSTables that use the leveled compaction strategy. Although SSTable sizes should be less or equal to sstable_size_in_mb, 尽管SSTable的大小应该比默认的160M要小或相等. it is possible to have a larger SSTable during compaction. 在Compaction过程中,可能产生更大的SSTable. This occurs when data for a given partition key is exceptionally large. 当某个分区键的数据非常大时, The data is not split into two SSTables. 分区很大的键在compact时不会拆分到两个SSTable文件中(因为partition key只会在一个SSTable中).
最后一列表示rows_merged:{tables:rows}. For example: {1:3, 3:1} means 3 rows were taken from one SSTable (1:3) and 1 row taken from 3 SSTables (3:1) to make the one SSTable in that compaction operation.
[qihuang.zheng@cass047202 ~]$ nodetool compactionstats pending tasks: 1 compaction type keyspace table completed total unit progress Compaction forseti velocity 4094787925 6749331650 bytes 60.67% Active compaction remaining time : 0h00m19s
tpstats
正常情况下不应该出现有Dropped的,但是在Driver报错:All host(s) tried for query failed,… connection has been closed. 是不是连接数过多,直接拒绝?
INFO [SharedPool-Worker-40] 2016-06-23 17:38:19,848 Message.java:532 - Unexpected exception during request; channel = [id: 0x04539b0a, /192.168.47.34:64733 :> /192.168.47.12:9042] java.io.IOException: Error while read(...): Connection reset by peer at io.netty.channel.epoll.Native.readAddress(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
sstable writer
Loads newly placed SSTables onto the system without a restart.
小批量测试 cp /home/admin/md5id_20160601/data/md5_id/data-md5_id-ka-2165* cd ~/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/ rename data md5 * nodetool -h 192.168.47.202 refresh md5 md5_id
#重命名所有文件 cd /home/admin/md5id_20160601/data/md5_id/ rename data md5 *
#找出所有不同编号 for f in $( ls | cut -d'-' -f-4 | uniq | head -5 ); do echo $f done
#一次多个,但是如何循环所有,直到文件夹都处理完毕? --》使用while循环判断文件夹下存在文件时处理 ls | cut -d'-' -f-4 | uniq | head -5 | while read f; do #分组,一次可以多个数据文件,而一个数据文件总共包含8个相关组件 mv $f-* ~/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/ #移动到cassandra的目标位置 done nodetool -h 192.168.47.202 refresh md5 md5_id #刷新sstable
#模拟把分多次把一个文件夹test的文件搬到另一个文件夹test1 rm -rf test test1 && mkdir test && cd test && touch 10 11 12 13 14 15 20 21 22 23 30 31 32 && cd ~/ && mkdir test1
cd test flag="BEGIN" while [ ! -d test ]; do ls | head -2 | while read f; do mv $f ~/test1 done #确保只需要执行一次, test的文件都被移动到test1后,test下没有文件,判断依据是:ls -A为空 #但是如果没有加任何条件,while循环会无条件判断,因为while中是判断存在文件夹 if [[ "`ls -A ~/test`" = "" && $flag != "OVER" ]]; then flag="OVER" echo "DONE" fi done
#可以把判断放在while循环里!--GOOD cd test flag="BEGIN" while [[ ! -d test && $flag != "OVER" ]]; do ls | head -2 | while read f; do mv $f ~/test1 done if [ "`ls -A ~/test`" = "" ]; then flag="OVER" fi done echo "OVER..."
#当然可以把if的所有条件都放到while里 cd test while [[ ! -d test && "`ls -A ~/test`" != "" ]]; do ls | head -2 | while read f; do mv $f ~/test1 done done echo "OVER..."
##最终脚本 startT=$(date +%s) cd /home/admin/md5id_20160601/data/md5_id while [[ ! -d /home/admin/md5id_20160601/data/md5_id && "`ls -A /home/admin/md5id_20160601/data/md5_id`" != "" ]]; do echo "......" start=$(date +%s) ls | cut -d'-' -f-4 | uniq | head -5 | while read f; do cfile="$f-*" echo $cfile mv $cfile /home/admin/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/ done nodetool -h 192.168.47.202 refresh md5 md5_id end=$(date +%s) time=$(( $end - $start )) echo "耗时:$time" done echo "OVER..." endT=$(date +%s) timeT=$(( $endT - $startT )) echo "Total耗时:$timeT"
ONLY ONE SHELL: IF RUN TOO LONG, JUST CTRL+z, bg, jobs:
1 2
cd /home/admin/md5id_20160601/data/md5_id ls | cut -d'-' -f-4 | uniq | while read f; do echo "batch $f"; mv $f-* /home/admin/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/; nodetool -h 192.168.47.202 refresh md5 md5_id; done
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 350224384 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2726), pid=9998, tid=140593796617984 # # JRE version: (7.0_51-b13) (build ) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.51-b03 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
--------------- T H R E A D ---------------
Current thread (0x00007fde84009800): JavaThread "Unknown thread" [_thread_in_vm, id=10000, stack(0x00007fde8b3e1000,0x00007fde8b4e2000)]
Stack: [0x00007fde8b3e1000,0x00007fde8b4e2000], sp=0x00007fde8b4e01a0, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x992f4a] VMError::report_and_die()+0x2ea V [libjvm.so+0x4931ab] report_vm_out_of_memory(char const*, int, unsigned long, char const*)+0x9b V [libjvm.so+0x81338e] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0xfe V [libjvm.so+0x81383f] os::Linux::commit_memory_impl(char*, unsigned long, unsigned long, bool)+0x4f V [libjvm.so+0x813a2c] os::pd_commit_memory(char*, unsigned long, unsigned long, bool)+0xc V [libjvm.so+0x80daea] os::commit_memory(char*, unsigned long, unsigned long, bool)+0x2a V [libjvm.so+0x87fcd3] PSVirtualSpace::expand_by(unsigned long)+0x53 V [libjvm.so+0x86eaf3] PSOldGen::initialize(ReservedSpace, unsigned long, char const*, int)+0x103 V [libjvm.so+0x299043] AdjoiningGenerations::AdjoiningGenerations(ReservedSpace, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)+0x3e3 V [libjvm.so+0x8341e0] ParallelScavengeHeap::initialize()+0x550 V [libjvm.so+0x9664ca] Universe::initialize_heap()+0xca V [libjvm.so+0x967699] universe_init()+0x79 V [libjvm.so+0x5a9625] init_globals()+0x65 V [libjvm.so+0x94ef8d] Threads::create_vm(JavaVMInitArgs*, bool*)+0x1ed V [libjvm.so+0x6307e4] JNI_CreateJavaVM+0x74 C [libjli.so+0x2f8e] JavaMain+0x9e
ERROR 01:47:33 Exception encountered during startup org.apache.cassandra.io.FSReadError: java.lang.NullPointerException at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:668) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:308) [apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) [apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) [apache-cassandra-2.1.13.jar:2.1.13] Caused by: java.lang.NullPointerException: null at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:660) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 3 common frames omitted FSReadError in Failed to remove unfinished compaction leftovers (file: /home/admin/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/md5-md5_id-ka-1051-Statistics.db). See log for details. at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:668) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:308) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) Caused by: java.lang.NullPointerException at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:660) ... 3 more Exception encountered during startup: java.lang.NullPointerException
Polling page always armed Deoptimize 4 GenCollectForAllocation 1 CMS_Initial_Mark 1 CMS_Final_Remark 1 EnableBiasedLocking 1 RevokeBias 77 BulkRevokeBias 5 Exit 1 15 VM operations coalesced during safepoint Maximum sync time 6133 ms Maximum vm operation time (except for Exit VM operation) 521 ms
WARN [SharedPool-Worker-52] 2016-06-19 10:23:30,376 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-52,5,main]: {} java.lang.RuntimeException: java.lang.RuntimeException: java.io.FileNotFoundException: /home/admin/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/md5-md5_id-ka-139-Data.db (打开的文件过>多) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2244) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.13.jar:2.1.13] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /home/admin/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/md5-md5_id-ka-139-Data.db (打开的文件过多) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:52) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:85) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:2075) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableScanner.<init>(SSTableScanner.java:84) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableScanner.getScanner(SSTableScanner.java:63) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1859) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:67) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:2074) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2191) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:132) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1567) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2241) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 4 common frames omitted Caused by: java.io.FileNotFoundException: /home/admin/cassandra/data/md5/md5_id-f88d3930345811e694a62bcdca057dae/md5-md5_id-ka-139-Data.db (打开的文件过多) at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51] at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241) ~[na:1.7.0_51] at org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:65) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:70) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:48) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 15 common frames omitted
WARN [epollEventLoopGroup-2-1] 2016-06-19 10:23:54,839 Slf4JLogger.java:151 - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. java.io.IOException: Error during accept(...): 打开的文件过多 at io.netty.channel.epoll.Native.accept(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollServerSocketChannel$EpollServerSocketUnsafe.epollInReady(EpollServerSocketChannel.java:102) ~[netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-all-4.0.23.Final.jar:4.0.23.Final] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
[admin@cass047202 cassandra]$ cqlsh 192.168.47.202 Connected to forseti_cluster at 192.168.47.202:9042. [cqlsh 5.0.1 | Cassandra 2.1.13 | CQL spec 3.2.1 | Native protocol v3] Use HELP for help. cqlsh> use data; cqlsh:data> select * from md5_id limit 1; Warning: schema version mismatch detected, which might be caused by DOWN nodes; if this is not the case, check the schema versions of your nodes in system.local and system.peers. Schema metadata was not refreshed. See log for details. cqlsh:data> quit [admin@cass047202 cassandra]$ cqlsh 192.168.47.202 Connection error: ('Unable to connect to any servers', {'192.168.47.202': OperationTimedOut('errors=None, last_host=None',)}) [admin@cass047202 cassandra]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.47.206 802.88 GB 256 ? 75f42842-e3ac-4bbe-947d-6b7537a521da RAC1 UN 192.168.47.222 670.08 GB 256 ? 1cc2c236-8def-4f2b-8149-28d591fc6b05 RAC1 UN 192.168.47.204 627.63 GB 256 ? 91ad3d42-4207-46fe-8188-34c3f0b2dbd2 RAC1 UN 192.168.47.221 677.56 GB 256 ? 87e100ed-85c4-44cb-9d9f-2d602d016038 RAC1 UN 192.168.47.205 724.58 GB 256 ? ac6313c8-e0b5-463b-8f90-55dc0f59e476 RAC1 UN 192.168.47.202 1.72 TB 256 ? abaa0cbc-09d3-4990-8698-ff4d2f2bb4f7 RAC1 UN 192.168.47.203 1.01 TB 256 ? 19b0b9cc-cad2-4b61-8da6-95423fe94af8 RAC1 UN 192.168.47.224 739.47 GB 256 ? 27e84abe-fb06-47ff-8861-130767ee006b RAC1 UN 192.168.47.225 677.46 GB 256 ? 216c67cf-de7d-4190-9d0d-441fc16a7f71 RAC1
select key,bootstrapped,broadcast_address,cluster_name,cql_version,data_center,gossip_generation,host_id,listen_address,native_protocol_version,partitioner, rack,release_version,rpc_address,schema_version,thrift_version from system.local;
[admin@cass047203 ~]$ nodetool status -- Address Load Tokens Owns Host ID Rack UN 192.168.47.205 724.58 GB 256 ? ac6313c8-e0b5-463b-8f90-55dc0f59e476 RAC1 DN 192.168.47.202 1.72 TB 256 ? abaa0cbc-09d3-4990-8698-ff4d2f2bb4f7 RAC1 UN 192.168.47.203 1.01 TB 256 ? 19b0b9cc-cad2-4b61-8da6-95423fe94af8 RAC1
INFO 01:50:05 Harmless error reading saved cache /home/admin/cassandra/saved_caches/KeyCache-ba.db java.lang.RuntimeException: Cache schema version b48bf712-fad2-3951-bb18-aa178e738b30 does not match current schema version 73e0e7c4-03b1-3db6-8967-d7c0144faa5c at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:188) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.cache.AutoSavingCache$3.call(AutoSavingCache.java:148) [apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.cache.AutoSavingCache$3.call(AutoSavingCache.java:144) [apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
只要不操作:select * from md5_id limit 1; 其他节点就不会认为202DN掉了。
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.<init>(String.java:203) at java.lang.StringBuilder.toString(StringBuilder.java:405) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.last1(AbstractGenData.java:220) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.genIdCardOneProvince(AbstractGenData.java:82) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.genIdCard(AbstractGenData.java:63) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.genData(AbstractGenData.java:55) at cn.fraudmetrix.vulcan.rainbowtable.sstable.BulkLoadIdCard.main(BulkLoadIdCard.java:116)
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000632f80000, 436731904, 0) failed; error='无法分配内存' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 436731904 bytes for committing reserved memory. # An error report file with more information is saved as: # /home/admin/hs_err_pid33681.log
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7625 admin 20 0 21.8g 19g 16m R 101.7 63.6 17:55.41 java 7603 admin 20 0 21.8g 19g 16m R 99.7 63.6 28:13.74 java 7627 admin 20 0 21.8g 19g 16m R 99.7 63.6 13:13.43 java 7628 admin 20 0 21.8g 19g 16m R 99.7 63.6 13:13.73 java
[admin@cass047225 ~]$ ./show-busy-javathreads.sh Busy(73.4%) thread(7603/0x1db3) stack of java process(7594) under user(admin): "main" prio=10 tid=0x00007f7468009000 nid=0x1db3 runnable [0x00007f7471aa9000] java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2271) at java.lang.StringCoding.safeTrim(StringCoding.java:79) at java.lang.StringCoding.encode(StringCoding.java:365) at java.lang.String.getBytes(String.java:939) at org.apache.cassandra.utils.ByteBufferUtil.bytes(ByteBufferUtil.java:225) at org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:49) at org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:26) at org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:73) at org.apache.cassandra.io.sstable.CQLSSTableWriter.addRow(CQLSSTableWriter.java:142) at org.apache.cassandra.io.sstable.CQLSSTableWriter.addRow(CQLSSTableWriter.java:118) at cn.fraudmetrix.vulcan.rainbowtable.sstable.BulkLoadIdCard.batchWrite(BulkLoadIdCard.java:55) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.genIdCardOneProvince(AbstractGenData.java:86) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.genIdCard(AbstractGenData.java:63) at cn.fraudmetrix.vulcan.rainbowtable.util.AbstractGenData.genData(AbstractGenData.java:55) at cn.fraudmetrix.vulcan.rainbowtable.sstable.BulkLoadIdCard.main(BulkLoadIdCard.java:116)
Busy(46.8%) thread(7625/0x1dc9) stack of java process(7594) under user(admin): "G1 Concurrent Refinement Thread#0" prio=10 tid=0x00007f746803c800 nid=0x1dc9 runnable
Busy(34.5%) thread(7628/0x1dcc) stack of java process(7594) under user(admin): "Gang worker#1 (G1 Parallel Marking Threads)" prio=10 tid=0x00007f746806b800 nid=0x1dcc runnable
Busy(34.5%) thread(7627/0x1dcb) stack of java process(7594) under user(admin): "Gang worker#0 (G1 Parallel Marking Threads)" prio=10 tid=0x00007f7468069800 nid=0x1dcb runnable
Busy(25.6%) thread(7688/0x1e08) stack of java process(7594) under user(admin): "Thread-2" prio=10 tid=0x00007f7469e3b000 nid=0x1e08 waiting on condition [0x00007f73f012c000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000003fb131618> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925) at org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter$DiskWriter.run(SSTableSimpleUnsortedWriter.java:240)
[/192.168.47.222, /192.168.47.204, /192.168.47.203, /192.168.47.224] java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:125) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:607) at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:471) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256) at java.lang.Thread.run(Thread.java:744)
ERROR [CompactionExecutor:19] 2016-06-29 10:06:19,266 CassandraDaemon.java:229 - Exception in thread Thread[CompactionExecutor:19,1,main] java.lang.RuntimeException: Not enough space for compaction, estimated sstables = 1, expected write size = 221195610 at org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace(CompactionTask.java:296) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:124) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:626) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] ERROR [HintedHandoffManager:1] 2016-06-29 10:06:19,267 CassandraDaemon.java:229 - Exception in thread Thread[HintedHandoffManager:1,1,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Not enough space for compaction, estimated sstables = 1, expected write size = 221195610 at org.apache.cassandra.db.HintedHandOffManager.compact(HintedHandOffManager.java:282) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:522) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:93) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:182) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_51] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_51] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_51] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Not enough space for compaction, estimated sstables = 1, expected write size = 221195610 at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.7.0_51] at java.util.concurrent.FutureTask.get(FutureTask.java:188) [na:1.7.0_51] at org.apache.cassandra.db.HintedHandOffManager.compact(HintedHandOffManager.java:278) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 11 common frames omitted Caused by: java.lang.RuntimeException: Not enough space for compaction, estimated sstables = 1, expected write size = 221195610 at org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace(CompactionTask.java:296) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:124) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:626) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_51] ... 3 common frames omitted WARN [STREAM-IN-/127.0.0.1] 2016-06-29 10:06:29,500 CompressedStreamReader.java:115 - [Stream ab888f50-3d8f-11e6-8b24-dff3aadcb7ef] Error while reading partition DecoratedKey(-677751571849691691, 4435344444443536393133383636353144303931304334383341363032413842) from stream on ks='md5' and table='md5_id'./usr
HiccupMeter: Failed to open log file. INFO 06:02:01 Classpath: /usr/install/cassandra/bin/../conf:.../usr/install/jHiccup-2.0.6/jHiccup.jar:/usr/install/cassandra/bin/../lib/jamm-0.3.0.jar INFO 06:02:01 JVM Arguments: [-javaagent:/usr/install/jHiccup-2.0.6/jHiccup.jar=, -ea, -javaagent:/usr/install/cassandra/bin/../lib/jamm-0.3.0.jar, -XX: WARN 06:02:01 Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root. WARN 06:02:01 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. ERROR 06:02:01 Error starting local jmx server: java.rmi.server.ExportException: Port already in use: 7199; nested exception is: java.net.BindException: 地址已在使用
[qihuang.zheng@dp0652 ~]$ export _JAVA_OPTIONS='-javaagent:/usr/install/jHiccup-2.0.6/jHiccup.jar="-d 20000 -i 1000"' && sudo -u admin /usr/install/cassandra/bin/cassandra [qihuang.zheng@dp0652 ~]$ Picked up _JAVA_OPTIONS: -javaagent:/usr/install/jHiccup-2.0.6/jHiccup.jar="-d 20000 -i 1000" HiccupMeter: Failed to open log file. INFO 06:24:09 JVM Arguments: [-ea, -javaagent:/usr/install/cassandra/bin/../lib/jamm-0.3.0.jar, -Dcassandra.storagedir=/usr/install/cassandra/bin/../data, -javaagent:/usr/install/jHiccup-2.0.6/jHiccup.jar=-d 20000 -i 1000] WARN 06:24:09 Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root. WARN 06:24:09 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. ERROR 06:24:09 Error starting local jmx server: java.rmi.server.ExportException: Port already in use: 7199; nested exception is: java.net.BindException: 地址已在使用
[qihuang.zheng@dp0652 ~]$ jps -lm Picked up _JAVA_OPTIONS: -javaagent:/usr/install/jHiccup-2.0.6/jHiccup.jar="-d 20000 -i 1000" 35054 sun.tools.jps.Jps -lm [qihuang.zheng@dp0652 ~]$ sudo -u admin jps -lm Picked up _JAVA_OPTIONS: -javaagent:/usr/install/jHiccup-2.0.6/jHiccup.jar="-d 20000 -i 1000" HiccupMeter: Failed to open log file.
We use jHiccup -p $pid to attach jHiccup to Cassandra Service, but unfortunately, there are non hlog file generated which should be in Cassandra install home. Last week I had demonstration to Daniel, and Daniel say he will checkout. Here is our steps on production environment