看起来是存储有问题,先上异常栈:
[code]
2016-08-16 05:44:30.265 ERROR 1419 --- [http-nio-9001-exec-283] -
java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Tue Aug 16 05:44:30 CST 2016, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68076: row 'h�^A"^@^@^@^@^@e�ё ^D^@' on table 'fileindex' at region=fileindex,,1461811143505.c0d3d5ad7798ea1a5c6081fb35872735.,
hostname=xxxx292,16020,1470015445443, seqNum=2444796
at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:97) ~[hbase-client-1.1.1.jar!/:1.1.1]
....
at sun.reflect.GeneratedMethodAccessor139.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_79]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_79]
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:222) [spring-web-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:137) [spring-web-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:814) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:737) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:969) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:871) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:648) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:845) [spring-webmvc-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) [tomcat-embed-websocket-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121) [spring-web-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-4.2.4.RELEASE.jar!/:4.2.4.RELEASE]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:676) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:521) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1096) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:674) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1500) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1456) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.0.30.jar!/:8.0.30]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Tue Aug 16 05:44:30 CST 2016, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68076: row 'h�^A"^@^@^@^@^@e�ё ^D^@' on table 'fileindex' at region=fileindex,,1461811143505.c0d3d5ad7798ea1a5c6081fb35872735., hostname=xxxx292,16020,1470015445443, seqNum=2444796
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:223) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:94) ~[hbase-client-1.1.1.jar!/:1.1.1]
... 54 common frames omitted
Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68076: row 'h�^A"^@^@^@^@^@e�ё ^D^@' on table 'fileindex' at region=fileindex,,1461811143505.c0d3d5ad7798ea1a5c6081fb35872735., hostname=xxxx292,16020,1470015445443, seqNum=2444796
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159) ~[hbase-client-1.1.1.jar!/:1.1.1]
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64) ~[hbase-client-1.1.1.jar!/:1.1.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79]
... 1 common frames omitted
Caused by: java.io.IOException: java.io.IOException: Could not seekToPreviousRow StoreFileScanner[HFileScanner for reader reader=hdfs://ns06cluster/apps/hbase/data/data/default/fileindex/c0d3d5ad7798ea1a5c6081fb35872735/bs/1245e48bd690407893c35f0e0578f909, compression=none, cacheConf=blockCache=LruBlockCache{blockCount=147151, currentSize=9772114048, freeSize=514337664, maxSize=10286451712, heapSize=9772114048, minSize=9772129280, minFactor=0.95, multiSize=4886064640, multiFactor=0.5, singleSize=2443032320, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false, firstKey=\x00\x00\x00\x00\x00\x00\x00\x00\x00`\x06\xBA\xE4\x90\x04\x00/bs:fileKey/1464861184798/Put, lastKey=\xFF\[email protected]"\x00\x00\x00\x00\x00d \x8B\xC4|\x04\x00/bs:fileKey/1469264431944/Put, avgKeyLen=38, avgValueLen=8, entries=690525, length=41267369, cur=h\x8D\x01"\x00\x00\x00\x00\x00e\xB5\xD1\x91 \x04\x01/bs:fileKey/1470964268832/Put/vlen=8/seqid=3148032] to key h\x8D\x01"\x00\x00\x00\x00\x00d\xDA\xCF\xCE<\x04\x13/bs:fileKey/1470045687342/Put/vlen=8/seqid=2554915
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekToPreviousRow(StoreFileScanner.java:477)
at org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.next(ReversedKeyValueHeap.java:136)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:596)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5587)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5738)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5525)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2396)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32205)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: On-disk size without header provided is 196710, but block header contains 65569. Block offset: -1, data starts with: DATABLK*\x00\x01\x00!\x00\x01\x00\x0D\x00\x00\x00\x00\x00\xBF0\xCD\x01\x00\[email protected]\x00\x00\x01\x00
at org.apache.hadoop.hbase.io.hfile.HFileBlock.validateOnDiskSizeWithoutHeader(HFileBlock.java:500)
at org.apache.hadoop.hbase.io.hfile.HFileBlock.access$700(HFileBlock.java:85)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1625)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:438)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekBefore(HFileReaderV2.java:674)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekBefore(HFileReaderV2.java:647)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekToPreviousRow(StoreFileScanner.java:441)
... 13 more
[/code]

- 阅读剩余部分 -

REWRITING UBER ENGINEERING: THE OPPORTUNITIES MICROSERVICES PROVIDE
https ://eng.uber.com/building-tincup/?from=timeline&isappinstalled=0

https://github.com/uber/tchannel

如题,当给Tengine配置8192位的数字证书后,-t测试是OK的,-s reload以后,所有worker进程的cpu都是100%

改成4096位的数字证书后正常。

对应Tengine的版本为
[code]
#/usr/local/nginx/sbin/nginx -v
Tengine version: Tengine/2.1.2 (nginx/1.6.2)
[/code]

备忘之。

先是收到线上某java服务异常日志警报
[code]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method) [na:1.7.0_79]
at java.lang.Thread.start(Thread.java:714) [na:1.7.0_79]
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) [na:1.7.0_79]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) [na:1.7.0_79]
at com.squareup.okhttp.ConnectionPool.put(ConnectionPool.java:189) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.OkHttpClient$1.put(OkHttpClient.java:89) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.internal.http.StreamAllocation.findConnection(StreamAllocation.java:179) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.internal.http.StreamAllocation.findHealthyConnection(StreamAllocation.java:126) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.internal.http.StreamAllocation.newStream(StreamAllocation.java:95) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.internal.http.HttpEngine.connect(HttpEngine.java:283) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.internal.http.HttpEngine.sendRequest(HttpEngine.java:224) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.Call.getResponse(Call.java:286) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:243) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:205) ~[okhttp-2.7.1.jar!/:na]
at com.squareup.okhttp.Call.execute(Call.java:80) ~[okhttp-2.7.1.jar!/:na]
at com.qiniu.http.Client.send(Client.java:195) ~[qiniu-java-sdk-7.0.10.jar!/:na]
at com.qiniu.http.Client.post(Client.java:132) ~[qiniu-java-sdk-7.0.10.jar!/:na]
at com.qiniu.http.Client.post(Client.java:115) ~[qiniu-java-sdk-7.0.10.jar!/:na]
at com.qiniu.storage.BucketManager.post(BucketManager.java:319) ~[qiniu-java-sdk-7.0.10.jar!/:na]
at com.qiniu.storage.BucketManager.ioPost(BucketManager.java:309) ~[qiniu-java-sdk-7.0.10.jar!/:na]
at com.qiniu.storage.BucketManager.fetch(BucketManager.java:263) ~[qiniu-java-sdk-7.0.10.jar!/:na]
[/code]
登录上服务器,直接报了个异常:无法分配内存
[code]
[[email protected]*****85 logs]# less stdout.log
-bash: fork: Cannot allocate memory
[/code]
看起来真是内存不足了。用free -m试了一下,还有1G多。
[code]
[[email protected]*****85 logs]# free -m
total used free shared buffers cached
Mem: 31996 30343 1652 3 47 16751
-/+ buffers/cache: 13544 18451
Swap: 20479 10 20469
[/code]

- 阅读剩余部分 -

今天遭遇HBase超负载问题,已经处置,备忘一下过程。

问题表现:
nginx:每个worker的cpu负载为10%到20%之间。平时为5%左右。
主服务:cpu于5%到200%之间波动。平时为200%左右。dmesg和messages中发现possible SYN flooding on port xxx. Sending cookies.
HBase:其中一台regionserver满载2400%CPU(24核),wa为0%,sys为3%,其它节点负载正常。重启此结点后,恢复正常,但会有一个其它结点变成这样。

问题定位:
首先怀疑是gc的问题,之前碰到过hbase的内存不足,一直gc,CPU超高无io负载。找到gc日志(通过jstat也可以实时监控到):
[code]
2016-08-01T09:31:44.167+0800: 23811.563: [GC2016-08-01T09:31:44.167+0800: 23811.563: [ParNew: 442182K->22966K(471872K), 0.0379880 secs] 7170427K->6752954K(25113408K), 0.0381800 secs] [Times: user=0.65 sys=0.00, real=0.03 secs]
2016-08-01T09:31:55.191+0800: 23822.587: [GC2016-08-01T09:31:55.191+0800: 23822.587: [ParNew: 442422K->24807K(471872K), 0.0374700 secs] 7172410K->6757782K(25113408K), 0.0376640 secs] [Times: user=0.65 sys=0.00, real=0.03 secs]
2016-08-01T09:32:06.618+0800: 23834.014: [GC2016-08-01T09:32:06.618+0800: 23834.014: [ParNew: 444263K->25807K(471872K), 0.0333280 secs] 7177238K->6759917K(25113408K), 0.0335430 secs] [Times: user=0.57 sys=0.00, real=0.04 secs]
Heap
par new generation total 471872K, used 381454K [0x00000001f8000000, 0x0000000218000000, 0x0000000218000000)
eden space 419456K, 84% used [0x00000001f8000000, 0x000000020db4fe68, 0x00000002119a0000)
from space 52416K, 49% used [0x0000000214cd0000, 0x0000000216603d68, 0x0000000218000000)
to space 52416K, 0% used [0x00000002119a0000, 0x00000002119a0000, 0x0000000214cd0000)
concurrent mark-sweep generation total 24641536K, used 6734110K [0x0000000218000000, 0x00000007f8000000, 0x00000007f8000000)
concurrent-mark-sweep perm gen total 131072K, used 46044K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)
[/code]
能过gc日志发现不是这个问题,后来重启regionserver结点后更验证了不是gc的问题。

- 阅读剩余部分 -

今天有同事反馈Spring的@Async注解无效,调用结果还是同步的。
调用方式大体如下:
[code]
public void TestAsync(){
...
this.testPrivate();
...
}

@Async
private void testPrivate(){
...需长时间运行的代码
}
[/code]

- 阅读剩余部分 -