[code]
[email protected] hbase-1.1.3
[]# bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1
...
File System Counters
FILE: Number of bytes read=321913437
FILE: Number of bytes written=327363226
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=10
Map output records=10
Map output bytes=160
Map output materialized bytes=240
Input split bytes=1470
Combine input records=0
Combine output records=0
Reduce input groups=10
Reduce shuffle bytes=240
Reduce input records=10
Reduce output records=10
Spilled Records=20
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=1326
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=3442950144
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
HBase Performance Evaluation
Elapsed time in milliseconds=13909
Row count=1048570
File Input Format Counters
Bytes Read=34515
File Output Format Counters
Bytes Written=126
[/code]

- 阅读剩余部分 -

小视频采集服务,之前的重复过滤机制,是根据文件的hash值精确匹配的。
目前碰到新情况:同一个视频,被不同的网站打了不同的水印,被当成了多个视频。
需要做的事情:识别这类相似的小视频,去重。

整体思路:
取视频的关键帧(比如第一帧)的关键区域(比如:九宫格的中间一行),取一些(比如1024个)关键点(比如:均匀取),数字化(比如:每个点转成一个64进制字符:256灰阶/4)存成可以比对的指纹信息。

例外的情况:
电影的首帧一般是出品方的固定信息,小视频这种情况比较少见。

最初的办法是精确匹配,可以在数据库里精确搜索。
实践证明:不靠谱。转码的色阶数值,不管怎么切割,总会有跨域问题。甚至变成黑白两色也不一致。此路不通。

简单描述一下跨域问题:
视频加水印时,会进行有损转码,灰阶值有机率发生小范围浮动。两个带水印的视频比对,就是两个灰阶浮动后对比,是浮动的叠加。还有一些是被多次加水印的视频,也是多次浮动的叠加。

想到GPS的geohash的网格问题处理方案,在256灰阶/4时搞了个去余、进位、四舍五入三指纹,交叉比对,用以解决跨域问题。
实践证明:同样不靠谱。浮动偶尔会有超过范围,变成极端的黑白也不成。

基本上得出结论:想精确匹配,不靠谱。

- 阅读剩余部分 -

先看它是干嘛用的
[code]
$ man sysctl
SYSCTL(8) SYSCTL(8)

NAME
sysctl - configure kernel parameters at runtime

SYNOPSIS
sysctl [-n] [-e] variable ...
sysctl [-n] [-e] [-q] -w variable=value ...
sysctl [-n] [-e] [-q] -p <filename>
sysctl [-n] [-e] -a
sysctl [-n] [-e] -A

DESCRIPTION
sysctl is used to modify kernel parameters at runtime. The parameters available are those listed under /proc/sys/. Procfs is required for sysctl(8) support in Linux. You can use sysctl(8) to both
read and write sysctl data.

PARAMETERS
variable
The name of a key to read from. An example is kernel.ostype. The ’/’ separator is also accepted in place of a ’.’.

variable=value
To set a key, use the form variable=value, where variable is the key and value is the value to set it to. If the value contains quotes or characters which are parsed by the shell, you may
need to enclose the value in double quotes. This requires the -w parameter to use.

-n Use this option to disable printing of the key name when printing values.

-e Use this option to ignore errors about unknown keys.

-N Use this option to only print the names. It may be useful with shells that have programmable completion.

-q Use this option to not display the values set to stdout.

-w Use this option when you want to change a sysctl setting.

-p Load in sysctl settings from the file specified or /etc/sysctl.conf if none given. Specifying - as filename means reading data from standard input.

-a Display all values currently available.

-A Same as -a

EXAMPLES
/sbin/sysctl -a

/sbin/sysctl -n kernel.hostname

/sbin/sysctl -w kernel.domainname="example.com"

/sbin/sysctl -p /etc/sysctl.conf

NOTES
Please note that modules loaded after sysctl is run may override the settings (example: sunrpc.* settings are overridden when the sunrpc module is loaded). This may cause some confusion during boot
when the settings in sysctl.conf may be overriden. To prevent such a situation, sysctl must be run after the particular module is loaded (e.g., from /etc/rc.d/rc.local or by using the install direc-
tive in modprobe.conf)

FILES
/proc/sys /etc/sysctl.conf

SEE ALSO
sysctl.conf(5), modprobe.conf(5)

AUTHOR
George Staikos, <[email protected]>

21 Sep 1999 SYSCTL(8)
[/code]

- 阅读剩余部分 -

nginx所在linux,http访问出现大量超时,调整sysctl.conf中两个参数后恢复正常


[code]
net.nf_conntrack_max = 655360
net.netfilter.nf_conntrack_max = 655350
[/code]

把相关的配置都留存备忘。

open files限制


[code]
# cat /etc/security/limits.conf | grep -v "\#"
* soft nproc 655350
* hard nproc 655350
* soft nofile 655350
* hard nofile 655350
[/code]

内核参数


[code]
# less sysctl.conf | grep -v "\#"
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syn_retries = 1
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 262144
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.tcp_max_syn_backlog = 262144
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.nf_conntrack_max = 655360
net.netfilter.nf_conntrack_max = 655350
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 10
net.netfilter.nf_conntrack_tcp_timeout_established = 600
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.ip_local_port_range = 1024 65535
[/code]

只有net.ipv4.tcp_timestamps = 1时,net.ipv4.tcp_tw_reuse和net.ipv4.tcp_tw_recycle的配置才有效。

简单备忘使用:
[code]
groupadd faker
useradd -d /home/faker -m faker -g faker
mkdir /home/faker/.ssh
chown faker:faker /home/faker/.ssh
echo ssh-rsa ...此处略去若干字... > /home/faker/.ssh/authorized_keys
chown faker:faker /home/faker/.ssh/*
[/code]

如果需要给这个用户开root权限,只需要visudo