About neohope

一直在努力,还没想过要放弃...

HBase基本操作03

接上一篇,HBase与传统数据库一个很大的不同之处:HBase可以保存多个版本的值,而不仅仅是保存最新的值。版本之间,通过timestamp属性来区分。

我们来具体看一下,首先新增一行visit100,我们多次设置列personinfo:name,每次设置后都查询一次最新的值:

hbase(main):011:0> put 'patientvisit','visit100','personinfo:name','A'
0 row(s) in 0.0890 seconds

hbase(main):012:0> get 'patientvisit','visit100','personinfo:name'
COLUMN                                        CELL                                                                                                                               
 personinfo:name                              timestamp=1454342758182, value=A                                                                                                   
1 row(s) in 0.0310 seconds

hbase(main):013:0> put 'patientvisit','visit100','personinfo:name','B'
0 row(s) in 0.0100 seconds

hbase(main):014:0> get 'patientvisit','visit100','personinfo:name'
COLUMN                                        CELL                                                                                                                               
 personinfo:name                              timestamp=1454342796166, value=B                                                                                                   
1 row(s) in 0.0230 seconds

hbase(main):015:0> put 'patientvisit','visit100','personinfo:name','B'
0 row(s) in 0.0140 seconds

hbase(main):016:0> get 'patientvisit','visit100','personinfo:name'
COLUMN                                        CELL                                                                                                                               
 personinfo:name                              timestamp=1454342802829, value=B                                                                                                   
1 row(s) in 0.0120 seconds

我们可以看到,每次设置该列的数值timestamp都有相应的变化:

timestamp value
1454342758182 A
1454342796166 B
1454342802829 B

Continue reading HBase基本操作03

HBase基本操作02

接上一篇,patientvisit表中的数据,最终是什么样子的呢?
为了便于查看,我把数据拆分成了两个Table:

Row Key personinfo personinfoex
empi name sex birthday address patid
visit001 empi001 zhangsan male 1999-12-31 shanghai xxx road pat001
visit002 empi002 lisi male 2000-01-01 beijing pat002
visit003 empi003 wangwu female 1999-12-30 guangzhou pat002
Row Key visitinfo visitinfoex
visitid visittime visitdocid visitdocname
visit001 visit001 2015-07-25 10:10:00 doc001 Dr. Yang
visit002 visit002 2015-07-26 11:11:00 doc001 Dr. Yang
visit003 visit003 2015-07-27 13:13:00 doc002 Dr. Li

Continue reading HBase基本操作02

HBase基本操作01

首先说一下HBase与传统的关系型数据库在逻辑层次上的不同:
1、HBase的表结构定义中是不需要定义列的,只需要定义列族(可以暂时把列族当成多个列的集合)。所以在建表的时候,只需要指定列族即可。在列族中新增列,是不需要任何事先声明的,直接使用就好了。
2、HBase中,行是通过key来定位的,扫描更是通过key来进行的。所以行的key值选择,就显得十分重要。表结构的定义及key值的选择,实际上决定了数据是否可以高效利用。
3、HBase中,行Key+列族名+列名,可以定义到唯一的一个Cell

其实,从这里大家可以看出:
1、HBase通过对行进行了一定的限制,实现了列的灵活操作,解决了列扩展的问题
2、我们实际应用中,往往将主表及多个关联表不计重复的一起记录到列中,通过对空间的浪费,来实现了时间的节省,解决了查询效率的问题。
换句话说,数据的增加速度,超出了硬件进步的水平,硬件处理速度已经无法满足如此大量数据的处理,只能通过分布式技术,通过并发处理,将任务分配到多个节点,才能满足速度的需要。
3、分布式处理也要付出代价,节点间的通信,再小也会是个技术瓶颈。从这个角度来说,如果数据量没有这么大的话,采用分布式处理,反而不如单节点优化的效果好。
4、分布式处理的优点是,可以通过一堆性能一般的电脑,达到一台高性能计算机的处理速度,同时自带了数据冗余机制,降低了维护量。但其维护量,总体上还是上升了。所以是否采用,就要权衡数据量及维护量之间的关系了。

哦,扯远了。。。咱们继续。

第一步当然是看一下帮助:

hadoop@hadoop-master:~/Deploy/hbase-1.1.2$ bin/hbase shell

hbase(main):061:0> help
HBase Shell, version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.

COMMAND GROUPS:
  Group name: general
  Commands: status, table_help, version, whoami

  Group name: ddl
  Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters

  Group name: namespace
  Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

  Group name: dml
  Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve

  Group name: tools
  Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, split, trace, unassign, wal_roll, zk_dump

  Group name: replication
  Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs

  Group name: snapshots
  Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot

  Group name: configuration
  Commands: update_all_config, update_config

  Group name: quotas
  Commands: list_quotas, set_quota

  Group name: security
  Commands: grant, revoke, user_permission

  Group name: visibility labels
  Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

SHELL USAGE:
Quote all names in HBase Shell such as table and column names.  Commas delimit
command parameters.  Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:

  {'key1' => 'value1', 'key2' => 'value2', ...}

and are opened and closed with curley-braces.  Key/values are delimited by the
'=>' character combination.  Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc.  Constants do not need to be quoted.  Type
'Object.constants' to see a (messy) list of all constants in the environment.

If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:

  hbase> get 't1', "key\x03\x3f\xcd"
  hbase> get 't1', "key\003\023\011"
  hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
hbase(main):062:0> 

Continue reading HBase基本操作01

搭建HBase集群环境

1、正常运行Hadoop集群

2、检查HBase与Hadoop的兼容性,下载对应的正确版本(*如果要看后续文章,建议使用hadoop-2.5.2 hbase-1.1.2 hive-1.2.1 spark-2.0.0)
S:支持
X:不支持
NT:未测试

HBase-0.94.x HBase-0.98.x (Support for Hadoop 1.1+ is deprecated.) HBase-1.0.x (Hadoop 1.x is NOT supported) HBase-1.1.x HBase-1.2.x
Hadoop-1.0.x X X X X X
Hadoop-1.1.x S NT X X X
Hadoop-0.23.x S X X X X
Hadoop-2.0.x-alpha NT X X X X
Hadoop-2.1.0-beta NT X X X X
Hadoop-2.2.0 NT S NT NT NT
Hadoop-2.3.x NT S NT NT NT
Hadoop-2.4.x NT S S S S
Hadoop-2.5.x NT S S S S
Hadoop-2.6.0 X X X X X
Hadoop-2.6.1+ NT NT NT NT S
Hadoop-2.7.0 X X X X X
Hadoop-2.7.1+ NT NT NT NT S

Continue reading 搭建HBase集群环境

搭建Cassandra集群环境

1、环境准备

VirtualBox4
Debian8
JDK8u60
Cassandra3

2、安装虚拟机,安装Guset插件

su
apt-get install gcc
apt-get install linux-headers-$(uname -r)
apt-get install build-essential
./VBoxLinuxAdditions.run

设置共享目录,将需要的文件拷贝到虚拟机。
当然也可以设置好虚拟的的ssh后,用scp或winscp将文件拷贝到虚拟机。

3.网络配置为两块网卡,第一块为Hostonly设为固定IP,第二块为NAT,设置为dhcp
修改配置文件/etc/network/interfaces

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 172.16.172.23
netmask 255.255.0.0
gateway 172.16.172.2

auto eth1
iface eth1 inet dhcp

修改hosts文件

#/etc/hosts
127.0.0.1	localhost
172.16.172.23	node01
172.16.172.24	node02
172.16.172.25	node03

修改hostname

#/etc/hostname
node01

根据需要(一般用不到),修改配置文件/etc/resolv.conf

nameserver xxx.xxx.xxx.xxx

重启网卡

su
ifconfig eth0 down
ifconfig eth0 up
ifconfig eth1 down
ifconfig eth1 up

Continue reading 搭建Cassandra集群环境

eXistDB简单Tirgger示例03

  • XCONF文件中指定XQuery文件路径
  • XCONF文件中包含XQuery文件
  • XCONF文件中指定Java类

第三种方式,是用XCONF文件通知eXistDB要对哪个collection中的哪些操作做触发,然后触发器指向一个JAVA类。

1、首先,编写触发器的java类,打成jar包,放到%existdb_home%\lib\user路径下
TriggerTest.java

package com.neohope.existdb.test;


import org.exist.collections.Collection;
import org.exist.collections.IndexInfo;
import org.exist.collections.triggers.DocumentTrigger;
import org.exist.collections.triggers.SAXTrigger;
import org.exist.collections.triggers.TriggerException;
import org.exist.dom.DocumentImpl;
import org.exist.dom.NodeSet;
import org.exist.security.PermissionDeniedException;
import org.exist.security.xacml.AccessContext;
import org.exist.storage.DBBroker;
import org.exist.storage.txn.Txn;
import org.exist.xmldb.XmldbURI;
import org.exist.xquery.CompiledXQuery;
import org.exist.xquery.XPathException;
import org.exist.xquery.XQueryContext;

import java.util.ArrayList;
import java.util.Map;

public class TriggerTest extends SAXTrigger implements DocumentTrigger {

    private String logCollection = "xmldb:exist:///db/Triggers";
    private String logFileName = "logj.xml";
    private String logUri;

    @Override
    public void configure(DBBroker broker, Collection parent, Map parameters)
            throws TriggerException {
        super.configure(broker, parent, parameters);

        ArrayList<String> objList =  (ArrayList<String>)parameters.get("LogFileName");
        if(objList!=null && objList.size()>0)
        {
            logFileName= objList.get(0);
        }

        logUri = logCollection+"/"+logFileName;
    }

    @Override
    public void beforeCreateDocument(DBBroker broker, Txn transaction, XmldbURI uri) throws TriggerException {
        LogEvent(broker,uri.toString(),"beforeCreateDocument");
    }

    @Override
    public void afterCreateDocument(DBBroker broker, Txn transaction, DocumentImpl document) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"afterCreateDocument");
    }

    @Override
    public void beforeUpdateDocument(DBBroker broker, Txn transaction, DocumentImpl document) throws TriggerException {
       LogEvent(broker,document.getDocumentURI(), "beforeUpdateDocument");
    }

    @Override
    public void afterUpdateDocument(DBBroker broker, Txn transaction, DocumentImpl document) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"afterUpdateDocument");
    }

    @Override
    public void beforeMoveDocument(DBBroker broker, Txn transaction, DocumentImpl document, XmldbURI newUri) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"beforeMoveDocument");
    }

    @Override
    public void afterMoveDocument(DBBroker broker, Txn transaction, DocumentImpl document, XmldbURI newUri) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"afterMoveDocument");
    }

    @Override
    public void beforeCopyDocument(DBBroker broker, Txn transaction, DocumentImpl document, XmldbURI newUri) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"beforeCopyDocument");
    }

    @Override
    public void afterCopyDocument(DBBroker broker, Txn transaction, DocumentImpl document, XmldbURI newUri) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"afterCopyDocument");
    }

    @Override
    public void beforeDeleteDocument(DBBroker broker, Txn transaction, DocumentImpl document) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"beforeDeleteDocument");
    }

    @Override
    public void afterDeleteDocument(DBBroker broker, Txn transaction, XmldbURI uri) throws TriggerException {
       LogEvent(broker, uri.toString(),"afterDeleteDocument");
    }

    @Override
    public void beforeUpdateDocumentMetadata(DBBroker broker, Txn txn, DocumentImpl document) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"beforeUpdateDocumentMetadata");
    }

    @Override
    public void afterUpdateDocumentMetadata(DBBroker broker, Txn txn, DocumentImpl document) throws TriggerException {
       LogEvent(broker, document.getDocumentURI(),"afterUpdateDocumentMetadata");
    }

    private void LogEvent(DBBroker broker,String uriFile, String logContent) throws TriggerException {
        String  xQuery = "update insert <trigger event=\""+logContent+"\" uri=\""+uriFile+"\" timestamp=\"{current-dateTime()}\"/> into doc(\""+logUri+"\")/TriggerLogs";

        try {
            XQueryContext  context  = broker.getXQueryService().newContext(AccessContext.TRIGGER);
            CreateLogFile(broker,context);
            CompiledXQuery compiled = broker.getXQueryService().compile(context,xQuery);
            broker.getXQueryService().execute(compiled, NodeSet.EMPTY_SET);
        } catch (XPathException e) {
            e.printStackTrace();
        } catch (PermissionDeniedException e) {
            e.printStackTrace();
        }
    }

    private void CreateLogFile(DBBroker broker,XQueryContext  context)
    {
        String  xQuery = "if (not(doc-available(\""+logUri+"\"))) then xmldb:store(\""+logCollection+"\", \""+logFileName+"\", <TriggerLogs/>) else ()";

        try {
            CompiledXQuery compiled = broker.getXQueryService().compile(context,xQuery);
            broker.getXQueryService().execute(compiled, NodeSet.EMPTY_SET);
        } catch (XPathException e) {
            e.printStackTrace();
        } catch (PermissionDeniedException e) {
            e.printStackTrace();
        }
    }
}

Continue reading eXistDB简单Tirgger示例03

eXistDB简单Tirgger示例02

  • XCONF文件中指定XQuery文件路径
  • XCONF文件中包含XQuery文件
  • XCONF文件中指定Java类

第二种方式,是用XCONF文件通知eXistDB要对哪个collection中的哪些操作做触发,然后将XQuery语句包含在XCONF文件中。

1、在你需要触发的collection的对应配置collection中,增加一个xconf文件,文件名任意,官方推荐collection.xconf。配置collection与原collection的对应关系为,在/db/system/config/db下,建立/db下相同的collection。
比如,如果你希望监控/db/cda02路径,就需要在/db/system/config/db/cda02路径下,新增一个collection.xconf。
collection.xconf

<collection xmlns="http://exist-db.org/collection-config/1.0">
    <triggers>
        <trigger event="create" class="org.exist.collections.triggers.XQueryTrigger">
            <parameter name="query" value="
             xquery version '3.0';

             module namespace trigger='http://exist-db.org/xquery/trigger';
             declare namespace xmldb='http://exist-db.org/xquery/xmldb';

             declare function trigger:before-create-document($uri as xs:anyURI)
             {
                 local:log-event('before', 'create', 'document', $uri)
             };

             declare function trigger:after-create-document($uri as xs:anyURI)
             {
                 local:log-event('after', 'create', 'document', $uri)
             };
             
             declare function trigger:before-delete-document($uri as xs:anyURI)
             {
                 local:log-event('before', 'delete', 'document', $uri)
             };
             
             declare function trigger:after-delete-document($uri as xs:anyURI)
             {
                 local:log-event('after', 'delete', 'document', $uri)
             };
             
             declare function local:log-event($type as xs:string, $event as xs:string, $object-type as xs:string, $uri as xs:string)
             {
                 let $log-collection := '/db/Triggers'
                 let $log := 'log02.xml'
                 let $log-uri := concat($log-collection, '/', $log)
                 return
                 (
                     (: util:log does not work at all
                         util:log('warn', 'trigger fired'),
                     :)
                     
                     (: create the log file if it does not exist :)
                     if (not(doc-available($log-uri))) then
                         xmldb:store($log-collection, $log, &lt;triggers/&gt;)
                     else ()
                     ,
                     (: log the trigger details to the log file :)
                     update insert &lt;trigger event='{string-join(($type, $event, $object-type), '-')}' uri='{$uri}' timestamp='{current-dateTime()}'/&gt; into doc($log-uri)/triggers                 )             };"/>
        </trigger>
    </triggers>
</collection>

Continue reading eXistDB简单Tirgger示例02

eXistDB简单Tirgger示例01

  • XCONF文件中指定XQuery文件路径
  • XCONF文件中包含XQuery文件
  • XCONF文件中指定Java类

第一种方式,是用XCONF文件通知eXistDB要对哪个collection中的哪些操作做触发,然后触发器指向一个XQM的文件。

1、首先,编写触发器的xqm文件,比如我的保存路径为/db/Triggers/TriggerTest01.xqm
TriggerTest01.xqm

xquery version "3.0";

module namespace trigger="http://exist-db.org/xquery/trigger";

declare namespace xmldb="http://exist-db.org/xquery/xmldb";

declare function trigger:before-create-collection($uri as xs:anyURI)
{
    local:log-event("before", "create", "collection", $uri)
};

declare function trigger:after-create-collection($uri as xs:anyURI)
{
    local:log-event("after", "create", "collection", $uri)
};

declare function trigger:before-copy-collection($uri as xs:anyURI, $new-uri as xs:anyURI)
{
    local:log-event("before", "copy", "collection", concat("from: ", $uri, " to:", $new-uri))
};

declare function trigger:after-copy-collection($new-uri as xs:anyURI, $uri as xs:anyURI)
{
    local:log-event("after", "copy", "collection", concat("from: ", $uri, " to:", $new-uri))
};

declare function trigger:before-move-collection($uri as xs:anyURI, $new-uri as xs:anyURI)
{
    local:log-event("before", "move", "collection", concat("from: ", $uri, " to:", $new-uri))
};

declare function trigger:after-move-collection($new-uri as xs:anyURI, $uri as xs:anyURI)
{
    local:log-event("after", "move", "collection", concat("from: ", $uri, " to:", $new-uri))
};

declare function trigger:before-delete-collection($uri as xs:anyURI)
{
    local:log-event("before", "delete", "collection", $uri)
};

declare function trigger:after-delete-collection($uri as xs:anyURI)
{
    local:log-event("after", "delete", "collection", $uri)
};

declare function trigger:before-create-document($uri as xs:anyURI)
{
    local:log-event("before", "create", "document", $uri)
};

declare function trigger:after-create-document($uri as xs:anyURI)
{
    local:log-event("after", "create", "document", $uri)
};

declare function trigger:before-update-document($uri as xs:anyURI)
{
    local:log-event("before", "update", "document", $uri)
};

declare function trigger:after-update-document($uri as xs:anyURI)
{
    local:log-event("after", "update", "document", $uri)
};

declare function trigger:before-copy-document($uri as xs:anyURI, $new-uri as xs:anyURI)
{
    local:log-event("before", "copy", "document", concat("from: ", $uri, " to: ", $new-uri))
};

declare function trigger:after-copy-document($new-uri as xs:anyURI, $uri as xs:anyURI)
{
    local:log-event("after", "copy", "document", concat("from: ", $uri, " to: ", $new-uri))
};

declare function trigger:before-move-document($uri as xs:anyURI, $new-uri as xs:anyURI)
{
    local:log-event("before", "move", "document", concat("from: ", $uri, " to: ", $new-uri))
};

declare function trigger:after-move-document($new-uri as xs:anyURI, $uri as xs:anyURI)
{
    local:log-event("after", "move", "document", concat("from: ", $uri, " to: ", $new-uri))
};

declare function trigger:before-delete-document($uri as xs:anyURI)
{
    local:log-event("before", "delete", "document", $uri)
};

declare function trigger:after-delete-document($uri as xs:anyURI)
{
    local:log-event("after", "delete", "document", $uri)
};

declare function local:log-event($type as xs:string, $event as xs:string, $object-type as xs:string, $uri as xs:string)
{
    let $log-collection := "/db/Triggers"
    let $log := "log01.xml"
    let $log-uri := concat($log-collection, "/", $log)
    return
    (
        (: create the log file if it does not exist :)
        if (not(doc-available($log-uri))) then
            xmldb:store($log-collection, $log, <triggers/>)
        else ()
        ,
        (: log the trigger details to the log file :)
        update insert <trigger event="{string-join(($type, $event, $object-type), '-')}" uri="{$uri}" timestamp="{current-dateTime()}"/> into doc($log-uri)/triggers
    )
};

Continue reading eXistDB简单Tirgger示例01

Redis与Tomcat集群集成

在网站访问量急剧上升时,通常需要使用集群的方法进行横向扩展。
对于没有状态的应用来说,直接用nginx进行处理即可。
但对于有状态的应用来说,比如登录状态等,除了使用nginx进行扩展外,就需要考虑到Session共享的问题了。

大家知道可以用apache+tomcat来实现Session共享,但效率太低了,而且容易出错。
今天说的主要是用nginx+tomcat+redis+tomcat-redis-session-manager的方式实现共享。
原理比较简单:

1、tomcat-redis-session-manage扩展了
org.apache.catalina.valves.ValveBase;
org.apache.catalina.session.ManagerBase;
org.apache.catalina.session.StandardSession;
并通过Tomcat配置,替代了这几个类。

2、Set属性时,用session id作为key,将Tomcat的整个Session拆分为SessionSerializationMetadata+RedisSession然后序列化为byte[],存放到Redis。

3、Get属性时,用session id作为key,从Redis获取byte[],然后反序列化为SessionSerializationMetadata+RedisSession,供Tomcat使用。

配置也很简单:
1、从github下载源码tomcat-redis-session-manager

2、用gradle进行编译

#master分支下面,要把signing段和uploadArchives段删掉,才能正常编译
#release就不需要了
gradle build

3、将三个Jar包拷贝到Tomcat的lib文件夹下

#%TOMCAT_HOME%/lib
tomcat-redis-session-manager-master-2.0.0.jar
commons-pool2-2.2.jar
jedis-2.5.2.jar

4、修改context.xml配置文件,新增下面内容就搞定咯

<!--%TOMCAT_HOME%/conf/context.xml-->
  <Valve className="com.orangefunction.tomcat.redissessions.RedisSessionHandlerValve" />
  <Manager className="com.orangefunction.tomcat.redissessions.RedisSessionManager" 
  host="localhost" port="6379" database="0" maxInactiveInterval="60"/>

好处是:不需要修改应用
坏处是:要耗费一定的时间来(序列化+保存到Redis)、(反序列化+从Redis读取)。