首先说一下HBase与传统的关系型数据库在逻辑层次上的不同:
1、HBase的表结构定义中是不需要定义列的,只需要定义列族(可以暂时把列族当成多个列的集合)。所以在建表的时候,只需要指定列族即可。在列族中新增列,是不需要任何事先声明的,直接使用就好了。
2、HBase中,行是通过key来定位的,扫描更是通过key来进行的。所以行的key值选择,就显得十分重要。表结构的定义及key值的选择,实际上决定了数据是否可以高效利用。
3、HBase中,行Key+列族名+列名,可以定义到唯一的一个Cell
其实,从这里大家可以看出:
1、HBase通过对行进行了一定的限制,实现了列的灵活操作,解决了列扩展的问题
2、我们实际应用中,往往将主表及多个关联表不计重复的一起记录到列中,通过对空间的浪费,来实现了时间的节省,解决了查询效率的问题。
换句话说,数据的增加速度,超出了硬件进步的水平,硬件处理速度已经无法满足如此大量数据的处理,只能通过分布式技术,通过并发处理,将任务分配到多个节点,才能满足速度的需要。
3、分布式处理也要付出代价,节点间的通信,再小也会是个技术瓶颈。从这个角度来说,如果数据量没有这么大的话,采用分布式处理,反而不如单节点优化的效果好。
4、分布式处理的优点是,可以通过一堆性能一般的电脑,达到一台高性能计算机的处理速度,同时自带了数据冗余机制,降低了维护量。但其维护量,总体上还是上升了。所以是否采用,就要权衡数据量及维护量之间的关系了。
哦,扯远了。。。咱们继续。
第一步当然是看一下帮助:
hadoop@hadoop-master:~/Deploy/hbase-1.1.2$ bin/hbase shell
hbase(main):061:0> help
HBase Shell, version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, split, trace, unassign, wal_roll, zk_dump
Group name: replication
Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs
Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot
Group name: configuration
Commands: update_all_config, update_config
Group name: quotas
Commands: list_quotas, set_quota
Group name: security
Commands: grant, revoke, user_permission
Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
SHELL USAGE:
Quote all names in HBase Shell such as table and column names. Commas delimit
command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:
{'key1' => 'value1', 'key2' => 'value2', ...}
and are opened and closed with curley-braces. Key/values are delimited by the
'=>' character combination. Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type
'Object.constants' to see a (messy) list of all constants in the environment.
If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:
hbase> get 't1', "key\x03\x3f\xcd"
hbase> get 't1', "key\003\023\011"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
hbase(main):062:0>
Continue reading HBase基本操作01→