BigData – Page 3 – Neohope的网络笔记

HBase基本操作01

Posted on 2016/01/31 by neohope — No Comments ↓

首先说一下HBase与传统的关系型数据库在逻辑层次上的不同：
1、HBase的表结构定义中是不需要定义列的，只需要定义列族（可以暂时把列族当成多个列的集合）。所以在建表的时候，只需要指定列族即可。在列族中新增列，是不需要任何事先声明的，直接使用就好了。
2、HBase中，行是通过key来定位的，扫描更是通过key来进行的。所以行的key值选择，就显得十分重要。表结构的定义及key值的选择，实际上决定了数据是否可以高效利用。
3、HBase中，行Key+列族名+列名，可以定义到唯一的一个Cell

其实，从这里大家可以看出：
1、HBase通过对行进行了一定的限制，实现了列的灵活操作，解决了列扩展的问题
2、我们实际应用中，往往将主表及多个关联表不计重复的一起记录到列中，通过对空间的浪费，来实现了时间的节省，解决了查询效率的问题。
换句话说，数据的增加速度，超出了硬件进步的水平，硬件处理速度已经无法满足如此大量数据的处理，只能通过分布式技术，通过并发处理，将任务分配到多个节点，才能满足速度的需要。
3、分布式处理也要付出代价，节点间的通信，再小也会是个技术瓶颈。从这个角度来说，如果数据量没有这么大的话，采用分布式处理，反而不如单节点优化的效果好。
4、分布式处理的优点是，可以通过一堆性能一般的电脑，达到一台高性能计算机的处理速度，同时自带了数据冗余机制，降低了维护量。但其维护量，总体上还是上升了。所以是否采用，就要权衡数据量及维护量之间的关系了。

哦，扯远了。。。咱们继续。

第一步当然是看一下帮助：

hadoop@hadoop-master:~/Deploy/hbase-1.1.2$ bin/hbase shell

hbase(main):061:0> help
HBase Shell, version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.

COMMAND GROUPS:
  Group name: general
  Commands: status, table_help, version, whoami

  Group name: ddl
  Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters

  Group name: namespace
  Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

  Group name: dml
  Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve

  Group name: tools
  Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, split, trace, unassign, wal_roll, zk_dump

  Group name: replication
  Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs

  Group name: snapshots
  Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot

  Group name: configuration
  Commands: update_all_config, update_config

  Group name: quotas
  Commands: list_quotas, set_quota

  Group name: security
  Commands: grant, revoke, user_permission

  Group name: visibility labels
  Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

SHELL USAGE:
Quote all names in HBase Shell such as table and column names.  Commas delimit
command parameters.  Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:

  {'key1' => 'value1', 'key2' => 'value2', ...}

and are opened and closed with curley-braces.  Key/values are delimited by the
'=>' character combination.  Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc.  Constants do not need to be quoted.  Type
'Object.constants' to see a (messy) list of all constants in the environment.

If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:

  hbase> get 't1', "key\x03\x3f\xcd"
  hbase> get 't1', "key\003\023\011"
  hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
hbase(main):062:0>

Continue reading HBase基本操作01→

搭建HBase集群环境

Posted on 2016/01/25 by neohope — No Comments ↓

1、正常运行Hadoop集群

2、检查HBase与Hadoop的兼容性，下载对应的正确版本（*如果要看后续文章，建议使用hadoop-2.5.2 hbase-1.1.2 hive-1.2.1 spark-2.0.0）
S：支持
X：不支持
NT：未测试

	HBase-0.94.x	HBase-0.98.x (Support for Hadoop 1.1+ is deprecated.)	HBase-1.0.x (Hadoop 1.x is NOT supported)	HBase-1.1.x	HBase-1.2.x
Hadoop-1.0.x	X	X	X	X	X
Hadoop-1.1.x	S	NT	X	X	X
Hadoop-0.23.x	S	X	X	X	X
Hadoop-2.0.x-alpha	NT	X	X	X	X
Hadoop-2.1.0-beta	NT	X	X	X	X
Hadoop-2.2.0	NT	S	NT	NT	NT
Hadoop-2.3.x	NT	S	NT	NT	NT
Hadoop-2.4.x	NT	S	S	S	S
Hadoop-2.5.x	NT	S	S	S	S
Hadoop-2.6.0	X	X	X	X	X
Hadoop-2.6.1+	NT	NT	NT	NT	S
Hadoop-2.7.0	X	X	X	X	X
Hadoop-2.7.1+	NT	NT	NT	NT	S

Continue reading 搭建HBase集群环境→

搭建Cassandra集群环境

Posted on 2016/01/22 by neohope — No Comments ↓

1、环境准备

VirtualBox4
Debian8
JDK8u60
Cassandra3

2、安装虚拟机，安装Guset插件

su
apt-get install gcc
apt-get install linux-headers-$(uname -r)
apt-get install build-essential
./VBoxLinuxAdditions.run

设置共享目录，将需要的文件拷贝到虚拟机。
当然也可以设置好虚拟的的ssh后，用scp或winscp将文件拷贝到虚拟机。

3.网络配置为两块网卡，第一块为Hostonly设为固定IP，第二块为NAT，设置为dhcp
修改配置文件/etc/network/interfaces

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 172.16.172.23
netmask 255.255.0.0
gateway 172.16.172.2

auto eth1
iface eth1 inet dhcp

修改hosts文件

#/etc/hosts
127.0.0.1	localhost
172.16.172.23	node01
172.16.172.24	node02
172.16.172.25	node03

修改hostname

#/etc/hostname
node01

根据需要（一般用不到），修改配置文件/etc/resolv.conf

nameserver xxx.xxx.xxx.xxx

重启网卡

su
ifconfig eth0 down
ifconfig eth0 up
ifconfig eth1 down
ifconfig eth1 up

Continue reading 搭建Cassandra集群环境→

Hadoop增删改查（Java）

Posted on 2015/10/10 by neohope — No Comments ↓

需要的jar包在hadoop里都可以找到，下面的例子中，至少需要这些jar包：

commons-cli-1.2.jar
commons-collections-3.2.1.jar
commons-configuration-1.6.jar
commons-io-2.4.jar
commons-lang-2.6.jar
commons-logging-1.1.3.jar
guava-11.0.2.jar
hadoop-auth-2.7.1.jar
hadoop-common-2.7.1.jar
hadoop-hdfs-2.7.1.jar
htrace-core-3.1.0-incubating.jar
log4j-1.2.17.jar
protobuf-java-2.5.0.jar
servlet-api.jar
slf4j-api-1.7.10.jar
slf4j-log4j12-1.7.10.jar

代码如下：

package com.neohope.hadoop.test;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class HDFSTest {

	static Configuration hdfsConfig;
	static {
		hdfsConfig = new Configuration();
		hdfsConfig.addResource(new Path("etc/hadoop/core-site.xml"));
		hdfsConfig.addResource(new Path("etc/hadoop/hdfs-site.xml"));
	}

	// 创建文件夹
	public static void createDirectory(String dirPath) throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path p = new Path(dirPath);
		try {
			fs.mkdirs(p);
		} finally {
			fs.close();
		}
	}

	// 删除文件夹
	public static void deleteDirectory(String dirPath) throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path p = new Path(dirPath);
		try {
			fs.deleteOnExit(p);
		} finally {
			fs.close();
		}
	}

	// 重命名文件夹
	public static void renameDirectory(String oldDirPath, String newDirPath)
			throws IOException {
		renameFile(oldDirPath, newDirPath);
	}

	// 枚举文件
	public static void listFiles(String dirPath) throws IOException {
		FileSystem hdfs = FileSystem.get(hdfsConfig);
		Path listf = new Path(dirPath);
		try {
			FileStatus statuslist[] = hdfs.listStatus(listf);
			for (FileStatus status : statuslist) {
				System.out.println(status.getPath().toString());
			}
		} finally {
			hdfs.close();
		}
	}

	// 新建文件
	public static void createFile(String filePath) throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path p = new Path(filePath);
		try {
			fs.createNewFile(p);
		} finally {
			fs.close();
		}
	}

	// 删除文件
	public static void deleteFile(String filePath) throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path p = new Path(filePath);
		try {
			fs.deleteOnExit(p);
		} finally {
			fs.close();
		}
	}

	// 重命名文件
	public static void renameFile(String oldFilePath, String newFilePath)
			throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path oldPath = new Path(oldFilePath);
		Path newPath = new Path(newFilePath);
		try {
			fs.rename(oldPath, newPath);
		} finally {
			fs.close();
		}
	}

	// 上传文件
	public static void putFile(String locaPath, String hdfsPath)
			throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path src = new Path(locaPath);
		Path dst = new Path(hdfsPath);
		try {
			fs.copyFromLocalFile(src, dst);
		} finally {
			fs.close();
		}
	}

	// 取回文件
	public static void getFile(String hdfsPath, String locaPath)
			throws IOException {
		FileSystem fs = FileSystem.get(hdfsConfig);
		Path src = new Path(hdfsPath);
		Path dst = new Path(locaPath);
		try {
			fs.copyToLocalFile(false, src, dst, true);
		} finally {
			fs.close();
		}
	}

	// 读取文件
	public static void readFile(String hdfsPath) throws IOException {
		FileSystem hdfs = FileSystem.get(hdfsConfig);
		Path filePath = new Path(hdfsPath);

		InputStream in = null;
		BufferedReader buff = null;
		try {
			in = hdfs.open(filePath);
			buff = new BufferedReader(new InputStreamReader(in));
			String str = null;
			while ((str = buff.readLine()) != null) {
				System.out.println(str);
			}
		} finally {
			buff.close();
			in.close();
			hdfs.close();
		}
	}

	public static void main(String[] args) throws IOException {
		System.setProperty("HADOOP_USER_NAME", "hadoop");
		// createDirectory("hdfs://hadoop-master:9000/usr");
		// createDirectory("hdfs://hadoop-master:9000/usr/hansen");
		// createDirectory("hdfs://hadoop-master:9000/usr/hansen/test");
		// renameDirectory("hdfs://hadoop-master:9000/usr/hansen/test","hdfs://hadoop-master:9000/usr/hansen/test01");
		// createFile("hdfs://hadoop-master:9000/usr/hansen/test01/hello.txt");
		// renameFile("hdfs://hadoop-master:9000/usr/hansen/test01/hello.txt","hdfs://hadoop-master:9000/usr/hansen/test01/hello01.txt");
		// putFile("hello.txt","hdfs://hadoop-master:9000/usr/hansen/test01/hello02.txt");
		// getFile("hdfs://hadoop-master:9000/usr/hansen/test01/hello02.txt","hello02.txt");
		// readFile("hdfs://hadoop-master:9000/usr/hansen/test01/hello02.txt");
		listFiles("hdfs://hadoop-master:9000/usr/hansen/test01/");
	}

}

Hadoop Linux Native 编译说明

Posted on 2015/10/09 by neohope — No Comments ↓

首先说明一下，如果要使用Linux Native的话，Hadoop是已经自带了哦

然后，如果要编译的话，建议直接从Hadoop源码按官方的说明进行编译，不要像我这样自己搞。。。

如果你喜欢折腾，请继续看：

1、按源码架构拷贝下面的文件及文件夹

hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\main\native
hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\CMakeLists.txt
hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\config.h.cmake
hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\JNIFlags.cmake
hadoop-2.5.2-src\hadoop-hdfs-project\hadoop-hdfs\src\main\native
hadoop-2.5.2-src\hadoop-hdfs-project\hadoop-hdfs\src\CMakeLists.txt（可能需要调整一下依赖文件JNIFlags.cmake的相对路径）
hadoop-2.5.2-src\hadoop-hdfs-project\hadoop-hdfs\src\config.h.cmake

2、编译libhadoop
2.1、检查并安装以来关系

#需要gcc、make、jdk，这些一般大家都有了
#需要zlib
apt-get install zlib1g-dev
#需要cmake
apt-get install cmake

2.2、用cmake生成Makefile

cmake ./src/ -DGENERATED_JAVAH=~/Build/hadoop-2.5.2-src/build/hadoop-common-project/hadoop-common/native/javah -DJVM_ARCH_DATA_MODEL=64 -DREQUIRE_BZIP2=false -DREQUIRE_SNAPPY=false

2.3、用javah生成头文件
需要三个jar包，hadoop-common,hadoop-annotations,guava

javah org.apache.hadoop.io.compress.lz4.Lz4Compressor
javah org.apache.hadoop.io.compress.lz4.Lz4Decompressor
javah org.apache.hadoop.io.compress.zlib.ZlibCompressor
javah org.apache.hadoop.io.compress.zlib.ZlibDecompressor
javah org.apache.hadoop.io.nativeio.NativeIO 
javah org.apache.hadoop.io.nativeio.SharedFileDescriptorFactory
javah org.apache.hadoop.net.unix.DomainSocket
javah org.apache.hadoop.net.unix.DomainSocketWatcher
javah org.apache.hadoop.security.JniBasedUnixGroupsMapping
javah org.apache.hadoop.security.JniBasedUnixNetgroupsMapping
javah org.apache.hadoop.util.NativeCrc32

将生成的文件，拷贝到对应的c文件夹中

2.3、生成

make

3、编译libhdfs
3.1、用cmake生成Makefile

cmake ./src/ -DGENERATED_JAVAH=~/Build/hadoop-2.5.2-src/build/hadoop-common-project/hadoop-common/native/javah -DJVM_ARCH_DATA_MODEL=64 -DREQUIRE_LIBWEBHDFS=false -DREQUIRE_FUSE=false

3.2、生成

make

4、将生成的文件拷贝到HADOOP_HOME/lib/mynative

5、修改/etc/profile，增加下面一行

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/mynative"

6、刷新配置

source /etc/profile

搞定！

Hadoop Windows Native 编译说明

Posted on 2015/10/09 by neohope — No Comments ↓

1、首先，下载hadoop-2.5.2-src源码

拷贝文件夹hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\main\native
拷贝文件夹from hadoop-2.5.2-src\hadoop-common-project\hadoop-common\src\main\winutils

2、设置JAVA_HOME及PATH环境变量

3、生成javah的头文件
解压hadoop-common-2.5.1.jar，然后运行

javah org.apache.hadoop.util.NativeCrc32
javah org.apache.hadoop.io.compress.lz4.Lz4Compressor
javah org.apache.hadoop.io.compress.lz4.Lz4Decompressor
javah org.apache.hadoop.io.nativeio.NativeIO
javah org.apache.hadoop.security.JniBasedUnixGroupsMapping
javah org.apache.hadoop.security.JniBasedUnixGroupsMapping

4、打开winutils.sln，修改输出路径到../bin，编译

5、打开native.sln，修改输出路径到../bin，修改winutils.lib引用地址，编译

6、拷贝exe及dll文件到HADOOP_HOME/bin，搞定

常见问题：
1、编译的硬件平台要与java位数一致（x86，x64），否则dll无法加载
2、出问题时，先运行winutils.exe，无法运行时，要安装对应VS版本的vcredist可再发行包就好了
3、如果提示”unable to load native hadoop-library for your platform”的话，那只需要在JVM启动参数中制定native library的路径，就可以了

如果比较着急的话，可以到我的github上下载2.5.2版本的native binary：hadoop-windows-native

Hadoop环境搭建（下）

Posted on 2015/09/29 by neohope — No Comments ↓

1、新建文件夹

bin/hadoop fs -ls /
bin/hadoop fs -mkdir /usr
bin/hadoop fs -mkdir /usr/neohope
bin/hadoop fs -mkdir /usr/neohope/test

2、从本地拷贝文件到hdfs

mkdir ~/test
echo hello hadoop >> ~/test/hello.txt
bin/hadoop fs -put ~/test/hello.txt /usr/neohope/test/

3、查看远程文件

bin/hadoop fs -ls /usr/neohope/test
bin/hadoop fs -cat /usr/neohope/test/hello.txt

4、从hdfs拷贝文件到本地

bin/hadoop fs -get /usr/neohope/test/hello.txt ~/test/hello1.txt
cat ~/test/hello1.txt

5、语法说明

hadoop@hadoop-master:~/hadoop-2.7.1$ bin/hadoop fs
Usage: hadoop fs [generic options]
	[-appendToFile <localsrc> ... <dst>]
	[-cat [-ignoreCrc] <src> ...]
	[-checksum <src> ...]
	[-chgrp [-R] GROUP PATH...]
	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
	[-chown [-R] [OWNER][:[GROUP]] PATH...]
	[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
	[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-count [-q] [-h] <path> ...]
	[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
	[-createSnapshot <snapshotDir> [<snapshotName>]]
	[-deleteSnapshot <snapshotDir> <snapshotName>]
	[-df [-h] [<path> ...]]
	[-du [-s] [-h] <path> ...]
	[-expunge]
	[-find <path> ... <expression> ...]
	[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-getfacl [-R] <path>]
	[-getfattr [-R] {-n name | -d} [-e en] <path>]
	[-getmerge [-nl] <src> <localdst>]
	[-help [cmd ...]]
	[-ls [-d] [-h] [-R] [<path> ...]]
	[-mkdir [-p] <path> ...]
	[-moveFromLocal <localsrc> ... <dst>]
	[-moveToLocal <src> <localdst>]
	[-mv <src> ... <dst>]
	[-put [-f] [-p] [-l] <localsrc> ... <dst>]
	[-renameSnapshot <snapshotDir> <oldName> <newName>]
	[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
	[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
	[-setfattr {-n name [-v value] | -x name} <path>]
	[-setrep [-R] [-w] <rep> <path> ...]
	[-stat [format] <path> ...]
	[-tail [-f] <file>]
	[-test -[defsz] <path>]
	[-text [-ignoreCrc] <src> ...]
	[-touchz <path> ...]
	[-truncate [-w] <length> <path> ...]
	[-usage [cmd ...]]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

Hadoop环境搭建（中）

Posted on 2015/09/29 by neohope — No Comments ↓

1、将hadoop解压

su hadoop
cd ~
tar -zxvf /home/neohope/Desktop/hadoop-2.7.1.tar.gz

2、修改/home/hadoop/hadoop-2.7.1/etc/hadoop/路径下配置
2.1、core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop-master:9000</value>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://hadoop-master:9000</value>
  </property>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/home/hadoop/hadoop-2.7.1/tmp</value>
  </property>

  <property>
    <name>io.file.buffer.size</name>
    <value>131702</value>
  </property>

</configuration>

2.2、hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/home/hadoop/hadoop-2.7.1/hdfs/name</value>
  </property>


  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/home/hadoop/hadoop-2.7.1/hdfs/data</value>
  </property>


  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>


  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>hadoop-master:9001</value>
  </property>


  <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
  </property>

</configuration>

2.3、mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>hadoop-master:10020</value>
  </property>

  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>hadoop-master:19888</value>
  </property>

  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>2048</value>      
  </property>

  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>2048</value>      
  </property>

</configuration>

2.4、yarn-site.xml

<?xml version="1.0"?>

<!-- Site specific YARN configuration properties -->

<configuration>

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>

  <property>
    <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

  <property>
    <name>yarn.resourcemanager.address</name>
    <value>hadoop-master:8032</value>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>hadoop-master:8030</value>
  </property>

  <property>
    <name>yarn.resourcemanager.resource-ticker.address</name>
    <value>hadoop-master:8031</value>
  </property>

  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>hadoop-master:8033</value>
  </property>

  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>hadoop-master:8088</value>
  </property>

  <property>
    <name>yarn.resourcemanager.resource.memory-mb</name>
    <value>2048</value>
  </property>

  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>2048</value>
  </property>

</configuration>

2.5、slaves

#localhost
hadoop-slave01
hadoop-slave02

3、修改/home/hadoop/hadoop-2.7.1/etc/hadoop/路径下JAVA路径
3.1、hadoop-env.sh

# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/java/jdk1.7.0_79

3.2、yarn-env.sh

# some Java parameters
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
if [ "$JAVA_HOME" != "" ]; then
  #echo "run java in $JAVA_HOME"
  #JAVA_HOME=$JAVA_HOME
  JAVA_HOME=/usr/java/jdk1.7.0_79
fi

4、分发hadoop文件夹到各slave

scp -r /home/hadoop/hadoop-2.7.1 hadoop@hadoop-slave01:~/
scp -r /home/hadoop/hadoop-2.7.1 hadoop@hadoop-slave02:~/

5、初始化主服务器

cd ~/hadoop-2.7.1
bin/hdfs namenode -format

6、启动hadoop

cd ~/hadoop-2.7.1
sbin/start-dfs.sh
sbin/start-yarn.sh

7、查看hadoop进程信息

/usr/java/jdk1.7.0_79/bin/jps

8、查看cluster信息

http://10.10.10.3:8088

9、查看hdfs文件系统信息

http://10.10.10.3:50070

10、Hadoop常用端口如下

端口	作用
9000	fs.defaultFS
9001	dfs.namenode.rpc-address口
50070	dfs.namenode.http-address
50470	dfs.namenode.https-address
50100	dfs.namenode.backup.address
50105	dfs.namenode.backup.http-address
50090	dfs.namenode.secondary.http-address
50091	dfs.namenode.secondary.https-address
50020	dfs.datanode.ipc.address
50075	dfs.datanode.http.address
50475	dfs.datanode.https.address
50010	dfs.datanode.address
8480	dfs.journalnode.rpc-address
8481	dfs.journalnode.https-address
8032	yarn.resourcemanager.address
8088	yarn.resourcemanager.webapp.address
8090	yarn.resourcemanager.webapp.https.address
8030	yarn.resourcemanager.scheduler.address
8031	yarn.resourcemanager.resource-tracker.address
8033	yarn.resourcemanager.admin.address
8042	yarn.nodemanager.webapp.address
8040	yarn.nodemanager.localizer.address
8188	yarn.timeline-service.webapp.address
10020	mapreduce.jobhistory.address
19888	mapreduce.jobhistory.webapp.address
2888	ZooKeeper，Leader用来监听Follower的连接
3888	ZooKeeper，用于Leader选举
2181	ZooKeeper，用来监听客户端的连接
60010	hbase.master.info.port
60000	hbase.master.port
60030	hbase.regionserver.info.port
60020	hbase.regionserver.port
8080	hbase.rest.port
10000	hive.server2.thrift.port
9083	hive.metastore.uris

timestamp	value
1454342758182	A
1454342796166	B
1454342802829	B

Row Key	personinfo						personinfoex
Row Key	empi	name	sex	birthday	address	patid
visit001	empi001	zhangsan	male	1999-12-31	shanghai xxx road	pat001
visit002	empi002	lisi	male	2000-01-01	beijing	pat002
visit003	empi003	wangwu	female	1999-12-30	guangzhou	pat002

Row Key	visitinfo				visitinfoex
Row Key	visitid	visittime	visitdocid	visitdocname
visit001	visit001	2015-07-25 10:10:00	doc001	Dr. Yang
visit002	visit002	2015-07-26 11:11:00	doc001	Dr. Yang
visit003	visit003	2015-07-27 13:13:00	doc002	Dr. Li

Category Archives: BigData

HBase基本操作03

HBase基本操作02