Monthly Archive: 二月 2015

【翻译】在集群上部署MapReduce v2 (YARN)

原文:http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_yarn_cluster_deploy.html#topic_11_4

环境hadoop01~03三台机器,其中hadoop01为rm及JobHistory Server。

一、修改mapred-site.xml

在configuration之间添加内容(标明使用yarn代替MapReduce1的框架功能):

  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

二、yarn-site.xml中一定要配的参数

1.添加以下内容以确保ResourceManager配置到正确的主机(日志位置要对应在hdfs上创建好目录,否则执行时会无报错的卡住)。

<configuration>
 <property>
 <name>yarn.resourcemanager.hostname</name>
 <value>hadoop01</value>
 </property>

 <property>
 <name>yarn.nodemanager.aux-services</name>
 <value>mapreduce_shuffle</value>
 </property>

 <property>
 <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
 <value>org.apache.hadoop.mapred.ShuffleHandler</value>
 </property>

 <property>
 <name>yarn.log-aggregation-enable</name>
 <value>true</value>
 </property>

 <property>
 <description>List of directories to store localized files in.</description>
 <name>yarn.nodemanager.local-dirs</name>
 <value>file:///var/lib/hadoop-yarn/cache/${user.name}/nm-local-dir</value>
 </property>

 <property>
 <description>Where to store container logs.</description>
 <name>yarn.nodemanager.log-dirs</name>
 <value>file:///var/log/hadoop-yarn/containers</value>
 </property>

 <property>
 <description>Where to aggregate logs to.</description>
 <name>yarn.nodemanager.remote-app-log-dir</name>
 <value>hdfs://hadoop01:8020/var/log/hadoop-yarn/apps</value>
 </property>

 <property>
 <description>Classpath for typical applications.</description>
 <name>yarn.application.classpath</name>
 <value>
 $HADOOP_CONF_DIR,
 $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
 $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
 $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
 $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
 </value>
 </property>
</configuration>
~

2.创建对应的hdfs上的目录

sudo -u hdfs hadoop fs -mkdir -p /var/log/hadoop-yarn
sudo -u hdfs hadoop fs -chown yarn:mapred /var/log/hadoop-yarn

三、配置history server

如果在集群上使用YARN代替MRv1. 需要运行MapReduce JobHistory server。

1.以下是需要配置到mapred-site.xml文件的参数。

Property

Recommended value

Description

mapreduce.jobhistory.address

historyserver.company.com:10020

The address of the JobHistory Server host:port

mapreduce.jobhistory.webapp.address

historyserver.company.com:19888

The address of the JobHistory Server web applicationhost:port

In addition, make sure proxying is enabled for the mapred user; configure the following properties in core-site.xml:

2.在core-site.xml文件中配置以下参数,以确保mapred用户的代理功能可用。

Property

Recommended value

Description

hadoop.proxyuser.mapred.groups

*

Allows the mapreduser to move files belonging to users in these groups

hadoop.proxyuser.mapred.hosts

*

Allows the mapreduser to move files belonging on these hosts

JobHistory Server的参考:http://dongxicheng.org/mapreduce-nextgen/hadoop-2-0-jobhistory-log/

3.创建对应的目录

sudo -u hdfs hadoop fs -mkdir -p /user/history
sudo -u hdfs hadoop fs -chmod -R 1777 /user/history
sudo -u hdfs hadoop fs -chown mapred:hadoop /user/history

四、配置staging 目录

1.配置mapred-site.xml文件

<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>

2.创建hdfs上的目录

sudo -u hdfs hadoop fs -mkdir -p /user/history
sudo -u hdfs hadoop fs -chmod -R 1777 /user/history
sudo -u hdfs hadoop fs -chown mapred:hadoop /user/history

五、将各配置文件部署到其余两个节点

scp core-site.xml mapred-site.xml yarn-site.xml root@hadoop02:/etc/hadoop/conf/
scp core-site.xml mapred-site.xml yarn-site.xml root@hadoop03:/etc/hadoop/conf/

六、安装启停脚本

1.在hadoop01上安装ResourceManager的启停脚本,JobHistory Server的启停脚本

yum install hadoop-yarn-resourcemanager.x86_64
yum install hadoop-mapreduce-historyserver.x86_64

2.在hadoop01~3上安装NodeManager的启停脚本

yum install hadoop-mapreduce-historyserver.x86_64

七、启动

1.在hadoop01上启动Resource Manager

[root@hadoop01 conf]# service hadoop-yarn-resourcemanager start
starting resourcemanager, logging to /var/log/hadoop-yarn/yarn-yarn-resourcemanager-hadoop01.out
Started Hadoop resourcemanager:[  OK  ]

2.在hadoop01~3上启动nodemananger

sudo service hadoop-yarn-nodemanager start

3.启动JobHistory Server

service hadoop-mapreduce-historyserver start

三、测试yarn

1.建立测试目录及文件

[root@hadoop01 hadoop-mapreduce]# su - hdfs
-sh-4.1$ hadoop fs -mkdir /wordcount
-sh-4.1$ cd /tmp/
-sh-4.1$ mkdir wordcount
-sh-4.1$ cd wordcount/
-sh-4.1$ ls
-sh-4.1$ echo "hello world, good bye world" > file01.txt
-sh-4.1$ ls
file01.txt
-sh-4.1$ cat file01.txt 
hello world, good bye world
-sh-4.1$ hadoop fs -mkdir /wordcount/input
-sh-4.1$ hadoop fs -copyFromLocal file01.txt /wordcount/input
-sh-4.1$ hadoop fs -ls /wordcount/input
Found 1 items
-rw-r--r--   3 hdfs supergroup         28 2015-02-27 10:30 /wordcount/input/file01.txt

2.YARN测试

-sh-4.1$ yarn jar
RunJar jarFile [mainClass] args...
-sh-4.1$ yarn jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.1.jar wordcount /wordcount/input /wordcount/output
15/02/27 10:36:02 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/10.62.228.211:8032
15/02/27 10:36:03 INFO input.FileInputFormat: Total input paths to process : 1
15/02/27 10:36:03 INFO mapreduce.JobSubmitter: number of splits:1
15/02/27 10:36:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1424968955716_0001
15/02/27 10:36:04 INFO impl.YarnClientImpl: Submitted application application_1424968955716_0001
15/02/27 10:36:04 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1424968955716_0001/
15/02/27 10:36:04 INFO mapreduce.Job: Running job: job_1424968955716_0001

3.查看结果

[root@hadoop01 conf]# hadoop fs -ls /wordcount/output
Found 2 items
-rw-r--r--   3 hdfs supergroup          0 2015-02-27 17:19 /wordcount/output/_SUCCESS
-rw-r--r--   3 hdfs supergroup         29 2015-02-27 17:19 /wordcount/output/part-r-00000
[root@hadoop01 conf]# hadoop fs -cat /wordcount/output/part-r-00000
bye     1
good    1
hello   1
world   2

四、故障及排除

1.卡map0% reduce0%:

hdfs namenode -format

删除datanode在本地的存储

2.卡map 100% reduce0%

修改/etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

删掉各种域,变为:

127.0.0.1   localhost

 

 

 

linux 查看当前登陆用户 与之发信息

查看当前登陆用户:

w

[root@banana ~]# w
 16:32:14 up 31 days,  1:43,  2 users,  load average: 0.00, 0.01, 0.00
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     tty1     :0               04Jan15 31days 22.68s 22.68s /usr/bin/Xorg :
root     pts/0    192.168.1.1      16:31    0.00s  0.04s  0.01s w

比较逗趣的是可以给用户发信息

机器一

[root@banana ~]# w
 16:55:07 up 31 days,  2:06,  3 users,  load average: 0.00, 0.00, 0.00
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     tty1     :0               04Jan15 31days 22.69s 22.69s /usr/bin/Xorg :
root     pts/0    IP2      16:31    6.00s  0.04s  0.04s -bash
root     pts/1    IP1    16:53    0.00s  0.04s  0.01s w
[root@banana ~]# write root pts/0
hello world
#ctrl+c退出输入

机器二

[root@banana ~]# 
Message from root@banana.ctbj.com on pts/1 at 16:58 ...
hello world
EOF