I. installation mode
same as hadoop 1.x ,there are several mode to install hadoop:
1.standalone
just run it on one machine,includes running of mapreduce.
2.pseudo
setup it with hdfs mode,and this case contains two types:
a.run hdfs only
in this case,mapreds also run in local mode ,yes ,you can see the job name called as job_localxxxxxx
b.run hdfs with yarn
yes ,this is same as the distributed mode
3.distributed mode/cluster mode
compare to item 2,this item only has some more configures and more than one nodes.
II.configures for cluster mode
file |
property |
value |
default val |
summary |
core-site.xml |
hadoop.tmp.dir |
/usr/local/hadoop/data-2.5.1/tmp |
/tmp/hadoop-${user.name} |
path to a tmp dir, some sub dirs will be filecache,usercache,nmPrivate.so thisdir shoult not set todir 'tmp' for productenvironment;
|
fs.defaultFS |
hdfs://host1:9000 |
file:/// |
the name of the default file system.this will determine the installation mode ;the correspondent deprecated one is: fs.default.name; The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem. |
|
hdfs-site.xml |
dfs.nameservices |
hadoop-cluster1 |
Comma-separated list of nameservices.here is single NN only but HA |
|
dfs.namenode.secondary.http-address |
host1:50090 |
0.0.0.0:50090 |
The secondary namenode http server address and port. |
|
dfs.namenode.name.dir |
file:///usr/local/hadoop/data-2.5.1/dfs/name |
file://${hadoop.tmp.dir}/dfs/name |
Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. |
|
dfs.datanode.data.dir |
file:///usr/local/hadoop/data-2.5.1/dfs/data |
file://${hadoop.tmp.dir}/dfs/data |
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored. |
|
dfs.replication |
1 | 3 | the replication factor to assign data blocks | |
dfs.webhdfs.enabled |
true | true |
Enable WebHDFS (REST API) in Namenodes and Datanodes. |
|
yarn-site.xml | yarn.nodemanager.aux-services | mapreduce_shuffle | the auxiliary service name | the valid service name should only contain a-zA-Z0-9_ and can not start with numbers |
yarn.resourcemanager.address | host1:8032 | ${yarn.resourcemanager.hostname}:8032 | The address of the applications manager interface in the RM | |
yarn.resourcemanager.scheduler.address | host1:8030 | ${yarn.resourcemanager.hostname}:8030 | the scheduler address of RM | |
yarn.resourcemanager.resource-tracker.address | host1:8031 | ${yarn.resourcemanager.hostname}:8031 | ||
yarn.resourcemanager.admin.address | host1:8033 | ${yarn.resourcemanager.hostname}:8033 | admin addr | |
yarn.resourcemanager.webapp.address | host1:50030 | ${yarn.resourcemanager.hostname}:8088 | the webp ui addr for RM ;here is set to job tracker addr that same as hadoop 1.x | |
mapred-site.xml | mapreduce.framework.name | yarn | local |
The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn. |
mapreduce.jobhistory.address | host1:10020 | 0.0.0.0:10020 | MapReduce JobHistory Server IPC host:port | |
mapreduce.jobhistory.webapp.address | host1:19888 | 0.0.0.0:19888 | MapReduce JobHistory Server Web UI host:port | |
III.the results of running MR in yarn
below are logs from mapreduce run with pseudo mode:
hadoop@ubuntu:/usr/local/hadoop/hadoop-2.5.1$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount wc wc-out
14/11/05 18:19:23 INFO client.RMProxy: Connecting to ResourceManager at namenode/192.168.1.25:8032
14/11/05 18:19:24 INFO input.FileInputFormat: Total input paths to process : 22
14/11/05 18:19:24 INFO mapreduce.JobSubmitter: number of splits:22
14/11/05 18:19:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1415182439385_0001
14/11/05 18:19:25 INFO impl.YarnClientImpl: Submitted application application_1415182439385_0001
14/11/05 18:19:25 INFO mapreduce.Job: The url to track the job: http://namenode:50030/proxy/application_1415182439385_0001/
14/11/05 18:19:25 INFO mapreduce.Job: Running job: job_1415182439385_0001
14/11/05 18:19:32 INFO mapreduce.Job: Job job_1415182439385_0001 running in uber mode : false
14/11/05 18:19:32 INFO mapreduce.Job: map 0% reduce 0%
14/11/05 18:19:44 INFO mapreduce.Job: map 9% reduce 0%
14/11/05 18:19:45 INFO mapreduce.Job: map 27% reduce 0%
14/11/05 18:19:54 INFO mapreduce.Job: map 32% reduce 0%
14/11/05 18:19:55 INFO mapreduce.Job: map 45% reduce 0%
14/11/05 18:19:56 INFO mapreduce.Job: map 50% reduce 0%
14/11/05 18:20:02 INFO mapreduce.Job: map 55% reduce 17%
14/11/05 18:20:03 INFO mapreduce.Job: map 59% reduce 17%
14/11/05 18:20:05 INFO mapreduce.Job: map 68% reduce 20%
14/11/05 18:20:06 INFO mapreduce.Job: map 73% reduce 20%
14/11/05 18:20:08 INFO mapreduce.Job: map 73% reduce 24%
14/11/05 18:20:11 INFO mapreduce.Job: map 77% reduce 24%
14/11/05 18:20:12 INFO mapreduce.Job: map 82% reduce 24%
14/11/05 18:20:13 INFO mapreduce.Job: map 91% reduce 24%
14/11/05 18:20:14 INFO mapreduce.Job: map 95% reduce 30%
14/11/05 18:20:16 INFO mapreduce.Job: map 100% reduce 30%
14/11/05 18:20:17 INFO mapreduce.Job: map 100% reduce 100%
14/11/05 18:20:18 INFO mapreduce.Job: Job job_1415182439385_0001 completed successfully
14/11/05 18:20:18 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=54637
FILE: Number of bytes written=2338563
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=59677
HDFS: Number of bytes written=28233
HDFS: Number of read operations=69
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=22
Launched reduce tasks=1
Data-local map tasks=22
Total time spent by all maps in occupied slots (ms)=185554
Total time spent by all reduces in occupied slots (ms)=30206
Total time spent by all map tasks (ms)=185554
Total time spent by all reduce tasks (ms)=30206
Total vcore-seconds taken by all map tasks=185554
Total vcore-seconds taken by all reduce tasks=30206
Total megabyte-seconds taken by all map tasks=190007296
Total megabyte-seconds taken by all reduce tasks=30930944
Map-Reduce Framework
Map input records=1504
Map output records=5727
Map output bytes=77326
Map output materialized bytes=54763
Input split bytes=2498
Combine input records=5727
Combine output records=2838
Reduce input groups=1224
Reduce shuffle bytes=54763
Reduce input records=2838
Reduce output records=1224
Spilled Records=5676
Shuffled Maps =22
Failed Shuffles=0
Merged Map outputs=22
GC time elapsed (ms)=1707
CPU time spent (ms)=14500
Physical memory (bytes) snapshot=5178937344
Virtual memory (bytes) snapshot=22517506048
Total committed heap usage (bytes)=3882549248
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=57179
File Output Format Counters
Bytes Written=28233
FAQs
1.2014-01-22 09:38:20,733 INFO [AsyncDispatcher event handler] rmapp.RMAppImpl (RMAppImpl.java:transition(788)) - Application application_1390354688375_0001 failed 2 times due to AM Container for appattempt_1390354688375_0001_000002 exited with exitCode: 127 due to: Exception from container-launch:
this maybe occur if you dont setup a JAVA_HOME in yarn-env.sh and hadoop-env.sh,and remember to restart yarn:)
2.occurs two jobs by running 'grep' example
it's normal!at first ,i think it's some wrong,but when i run wordcount again,the result shows one job only .so i think it's the nature of this example.
ref:
相关推荐
赠送jar包:hadoop-hdfs-client-2.9.1.jar; 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-client-2.9.1.pom;...
赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...
1.安装 Hadoop-gpl-compression 1.1 wget http://hadoop-gpl-compression.apache-extras.org.codespot.com/files/hadoop-gpl-compression-0.1.0-rc0.tar.gz 1.2 mv hadoop-gpl-compression-0.1.0/lib/native/Linux-...
Hadoop-0.20.0-HDFS+MapReduce+Hive+HBase十分钟快速入门
赠送jar包:hadoop-hdfs-client-2.9.1.jar 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-client-2.9.1-javadoc-...
hadoop-auth-3.1.1.jar hadoop-hdfs-3.1.1.jar hadoop-mapreduce-client-hs-3.1.1.jar hadoop-yarn-client-3.1.1.jar hadoop-client-api-3.1.1.jar hadoop-hdfs-client-3.1.1.jar hadoop-mapreduce-client-jobclient...
hadoop-lzo-0.4.20 centOS6.5 64位编译出来的 拷贝jar包到hadoop和hbase中 cp /opt/hadoopgpl/lib/hadoop-lzo-0.4.20-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/ cp /opt/hadoopgpl/lib/hadoop-lzo-0.4.20-...
赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...
hadoop-2.6.0 Window客户端,解压到目录,设置环境变量即可使用。java调用实例:// windows环境下需要配置Hadoop的客户端 System.setProperty("hadoop.home.dir", "E:/hadoop-2.6.0/"); conf = new Configuration...
Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】--...
Hadoop环境 Jdk1.8 三、实验内容 1:hdfs常见命令: (1)查看帮助:hdfs dfs -help (2)查看当前目录信息:hdfs dfs -ls / (3)创建文件夹:hdfs dfs -mkdir /文件夹名 (4)上传文件:hdfs dfs -put /...
必须注意对于不同的hadoop版本,` HADDOP_INSTALL_PATH/share/hadoop/common/lib`下的jar包版本都不同,需要一个个调整 - `hadoop2x-eclipse-plugin-master/ivy/library.properties` - `hadoop2x-eclipse-plugin-...
赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...
2014年8月新发布的hadoop-2.5.0 此版本在windows64位下编译,适用于64位的microsoft windows系统。 自编译版本,仅供学习,详情参考我的博客
Binary Compatibility for MapReduce applications built on hadoop-1.x Substantial amount of integration testing with rest of projects in the ecosystem A couple of important points to note while ...
idea hadoop-hdfs插件,和eclipse上一样的Hadoop hdfs的插件功能一样;端口分别为50020和9000,不用点测试直接点应用即可
赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...
Hadoop-2.8.0-HA-Hive安装部署与HQL12.hive的基本语法--数据导入--从本地--从hdfs.mp4