`

hadoop源码阅读-shell启动流程

 
阅读更多

open the bin/hadoop file,you will see the there is a config file to load:

  either libexec/hadoop-config.sh or bin/hadoop-config.sh

and the previor is loaded if exists,else the load the later.

 

you will see the HADOOP_HOME is same as HADOOP_PREFIX at last:

export HADOOP_HOME=${HADOOP_PREFIX}

 

ok,now start to have a grance of shell-starting flow of distributed-mode:

  namenode format -> start-dfs -> start-mapred

 

step 1-namenode format

 appropriate cmd is: "hadoop namenode -format",and the related class entry is:

 org.apache.hadoop.hdfs.server.namenode.NameNode

 

well ,what is NameNode(NN) responsible for? description copied from code:

 * NameNode serves as both directory namespace manager and
 * "inode table " for the Hadoop DFS.  There is a single NameNode
 * running in any DFS deployment.  (Well, except when there
 * is a second backup/failover NameNode.)
 *
 * The NameNode controls two critical tables:
 *   1)  filename->blocksequence (namespace )
 *   2)  block->machinelist ("inodes ")
 *
 * The first table is stored on disk and is very precious.
 * The second table is rebuilt every time the NameNode comes
 * up.
 *
 * 'NameNode' refers to both this class as well as the 'NameNode server'.
 * The 'FSNamesystem' class actually performs most of the filesystem
 * management.  The majority of the 'NameNode' class itself is concerned
 * with exposing the IPC interface and the http server to the outside world,
 * plus some configuration management.
 *
 * NameNode implements the ClientProtocol interface, which allows
 * clients to ask for DFS services.  ClientProtocol is not
 * designed for direct use by authors of DFS client code.  End-users
 * should instead use the org.apache.nutch.hadoop.fs.FileSystem class.
 *
 * NameNode also implements the DatanodeProtocol interface, used by
 * DataNode programs that actually store DFS data blocks.  These
 * methods are invoked repeatedly and automatically by all the
 * DataNodes in a DFS deployment.
 *
 * NameNode also implements the NamenodeProtocol interface, used by
 * secondary namenodes or rebalancing processes to get partial namenode's
 * state, for example partial blocksMap etc.

 

 

the formated files list are here:

hadoop@leibnitz-laptop:/cc$ ll data/hadoop/hadoop-1.0.1/cluster-hadoop/mapred/local/


hadoop@leibnitz-laptop:/cc$ ll data/hadoop/hadoop-1.0.1/cluster-hadoop/dfs/name/current/
-rw-r--r-- 1 hadoop hadoop    4 2012-05-01 15:41 edits
-rw-r--r-- 1 hadoop hadoop 2474 2012-05-01 15:41 fsimage
-rw-r--r-- 1 hadoop hadoop    8 2012-05-01 15:41 fstime
-rw-r--r-- 1 hadoop hadoop  100 2012-05-01 15:41 VERSION

 

 

hadoop@leibnitz-laptop:/cc$ ll data/hadoop/hadoop-1.0.1/cluster-hadoop/dfs/name/image/
-rw-r--r-- 1 hadoop hadoop  157 2012-05-01 15:41 fsimage

 

 

ok.let's to see what does these files to keep.

edits: FSEditLog maintains a log of the namespace modifications .(same as transactional logs)

(these files belong to FSImage listed below)

fsimage : FSImage handles checkpointing and logging of the namespace edits .

fstime : keep last checkpoint time

VERSION: File VERSION contains the following fields:

  1. node type
  2. layout version
  3. namespaceID
  4. fs state creation time
  5. other fields specific for this node type

   The version file is always written last during storage directory updates. The existence of the version file indicates that     all other files have been successfully written in the storage directory, the storage is valid and does not need to be     recovered.

 

a dir named 'previous.checkpoint ' wil be occured when :

     * previous.checkpoint is a directory, which holds the previous
     * (before the last save) state of the storage directory
.
     * The directory is created as a reference only, it does not play role
     * in state recovery procedures, and is recycled automatically,

     * but it may be useful for manual recovery of a stale state of the system.

content like this:
hadoop@leibnitz-laptop:/cc$ ll data/hadoop/hadoop-1.0.1/cluster-hadoop/dfs/name/previous.checkpoint/
-rw-r--r-- 1 hadoop hadoop  293 2012-04-25 02:26 edits
-rw-r--r-- 1 hadoop hadoop 2934 2012-04-25 02:26 fsimage
-rw-r--r-- 1 hadoop hadoop    8 2012-04-25 02:26 fstime
-rw-r--r-- 1 hadoop hadoop  100 2012-04-25 02:26 VERSION

 

 

yes, i found a import class named "Lease" which will do as:

A Lease governs all the locks held by a single client.
   * For each client there's a corresponding lease , whose
   * timestamp is updated when the client periodically
   * checks in.  If the client dies and allows its lease to
   * expire, all the corresponding locks can be released.

 

 

 

分享到:
评论

相关推荐

    hadoop-core-1.2.2-SNAPSHOT.jar

    windows下搭建nutch会遇到Hadoop下FileUtil.java问题,所以我们一般的做法是找到Hadoop-core-1.2.0源码中的org.apache.hadoop.fs下的FileUtil.java修改其中的CheckReturnValue方法,注释掉其中的内容这时运行会遇到...

    基于Python的大数据Hadoop平台2-2、MapReduce.zip

    开发方式:shell、vim、IDE(idea) 项目:推荐系统----模板,融会贯通(检索、反作弊、预测) 重点:架构思维,思考方式,解决方法。 在正式介绍MR之前,先铺垫一些Hadoop生态圈组件,如图所示,这些组件从下到上看,...

    新版Hadoop视频教程 段海涛老师Hadoop八天完全攻克Hadoop视频教程 Hadoop开发

    第一天 hadoop的基本概念 伪分布式hadoop集群安装 hdfs mapreduce 演示 01-hadoop职位需求状况.avi 02-hadoop课程安排.avi 03-hadoop应用场景.avi 04-hadoop对海量数据处理的解决思路.avi 05-hadoop版本选择和...

    hadoop段海涛老师八天实战视频

    第一天 hadoop的基本概念 伪分布式hadoop集群安装 hdfs mapreduce 演示 01-hadoop职位需求状况.avi 02-hadoop课程安排.avi 03-hadoop应用场景.avi 04-hadoop对海量数据处理的解决思路.avi 05-hadoop版本选择和...

    Hadoop从入门到上手企业开发

    026 Eclipse导入Hadoop源码项目 027 HDFS 设计目标 028 HDFS 文件系统架构概述 029 HDFS架构之NameNode和DataNode 030 HDFS 架构讲解总结 031 回顾NameNode和DataNode 032 HDFS架构之Client和SNN功能 033 HDFS Shell...

    hadoop shell操作与程式开发

    NULL 博文链接:https://huanglz19871030.iteye.com/blog/1518589

    Hadoop 培训课程(2)HDFS

    Hadoop 培训课程(2)HDFS 分布式文件系统与HDFS HDFS体系结构与基本概念*** HDFS的shell操作*** java接口及常用api*** ---------------------------加深拓展---------------------- RPC调用** HDFS的分布式存储架构的...

    NativeIO和YARNRunner修改后的源码

    用eclipse本地提交Hadoop任务(如WordCount)到服务器上跑的时候,会报错: Stack trace: ExitCodeException...这是hadoop本身的一个bug,可以通过修改NativeIO和YARNRunner的源码并替换解决。这是这两个.java的zip包。

    开源bbs源码java-Hadoop-LogAnalysis:基于论坛的apachecommon日志分析项目

    如果是日志服务器数据较小、压力较小,可以直接使用shell命令把数据上传到HDFS中; 如果是日志服务器数据较大、压力较答,使用NFS在另一台服务器上上传数据; 如果日志服务器非常多、数据量大,使用flume进行数据...

    Hadoop之MapReduce编程实例完整源码

    一个自己写的Hadoop MapReduce实例源码,网上看到不少网友在学习MapReduce编程,但是除了wordcount范例外实例比较少,故上传自己的一个。包含完整实例源码,编译配置文件,测试数据,可执行jar文件,执行脚本及操作...

    word源码java-briefly:简要-用于作业流程控制的Python元编程库

    word源码java 简述作业流程控制 杨周涵 () (Version 1.0) 概述 简而言之,它是一个基于 Python 的元编程库,旨在管理具有各种类型任务的复杂工作流,例如 Hadoop(本地、Amazon EMR 或 Qubole)、Java 进程和 shell ...

    flink 1.6.0 源码包

    flink 1.6.0 源码 <!-- Dummy module to force execution of the Maven Shade plugin (see Shade plugin below) --> <module>tools/force-shading <module>flink-annotations <module>flink-shaded-...

    hive-exec-1.2.1.spark2.jar

    hive-exec-1.2.1.spark2.jar spark2-shell 支持 hive2 hadoop3

    基于CentOS7平台的Hadoop安装及环境搭建全教程(不断更新)+源代码+文档说明

    **在操作过程中,如果遇到权限相关的问题,基本上在代码前面加sudo就可以解决(这样可以暂时获取权限)** ### 1.1 Hadoop用户的创建 如果你安装 CentOS 的时候不是用的 “hadoop” 用户,那么需要增加一个名为 ...

    基于Hadoop和MapReduce实现的朴素贝叶斯分类器源码+项目说明.zip

    基于Hadoop和MapReduce实现的朴素贝叶斯分类器源码+项目说明.zip 环境搭建 搭建 `Hadoop` 环境,本项目在Mac系统上搭建的 `hadoop-2.8.5`环境下完成。 数据集说明 数据集在项目文件`\Data\NBCorpus\`中 包含二个子...

    hadoop2.2.0 winutils.exe

    hadoop2.2.0 在window下进行...源码org.apache.hadoop.util.Shell.java 的277行fullName的路径如: String fullExeName = "d:/hadoop" + File.separator + executable; 然后把winutils.exe放入到d:/hadoop目录下

    CSDN-CODE:停止维护 -->移步 https

    包含,Ubuntu服务器创建、远程工具连接配置、Ubuntu服务器配置、Hadoop文件配置、Hadoop格式化、启动。(首更时间2016年10月27日) Hadoop-Configure-配置文件 core-site.xml hadoop-env.sh hdfs-site.xml mapred-...

    hdfs源码.zip

    1.3.4 Datanode启动、心跳以及执行名字节点指令流程 26 1.3.5 HA切换流程 27 第2章 Hadoop RPC 29 2.1 概述 29 2.1.1 RPC框架概述 29 2.1.2 Hadoop RPC框架概述 30 2.2 Hadoop RPC的使用 36 2.2.1 ...

    妳那伊抹微笑_云计算之Hadoop完美笔记2.0

    Day2 介绍HDFS体系结构及shell、java操作方式 Day3 介绍MapReduce体系结构(1) Day4 介绍MapReduce体系结构(2) Day5 介绍Hadoop集群、zookeeper操作 Day6 介绍HBase体系结构及基本操作 Day7 介绍Hive、sqoop体系结构...

    java8集合源码分析-BigData:BigData笔记

    java8 集合源码分析 BigData 第一阶段 JavaSE基础 MySQL JDBC JavaWeb Redis Git Shell Hadoop

Global site tag (gtag.js) - Google Analytics