Hadoop学习笔记一:Hadoop安装(伪分布式)

2018-01-31 10:58:35来源:oschina作者:zlikun人点击

分享

Hadoop的伪分布式安装,参考:http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation


配置文件
# 这里列出了Hadoop全部的配置文件,无论是伪分布式还是完全分布式实际都是通过这些配置文件实现
# $HADOOP_HOME/etc/hadoop/
├── capacity-scheduler.xml
├── configuration.xsl
├── container-executor.cfg
├── core-site.xml
├── hadoop-env.cmd
├── hadoop-env.sh
├── hadoop-metrics2.properties
├── hadoop-metrics.properties
├── hadoop-policy.xml
├── hdfs-site.xml
├── httpfs-env.sh
├── httpfs-log4j.properties
├── httpfs-signature.secret
├── httpfs-site.xml
├── kms-acls.xml
├── kms-env.sh
├── kms-log4j.properties
├── kms-site.xml
├── log4j.properties
├── mapred-env.cmd
├── mapred-env.sh
├── mapred-queues.xml.template
├── mapred-site.xml.template
├── slaves
├── ssl-client.xml.example
├── ssl-server.xml.example
├── yarn-env.sh
└── yarn-site.xml
配置HDFS
# etc/hadoop/core-site.xml



fs.defaultFS
hdfs://v108.zlikun.com:9000



hadoop.tmp.dir
/var/hadoop/tmp



dfs.permissions
false


# etc/hadoop/hdfs-site.xml



dfs.replication
1



dfs.namenode.rpc-address
v108.zlikun.com:9000



dfs.namenode.rpc-bind-host
0.0.0.0


# 上述配置全部配置项参考:
# http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-common/core-default.xml
# http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
# Hadoop集群启动之后,NameNode是通过SSH来启动和停止各个节点上的各种守护进程的,所以在节点之间执行指令的时候不能有密码
# 配置SSH免密登录
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
# 执行格式化,注意输出日志中出现 ` Storage directory /var/hadoop/tmp/dfs/name has been successfully formatted.` 语句时,说明格式化成功
$ bin/hdfs namenode -format
18/01/30 08:50:38 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = v108.zlikun.com/192.168.1.108
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.5
STARTUP_MSG: classpath = /opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/opt/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/opt/hadoop/share/hadoop/common/lib/curator-framework-2.7.1.jar:/opt/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/opt/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/opt/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/opt/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/opt/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/opt/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/opt/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.5.jar:/opt/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/opt/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/opt/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/opt/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/opt/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/opt/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/opt/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/opt/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/opt/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/opt/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/opt/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/opt/hadoop/share/hadoop/common/lib/gson-2.2.4.jar:/opt/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/opt/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/opt/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/opt/hadoop/share/hadoop/common/lib/activation-1.1.jar:/opt/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/opt/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/opt/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/opt/hadoop/share/hadoop/common/lib/hadoop-annotations-2.7.5.jar:/opt/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/opt/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/opt/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/opt/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/opt/hadoop/share/hadoop/common/lib/jsch-0.1.54.jar:/opt/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/opt/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/opt/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/opt/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/opt/hadoop/share/hadoop/common/lib/xz-1.0.jar:/opt/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/opt/hadoop/share/hadoop/common/lib/jetty-sslengine-6.1.26.jar:/opt/hadoop/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/opt/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/opt/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/opt/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/opt/hadoop/share/hadoop/common/lib/junit-4.11.jar:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/opt/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/opt/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/opt/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/opt/hadoop/share/hadoop/common/lib/asm-3.2.jar:/opt/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/opt/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/opt/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/opt/hadoop/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/opt/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/opt/hadoop/share/hadoop/common/hadoop-common-2.7.5.jar:/opt/hadoop/share/hadoop/common/hadoop-common-2.7.5-tests.jar:/opt/hadoop/share/hadoop/common/hadoop-nfs-2.7.5.jar:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/opt/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/opt/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/opt/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/opt/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/opt/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/opt/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/opt/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/opt/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/opt/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar:/opt/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/opt/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/opt/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/opt/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/opt/hadoop/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/opt/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/opt/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/opt/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/opt/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/opt/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/opt/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/opt/hadoop/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/opt/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/opt/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.5.jar:/opt/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.5-tests.jar:/opt/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/opt/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-cli-1.2.jar:/opt/hadoop/share/hadoop/yarn/lib/jettison-1.1.jar:/opt/hadoop/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/opt/hadoop/share/hadoop/yarn/lib/jersey-server-1.9.jar:/opt/hadoop/share/hadoop/yarn/lib/jersey-core-1.9.jar:/opt/hadoop/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/share/hadoop/yarn/lib/jersey-json-1.9.jar:/opt/hadoop/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/opt/hadoop/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/opt/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/opt/hadoop/share/hadoop/yarn/lib/log4j-1.2.17.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-io-2.4.jar:/opt/hadoop/share/hadoop/yarn/lib/activation-1.1.jar:/opt/hadoop/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/opt/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/opt/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/opt/hadoop/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/opt/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/opt/hadoop/share/hadoop/yarn/lib/servlet-api-2.5.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/opt/hadoop/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/opt/hadoop/share/hadoop/yarn/lib/xz-1.0.jar:/opt/hadoop/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/opt/hadoop/share/hadoop/yarn/lib/guava-11.0.2.jar:/opt/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/opt/hadoop/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/opt/hadoop/share/hadoop/yarn/lib/asm-3.2.jar:/opt/hadoop/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/opt/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-codec-1.4.jar:/opt/hadoop/share/hadoop/yarn/lib/jetty-6.1.26.jar:/opt/hadoop/share/hadoop/yarn/lib/commons-lang-2.6.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-api-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-registry-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-client-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-2.7.5.jar:/opt/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/opt/hadoop/share/hadoop/mapreduce/lib/guice-3.0.jar:/opt/hadoop/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/opt/hadoop/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/opt/hadoop/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/opt/hadoop/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/opt/hadoop/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/opt/hadoop/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/opt/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/opt/hadoop/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/opt/hadoop/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/opt/hadoop/share/hadoop/mapreduce/lib/hadoop-annotations-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/opt/hadoop/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop/share/hadoop/mapreduce/lib/javax.inject-1.jar:/opt/hadoop/share/hadoop/mapreduce/lib/xz-1.0.jar:/opt/hadoop/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/opt/hadoop/share/hadoop/mapreduce/lib/junit-4.11.jar:/opt/hadoop/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/opt/hadoop/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/opt/hadoop/share/hadoop/mapreduce/lib/asm-3.2.jar:/opt/hadoop/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.5-tests.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.5.jar:/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.5.jar:/opt/hadoop/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = https://shv@git-wip-us.apache.org/repos/asf/hadoop.git -r 18065c2b6806ed4aa6a3187d77cbe21bb3dba075; compiled by 'kshvachk' on 2017-12-16T01:06Z
STARTUP_MSG: java = 1.8.0_151
************************************************************/
18/01/30 08:50:38 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
18/01/30 08:50:38 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-a8cec172-6f1b-4aa5-8f73-f99ad2bb29b2
18/01/30 08:50:38 INFO namenode.FSNamesystem: No KeyProvider found.
18/01/30 08:50:38 INFO namenode.FSNamesystem: fsLock is fair: true
18/01/30 08:50:38 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
18/01/30 08:50:39 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
18/01/30 08:50:39 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
18/01/30 08:50:39 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
18/01/30 08:50:39 INFO blockmanagement.BlockManager: The block deletion will start around 2018 Jan 30 08:50:39
18/01/30 08:50:39 INFO util.GSet: Computing capacity for map BlocksMap
18/01/30 08:50:39 INFO util.GSet: VM type= 64-bit
18/01/30 08:50:39 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
18/01/30 08:50:39 INFO util.GSet: capacity= 2^21 = 2097152 entries
18/01/30 08:50:39 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
18/01/30 08:50:39 INFO blockmanagement.BlockManager: defaultReplication= 1
18/01/30 08:50:39 INFO blockmanagement.BlockManager: maxReplication= 512
18/01/30 08:50:39 INFO blockmanagement.BlockManager: minReplication= 1
18/01/30 08:50:39 INFO blockmanagement.BlockManager: maxReplicationStreams= 2
18/01/30 08:50:39 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
18/01/30 08:50:39 INFO blockmanagement.BlockManager: encryptDataTransfer = false
18/01/30 08:50:39 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
18/01/30 08:50:39 INFO namenode.FSNamesystem: fsOwner= root (auth:SIMPLE)
18/01/30 08:50:39 INFO namenode.FSNamesystem: supergroup = supergroup
18/01/30 08:50:39 INFO namenode.FSNamesystem: isPermissionEnabled = true
18/01/30 08:50:39 INFO namenode.FSNamesystem: HA Enabled: false
18/01/30 08:50:39 INFO namenode.FSNamesystem: Append Enabled: true
18/01/30 08:50:39 INFO util.GSet: Computing capacity for map INodeMap
18/01/30 08:50:39 INFO util.GSet: VM type= 64-bit
18/01/30 08:50:39 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
18/01/30 08:50:39 INFO util.GSet: capacity= 2^20 = 1048576 entries
18/01/30 08:50:39 INFO namenode.FSDirectory: ACLs enabled? false
18/01/30 08:50:39 INFO namenode.FSDirectory: XAttrs enabled? true
18/01/30 08:50:39 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
18/01/30 08:50:39 INFO namenode.NameNode: Caching file names occuring more than 10 times
18/01/30 08:50:39 INFO util.GSet: Computing capacity for map cachedBlocks
18/01/30 08:50:39 INFO util.GSet: VM type= 64-bit
18/01/30 08:50:39 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
18/01/30 08:50:39 INFO util.GSet: capacity= 2^18 = 262144 entries
18/01/30 08:50:39 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
18/01/30 08:50:39 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
18/01/30 08:50:39 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
18/01/30 08:50:39 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
18/01/30 08:50:39 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
18/01/30 08:50:39 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
18/01/30 08:50:39 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
18/01/30 08:50:39 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
18/01/30 08:50:39 INFO util.GSet: Computing capacity for map NameNodeRetryCache
18/01/30 08:50:39 INFO util.GSet: VM type= 64-bit
18/01/30 08:50:39 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
18/01/30 08:50:39 INFO util.GSet: capacity= 2^15 = 32768 entries
18/01/30 08:50:39 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1709412250-192.168.1.108-1517320239405
18/01/30 08:50:39 INFO common.Storage: Storage directory /var/hadoop/tmp/dfs/name has been successfully formatted.
18/01/30 08:50:39 INFO namenode.FSImageFormatProtobuf: Saving image file /var/hadoop/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
18/01/30 08:50:39 INFO namenode.FSImageFormatProtobuf: Image file /var/hadoop/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 321 bytes saved in 0 seconds.
18/01/30 08:50:39 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
18/01/30 08:50:39 INFO util.ExitUtil: Exiting with status 0
18/01/30 08:50:39 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at v108.zlikun.com/192.168.1.108
************************************************************/
# 查看格式化的目录
$ ls -l /var/hadoop/tmp/
total 0
drwxr-xr-x. 3 root root 18 Jan 30 08:50 dfs
运行HDFS
# 启动HDFS,启动后应有三个进程 ( 因为是伪分布式,所以各个节点都在同一台机器上 )
$ sbin/start-dfs.sh
$ jps
9091 DataNode
9242 SecondaryNameNode
8973 NameNode
# 此时应可以在在浏览器中通过URL访问HDFS信息
# http://192.168.1.108:50070/
# 如果访问不到,可能是防火墙禁用了50070的端口访问,这里选择关闭防火墙 ( 生产环境不要这样做 )
$ firewall-cmd --state
running
$ systemctl stop firewalld
$ firewall-cmd --state
not running
# 这里直接禁用掉防火墙( 开机时不会自启动 )
$ systemctl disable firewalld
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
# 在HDFS中创建一个目录 ( 这里创建一个用户目录,Hadoop中用户目录是/user目录,这里直接使用的是root帐号 )
$ bin/hdfs dfs -mkdir -p /user/root
# 上传一个本地文件到HDFS中
$ bin/hdfs dfs -put input/lang.txt lang.txt
# 查看上传后的文件
$ bin/hdfs dfs -ls /user/root
Found 1 items
-rw-r--r-- 1 root supergroup59 2018-01-30 09:01 /user/root/lang.txt
# 运行词频统计程序,这次统计的文件位于HDFS中
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar wordcount lang.txt output
18/01/30 09:06:01 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
18/01/30 09:06:01 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
18/01/30 09:06:01 INFO input.FileInputFormat: Total input paths to process : 1
18/01/30 09:06:01 INFO mapreduce.JobSubmitter: number of splits:1
18/01/30 09:06:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local147097953_0001
18/01/30 09:06:02 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
18/01/30 09:06:02 INFO mapreduce.Job: Running job: job_local147097953_0001
18/01/30 09:06:02 INFO mapred.LocalJobRunner: OutputCommitter set in config null
18/01/30 09:06:02 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
18/01/30 09:06:02 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
18/01/30 09:06:02 INFO mapred.LocalJobRunner: Waiting for map tasks
18/01/30 09:06:02 INFO mapred.LocalJobRunner: Starting task: attempt_local147097953_0001_m_000000_0
18/01/30 09:06:02 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
18/01/30 09:06:02 INFO mapred.Task:Using ResourceCalculatorProcessTree : [ ]
18/01/30 09:06:02 INFO mapred.MapTask: Processing split: hdfs://v108.zlikun.com:9000/user/root/lang.txt:0+59
18/01/30 09:06:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
18/01/30 09:06:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
18/01/30 09:06:02 INFO mapred.MapTask: soft limit at 83886080
18/01/30 09:06:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
18/01/30 09:06:02 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
18/01/30 09:06:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
18/01/30 09:06:02 INFO mapred.LocalJobRunner:
18/01/30 09:06:02 INFO mapred.MapTask: Starting flush of map output
18/01/30 09:06:02 INFO mapred.MapTask: Spilling map output
18/01/30 09:06:02 INFO mapred.MapTask: bufstart = 0; bufend = 99; bufvoid = 104857600
18/01/30 09:06:02 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214360(104857440); length = 37/6553600
18/01/30 09:06:02 INFO mapred.MapTask: Finished spill 0
18/01/30 09:06:02 INFO mapred.Task: Task:attempt_local147097953_0001_m_000000_0 is done. And is in the process of committing
18/01/30 09:06:02 INFO mapred.LocalJobRunner: map
18/01/30 09:06:02 INFO mapred.Task: Task 'attempt_local147097953_0001_m_000000_0' done.
18/01/30 09:06:02 INFO mapred.Task: Final Counters for attempt_local147097953_0001_m_000000_0: Counters: 23
File System Counters
FILE: Number of bytes read=296004
FILE: Number of bytes written=586165
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=59
HDFS: Number of bytes written=0
HDFS: Number of read operations=5
HDFS: Number of large read operations=0
HDFS: Number of write operations=1
Map-Reduce Framework
Map input records=1
Map output records=10
Map output bytes=99
Map output materialized bytes=92
Input split bytes=111
Combine input records=10
Combine output records=7
Spilled Records=7
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=18
Total committed heap usage (bytes)=165744640
File Input Format Counters
Bytes Read=59
18/01/30 09:06:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local147097953_0001_m_000000_0
18/01/30 09:06:02 INFO mapred.LocalJobRunner: map task executor complete.
18/01/30 09:06:02 INFO mapred.LocalJobRunner: Waiting for reduce tasks
18/01/30 09:06:02 INFO mapred.LocalJobRunner: Starting task: attempt_local147097953_0001_r_000000_0
18/01/30 09:06:02 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
18/01/30 09:06:02 INFO mapred.Task:Using ResourceCalculatorProcessTree : [ ]
18/01/30 09:06:02 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@8fe6e1c
18/01/30 09:06:02 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10
18/01/30 09:06:02 INFO reduce.EventFetcher: attempt_local147097953_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
18/01/30 09:06:02 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local147097953_0001_m_000000_0 decomp: 88 len: 92 to MEMORY
18/01/30 09:06:02 INFO reduce.InMemoryMapOutput: Read 88 bytes from map-output for attempt_local147097953_0001_m_000000_0
18/01/30 09:06:02 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 88, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->88
18/01/30 09:06:02 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
18/01/30 09:06:02 INFO mapred.LocalJobRunner: 1 / 1 copied.
18/01/30 09:06:02 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
18/01/30 09:06:02 INFO mapred.Merger: Merging 1 sorted segments
18/01/30 09:06:02 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 79 bytes
18/01/30 09:06:02 INFO reduce.MergeManagerImpl: Merged 1 segments, 88 bytes to disk to satisfy reduce memory limit
18/01/30 09:06:02 INFO reduce.MergeManagerImpl: Merging 1 files, 92 bytes from disk
18/01/30 09:06:02 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
18/01/30 09:06:02 INFO mapred.Merger: Merging 1 sorted segments
18/01/30 09:06:02 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
18/01/30 09:06:02 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 79 bytes
18/01/30 09:06:02 INFO mapred.LocalJobRunner: 1 / 1 copied.
18/01/30 09:06:02 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
18/01/30 09:06:02 INFO mapred.Task: Task:attempt_local147097953_0001_r_000000_0 is done. And is in the process of committing
18/01/30 09:06:02 INFO mapred.LocalJobRunner: 1 / 1 copied.
18/01/30 09:06:02 INFO mapred.Task: Task attempt_local147097953_0001_r_000000_0 is allowed to commit now
18/01/30 09:06:02 INFO output.FileOutputCommitter: Saved output of task 'attempt_local147097953_0001_r_000000_0' to hdfs://v108.zlikun.com:9000/user/root/output/_temporary/0/task_local147097953_0001_r_000000
18/01/30 09:06:02 INFO mapred.LocalJobRunner: reduce > reduce
18/01/30 09:06:02 INFO mapred.Task: Task 'attempt_local147097953_0001_r_000000_0' done.
18/01/30 09:06:02 INFO mapred.Task: Final Counters for attempt_local147097953_0001_r_000000_0: Counters: 29
File System Counters
FILE: Number of bytes read=296220
FILE: Number of bytes written=586257
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=59
HDFS: Number of bytes written=58
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Map-Reduce Framework
Combine input records=0
Combine output records=0
Reduce input groups=7
Reduce shuffle bytes=92
Reduce input records=7
Reduce output records=7
Spilled Records=7
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=5
Total committed heap usage (bytes)=165744640
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Output Format Counters
Bytes Written=58
18/01/30 09:06:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local147097953_0001_r_000000_0
18/01/30 09:06:02 INFO mapred.LocalJobRunner: reduce task executor complete.
18/01/30 09:06:03 INFO mapreduce.Job: Job job_local147097953_0001 running in uber mode : false
18/01/30 09:06:03 INFO mapreduce.Job:map 100% reduce 100%
18/01/30 09:06:03 INFO mapreduce.Job: Job job_local147097953_0001 completed successfully
18/01/30 09:06:03 INFO mapreduce.Job: Counters: 35
File System Counters
FILE: Number of bytes read=592224
FILE: Number of bytes written=1172422
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=118
HDFS: Number of bytes written=58
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Map-Reduce Framework
Map input records=1
Map output records=10
Map output bytes=99
Map output materialized bytes=92
Input split bytes=111
Combine input records=10
Combine output records=7
Reduce input groups=7
Reduce shuffle bytes=92
Reduce input records=7
Reduce output records=7
Spilled Records=14
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=23
Total committed heap usage (bytes)=331489280
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=59
File Output Format Counters
Bytes Written=58
# 查看统计结果
$ bin/hdfs dfs -cat output/*
erlang1
golang1
java3
javascript1
lua 1
ruby1
rust2
# 停止HDFS
$ sbin/stop-dfs.sh
Stopping namenodes on [v108.zlikun.com]
v108.zlikun.com: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
配置YARN
# 复制 mapred-site.xml.template 为 mapred-site.xml
# etc/hadoop/mapred-site.xml ,配置 mapreduce任务由YARN来调度


mapreduce.framework.name
yarn


# etc/hadoop/yarn-site.xml


yarn.nodemanager.aux-services
mapreduce_shuffle



yarn.nodemanager.resource.memory-mb
4096



yarn.nodemanager.resource.cpu-vcores
2



yarn.log-aggregation-enable
true



yarn.log-aggregation.retain-seconds
604800



yarn.nodemanager.remote-app-log-dir
/tmp/logs


# 上述全部配置参考:
# http://hadoop.apache.org/docs/r2.7.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
# http://hadoop.apache.org/docs/r2.7.5/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
运行YARN
# 启动HDFS和YARN
$ sbin/start-dfs.sh
$ sbin/start-yarn.sh
# 查看进程 ( 应有5个进程 )
$ jps
10977 NameNode
11523 NodeManager
11101 DataNode
11262 SecondaryNameNode
11407 ResourceManager
# 同样,YARN可以通过浏览器来访问其状态信息
# http://192.168.1.108:8088/cluster
# 删除之前生成的文件 ( MapReduce程序输出的目录不能是系统已存在的目录 ),下面将重新执行词频统计程序
$ bin/hdfs dfs -rm -r output
18/01/30 09:10:18 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted output
# 重新运行词频统计程序,这次由YARN来调度执行
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar wordcount lang.txt output
18/01/30 10:18:28 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/01/30 10:18:29 INFO input.FileInputFormat: Total input paths to process : 1
18/01/30 10:18:29 INFO mapreduce.JobSubmitter: number of splits:1
18/01/30 10:18:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517321368440_0002
18/01/30 10:18:29 INFO impl.YarnClientImpl: Submitted application application_1517321368440_0002
18/01/30 10:18:29 INFO mapreduce.Job: The url to track the job: http://v108.zlikun.com:8088/proxy/application_1517321368440_0002/
18/01/30 10:18:29 INFO mapreduce.Job: Running job: job_1517321368440_0002
18/01/30 10:18:38 INFO mapreduce.Job: Job job_1517321368440_0002 running in uber mode : false
18/01/30 10:18:38 INFO mapreduce.Job:map 0% reduce 0%
18/01/30 10:18:44 INFO mapreduce.Job:map 100% reduce 0%
18/01/30 10:18:49 INFO mapreduce.Job:map 100% reduce 100%
18/01/30 10:18:50 INFO mapreduce.Job: Job job_1517321368440_0002 completed successfully
18/01/30 10:18:50 INFO mapreduce.Job: Counters: 49
File System Counters
... ...
Job Counters
... ...
Map-Reduce Framework
Map input records=1
Map output records=10
Map output bytes=99
Map output materialized bytes=92
Input split bytes=111
Combine input records=10
Combine output records=7
Reduce input groups=7
Reduce shuffle bytes=92
Reduce input records=7
Reduce output records=7
Spilled Records=14
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=146
CPU time spent (ms)=1900
Physical memory (bytes) snapshot=328867840
Virtual memory (bytes) snapshot=4159520768
Total committed heap usage (bytes)=219676672
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=59
File Output Format Counters
Bytes Written=58
界面预览

HDFS 管理界面
HDFS 管理界面
HDFS 文件浏览器


YARN 监控界面


由于是笔记性质的博客,所以写了很多注释,其中有谬误之处,请读者留言指出,我好修改。

最新文章

123

最新摄影

闪念基因

微信扫一扫

第七城市微信公众平台