一 需要準備的軟件
1.Ubuntu 14.04
? ? ?三個主機
? ? ?192.168.71.136 ?cloud01
? ? ?192.168.71.135 ?cloud02
? ? ? 192.168.71.137 ?cloud03
2.jdk-7u51-linux-i586.tar.gz
3.hadoop-2.2.0.tar.gz
百度云盤鏈接:pan.baidu.com/s/1pKADKNL
二 操作步驟
單機搭建
1.修改主機名分別修改三個主機名為cloud01 cloud02 cloud03
Sudo gedit /etc/hostnname(重啟)
2 在hosts中添加地址內容
192.168.71.134 cloud01
192.168.71.129 cloud02
192.168.71.130 cloud03
Sudo gedit /etc/hosts
3 ?安裝java(分別安裝)
新建文件夾并八java的壓縮包拷貝到該目錄下
Sudo mkdir/usr/java
解壓
Sudo tar –zxvf文件名
修改配置文件
Sudo gedit/etc/profile
添加如下內容:
export JAVA_HOME=/usr/java/jdk1.7.0_51
exportCLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
exportPATH=$JAVA_HOME/bin:$PATH
執行命令
Source /etc/profile
查看是否安裝成功
Java –version
4.安裝hadoop
把文件拷貝到家目錄解壓
Sudo tar –zxvf文件名
解壓之后chmod –R 777得到的文件名 賦予執行權限
這一步為止表示單機安裝完畢 驗證一下
在該目錄下執行
./bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 10 20
偽分布搭建(接上)
5.安裝ssh
執行命令sudo apt-get install ssh
在家目錄新建文件.sshsudo mkdir .ssh
進入該文件夾cd .ssh
ssh-keygen –t rsa (一路enter)
cat id_rsa.pub >> authorized_keys
sudo service ssh restart
測試一下ssh cloud00(效果是不需要輸入密碼)
6配置hadoop環境變量
首先在家目錄創建幾個文件夾
~/hddata/dfs/name
~/hddata/dfs/data
~/hddata/tmp
然后在hadoop 2.2.0文件夾下 修改三個配置文件
gedit etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
gedit etc/hadoop/core-site.xml
fs.default.name
hdfs://localhost:9000
hadoop.tmp.dir
/home/hduser/hddata/tmp
gedit etc/hadoop/hdfs-site.xml
dfs.namenode.name.dir
/home/hduser/hddata/dfs/name
dfs.datanode.data.dir
/home/hduser/hddata/dfs/data
dfs.replication
1
cp etc/hadoop/mapred-site.xml.templateetc/hadoop/mapred-site.xml
gedit etc/hadoop/mapred-site.xml
mapred.job.tracker
localhost:54311
mapred.map.tasks
10
mapred.reduce.tasks
2
格式硬盤
./bin/hdfs namenode –format
啟動所有的程序
./sbin/start-all.sh
查看啟動程序
Jps
3776 ResourceManager
3354 NameNode
3645 SecondaryNameNode
3467 DataNode
3895 NodeManager
4382 Jps
測試在瀏覽器中輸入loalhost:50070
這個時候 偽分布搭建已經完成
集群搭建
1.解壓集群配置文件,在虛擬機中打開三個機器
2.修改每臺機器的固定IP地址,注意查看網關和DNS
3.修改每臺機器的hosts
sudo gedit /etc/hosts
192.168.71..136 cloud01
192.168.71.135 cloud02
192.168.71.137 cloud03
注:請將原文件最上面的第二行127.0.1.1刪除掉,每臺機器都要做
4.每臺機器配公私鑰
sudo apt-get install ssh
mkdir .ssh
cd .ssh
ssh-keygen -t rsa
cat id_rsa.pub>>authorized_keys
sudo service ssh restart
ssh localhost
如果存在.ssh文件夾,則應先刪除.ssh(rm-rf .ssh)
5.發送主機的公鑰,并加入到每臺機器的授權文件中
cd .ssh
scp authorized_keyshduser@cloud02:~/.ssh/authorized_keys_from_cloud01
分別進入cloud02和cloud03,執行以下命令
cd .ssh
catauthorized_keys_from_cloud01>>authorized_keys
6.在每臺機器上安裝jdk
7.在主機上安裝hadoop-2.2.0(tar-zxvf hadoop-2.2.0.tar.gz)
8.在每臺機器的主文件夾下新建以下三個文件夾
~/hddata/dfs/name
~/hddata/dfs/data
~/hdata/tmp
scp -r ~/hddata hduser@cloud02:~/
scp -r ~/hddata hduser@cloud03:~/
9.在主機上修改7個配置文件
cd hadoop-2.2.0
(1)geditetc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
(2)geditetc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
(3)geditetc/hadoop/slaves
cloud01
cloud02
cloud03
(4)geditetc/hadoop/core-site.xml
fs.defaultFS
hdfs://cloud01:9000
io.file.buffer.size
131072
hadoop.tmp.dir
/home/hduser/hddata/tmp
(5)geditetc/hadoop/hdfs-site.xml
dfs.namenode.secondary.http-address
cloud01:9001
dfs.namenode.name.dir
/home/hduser/hddata/dfs/name
dfs.datanode.data.dir
/home/hduser/hddata/dfs/data
dfs.replication
2
dfs.webhdfs.enabled
true
(6)cpmapred-site.xml.template mapred-site.xml
gedit etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
cloud01:10020
mapreduce.jobhistory.webapp.address
cloud01:19888
(7)geditetc/hadoop/yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.resourcemanager.address
cloud01:8132
yarn.resourcemanager.scheduler.address
cloud01:8130
yarn.resourcemanager.resource-tracker.address
cloud01:8131
yarn.resourcemanager.admin.address
cloud01:8133
yarn.resourcemanager.webapp.address
cloud01:8188
9.將主機上的hadoop-2.2.0的文件夾發送給另兩臺機器
scp -r hadoop-2.2.0 hduser@cloud02:~/
scp -r hadoop-2.2.0 hduser@cloud03:~/
10.格式化namenode
cd hadoop-2.2.0
./bin/hdfs namenode -format
11.啟動hadoop
./sbin/start-all.sh
查看文件塊組成
./bin/hdfs fsck / -files -blocks
./bin/hdfs dfsadmin -report
http://192.168.71.136:50070
http://192.168.71.136:8188
./sbin/mr-jobhistory-daemon.sh start historyserver
12.運行pi
./bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 10 20