一篇漫畫讓你了解Hadoop HDFS!
默認的3副本存放規則:
- 如果寫入Client是HDFS集群中的DN,則1st副本存放在本Client所在主機;
- 如果寫入Client不是HDFS集群中DN,則1st副本隨機存放在集群中某個DN;
- 2nd副本存放于和1st副本不同機架的某DN節點上;
- 3rd副本存放于和2nd副本相同機架的另外一個DN上。
/**
* The class is responsible for choosing the desired number of targets
* for placing block replicas.
* The replica placement strategy is that if the writer is on a datanode,
* the 1st replica is placed on the local machine,
* otherwise a random datanode. The 2nd replica is placed on a datanode
* that is on a different rack. The 3rd replica is placed on a datanode
* which is on a different node of the rack as the second replica.
*/
@InterfaceAudience.Private
public class BlockPlacementPolicyDefault extends BlockPlacementPolicy {
private static final String enableDebugLogging =
"For more information, please enable DEBUG log level on "
+ BlockPlacementPolicy.class.getName();
private static final ThreadLocal<StringBuilder> debugLoggingBuilder
= new ThreadLocal<StringBuilder>() {
@Override
protected StringBuilder initialValue() {
return new StringBuilder();
}
};
關于block副本數需要注意的地方:
- 設置的副本數不能超過集群中DataNode的數量
- 每個DataNode只能存放某block的一個副本
- 每個機架最多只能存放某個block的2個副本