Discussion:
Failed to active namenode when config HA
清如许
2014-09-28 18:56:36 UTC
Permalink
Hi,

I'm new to hadoop and meet some problems when config HA.
Below are some important configuration in core-site.xml

<property>
<name>dfs.nameservices</name>
<value>ns1,ns2</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn3</value>
</property>
<property>
<name>dfs.ha.namenodes.ns2</name>
<value>nn2,nn4</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>namenode1:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn3</name>
<value>namenode3:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2.nn2</name>
<value>namenode2:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2.nn4</name>
<value>namenode4:9000</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://datanode2:8485;datanode3:8485;datanode4:8485/ns1</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hduser/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hduser/mydata/hdfs/journalnode</value>
</property>

(two nameservice ns1,ns2 is for configuring federation later. In this step, I only want launch ns1 on namenode1,namenode3)

After configuration, I did the following steps
firstly, I start jornalnode on datanode2,datanode3,datanode4
secondly I format datanode1 and start namenode on it
then i run 'hdfs namenode -bootstrapStandby' on the other namenode and start namenode on it

Everything seems fine unless no namenode is active now, then i tried to active one by running
hdfs haadmin -transitionToActive nn1 on namenode1
but strangely it says "Illegal argument: Unable to determine the nameservice id."

Could anyone tell me why it cannot determine nn1 from my configuration?
Is there something wrong in my configuraion?

Thanks a lot!!!
Matt Narrell
2014-09-28 22:28:52 UTC
Permalink
I’m pretty sure HDFS HA is relegated to two name nodes (not four), designated active and standby. Secondly, I believe these properties should be in hdfs-site.xml NOT core-site.xml.

Furthermore, I think your HDFS nameservices are misconfigured. Consider the following:

<?xml version="1.0"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/var/data/hadoop/hdfs/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/var/data/hadoop/hdfs/dn</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>hdfs-cluster</value>
</property>

<property>
<name>dfs.ha.namenodes.hdfs-cluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdfs-cluster.nn1</name>
<value>namenode1:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfs-cluster.nn1</name>
<value>namenode1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdfs-cluster.nn2</name>
<value>namenode2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfs-cluster.nn2</name>
<value>namenode2:50070</value>
</property>

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://journalnode1:8485;journalnode2:8485;journalnode3:8485/hdfs-cluster</value>
</property>

<property>
<name>dfs.client.failover.proxy.provider.hdfs-cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
</configuration>

mn
Post by 清如许
Hi,
I'm new to hadoop and meet some problems when config HA.
Below are some important configuration in core-site.xml
<property>
<name>dfs.nameservices</name>
<value>ns1,ns2</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn3</value>
</property>
<property>
<name>dfs.ha.namenodes.ns2</name>
<value>nn2,nn4</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>namenode1:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn3</name>
<value>namenode3:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2.nn2</name>
<value>namenode2:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2.nn4</name>
<value>namenode4:9000</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://datanode2:8485;datanode3:8485;datanode4:8485/ns1</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hduser/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hduser/mydata/hdfs/journalnode</value>
</property>
(two nameservice ns1,ns2 is for configuring federation later. In this step, I only want launch ns1 on namenode1,namenode3)
After configuration, I did the following steps
firstly, I start jornalnode on datanode2,datanode3,datanode4
secondly I format datanode1 and start namenode on it
then i run 'hdfs namenode -bootstrapStandby' on the other namenode and start namenode on it
Everything seems fine unless no namenode is active now, then i tried to active one by running
hdfs haadmin -transitionToActive nn1 on namenode1
but strangely it says "Illegal argument: Unable to determine the nameservice id."
Could anyone tell me why it cannot determine nn1 from my configuration?
Is there something wrong in my configuraion?
Thanks a lot!!!
Loading...