Monday, September 3, 2012

Hadoop could not obtain block

Just got a new exciting exception

java.lang.IllegalStateException: hdfs://mypath0/part-03511
at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:131)
at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:1)
at com.google.common.collect.Iterators$9.transform(Iterators.java:845)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.Iterators$6.hasNext(Iterators.java:583)
at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
at org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:208)
at org.apache.mahout.clustering.iterator.CIMapper.setup(CIMapper.java:28)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: Could not obtain block: blk_-3246019420168585051_14555 file=/mypath0/part-03511
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2269)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2063)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2224)
at java.io.DataInputStream.readFully(DataInputStream.java:178)


Looking into your datanode log you could find something like that

2012-03-03 14:28:55,095 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(112.28.22.231:50010, storageID=DS-314214910-127.0.0.2-50010-1346671568278, infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: xceiverCount 275 exceeds the limit of concurrent xcievers 256
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:92)
        at java.lang.Thread.run(Thread.java:662)



That could be a configuration issue of your data node. Try to set the xcievers property higher than 256 in your hdfs-site.xml.
Check the hadoop book page for detailed information!