...
No Format |
---|
hama/bin/hama jar YOUR_JAR.jar |
to your Hama cluster.
Troubleshooting
If your job does not execute, your cluster may not have enough resources (task slots).
Symptoms may look like this in the bsp master logIf you sort the result descending by pagerank you can see the following top 10 sites:
No Format |
---|
2012-05-27 20:00:51,228 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2012-05-27 20:00:51,288 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 16 2012-05-27 20:00:51,305 INFO org.apache.hama.bsp.JobInProgress: Job is initialized. 2012-05-27 20:00:51,313 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Pagerank could not be done successfully. Killing it! 2012-05-27 20:01:08,334 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 16 2012-05-27 20:01:08,339 INFO org.apache.hama.bsp.JobInProgress: Job is initialized. 2012-05-27 20:01:08,340 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Pagerank could not be done successfully. Killing it! |
This was run on a 8 slot cluster, but it required 16 slots because of 64m chunk size of HDFS. Either you can reupload the file with higher chunksize so the slots match the blocks or you can increase the slots in your Hama cluster.
If you sort the result descending by pagerank you can see the following top 10 sites:
No Format |
---|
1 0.00222 United_States
2 0.00141 2007
3 0.00136 2008
4 0.00126 Geographic_coordinate_system
5 0.00101 United_Kingdom
6 0.00087 2006
7 0.00074 France
8 0.00073 Wikimedia_Commons
9 0.00066 Wiktionary
10 0.00065 Canada
|
1 0.00222 United_States
2 0.00141 2007
3 0.00136 2008
4 0.00126 Geographic_coordinate_system
5 0.00101 United_Kingdom
6 0.00087 2006
7 0.00074 France
8 0.00073 Wikimedia_Commons
9 0.00066 Wiktionary
10 0.00065 Canada
|
Note that you can decode the indices you may see with the page titels files.
Troubleshooting
If your job does not execute, your cluster may not have enough resources (task slots).
Symptoms may look like this in the bsp master log:
No Format |
---|
2012-05-27 20:00:51,228 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-05-27 20:00:51,288 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 16
2012-05-27 20:00:51,305 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
2012-05-27 20:00:51,313 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Pagerank could not be done successfully. Killing it!
2012-05-27 20:01:08,334 INFO org.apache.hama.bsp.JobInProgress: num BSPTasks: 16
2012-05-27 20:01:08,339 INFO org.apache.hama.bsp.JobInProgress: Job is initialized.
2012-05-27 20:01:08,340 ERROR org.apache.hama.bsp.SimpleTaskScheduler: Scheduling of job Pagerank could not be done successfully. Killing it!
|
This was run on a 8 slot cluster, but it required 16 slots because of 64m chunk size of HDFS. Either you can reupload the file with higher chunksize so the slots match the blocks or you can increase the slots in your Hama clusterNote that you can decode the indices you may see with the page titels files.