Friday, September 12, 2008

Problem Set 0: Getting onto the cloud

Got comments, questions, gripes, issues, etc. on Problem Set 0: Getting onto the cloud in my cloud computing course?

Post here!

2 comments:

Unknown said...

My comment is: Don't under-estimate anything with Zero.

Seriously, I got pain in problem 0 and wanna show my problems here.
1. I forgot to include the first line "---begin" and last line"----" in the pair keyfile. When I login in, it asked me to provide "passphrase". After I included them, it worked.
2. Don't make the keypair file too open (for example, chmod 777). It won't take this keypair.
3. I didn't have ssh in cygwin. When I start the cluster, it's always said: no command found. Start it again, it said can't find .hadooop-zone-master-cluster file. After I installed ssh in cygwin, it worked.
4. Don't include space in your directory path for EC2, if you want, set protection.

Forgot to mention, my OS is winXP.

Unknown said...

Me and one other student experienced a problem where we would get exceptions every time we tried to replicate something (be it input files, running a hadoop job, etc) This was occurring because the computer that we were being told is the master was actually a slave. Sometimes we get the correct master, but more often than not we get a slave. A work around for this is to first start up a cluster with 0 nodes and then add nodes. (can be done with the same command that you use to start a cluster)

If you encounter this problem and do not want to shut down your cluster you can manually edit the files in your temp directory(defined in env.sh) to the correct ip address which you can get by trial and error from the slave machines listed in ec2-describe-instance.

If someone can find a complete solution to this problem (instead of a workaround) it would be appreciated.

Contributors