Intro Guide to AWS

Intro Guide to AWS

Intro Guide to AWS

By February 1, 2016 Booz Allen, Data Science No Comments

Intro Guide to AWS

This guide will walk you through using spot instances with Amazon Web Services (AWS) to help you save money when training DSB models on Mxnet. A spot instance on AWS is a virtual machine hosted on the Amazon cloud that you bid for. If you are outbid, the instance is terminated and all data associated with that instance is lost. There are certain steps which may require external search such as using Google/Bing. For instance, this guide does not cover setup of an AWS. We assume you have an AWS account, and we start from there.

  1. Go to https://aws.amazon.com/ -> My Account -> AWS Management Console. Enter your username and password for your AWS account.

fig1

 

2. Click on the EC2 link.

fig2

 

3. Click on the Spot Requests link. Then click the Request Spot Instances link.

fig3

 

4. Under Community AMIs search for mxnet and select MXNet.jl_GPU – ami-eda8fc87

fig4

 

5. Select the g2.8xlarge instance. And click Next: Configure Instance Details

fig5

 

6. Select a maximum price you are willing to pay per hour. I input $0.6 in this case but the price is up to you. The higher the price the lower chance you will have on getting your instance terminated by a higher bid. You may also select other options that are documented by Amazon. Note that this example is in Northern Virginia and your exact options and prices will vary.

fig6

 

7. You can add storage in the way you want here. In this example, I just input 900 GB in the root. You don’t need this much space to download and extract the DSB files, but just in case.

fig7

 

8. I don’t do anything on this Tag Spot Request page. I just click Next.

fig8

 

9. I don’t do anything on the Configure Security Group page. You may, however, want to add security to your instance.

fig9

 

10. Now I just Review Spot Instance Request.

fig10

 

11. Use wget to download the Data Science Bowls quickly onto your AWS machine. Searching Kaggle will give you the steps to accomplish this.

 

GPU instances may have prices between 0.07 to 0.40 per hour depending on which GPU instance you select and the rate in your AWS area. Using GPU instances 24/7 for 50 days could cost around $0.4*24*50=$480 dollars at the high end.

After these above steps you may want additional help which can be found via web search or a place like Stack Overflow. Typically, you will access your machine via your pem file using ssh (in the terminal). The instructions are found on the AWS console. You may need to chmod 400 yourpemfile. You can scp to transfer files to and from your machine. You might want to do this often in case your instance is outbid and thus terminated. All files that are not backed up or on S3 are lost. In future posts we may add other ways to leverage MXNet and AWS such as using AWS S3 for storage.

Mxnet provides documentation for S3 here.

—Written by Mike Kim