Ruben Orduz's Blog: AWS

Showing posts with label AWS. Show all posts

Wednesday, May 23, 2012

boto cheat sheet

I've been using python boto for nearly a year and have been greatly impressed with it from the get-go. However, I usually find myself forgetting key methods and parameter for the few AWS services I use the most, namely SQS, DynamoDB, S3 and EC2. So, in order to avoid going through the documentation every time, I made a cheat sheet for the most commonly used methods and functionality of the above said services. You can now download it here. If you see something amiss or inaccurate please let me know.

Friday, March 23, 2012

Another of my small personal pet projects is up: RDZR

About a three days ago I thought up of a very simple use case to tackle as a personal project: a URL "minifying", "shrinking" service. It so happened that my head was fresh-hot on web2py that I had been using for a work-related project. So, I took on the task to implement it as fast as I could -- given the natural time constraints that come with being a father of two rambunctious kids and a full-time job :) So, having which framework to use figured out, I moved to the next question: where to host it? AWS was a natural choice given that I've been using their services for about one-and-half year with very little problems. However, I thought I'd give GAE (Google App Engine) a try given than web2py has built-in GAE deployment (including database layer for their data store). But, as it turns out, GAE doesn't offer what's, in my opinion, very basic options (like providing/renting IP addresses and assigning it to an app so that one can have DNS A-records pointing to it), as well as lack of SSL for custom domains. So, after bumping my head against their wall, I went with AWS -- and yes, it was as easy as pie to setup: fire up the instance, associate elastic IP, point Route 53 to that address and presto! That out of the way, I tackled the code aspect. After, say, couple of hours I was done with that. About forty lines of python (although I'm sure hard-core pythonistas could compress it further through conventions and idioms) and a few tweaks to their routing it was all it took to have the whole service completed -- with error handling and all that jazz. Then came my least favorite part of the process: dealing with the presentation layer (i.e. HTML/CSS). Even though I'm using most of web2py's default layout and CSS, it was still rather annoying adding my own dash html/css to it. But that's done.

So without fanfare I present you RDZR: Yet Another URL ReDuZR. Which can be reached through http://rdzr.co (or its SSL flavor, if you prefer https://rdzr.co)

Technologies used: Python, Web2Py, SQLite, Apache + mod_wsgi + mod _ssl

Saturday, January 21, 2012

AWS surprises with DynamoDB

While none of the features I thought was going to be announced (and I do hope they do address those soon), they did come through with a very nifty announcement. They introduced DynamoDB which so far has made quite the splash in the cloud developer community. Most AWS API libraries on github and rooms in IRC were lit up in activity for the last couple days. I jumped in and helped out a wee bit the folks of boto releasing their DynamoDB support (in a very short amount of time). Using a preliminary release of boto, I set out to test DynamoDB's performance (specially against SimpleDB). It simply blows SimpleDB out of the water. Read/write query times were in the 5-6 ms region, while deletes were a bit over 10 ms in average. From our office (which includes round trip network latency), numbers were about 60 ms for read/writes and about 70 ms for deletes this was with a 10/10 provisioned throughput which is almost completely covered under their free usage tier. Overall, I'm fairly impressed with DynamoDB and I think it'll make AWS buckets of money. Redis and Mongo (although a completely different object models), are somewhat infamous for not being particularly performant when running on EBS and EC2, and many were looking for alternatives. DynamoDB does offer that alternative and further it provides scaling and replication out of the box.

I will post a tutorial and perhaps more detailed numbers in coming days.

Tuesday, January 17, 2012

Things I'm hoping to hear in tomorrow's AWS Cloud Event

AWS is making big noise about tomorrow's "AWS Cloud Event" webcast. They are marketing it as "Find out what's next in the AWS Cloud". So marketing and hype aside, here's a short list of things I'd like to hear them announcing:

Rails/.Net/Python BeanStalk containers

They need to if they want to compete against Google App Engine and other PaaS providers

Multi-Region services

So far all their services are region-specific, which has its advantages but it also adds lots of complexity to the resource management code, so, for example, it would be nice if they allowed to have a VPC span two or more regions while each of the subnets remain region and AZ specific

Ability to launch RDS instances inside a VPC

RDS sounds like a promising service to use, but the inability to launch instances inside a VPC makes it a deal breaker for many

MS SQL Server as an option for RDS

Again, RDS sound like promising server but there are many applications and customers who work on a SQL Server backend DB. Having this option would add a strong selling point to RDS

Region-to-region direct connection

Even if they don't announce any multi-region service support, having region-to-region (with no internet hops) fat pipe would be huge. It would make Disaster Recovery replication and redundancy much simpler and faster

While AWS currently has a dominant position in the "Cloud Computing" arena (and much deserved, btw) , the above suggested features would make it even more of a no-brainer for developers to use and for managers to appreciate.

Monday, December 19, 2011

Setting up Queue-size-based auto scaling groups in AWS

One of AWS' coolest features is the ability to scale in and out according to custom criteria. It can be based on machine load, number of requests, and so forth. For the sake of this tutorial, we are going to focus on queue-size-based auto scaling, that is, once we have certain amount of messages in a queue, an alarm will go off and will trigger the auto scaling policy to go into effect; however, this feature can also be used in conjunction with some sort of load balancing mechanism. In the AWS ecosystem, we can accomplish this using some of their off-the-shelf services, namely SQS, CloudWatch, and AutoScaling. As of this writing, AutoScaling doesn't have console/UI access to AutoScaling groups or any of its underlying requirements, so we are going to use boto which is a great and easy to use Python-based AWS API library.

For all the steps that follow, please remember, as a general rule, AWS services are region-specific. That is to say, these services are only visible and usable within the region in which they were created. Also important to bear in mind is that each of these services have a cost associated to them as per AWS pricing (see each of the product pages above for pricing details).

First let's setup the queue we're going to use to post and receive messages from and base our auto scaling on. To do this, log into your AWS management console and head to the SQS tab and click on Create New Queue. A modal window will pop up like the one shown below:

A bit on the parameters of the queue creation:

Default Visibility Timeout: The number of seconds (up to 12 hours) your messages will remain invisible once the message has been delivered.

Message Retention Period: The time (up to 14 days) when your message will be automatically deleted if they aren't deleted by the receiver(s).

Maximum Message Size: The maximum queue message size (in KB, up to 64).

Delay Delivery: The time (in seconds) that the queue will hold a message before it is delivered for the first time.

Our next steps will include setting up the auto scaling group (and all the underlying services) and then setting up CloudWatch to handle the monitoring and issuing of alarms.

So, assuming you have Python and boto already installed, we're going to create a script to do the heavy lifting for us. The way the API and auto scaling works in boto is as follows: first you need a Launch Configuration (LC). A Launch Configuration, as its name states, is metadata about what do you want to launch every time the alarm is triggered (i.e. which ami, security groups, kernel, userdata and so forth). Then you need an Auto Scaling Group (ASG). ASGs are the imaginary "containers" for you auto scaling instances and contain information about Availability Zones (AZ), LCs and group size parameters. Then, in order to actually do the scaling, you'll need at least one Scaling Policy (SP). SPs describe the desired scaling behavior of a group when certain criteria is met or an alarm is set off. The last piece of the puzzle is a CloudWatch alarm which I will address later.

So, back to our script. First, import the necessary modules:

from boto.ec2.autoscale import AutoScaleConnection, LaunchConfiguration, AutoScalingGroup
from boto.ec2.regioninfo import RegionInfo
from boto.ec2.autoscale.policy import AdjustmentType, MetricCollectionTypes, ScalingPolicy

As an aside, while in boto you can set your AWS credentials in a boto config file, I like having the credentials within the scripts themselves to make it more direct and explicit, but feel free to use boto config if that's what your preference.

First thing we need to do is to establish an auto scaling connection to our region of choice -- in this example, the Oregon region (aka us-west-2). To do so, we do as follows:

AWS_KEY = '[YOUR_AWS_KEY_ID_HERE]'
AWS_SECRET = '[YOUR_AWS_SECRET_KEY_HERE]'

reg = RegionInfo(name='us-west-2',  endpoint='autoscaling.us-west-2.amazonaws.com')
conn = AutoScaleConnection(AWS_KEY, AWS_SECRET,  region = reg,  debug = 0)

We then need to create the LC. In the code below I added many parameters for the sake of illustration, but not all of them are required by either AWS or boto. I believe that the only required fields are name and image_id. Bear in mind, though, that if you choose to use these optional parameters, they need to be accurate else you'll get an error in the create launch configuration API request.

lc = LaunchConfiguration(name="LC-name", image_id="ami-12345678",
instance_type="m1.large", key_name="Your-Key-Pair-Name", security_groups=['sg-12345678', 'sg-87654321'])
conn.create_launch_configuration(lc)

The next step is to setup the ASG. Choose your min and max size carefully, specially if your scenario will scale based on a queue that can be directly or indirectly DDoS attacked. While you wouldn't want your site to be unresponsive to your customers, you wouldn't want would-be attackers to scale you up a very hefty bill. So, as a good practice, set an upper bound to your scaling groups.

ag = AutoScalingGroup(group_name="your-sg-name",
availability_zones=['us-west-2a', 'us-west-2b'],
launch_config=conn.get_all_launch_configurations(names=['LC-name'])[0], min_size=0, max_size=10)
conn.create_auto_scaling_group(ag)

We are almost done with the auto scaling setup; however, without a way to trigger auto scaling, all is for naught. To this end AWS lets you set different scaling criteria in the form of Scaling Policies (SP). Any self-respecting AS scheme has some sort of symmetry, that is to say for every scale up, there's a scale down. If you don't have a scale down, chances are you won't be entirely happy with monthly bill and wasting resources/capacity. The way we set the ASPs with boto is as follows:

sp_up = ScalingPolicy(name='AS-UPSCALE', adjustment_type='ChangeInCapacity',
as_name='your-sg-name',scaling_adjustment=1, cooldown=30)
conn.create_scaling_policy(sp_up)

sp_down = ScalingPolicy(name='AS-DOWNSCALE', adjustment_type='ChangeInCapacity',
as_name='your-sg-name',scaling_adjustment=-1, cooldown=30)
conn.create_scaling_policy(sp_down)

Before I continue on, I will say that the whole topic of SPs is, as of this writing, sparsely covered in AWS' documentation. I found some general information, but nothing to the level of detail that is desired by most people trying to understand SPs and their nuance.

Alright, if everything thus far has gone according to plan, we should be ready to move on to the next step. For this part, we will use the AWS management console. We could, of course, do it via API but I like to use the console whenever possible. So, log into the management console and click on the CloudWatch tab and make sure you are working in the right region.

On the left navigation bar, click on Alarms. Then click on Create Alarm. A Create Alarm Wizard modal window will pop up. In the search field, next to the All Metrics dropdown, type "SQS". This will bring up the metrics associated with the queue we built at the beginning of this tutorial. For the sake of this exercise, click on NumberOfMessagesReceived (though you are welcome to try other options/metrics if you wish). After selecting the row, click Continue. Give it a name and description. In the threshold section set it to ">= 10 for 5 minutes". In the next step of the wizard, we are going to configure the actions to take once this creteria has been met. Set the "When Alarm State is" column to ALARM, set the "Take action" column to Auto Scaling Policy and finally set the "Action details" to the scaling group we just created. A new dropdown menu will appear where you choose which policy to apply (see this screnshot -- sorry but the image was too wide for this blog layout). This will be our up-scale policy. To setup the down-scale step, on the last column of the configure actions step of the Create Alarm Wizard, click on "ADD ACTION". In this new row, select "OK" from the "When Alarm State is" dropdown menu, then, just as above, select "Auto Scaling Policy" from the "Take action" column dropdown menu, in the "Action details" dropdown select your AS group, and select your downscaling policy from the policy dropdown. Click Continue. In the next step, check your metrics, alarms and actions are correct. Finally click on Create Alarm.

Now that we are completely done setting the auto scaling up, you might want to test it. The easiest way would be to send couple hundred messages to the queue via API/boto and see how it scales up. Then deleting the messages and seeing how it scales down, but that is something I might address in a later post.

Hope this tutorial was of help and easy to follow. For comments and suggestions, ping me via Twitter @WallOfFire

Wednesday, September 28, 2011

DIY Basic AWS EC2 Dashboard using Apache, Python, Flask and boto (PartII)

In part I of this tutorial we covered the basic stack setup as well as showing boto/Flask usage. In this part, I'll show how to handle posts, request instance details and rendering dynamic URLs.

So, arguably, you have your index page as your EC2 instance dashboard already working. Now, building on that let's say that you want to see the detail of any of the instances in display. So as per my explanation of the ideosyncracies of AWS' API, in order to see such details, you need two basic pieces of information: the region and the instance id. The way we are going pass this information from the dashboard page to Flask is through URLs. Flask has pretty neat ad-hoc URL routing with replaceable variables that you can then use in your code. So, in this instance, in the dashboard page we are going to dynamically generate a link that contains both the region end-point and the specific instance id. If you look closely, to the index.html template the link looks as follows:

<a href="/details/{{region['Name']}}/{{instance['Id']}}">See Details</button>

So, now we need to tell Flask to "listen" to that URL pattern. To do so, add the following line to your [WEB_APP_NAME].py file (please bear in mind that this is only for the sake of illustration only; so I'm putting code style aside):

@app.route("/details/<region_nm>/<instance_id>")

This tells Flask to match incoming requests to that URL and bind the incoming parameters to the variable names inside the angled brackets "<" ">". Right below that line declare the method/function you are going to use explicitly declaring the parameters you expect.

def details(region_nm=None,instance_id=None)

That is it in so far as Flask is concerned. Now that we, presumably, have all the data we need, we leverage boto to do the "hard work" for us. So as seen in part I of this tutorial, whenever we need to issue calls to the AWS API, the first thing we need to do is to start a connection to whatever region you are interested in. So, we go ahead and do so:

rconn = boto.ec2.connect_to_region(region_nm,aws_access_key_id=akeyid,aws_secret_access_key=seckey)

With that active regional connection, we then query the API for the details of a particular instance (which in this case we are going to use the instance_id passed in):

instance = rconn.get_all_instances([instance_id])[0].instances[0]

the API call above takes an array of instance ids (in our case we are only interested in one) and it will return not the instance itself, but an array of the parent reservation that why we have that first [0] is for and then that reservation's instances collection with only one instance object in it, the one we requested. Once we have the instance object it should be clear and obvious that you can extract any information you want or need (type, tags, state, etc.).

Now what if we wanted to make changes to any of the instances or if we wanted to, say, start/stop any of them? It's actually not unlike what we've been doing thus far: we tell Flask route object what to look for, then use that machinery to get the data you want. So, for this step, as a matter of generally acceptable good practices, we are going to send the information via submitting a form through a POST request method. Our route should look something like this:

@app.route("/change", methods=['POST'])

We can now define out function to handle the request:

def change():

Within this method we can now leverage Flask's request object to get the form data, for instance:

        instance_id = request.form['instance_id']
        state_toggle = request.form['sstoggle']
        toggle_change = request.form['type_change']

And finally, to make changes to an instance, all you need is to modify the instance's attributes key-value pairs. Let's say we wanted to change the instance type, to do so we simply change the value via boto's modify_attribute method as shown below:

        instance.modify_attribute('instanceType', 'm1.large')

One thing to bear in mind is that regretfully AWS API does not provide a "valid types" list for each instance. So, if you are dealing with a mix of 32- and 64-bit machines, it is possible you can assign an instance a type it is not compatible with, so you must be mindful about that. Given that there is no list provided by the API also means you need to hard-code the instance types. A reference to the official instance type list can be found here (pay attention to the API name field).

To change the instance state, however, you do not use the attributes collection directly. Instead, use the API calls provided by boto to start/stop/terminate:


         instance.stop()
         instance.start() 
...

Hope this tutorial was of help. If you got questions or comments, feel free to ping me on Twitter @WallOfFire.