As we understand it from the discussion on stage, a Think Cloud is a “body of knowledge” that is a real-time information base of Amazon cloud that can be pivoted all the way down to the threads and individual data concurrency. It would be an index that acts like a control point that helps define movement of data through a servers and compute tasks. Looking at the journey from the data point of view, including data about the environment itself and how to repair itself when damaged and keep data concurrency in tact.
Graham identifies four forces: 1. The Internet’s open platform fosters innovation at hacker speeds instead of big company speeds. 2. Moore’s Law worked its magic on Internet bandwidth. 3. Piracy taught a new generation of users it’s more convenient to watch shows on a computer screen. 4. Social applications made everybody from grandmas to 14-year-old girls want computers — in a three-word-nutshell, Facebook killed TV.
NYTimes has a good blog post about the nation’s new CIO and his desire to embrace cloud computing:
Mr. Kundra also said that he would push the government to embrace cloud computing — having work done on large servers rather than on desktop PCs. He acknowledged that there are privacy and security issues with some cloud-computing efforts, particularly when the computers are not all operated by the government. But he said that should not stop the government from taking advantage of the speed and efficiency such systems offer.
SC has a good write up on cloud computing security:
Cloud computing, as least as a concept, is being driven largely by economics. It is generally less costly to run applications, add capacity and increase storage in the cloud, rather than investing in new hardware and software, and bringing on additional staff and beefing up networking.
“Cloud computing will happen because it has too much of an economic incentive and developer support – applications can be quickly added and developers can have a single place to maintain source code,” says Vatsal Sonecha, VP, business development & product management at TriCipher.
Overall, incentives include application-deployment speed, lower costs and fast prototyping. These are strong drivers. So much so that Gartner predicts that by 2012, 80 percent of Fortune 1000 companies will pay for some cloud computing service, and 30 percent of them will pay for a cloud computing infrastructure.
That is not to say that entire data centers will be moving to the cloud, at least in the largest companies. But for certain solutions, the cost benefits are hard to ignore.
I wanted to touch briefly on the security concerns for having Scalr accessible via the Internet. If you are running your own install of Scalr this is an important factor before even adding the first farm. For my own sake I will not getting into my exact setup, but instead talk about a few approaches to locking down access to Scalr.
Possibly the best approach is to limit access to Scalr interface to internal network requiring users to use OpenVPN or some other VPN solution to access internal resources which would include Scalr. If you are hosting Scalr on an AWS instance be sure to set the security group to only allow the port you are running for VPN. You can find a quick and dirty howto for OpenVPN on an EC2 instance at Google Books.
Another option is to use SSL and mod_access (Apache 1.3) or its renamed equivalent in Apache 2.2 mod_authz_host to limit those who have access to Scalr interface. You should for sure at least use SSL to access Scalr. You can also add a layer of authentication for good measure using Apache Basic Authentication.
Being that Scalr controls the rest of your AWS setup it is by far the one thing you want to lock down as much as possible.
I wanted to touch again on the use of Subversion (SVN) to populate the /var/www of app servers on Scalr. Basically the issue is how to add your web content to a new instance once it has automagically launched a new instance due to high load. So Scalr will launch another app role once the server reaches a load threshold you have previously set. So the issue is I can have the instance started, but once it has launched the /var/www needs to be populated for that server to be able to serve content via load balancer.
This is where SVN and Scalr Scripting come into play. I keep all my site content in a SVN repo. So I link to whatever production tag I want to be live at that time. In order to get the directory populated I make a simple script to do an svn checkout of that tag to /var/www. A simple bash script is added to do the checkout and is added to the “OnHostUp” option. This way once the server sends its SNMP trap saying it is up the script will be executed. This is also a helpful means of updating your servers to a newer build. I DO NOT checkout the tag directly into /var/www instead I make a symlink to /var/svn where the tags are checked out. So when it is time to roll out a new production tag I simply checkout the new tag to /var/svn and redo the symlink to point at new tag. This way if there is an issue that was not forseen in the QA process I can roll back to known good tag by redoing symlink. This is an easy but very effective way of using Scalr scripts and SVN to manage content loading on servers.
Since I have been using Scalr to manage my Amazon Web Services farms I have been wanting more monitoring in terms of statistical information on services, traffic, disk usage, and uptime to name a few. Scalr has built in means of basic event notifications such as host up, host down, etc. Along with providing very basic load statistic via RRDtool. In the past I have always used Zabbix for most projects I have worked on so I wanted to be able to use it with Scalr. I am still testing the setup I am going to speak of so please keep that in mind. This is NOT a howto, but more of a brainstorming of how I plan on getting Zabbix integrated into my Scalr setup. In the Zabbix documentation (PDF) there are a few ways to use the auto-discovery that they cover (page 173). You can have Zabbix monitor a block of IPs to find new Zabbix Agents running for example. So here is what I will have my Zabbix Server do:
- Look for new Zabbix Agents on my AWS internal IP range.
- If the system.uname contains “Scalr” it will add to Scalr server group
- Server must be up for 30+ minutes
There will be other stipulations in order to get the server added to Zabbix. I will have system templates for each of my Scalr AMI roles. Once the server is added to Zabbix it will add them to to their respective groups and monitor for items and triggers listed in the system template. There will also be a rule to remove old instances after 24 hours from Zabbix after receiving the host down trigger. This way I will not have a bunch of old instances that were once monitored still cluttering Zabbix database. If you happen to also have Windows AWS instances you can add a rule to monitor these as well. The AMI just needs to have the Zabbix Windows Agent installed.