Archive for Linux
Apache mod_proxy_balancer Self Registration : Part 3
Posted by: | CommentsI’ll start off by going over the basic high level architecture for my self registration procedure:
There is a register.php script residing on the load balancer, accessible via HTTP.
There is a deregister.php script residing on the load balancer, accessible via HTTP.
There is a register_with_lb.pl script residing on the web server, in /usr/local/bin/.
There is a deregister_with_lb.pl script residing on the web server, in /usr/local/bin/.
There is a MySQL database that stores the current configuration state, on it are two stored procedures register_lb and deregister_lb.
register.php
No changes were made to register.php as described in this post , though I’m considering some alterations to increase its security.
deregister.php
The biggest difference between register.php and deregister.php (aside from their purpose) is where the insert/delete database code is called from and why. When register.php is called by the web server, it will have already inserted information about itself into the database, including its hash. I made the decision that I did not want the load balancer responsible for inserting servers into the database. It would merely check that the requesting server inserted itself, and then regenerate the balancer_members.conf.
In the case of deregister.php I decided I wanted the server making the call to still be in the database so the script could verify the identity before removing it and regeneration the balancer_members. And since the deregistration SQL is contained within a stored procedure, I needed to make some changes to the script (as compared to register.php) regarding the database.
Specifically, the standard mysql library cannot call stored procedures. So I had to convert it to using mysqli, which is a similar, though more OO approach. The portion of the code that regenerates the balancer_members.conf is similar enough that I won’t re-list it here, but I will show how to connect using mysqli, and how to call a stored procedure.
$mysqli = new mysqli($dbhost, $dbuser, $dbpass, $dbname);
if (mysqli_connect_errno()) {
printf("Connect failed: %s\n", mysqli_connect_error());
}
$query = "SELECT count(*) as count FROM " . $dbtable . " WHERE ip='" . $_SERVER['REMOTE_ADDR'] . "';";
$result = $mysqli->query($query);
$row = $result->fetch_row();
echo $row[0];
if ($row[0] >= 1) {
$del_query = "call deregister_lb('" . $_SERVER['REMOTE_ADDR'] . "');";
$del_result = $mysqli->query($del_query);
//<code for regenerating the conf file removed here>
echo exec('echo "' . $file . '" > /etc/httpd/conf.d/balancer_members.conf');
echo exec("sudo /usr/local/bin/reload_httpd");
}
As you can see, I’m using the actual REMOTE_ADDR to determine the validity of the request.
(de)register_lb.sql Stored Procedure
Here is the code for the deregister_lb stored procedure:
DROP PROCEDURE IF EXISTS deregister_lb $$
CREATE PROCEDURE deregister_lb ( ip VARCHAR(100) )
BEGIN
DELETE FROM lb2_members
WHERE ip=_ip;
END $$
and also for the register_lb stored procedure:
DROP PROCEDURE IF EXISTS register_lb $$
CREATE PROCEDURE register_lb (
_hostname VARCHAR(100),
_ip VARCHAR(40),
_loadfactor INT,
_hash VARCHAR(100)
)
BEGIN
DECLARE already_exists INT DEFAULT 0;
SELECT count(*) INTO already_exists FROM lb2_members WHERE hash=_hash;
IF already_exists=1 THEN
UPDATE lb2_members
SET hostname=_hostname, ip=_ip, loadfactor=_loadfactor
WHERE hash=_hash;
ELSE
INSERT INTO lb2_members (ip, hostname, loadfactor, hash)
VALUES (_ip, _hostname, _loadfactor, _hash);
END IF;
END $$
Note that I’ve omitted the code that changes the delimiter to $$ instead of a semicolon.
register_with_lb.pl
This perl script uses perl DBI for accessing the database. I had to get that installed on my web server since it wasn’t already. Normally you can install perl packages using the cpan command. In which case you would issue the following commands to install DBI and a MySQL driver for it:
cpan DBI cpan DBD::mysql
If it’s the first time you’ve run cpan, you will need to go through some configuration. It’s pretty much self explanatory, and I just accepted all of the defaults. Everything installed correctly except for the MySQL driver, which I ended up having to install from source. If I had executed the command:
yum install mysql-devel.i386
first, then my cpan install of DBD::mysql might have worked, but I didn’t realize that until installing from source. In case you ever need to install a perl module from source, particularly the DBD::mysql driver, enter these commands (which I think is basically what cpan does):
yum install mysql-devel.i386 #(only requred in this particular instance) wget http://www.cpan.org/modules/by-module/DBD/DBD-mysql-4.011.tar.gz gzip -cd DBD-mysql-4.011.tar.gz | tar xf - cd DBD-mysql-4.011 #(or whatever version you downloaded) perl Makefile.PL make
Here is how you connect to the database and call a stored procedure:
my $dsn = "DBI:mysql:host=mysql.host;database=lb_register";
my $dbh = DBI->connect ($dsn, "lbuser", "lbpasswd")
or die "Cannot connect to MySQL server\n";
my $sql = "call register_lb('" . $localhost . "', '" . $localip . "', " . $loadfactor . ", '" . $hash . "')";
$dbh->do($sql);
$dbh->disconnect();
After that, register_with_lb.pl opens a socket to the load balancer and makes an HTTP request over the socket. There are probably easier ways to do this, I just happened to have the socket code lying around and was glad to be able to reuse it. Here’s the gist of it, in case you’re interested:
# Parse the URI.
my $url = URI->new("http://load.balancer.com/register/register.php?hash=" . $hash);
# Parse these in from the command line
$host = $url->host;
$port = $url->port;
$resource = $url->path;
$query = $url->query;
# Initialize the socket
$socket = IO::Socket::INET->new ( Proto => "tcp", PeerAddr => $host, PeerPort => $port,);
unless ($socket) { die "Error connecting to $host" }
$socket->autoflush(1);
# Format the request
my $request = "GET " . $resource . (($query)?"?" . $query : "") . " HTTP/1.1" . $EOL . "Host: " . $host . $EOL . "User-agent: register_script" . $EOR;
# Use send() to make the request, and output the response.
# Not necessary in this example, but informational.
if ( $socket->send($request) ) {
while ( <$socket> ) { print }
}
# Close the socket
close $socket;
The above code pretty much sums up deregister_from_lb.pl, since no database calls are made, a call is simply made to the deregister script. The line you would change is as follows:
my $url = URI->new("http://my.balancer.com/register/deregister.php");
Then make the files executable, and copy them to be used by the startup script described in the previous post:
chmod a+x register_with_lb.pl chmod a+x deregister_with_lb.pl cp register_with_lb.pl /usr/local/bin/ cp deregister_with_lb.pl /usr/local/bin
I don’t show it here, but right now my IP addresses are hard coded. There are a number of ways you can find out your actual IP address from within perl, I’m just not doing that right now.
Securing the register scripts
As an additional security measure, I’ve restricted access to the /register/ location on the load balancer to the IP address range I expect my web servers to be from, like this:
<Location /register> Order Deny,Allow Deny from all Allow from 10.0.0. </Location>
And now you have a web server that can register automatically (if you’ve gone through the previous two posts as well) with a mod_proxy_balancer load balancer.
Update
I did some searching around to find a way to determine your IP address from inside the perl script. This is a simple way if your server has a public IP address and reverse DNS set up correctly for that IP address:
use Socket; use Sys::Hostname; my $host = hostname(); my $addr = inet_ntoa(scalar(gethostbyname($host)) || 'localhost');
If your slave web servers are on a private network, the above command will return the loopback IP address (127.0.0.1) which isn’t useful for the load balancer (I wonder if it would start an infinite loop and crash the load balancer?). I found a function that prints out the IP address by parsing it out from the results of the ifconfig command.
It seemed a little long to just rip off and copy verbatim. So here’s a link to that code (which is what I’m using now) in case you’d like to use it. Perl script to get IP address.
Apache mod_proxy_balancer Running Scripts at Startup/Shutdown : Part 2
Posted by: | CommentsI think I have to break the self-registration into two posts, it took a lot longer than I expected last night. This post deals with getting scripts to run at startup and shutdown on Linux. I did this on Fedora, I imagine the process would be similar on Ubuntu, etc.
This is actually the last thing I did, but I’m ordering it first because it’s a bit more general. My goal was to have a script that I could call as follows to register with my load balancer manually:
service lb_register start|stop|status|restart|reload|force-reload
And then also to register this service to be called at startup and shutdown, to register and deregister with the load balancer respectively.
Let’s start with creating the script. You can find tutorials for creating shell scripts just about anywhere on the web. I found it easier to take an existing script from the /etc/init.d/ directory, clear it out and start from that shell. Ironically the one that I chose was the startup script for Pound, another load balancer (a very good one, I might add). It shouldn’t matter which one you use, since all we’re really after is the format. The basic format is essentially the same for all of them: you have your function definitions at the top, and a case statement at the bottom.
The cases match the option parameters (start and stop are very common, as well as the others listed above in the callout). And the case blocks generally call one or more of the functions. I won’t list the case statement in this post as they are all very similar. There are some important things to note in the function definitions, though.
Here is the definition for my start() function:
start() {
echo -n $"Registering with the load balancer:"
./usr/local/bin/register_with_lb.pl
touch /var/lock/subsys/lb_register
}
You can see that I call a perl script (which presumably registers with the load balancer). Then I touch a file in the /var/lock/subsys/ directory. This is very important if you want a script to automatically run at shutdown. At shutdown the rc script will check for the presence of this file, if it is there it will call stop(). If it is not there, it assumes the service is stopped already and will not call stop() on the service.
My stop function:
stop() {
echo -n $"Deregistering with the load balancer: "
./usr/local/bin/deregister_with_lb.pl
rm -f /var/lock/subsys/lb_register
}
This calls the deregister perl script, and then removes the lock file for good housekeeping. There is one more thing about the script itself I want to mention before moving on to the registration. There is a line in the comments of most of the scripts (if not all) in /etc/init.d/ that will look something like this:
# chkconfig: - 85 10
or
# chkconfig: 2345 55 15
These are directives to chkconfig (the command we’ll be getting to in a moment) on how to set this script up. The – or the first grouping of numbers deal with the run levels. It tells chkconfig what levels this script will be turned “on”. It is assumed that it will be “off” in the omitted ones.
The second number is a startup priority, different scripts can have different startup priorities in case there are prerequisite dependencies. In the two examples above, the script containing the bottom directive would run before the script containing the bottom one. The last number is the shutdown priority, which is the same thing only during the shutdown process.
Once you have your script, you will need to run the chkconfig command. To add your script to the startup process:
chkconfig --add lb_register
And should you need to remove it:
chkconfig --del lb_register
If you’ve done everything correctly, register_with_lb.pl will be called at startup, and deregister_with_lb.pl will be called at shutdown. If it doesn’t, check that you’re touching the pid file, and that your scripts are executable. You will also be able to make calls like this to deregister and register manually:
service lb_register start service lb_register restart service lb_register stop
I should mention a few sites that helped me out quite a bit:
Linux Init Processes
Introduction to BASH Programming
An interesting script for creating new scripts
Apache mod_proxy_balancer Self Registration : Part 1
Posted by: | CommentsLoad balancers are great, but they become even more powerful when servers have the ability to self-register when they come online, and deregister when they go offline. This is especially true with services such as EC2, when the size of the server group might grow or shrink in response to need. This is a tutorial describing my particular (partially insecure at the moment) solution for allowing self-registration with Apache’s mod_proxy_balancer. Specifically this covers the load balancer side of the equation. Tomorrow I hope to get a post out describing the server side.
Here is my flowchart for how self registration will work:
1. Server comes online.
2. A startup script will register itself with the MySQL database (including hostname, ip, loadfactor, and a hash that it will generate in some way).
3. The server will then call a PHP script on the load balancer: “register/register.php”.
4. The PHP script will verify that a server sent the request.
5. The PHP script will query the database to get the current list of balancer members, and regenerate the balancer_members.conf file.
6. The PHP script will then issue a command to reload Apache’s configuration files.
Deregistration, which my PHP script as presented doesn’t display, will work as follows:
1. Server sends its hash to the PHP script, and shuts down.
2. The PHP script will check the hash against the database.
3. The PHP script will remove the server from the database.
4. The PHP script will repeat steps 5 and 6 above.
First, set up the database and created a user with sufficient privileges.
CREATE DATABASE lb_register; GRANT ALL ON lb_register.* TO 'lbuser'@'%' IDENTIFIED BY 'password'; CREATE TABLE lb2_members( ip VARCHAR(20) NOT NULL PRIMARY KEY, hostname VARCHAR(100) NOT NULL, loadfactor INT NOT NULL DEFAULT 0, hash VARCHAR(40) );
Second, create the PHP script.
$dbhost = "mysql.host.com";
$dbuser = "lbuser";
$dbpass = "password";
$dbname = "lb_register";
$dbtable = "lb2_members";
$conn = mysql_connect($dbhost, $dbuser, $dbpass) or die (mysql_error());
mysql_select_db($dbname);
$query = "SELECT count(*) as count FROM " . $dbtable . " WHERE hash='" . $_GET['hash'] . "';";
$result = mysql_query($query);
$row = mysql_fetch_assoc(mysql_query($query));
if ($row['count'] >= 1) {
$file = "<Proxy balancer://mycluster>" . "\n";
$member_query = "SELECT hostname, loadfactor FROM " . $dbtable . ";";
$member_result = mysql_query($member_query);
while ($row = mysql_fetch_array($member_result, MYSQL_BOTH)) {
$file .= " BalancerMember http://" . $row['hostname'] . " ";
$file .= ($row['loadfactor'] > 1) ? ("loadfactor=" . $row['loadfactor'] . "\n") : "\n";
}
$file .= "</Proxy>";
exec('echo "' . $file . '" > /etc/httpd/conf.d/balancer_members.conf');
exec("sudo /usr/local/bin/reload_httpd");
}
mysql_close($conn);
You can tell a few things about the server configuration by looking at the script:
1. User apache will need to be able to write to the “/etc/httpd/conf.d/balancer_members.conf” file.
2. User apache will need to be able to execute the script “/usr/local/bin/reload_httpd”.
3. User apache will need sudoer rights.
4. This script was used for debugging, and not by a server that is actually registering… tyou can see that deregistration is not handled yet.
To grant write privileges to apache, I changed the owner of the balancer_members.conf to apache.
chown apache /etc/httpd/conf.d/balancer_members.conf
This is probably the least secure aspect of my solution, as if the apache user were compromised, then any directives could be written to this file. I’m not sure how big a threat this is, but it’s something that concerns me at least enough to think about this some more (and invite suggestions).
Next is to grant apache privileges to execute “/usr/local/bin/reload_httpd”. We could accomplish this the same as we did above, but then it wouldn’t allow apache to execute what’s inside of the script, which is this:
#!/bin/bash service httpd reload
unless we give execution rights to apache on service, which we don’t want. What we also don’t want is for apache to be able to write to the file reload_httpd. So what I ended up doing was, as you see in the script, to make root the owner of reload_httpd and remove write privileges for all (so apache couldn’t change it) and then add apache to the sudoers file, granting rights to execute this script without a password.
visudo
is the generally accepted way to edit the sudoers file. And I added this line:
apache ALL=(ALL) NOPASSWD: /usr/local/bin/reload_httpd
I’m open to more secure ways of implementing this aspect as well, as I don’t consider myself a sudo configuration expert. I think this gives apache rights to execute everything from anywhere if he knows the password; but he can also execute the /usr/local/bin/reload_httpd script without a password.
I also had to comment out the line:
#Defaults requiretty
to allow sudo to function properly from a script not executed in a terminal.
Finally I had to disable proxying for the register script in my balancer.conf file:
ProxyPass /register/ !
And then your server is configured to dynamically update its list of balance members, you can check by going to the balancer-manager if you’ve got that enabled. Next I will discuss how to handle the web server side of things.
VirtuaWin for Windows
Posted by: | CommentsI’ve really come to appreciate the “workspace” feature in OS X and Linux. However, I do most of my development in Windows which (oddly) has no similar feature. I’ve tried a few applications that give Windows this ability, my favorite is an open source project called VirtuaWin.
Their project is hosted on SourceForge, where you can find more information and links to download: http://virtuawin.sourceforge.net/.
Things I like about it:
1) It’s minimalistic and not too flashy. Switching spaces is similar to OS X.
2) It’s very stable. It works on Vista, and has never crashed.
3) It has a handy “always show” flag you can apply to a specific window.
Things I don’t like about it:
1) Configuration/customization takes some time (default settings are OK, though).
Book Review: Programming Amazon Web Services
Posted by: | CommentsI don’t know if they’re just a more established tech book publishing company, but I usually have a good experience with O’Reilly books. Programming Amazon Web Services, subtitled S3, EC2, SQS, FPS, and SimpleDB, by James Murty, was great. 5 stars.
I enjoyed this book mainly because I love using Amazon’s web services for recreation and work. If I didn’t enjoy Amazon’s web services in the first place I probably would have found the book excessively detailed. In chapter 5 the author writes, “This chapter delves into the nitty-gritty aspects of running a Linux server in EC2,” and he ain’t kidding! This book really gets down into the API (and this is true for all the services treated in the book, not just EC2).
So if you’re looking to do some casual computing on EC2 or S3, you’d probably be better off without this book. I’d recommend installing the Firefox plugins for EC2 and S3, and going from there. Here’s a link (from the web site of a class I took last fall) that will probably be useful to someone in that situation. On that page you’ll find links to some tutorial pages, and a webcast or two.
On the other hand, if your intention is one of the following:
- Author a tool similar to the Firefox EC2 plugin.
- Create complex scripts to manage your EC2 instances or S3 buckets.
- Write a code library for any of the Amazon web services.
- Increase your understanding of what’s going on when you use the Firefox plugins.
Then this is the book for you.
That said, this book exposed me to FPS and SimpleDB for the first time (never had a chance to use either). As far as EC2, S3, and SQS go… I didn’t really learn how to do anything new with them from this book, per se. But it did significantly increase the depth of my understanding regarding each of these services. There’s a benefit to depth of knowledge with these kinds of technologies, because I’m sure I’ll encounter a problem in the future that can be solved with these tools, whose solution I might have overlooked before.

