We recently had need to start and stop Redis programatically, on an arbitrary port. Unfortunately, the version of the Redis server that we’re using doesn’t support specifying this information as command line arguments. The two options we had were to either write out a temporary config file and point redis-server to that, or to pipe the config in to the redis-server command. The latter was chosen, as it was decided that writing the temporary config files all over the place was messy.
The created process group has an ID that’s the same as the process ID returned by the spawn call. Now we just needed to find a way to send a signal to all the processes in a certain group.
This turned out to be easy, as Process.kill treats negative arguments as process group IDs instead of process IDs:
Process.kill("TERM", -pid)
So now we can spawn a process without having to worry about tracking the sub-processes it spawns, and still be able to clean them all up at a later time.
Having just installed MongoDB within an Ubuntu instance on an Amazon EC2 instance I was expected everything to just work out of the box. Unfortunately this just wasn’t the case and I received a very helpful message:
terminate called after throwing an instance of 'std::runtime_error'
what(): locale::facet::_S_create_c_locale name not valid
A quick look on Google quickly turned up that I was not the first person with this issue and that the solution was to check my default locale was correctly set in the default file:
/etc/default/locale
Then run the locale-gen command which will correct everything and setup your required locale so you can run MongoDB. FAIL..
What took me the next hour to find and is not in the ubuntu help is that locale-gen takes a parameter which is the locale to be installed, so you need to run (for my system):
We have several processes at Reevoo that need to be run round the clock. They’re pretty long running but variable, so aren’t really suitable for running with cron. The solution to this is to background them, but how do we run background tasks that need access to our Rails applications model?
We settled on the Daemons gem, a gem originally hosted on rubyforge that uses Process::fork to fork your daemon code in the background and close all open file descriptors, making Ruby code behaves as a standard UNIX daemon. Unfortunately the Daemons gem is pretty old now (it doesn’t look like it’s been updated since March 2008) and is missing a few essential features.
Firstly, we found that in certain situations the daemon won’t respond to a standard SIGTERM() this makes managing stuff with most monitoring systems a real pain. It can also cause deployment headaches when you have to manually kill -9 various processes and clear up their log files on every code update. Sysadmin headache!
The other main problem is the configurability of the log and pid file directories. Previously you could only create your daemon with one directory configured and everything would live there. We like to store our applications pid files in shared memory (/dev/shm on our RHEL and CentOS boxes). This is basically a paranoia check to ensure that if any runaway code ever forces the machine to reboot, we have no stale pid file issues and Monit can start all the applications back up with minimal sysadmin intervention. Obviously storing daemon logs in shared memory is not optimal.
Pre Rails 2.3, when you wanted to set the value of a cookie in a test you had to:
def test_should_set_user_name
#set cookie value
@request.cookies['visitor_name'] = CGI::Cookie.new('visitor_name', 'Dave')
post :login, :name => 'Dan'
assert_equal 'Dan', cookies['visitor_name'].value
end
But in 2.3 you can forget about all that CGI::Cookie crap and concentrate on the sweet code you actually want to test:
def test_should_set_user_name
#set cookie value
@request.cookies['visitor_name'] = 'Dave'
post :login, :name => 'Dan'
assert_equal 'Dan', cookies['visitor_name'].value
end
Pretty!
The problem comes when you want to ensure that ‘Dave’ has been logged out. Consider this test:
def test_should_log_user_out
@request.cookies['visitor_name'] = 'Dave'
post :logout assert_nil cookies['visitor_name']
end
This test always passes! No matter whether the cookie is removed in the logout action or not.
The solution
‘cookies’ is defined in ActionController::TestProcess as:
def cookies
@response.cookies
end
When you ask for cookies in your test, what you’re actually getting is @response.cookies which doesn’t include the values of the cookies set in the request. This means that cookies[‘visitor_name’], as far as the test is concerned, is never set, making it pass incorrectly.
I’ve written a simple patch which has recently been merged in to Rails. It merges the @response cookie with the @request cookie and returns that, rather than just returning the @response.cookie. So ActionController::TestProcess becomes:
def cookies
@request.cookies.merge(@response.cookies)
end
We recently moved a Xen virtual machine from one Xen host to another. The process involves simply copying across its disk partition (in this case an LVM2 partition using the device mapper) and copying across its config file:
1) Copy across the VM partition device. On target host:
Some further explanation is probably needed for the first step. Creating the logical volume on the target host in advance ensures that the partition, once copied across, remains a block device rather than as an image file. Netcat (nc) provides a fast mechanism to transfer the partition over the ether (use SSH if you’re sensitive about your data). As for the flags to dd, the block size (bs) is set to 16065 bytes, the number of sectors in a cylinder so I’m told (it worked for me), and the nocreat flag tells dd not to overwrite the block device.
Note: remember to shutdown the VM first! This is an offline copy. If you want to minimise downtime I’m sure an LVM snapshot would work too.
Here at Reevoo we (like many others) use Apache as our webserver of choice and with this comes the venerable mod_rewrite.
Mod_rewrite can be used for a lot more than just redirecting pages though, you can use it for forward and reverse proxying, redirection and url rewriting based on various factors such as the HTTP host or request uri. However there are myriad ways in which to shoot yourself in the foot!
This is very cool and makes working with mod_rewrite much less painful than it can be!
The original code is a series of gists hosted on the vigetlabs github page and to make them easier to use and manage I packaged it up as a gem, which you can install as follows:
sudo gem sources -a http://gems.github.com &&
sudo gem install shadowaspect-http_redirect_test
and use it in your code like this:
require 'http_redirect_test'
have fun!
detenc is a fast character encoding detector for Western European text. It can determine whether a file is encoded in US-ASCII, UTF-8, ISO-8859-15,WINDOWS-1252, or something else. It can distinguish ISO-8859-15 and WINDOWS-1252 where there is enough information: this means that Euro signs are handled correctly.
The program was written to help normalise the encoding of very large data feeds(of the order of several gigabytes) at Reevoo. It uses very little memory and can determine the encoding of a two-gigabyte file in under a minute.
We process a lot of data feeds from retailers here at Reevoo. If we’re lucky, we get to specify the format. Often, though, we have to make do with feeds that are already available. The quality of these can be variable, which means that we need to be liberal in what we accept—but not so liberal that we start importing bogus data.
One of the significant variables is character encoding. This is a poorly understood topic in general, and our experience reflects this. We get feeds in:
As an aside, ISO 8859-1 is also a possibility. However, given that it doesn’t include the Euro sign, we can reasonably assume that any feeds we receive today are likely to be in ISO 8859-15 (which is very similar).
We need to turn everything into the canonical encoding — UTF-8 — before we start processing. Up until recently, we’d been using iconv for this, attempting each encoding in turn, and falling back to the next on failure. The naive detector loaded the file into memory, fed it through iconv, and wrote it back out. This didn’t work too well on big feeds — and by ‘big’ I mean 2 GB+. Working line by line over 30 million lines was still not good enough.
So I wrote a small C program to do the job. Detecting ASCII is easy: the high bit is never set.UTF-8 is a little harder, but can be done very reliably thanks to the self-synchronising characteristics of its byte sequences. Windows 1252 and ISO 8859-15 have a significant overlap, meaning that text may be in both; in this case, the program selects ISO 8859-15. However, a text that uses a byte value defined in one but not the other can only be in one encoding. Finally, a text may include byte values outside any of these ranges, in which case it’s unknown.
The program can scan a 2GB file in under a minute, which is a big improvement, and certainly good enough. It uses a few hundred kilobytes of memory, making it about 10,000 times better than the original naive implementation! It also features what I consider to be a legitimate use of gotoin the UTF-8 validating state machine.
In a previous article Joel spoke about the problems we were having with our load balancing between Apache and Mongrel and his bybusy mod that attempts to solve the problem.