I started stressing a RoR project that has grown pretty big. Some serious hands on testing was showing that functionality was working well, performance was fine, but sometimes I would just get weird results from the database or ActiveRecord. I would create a new AR object, save it, use it a few times, update it, then it would suddenly just disappear. I would start getting ActiveRecord::RecordNotFound exceptions doing a Thingy.find(1234), when thingy#1234 definitely existed in the database. It would take a restart of Phusion Passenger or for one of the workers to timeout before I would start seeing the object again, and if I refreshed a page with Thingy.all(:conditions=>c) the results would change, then change back. I'm using MySQL so its not exactly what I was expecting to see.
I had issues in the past with some forking of processes that could just run through to conclusion in the background - they were removed. I made sure that there were good Thingy.transaction do end blocks covering my updates. Still, things were getting worse, not better.
Eventually I ended up hunting around the code from the dim and distant past. That stuff I don't touch because it "just works". Well, I roll up to an interesting section in a class :
sql = ActiveRecord::Base.connection();
sql.execute "SET autocommit=0";
sql.begin_db_transaction
sql.delete 'delete from a_table where some_conditions'
sql.update sqlstring
sql.commit_db_transaction
This was valid, as the SQL going on in the sqlstring was complex to say the least. But since I've removed this from the main flow of the application things seem to have settled down considerably.
I'm guessing that my standard transactions were getting caught up in my attempt to borrow a connection from the pool explicitly and who knows what was happening. Or maybe Passenger was losing its connection and recreating a connection. I don't know, but I'm not doing it again!
This blog records all the crazy stuff I've done to get software to work. No guarantees for you based on having no expertise on my part. I hope you enjoy, or at least find something useful if you stumbled here by accident. My other blog is far more 'professional' and targeted at getting me some consulting work in business process improvement and generally fixing business issues with technology, so this blog gives me an outlet for all the other stuff.
Showing posts with label ruby on rails. Show all posts
Showing posts with label ruby on rails. Show all posts
Wednesday, December 1, 2010
Wednesday, October 27, 2010
Counting pages
I've been working on a Ruby on Rails project for a while. One area of it has morphed into a bit of document management, and for some users it is important to know how many pages a specific document has in it. At least for PDFs and TIFFs.
Well, ImageMagick is one approach, letting you load the document then review its properties. But as anybody who has used it will know, unless you are careful, this can be a huge memory sink. In fact I use ImageMagick 'convert' as a way to force my machine to run out memory during testing, to see if it fails gracefully.
So, I hunted around a bit and came up with these programs: tiffdump and pdfinfo. I also considered tiffinfo, although the 'rawness' of tiffdump just seemed more appealing when parsing out the data I needed.
To install them (on Ubuntu):
sudo apt-get install libtiff-tools poppler-utils
Then use the command line programs from Ruby, something like this:
path = '/home/someone/somewhere/somefile.xxx'
mime_type = WEBrick::HTTPUtils.mime_type(path, WEBrick::HTTPUtils::DefaultMimeTypes)
if mime_type=='image/tiff'
return `tiffdump '#{path}' | grep 'Directory'`.count('\n')
elsif mime_type=='application/pdf'
return `pdfinfo '#{path}' | grep 'Pages'`.split(':')[1].chomp.to_i
else
# whatever
end
Well, ImageMagick is one approach, letting you load the document then review its properties. But as anybody who has used it will know, unless you are careful, this can be a huge memory sink. In fact I use ImageMagick 'convert' as a way to force my machine to run out memory during testing, to see if it fails gracefully.
So, I hunted around a bit and came up with these programs: tiffdump and pdfinfo. I also considered tiffinfo, although the 'rawness' of tiffdump just seemed more appealing when parsing out the data I needed.
To install them (on Ubuntu):
sudo apt-get install libtiff-tools poppler-utils
Then use the command line programs from Ruby, something like this:
path = '/home/someone/somewhere/somefile.xxx'
mime_type = WEBrick::HTTPUtils.mime_type(path, WEBrick::HTTPUtils::DefaultMimeTypes)
if mime_type=='image/tiff'
return `tiffdump '#{path}' | grep 'Directory'`.count('\n')
elsif mime_type=='application/pdf'
return `pdfinfo '#{path}' | grep 'Pages'`.split(':')[1].chomp.to_i
else
# whatever
end
Not pretty, not clever, but a lot faster than RMagick, and a lot easier than the Ghostscript approaches I've seen discussed but never actually working.
Subscribe to:
Posts (Atom)