The dangers of multiple Drupal database connections

Our latest venture, www.analyticsseo.com, is a pretty complex app for a Drupal site. There are two parts to the system:

  1. a back-end application, which crawls the web, analyses sites and competitors, and does all kinds of clever stuff
  2. a front-end application, which summarises this data, generates tasks, and manages workloads

Fortunately, Drupal makes it really easy to work with multiple databases. Simply add two database connection strings in the settings.php:

$db_url['default'] = 'mysqli://user1:password1@server1/database1';
$db_url['backend'] = 'mysqli://user2:password2@server2/database2';

Then in the code:

db_set_active('backend');
// do stuff in backend database
db_set_active();
// do stuff in front-end database

and all is sweet.

Until you do something where it isn't sweet without realising it.

It's really important to keep the code between the db_set_active calls really targetted on database interaction - don't do anything else between those calls! If you call other Drupal functions, there's a good chance that they will do some stuff based on the assumption that they are connection to the default database, when they are not. This is particularly true of theme functions, which can get terribly confused.

I made the mistake of calling a theme function in my db_set_active block like this:

db_set_active('backend');
// get data from backend database
// put data into form elements
$form['site_pager'] = array(
  '#type' => 'markup',
  '#value' => theme('pager', null, $results_per_page, 1),
);
db_set_active();

and ended up getting a hugely unhelpful

Fatal error: Call to undefined function zen_menu_item_link() in /var/www/seo/sites/all/modules/drupal-contrib/dhtml_menu/dhtml_menu.module on line 79

Looking at the code in dhtml_menu.module didn't help much, nor did dumping a debug_backtrace() in the function, and nor did Google. Fortunately I had a very similar function elsewhere which was working, so by comparing the two I was able to spot the difference and straighten things out.

So the moral is, while using multiple databases in Drupal is conceptually really easy, always remember:

  1. set your secondary database active
  2. get your data
  3. get the hell out of that secondary database as fast as possible

Comments

Thank you, Mark! Very helpfull article.

After spending a couple of hours trying to figure out what kind of mistake we [me and a friend] made when swapping databases, I finaly found the answer.

In our case, calling node_load in the secondary database, was leading Drupal on node_save warnings when running on primary database. (yes, it´s all about a data migration module =)

Now we've decided to get the nodes we wish to migrate by calling a webservice hosted in the another Drupal instance.

I just posted a comment on this in the Drupal forums, with a link back to your blog. I found that same problem, and am concerned that that is actually the wrong approach - what if a remote or other database takes TIME to return a result, or you have high load on your site? Whoopsey!

Check it out: forum post

Ryan