deploying bucardo and solving a schema mismatch problem

In the process of deploying bucardo 4.4.0 to replicate a fairly large django database using postgresql 8.4, I ran into a tricky issue.  Although I knew for certain that the database schemas were identical, bucardo still complained about a mismatch in the default value for the first table's primary key column. For some reason bucardo was identifying the reference to the sequence with the "public" schema explicitly named on one database and without "public" on the other.  In this case, schema refers to a namespace and not the database layout sense of the word. The goat validation was throwing the following error. ... MCP Warning: Source database for sync "databaseA" has column "id" of table "public.auth_group" with a DEFAULT of "nextval('auth_group_id_seq'::regclass)", but target database "databaseB" has a DEFAULT of "nextval('public.auth_group_id_seq'::regclass)" ... I was sure that the databases were identical because they were both restored from the same backup file.  I did find that I was running 8.4.1 on one postgresql server and 8.4.3 on the other. I synced up the postgresql versions, but I did not track down any of the relevant driver versions, because I was able to find a simple work around. I searched for the error string in the bucardo perl source code and found the problematic section.  Since I knew the tables were identical, I felt that skipping this part of the validation would be harmless, so I made the following change to Bucardo.pm 2032 #if ($scol->{atthasdef} and $fcol->{atthasdef} and $scol->{def} ne $fcol->{def}) { 2033 if (1 == 2) { After I installed my updated bucardo, I kicked off replication again with a successful result.

displaying a google maps api v3 map in a django application with geodjango and postgis

In order to create a Google Maps API v3 map with django templates and views you can use the python modules django.contrib.gis.maps and specifically django.contrib.gis.maps.google. We will create a view called show_map that takes a request object and a query string that we will use to retrieve polygons from postgis, our spatial database. If we have a database of time zones and we would like to display a map with a specific time zone, we first need to select it from the database.  For this query, we are using the standard django ORM to pull out the desired polygon. We have defined the Zone object in our app's models.py file.

querying the geodatabase using django ORM

In[24]: results = Time_Zone.objects.filter(timezone_id=query_string) timezone_id is the criteria we are using to select all objects where the time zone id matches the query string exactly. The geometry or polygon shapes of the records that have been returned are available in the polygon field you have specified in the model definition. Because we are using the filter method of Zone's object manager, we can pull multiple records that match our search criteria. We can get at the polygon data for each record using some properties that are available on the geometry columns.

geojson

In[24]: print results[0].geom.geojson Out[24]: '{ "type": "MultiPolygon", "coordinates": [ [ [ [ -6.091862, 4.991835 ], [ -6.089009, 4.990883 ], [ -6.079963, 4.988334 ], [ -6.074483, 4.985070 ], [ -6.071546, 4.979528 ], [ -6.072025, 4.977002 ], [ -6.076686, 4.968072 ], [ -6.083112, 4.967894 ], [ -6.100063, 4.971250 ], [ -6.103525, 4.968749 ], [ -6.107572, 4.978899 ], [ -6.113694, 4.987916 ], [ -6.113905, 4.990215 ], [ -6.112429, 4.991211 ], [ -6.105550, 4.991084 ], [ -6.099033, 4.994057 ], [ -6.091862, 4.991835 ] ] ] ] }'

kml

In [25]: results[0].geom.kml Out[25]: ' -6.09186172485,4.99183511734,0 -6.08900928497,4.99088335037,0 -6.07996273041,4.98833370209,0 -6.07448291779,4.98507022858,0 -6.07154560089,4.97952795029,0 -6.07202482224,4.97700214386,0 -6.07668590546,4.9680724144,0 -6.08311223984,4.9678940773,0 -6.10006284714,4.97125005722,0 -6.10352468491,4.96874904633,0 -6.1075720787,4.97889947891,0 -6.11369419098,4.98791646957,0 -6.113904953,4.99021482468,0 -6.11242866516,4.9912109375,0 -6.10554981232,4.99108362198,0 -6.09903335571,4.9940571785,0 -6.09186172485,4.99183511734,0'

google maps api

In order to display the polygons on our google map, we will need to pass them to our code written in javascript. We will then use methods from the google maps api v3 to construct and display our polygons on the map. We will also need a little more information about the polygons we will be displaying in order to adjust the center and zoom level of our map. First we will need to instantiate a google zoom object. There are probably easier ways to do this, perhaps subclassing the geodjango admin views, but this is quick and effective. And then render our template with the appropriate context. gz = GoogleZoom() return render_to_response('map.html', { 'all_tz': all_tz, 'zoom': gz.get_zoom(all_tz.unionagg()) })

template javascript

To display our maps using google's javascript mapping API, we need to initialize our map and the add our polygons to it. We will iterate through our list of polygons use the gpoly method of our Time_Zone object to definte the points that make up each polygon. We'll talk about the gpoly method in a bit. function initialize() { var latlng = new google.maps.LatLng(, ); var myOptions = { zoom: , center: latlng, mapTypeId: google.maps.MapTypeId.TERRAIN }; var map = new google.maps.Map(document.getElementById("map_canvas"), myOptions)

google.maps.Polygon

A google.maps.Polygon object is slightly different from a geojson representation. They are both a string containing a list of points that represent the vertices of the polygon. The geojson polygon that is displayed above needs to be transformed into the following string to be interpretable by the google Polygon constructor. '[new google.maps.LatLng(4.99183511734,-6.09186172485),new google.maps.LatLng(4.99088335037,-6.08900928497),new google.maps.LatLng(4.98833370209,-6.07996273041),new google.maps.LatLng(4.98507022858,-6.07448291779),new google.maps.LatLng(4.97952795029,-6.07154560089),new google.maps.LatLng(4.97700214386,-6.07202482224),new google.maps.LatLng(4.9680724144,-6.07668590546),new google.maps.LatLng(4.9678940773,-6.08311223984),new google.maps.LatLng(4.97125005722,-6.10006284714),new google.maps.LatLng(4.96874904633,-6.10352468491),new google.maps.LatLng(4.97889947891,-6.1075720787),new google.maps.LatLng(4.98791646957,-6.11369419098),new google.maps.LatLng(4.99021482468,-6.113904953),new google.maps.LatLng(4.9912109375,-6.11242866516),new google.maps.LatLng(4.99108362198,-6.10554981232),new google.maps.LatLng(4.9940571785,-6.09903335571),new google.maps.LatLng(4.99183511734,-6.09186172485)]' To do that with django, I import the GPolygon object from django.contrib.gis.maps.google.overlays. It takes a geodjango multipolygon object and returns a string which is acceptable to the google maps api polygon constructor paths argument. I've added a gpoly method to my geodjango object that wraps the GPolygon object and returns the correct string.

supporting google maps api v3

Unfortunately, geodjango seems to be written with v2 of the google maps api in mind, at least in the current version. In order to transfer the output of the GPolygon method, we simply do a replace in the string. The gpoly method looks like the following. def gpoly(self): gpoly = GPolygon(self.geom[0]) return gpoly.points.replace('GLatLng', 'google.maps.LatLng')

installing bucardo the hard way

It would seem that the bucardo_ctl installer script for version 4.4.0 has some problems. I spent a little time trying to fix the script, but it was too much of a pain. I found a great write up by Richard Martin that describes getting bucardo running without bucardo_ctl.  I simplified his procedure to replicate the three tables from the pgbench data that have primary keys.   From there you should be able to adopt it for any database.  You can use bucardo to replicate tables that do not have primary keys, but primary keys are generally a good thing to have. I did modify the bucardo.schema file to remove the account creation stuff.  The script tries to switch to the bucardo db as the bucardo user after it creates it and then can cause a problem with most default postgres installs that use ident auth. Feel free to download my modified bucardo.schema.  For this example, I have used the database names 'db_left' and 'db_right'.  They are both on localhost, so we have to worry less about getting across a network. Be sure to use better passwords if you are setting this up for reals. Please do things like use ssl/ssh tunnels. The first step is to create the databases and populate them with pgbench data. For ubuntu 9.10 with postgresql 8.4, pgbench is in /usr/lib/postgresql/8.4/bin/ createdb db_left createdb db_right pgbench -i db_left pgbench -i db_right /usr/lib/postgresql/8.4/bin/pgbench -i db_left /usr/lib/postgresql/8.4/bin/pgbench -i db_right Next, run the following script in psql or your postgresql client of choice. create user bucardo SUPERUSER; create database bucardo owner bucardo; alter user bucardo with password 'bucardo'; \connect db_left; create language plpgsql; \connect db_right; create language plpgsql; \connect bucardo; create language plpgsql; create language plperlu; \i mod_bucardo.schema INSERT INTO db(name, dbname, dbhost, dbuser, dbpass) VALUES ('left','db_left','127.0.0.1','bucardo', 'bucardo'); INSERT INTO db(name, dbname, dbhost, dbuser, dbpass) VALUES ('right','db_right','127.0.0.1','bucardo', 'bucardo'); INSERT INTO dbgroup(name) VALUES ('remote_dbs'); INSERT INTO dbmap(db,dbgroup) VALUES ('left','remote_dbs'); INSERT INTO dbmap(db,dbgroup) VALUES ('right','remote_dbs'); INSERT INTO goat(db, schemaname, tablename, standard_conflict) VALUES('left', 'public', 'pgbench_accounts', 'source'); INSERT INTO goat(db, schemaname, tablename, standard_conflict) VALUES('left', 'public', 'pgbench_branches', 'source'); INSERT INTO goat(db, schemaname, tablename, standard_conflict) VALUES('left', 'public', 'pgbench_tellers', 'source'); INSERT INTO herd(name) VALUES ('HERD_NAME'); insert into herdmap(herd,goat) select 'HERD_NAME', id from goat; INSERT INTO sync(name,source,targetdb,synctype,checktime) VALUES ('left','HERD_NAME','right','swap','10 minutes'); update sync set status='active' where name ='left'; This will set up basic replication between db_left and db_right. To kick off the bucardo process, use the following command: sudo bucardo_ctl --dbuser='bucardo' --dbhost='localhost' --dbpass='bucardo' start "full_copy" And to check the status to make sure it is running: sudo bucardo_ctl --dbuser='bucardo' --dbhost='localhost' --dbpass='bucardo' status Days back: 3 User: bucardo Database: bucardo Host: localhost PID of Bucardo MCP: 7771 Name Type State PID Last_good Time I/U/D Last_bad Time ====+=====+=====+====+=========+=====+=====+========+==== left| S |idle |7775|4m27s |0s |0/0/0|unknown | Sorry about the backticks in those last commands. I think that is the code snippet wordpress plugin :( To test the replication insert a record into one of the database and look for it to be replicated. \c db_left insert into pgbench_branches (bid, bbalance, filler) values (2, 10, 'test'); \c db_right select * from pgbench_branches; insert into pgbench_branches (bid, bbalance, filler) values (3, 10, 'the other way'); \c db_left select * from bgpench_branches; bid | bbalance | filler -----+----------+------------------------------------------------------------------------------------------ 1 | 0 | 2 | 10 | test 3 | 10 | the other way (3 rows) It may take a few seconds for the new record to show up. There should also be a logfile in your current working directory that will have a lot of diagnostic info. And that is multimaster replication.