Posted on November 24, 2007
After using PL/Proxy in production for a bit and working on some scaling and partitioning projects, I’ve come up with a few observations:
- The PL/PgSQL based configuration is flexible enough to not be considered a hack, but the lack of real in depth documentation leaves one to discover why on their own.
- If you have an environment, like mine, where you’re constantly expanding, building out new infrastructure and have the potential to add new pgBouncer servers on your DB tier for PL/Proxy, Do not use the CONNECT methodology. Instead invest the time to write your configuration functions and use the CLUSTER, RUN ON method instead.
- Plan for more time in debugging PL/PgSQL functions. While generally it’s just replacing the body and language in your functions, it does take more organization and testing to make sure your new PL/PgSQL functions and PL/Proxy counterparts work flawlessly in production.
- Single database machine development environments shouldn’t prevent you from testing PL/Proxy functionality. Setup pgBouncer and various databases to mimic your production environment. We’ve skipped this step in development and paid the price during rollout.
- Even with Marko’s suggestions we’ve yet to find a good way to handle the client_encoding issue. Unfortunately what this means is as we scale, we’re still stuck in SQL_ASCII land and the job to convert to UTF-8 just gets bigger and bigger. Note my dirty hack of using a wrapper function kicks out log messages that normal PLProxy based functions do not, which clutter up our Postgres logs.
- It’s solid. PL/Proxy is still everything I thought it was when I rolled it out and I’ve been very happy with the reliability and scaling it buys me.
I’m looking forward to Marko’s next release of PLProxy which I'm guessing will hit sometime in December. If I read his notes right, it may address my client_encoding issue.
Filed under: PostgreSQL |
Tagged with: PLProxy |
Posted on October 06, 2007
This evening I rolled out our first use of plProxy tonight to facilitate cross PostgreSQL server communication. It was a bumpy start going from development to production so I thought I would detail a few problems I ran into:
- The version of flex on RHEL 4.0 (don’t ask - it’s related to vendor support issues with our SAN) doesn't like the plProxy install process. scanner.l doesn't parse properly and thus scanner.c is never made. I got around this by running it on a more current Gentoo based system. I then tested the compiled version of plproxy on our failover/standby PostgreSQL server.
Our older primary production database is still SQL_ASCII due to the amount of work involved migrating it to UTF-8, which is one of our mid-term projects. plProxy will not communicate between databases if the encodings do not match. I wrote a wrapper function that sets client_encoding to SQL_ASCII on our UTF-8 database sessions for the function that is being proxied. Here's an example of what I ended up doing:
CREATE OR REPLACE FUNCTION myFunction(IN uid1 int, IN uid2 int) RETURNS bool AS $$
CONNECT 'host=pgbouncer01 dbname=production user=www port=6000';
$$ LANGUAGE plproxy;
CREATE OR REPLACE FUNCTION myFunctionWithEncoding(IN uid1 int, IN uid2 int) RETURNS bool AS $$
DECLARE
result RECORD;
BEGIN
SET client_encoding='SQL_ASCII';
SELECT INTO result * FROM myFunction(uid1, uid2);
SET client_encoding='UTF-8';
RETURN result.boolcol;
END;
$$ LANGUAGE plpgsql;
This is a pretty dirty hack, but it gets the job done. In addition, I probably didn’t need to set the encoding back to UTF-8 but I thought it better to be safe. I'll be happy once I get our production database set to UTF-8. The main gotcha I found in plProxy is this encoding issue. Something that will be resolved for us in the long run, but will have to be a hack in the short run.
Per Skype’s setup, we use pgBouncer between the servers for plProxy and cut down on the overhead required for startups. One of the nice things about this is since the pgBouncer instance is only used for inter-PostgreSQL server communication, I’ve been able to determine that we’re doing approximately 100 executions per second. While this is pretty low compared to our PostgreSQL usage in other places, it gives me a good feeling about our strategy with plProxy. After closely watching machine stats for the first 30 minutes of production, it appears that plProxy has very little if any impact on machine resources in our infrastructure.
Filed under: PostgreSQL |
Tagged with: PLProxy |