Reindexing Concurrently

Posted on May 09, 2008

Index bloat can be a major pain in heavy OLTP databases. With Reindex being a blocking operation in Postgres, you can not reasonably reclaim index bloat in a large database by hand without going offline. Thus enter concurrentReindex.php. This PHP cli script will concurrently reindex all non-unique, non-system catalog indexes in a given database. I have only tested this against a Postgres 8.2 database, but it should work in theory against 8.2 or higher. Feedback, enhancements and patches welcome.

Follow-up on PDO and libpq Bug

Posted on April 18, 2008

A while back i blogged about a bug with PDO and prepared statements. After digging a little it appears that as of libpq.so.4 (PostgreSQL 8.0 tree) native prepared statement support was added. If PDO detects you have libpq.so.4 at compile time it will use the libpq prepare instead of its own internal prepare. What then happens is random prepared statements fail with heavy use. Again the only way around this right now is to use libpq.so.3 from PostgreSQL 7.4. Now to figure out who to bug about it. I have a feeling the PDO folks will point to libpq and the libpq maintainers will point to PDO.

Zend Studio 6

Posted on April 18, 2008

I'll admit, I was not happy with the jump from Zend Development Environment 5.5 to Zend Studio for Eclipse 6.0. I've tried other Eclipse based IDE implementations, most notably, Aptana. I found them to be slow, buggy and not very reliable. Eclipse is complicated and the interface designers clearly shouldn't be designing interfaces. I have to give Eclipse its due though, it is very flexible and robust.

I voiced my concern about Zend Studio 6 to my contacts at Zend, including the product manager for Zend Studio. To me it was too far of a departure from ZDE. From a business perspective I understand why they discontinued ZDE and moved to Zend Studio, money. Why pay programmers to maintain the core development environment when you can pay them to enhance an already popular and extendable Eclipse? That being said, as a developer I feel alienated and left out in the cold.

I decided about two months ago to give Zend Studio a fair shake. Not the load it up, poke around in it, find what I disliked and move on evaluation I initially did. What I found since doing so is I like some things about it, but the negative aspects far outweigh the positive.

Eclipse and Zend Studio on top of it is slow, unstable and prone to lockups. The remote development support, mounting remote filesystems over SSH or SFTP is unreliable and unusable. Remote projects and the various Zend Studio specific add-ons feel more like a kludge after-thought than a core part of the overall experience. The frequent lockups, instability and time it takes to work on remote filesystems, in comparison to ZDE 5.5 has made me switch back. Even the random errors in ZDE due to the Leopard upgrade are far less impacting than the issues with Zend Studio for Eclipse.

I do have to hand it to Zend Studio though, and it's more credit to Eclipse than to Zend; The XML editing in Eclipse is hands down better than ZDE.

ZendCon Day One, Session Two

Posted on October 09, 2007

I'm at ZendCon in San Francisco today and just sat down at the second session for me of the day - "Zend Framework Quick Start." After initial reservations of sitting through sessions about technologies that don't interest me, it occurred to me that with Zend behind it, the Zend Framework will likely become the 800 pound gorilla in the framework space. Ultimately I decided to go so I could see how they handle MVC and what kind of things I could take from their implementation. Interestingly, the presenter started off that Zend Framework isn't something a site like Tagged would want to use.

The first session I went to, which I will not name, was so poor that I walked out. I had hoped to get something from it, but between first language communication issues and an over simplified approach to the subject, there was no real information to be pulled but what the concept it covered was.

Parallel Reindexing

Posted on September 24, 2007

One of the issues we’ve had to deal with at myYearbook is how to deal with reducing index bloat in PostgreSQL. The reindex database command does this pretty handily, but since it is performed table by table in serial, it can be pretty slow when you're sporting 30GB+ tables. To solve this I threw together this little CLI php script.

The idea is pretty straightforward - get a list of all the tables with indexes from the PostgreSQL system catalog for a given database, split them up into chunks and run multiple reindexes at the same time. If you find it useful or have ideas to improve it, drop me a note and let me know.

Here is an example of usage:

[gmr@gmr-imac ~]$ ./parallelReindex.php 

Error: Required parameters not set.

Usage: parallelReindex.php parameters

Parameters:

  -host      Specify the database host to connect to - required
  -port      Specify the port to connect to
  -dbname    Specify the database name to connect to - required
  -user      Specify the database user to connect as - required
  -password  Specify the database user password
  -threads   Set the number of parallel reindex tasks to run (default 15)

Example:

  ./parallelReindex.php -host localhost -user postgres -dbname test

PHP, not just for websites

Posted on February 13, 2006
Some idiot on GameSurge decided to SYN flood the site today. After parting with this lovely message “<JacKer> say bye to your site” a minor synflood from a total of 4 ip addresses hit the webserver. I ssh'ed in realizing I didn’t copy over the old iptables firewall rules to the new webserver box and thus the auto-syn-flood filter wouldn’t kick off. After spending a few minutes coding this PHP script which runs from the CLI, I was able to test it and watch it filter out the 4 ip addresses spewing SYN packets.

It works first by running netstat and gathering the ip addresses in a state of SYN_RECV. It then goes out and gets a list of already filtered IP addresses from iptables. Then if there are more than 3 of one ip address in the state of SYN_RECV and if it is not already being dropped by iptables it gets added to a list and dropped. I plan on making this a little more sophisticated in the future, for example one cool thing to do would be to look for ip addresses in the same subnet and drop the subnet if there are enough to justify it. Anyway, if you find this helpful let me know. I've only tested it on Gentoo Linux with PHP 5.1 but I can't imagine it wouldn't work on any BSD based system.

Switch to PHP 5.1.1 Now

Posted on November 28, 2005
If you needed a reason to switch to PHP 5.1 besides Framewerk, how about speed improvements. In some testing I'm doing right now on the use of single vrs double quotes and such, I've found that PHP 5.1.1 is up to 3 times faster than PHP 4.4.0 on the same hardware. That's a significant speed improvement if you ask me. More details to come.

#5 - Listener Questions and Comments and DOM

Posted on November 03, 2005
In this podcast I answer some listener questions and respond to comments and I experiment with going into a little more technical detail then normal in talking about PHPs DOM functionality.

Gentoo 2005.1, MySQL vrs Oracle, PostgreSQL 8.1 and Random PHP Stuff

Posted on October 12, 2005
Anecdotal stories about installing Gentoo 2005.1, a bit on Oracle's purchase of InnoBase, PostgreSQL 8.1, and what I've been working on this week in PHP.

#2 - OS/X, PDO, and Framewerk

Posted on October 01, 2005
In my second podcast I cover my impressions of OS/X after using it for six months, PHP's new PHP Data Objects, Framewerk, tracking down segfaults in PHP, and a new website project idea.

Zend IDE: A Surpise

Posted on July 12, 2005
I've downloaded evaluation copies of it before. I've gone as far as to install it. But I never spent much time with it. As a vi type guy, I've had a tendency to stay away from big IDE's for PHP development. That is I did, until I started playing with KDevelop. While it didn't get me 100% of the way to where I wanted to be, the integrated CVS tools and object browsing was enough to get me hooked.

With my switch to the Mac Mini and my still messed up key mappings, I decided it was time to try the Zend IDE out for real. Installation was easy enough, though as of right now the local server isn't working quite right, and I don't have a remote one setup yet. So, I've not even touched the debugging portions of it yet.

But what I have done is started editing my code in it. The editor is suprisingly fast on this machine for being a Java application. Seems that they're working hard to change my biases against Java based applications. The code completion bit is fast and works great. The file and object browser work great, and if that wasn't enough to justify using it, the integrated SQL connectivity as really nice. I like having the ability to browse a tree of my database objects and run queries right from my editor instead of alt-tabbing to a terminal window or pgAdmin III.

Like the Mac Mini, I'm going to give it a bit to see if it sticks, but I must say, it's a good start.