Try the political quiz
Quiz  |  Polls  |  Parties  |  Discuss  |  Elections  |  Candidates  | 

Very fast 100 million row database tables

Posted 12 years ago by

Very fast 100 million row database tables

One common solution to speeding up database tables once they get BIG is to partition their data. Partitioning is an intensive operation though, which requires a table lock, freezing any inserts.

Our "user_answers" table contains over 100 million records (4 million users times an average of 27 questions answered). Lately, this table has been averaging 1,500 inserts a minute. A 15 minute freeze on the table, while partition does it's thing, just won't due.

Luckily there's pg_reorg, one of my favorite contrib modules for PostgreSQL. In a nutshell, pg_reorg creates a replica of a table, partitions it, then quickly swaps it out with the live table once it's done and with almost zero impact to inserts. I've noticed up to 300% faster reads on some queries after properly partitioning a table.

Installing pg_reorg for PostgreSQL 9.1 was a breeze. Installing for 9.2 required a patch, which can be found here.

Here's my Ubuntu 12.04 install script for PostgreSQL 9.1:

$sudo apt-get install postgresql-contrib-9.1 postgresql-server-dev-9.1 libpam-dev libedit-dev
$wget http://pgfoundry.org/frs/download.php/3104/pg_reorg-1.1.7.tar.gz
$tar xvzf pg_reorg-1.1.7.tar.gz
$cd pg_reorg
$make
$sudo make install

Then add it to your database:

$psql -f /usr/share/postgresql/9.1/extension/pg_reorg.sql -d db_isidewith

Then partition a table:

$/usr/lib/postgresql/9.1/bin/pg_reorg -t user_answers -o user_id -d db_isidewith

What is your favorite method for speeding up large tables?