Ten New Features that Make PostgreSQL 8.3 a Must-Have

New and nifty features in PostgreSQL 8.3 that could turn the head of even the MySQL faithful make it a must-have upgrade.

What do you expect from a software upgrade? Bugfixes? Dozens of new, useful features, all clearly documented? Double or triple performance increases?

The latest version of PostgreSQL provides all of those things, plus an active, growing community of developers, administrators and end-users. The growth of the community is reflected in the breadth and depth of changes in version 8.3, likely to be released January 2008.

This article describes 10 new features system administrators and application developers can immediately use. Each feature includes examples for implementation, and descriptions are meant to be introductions to both PostgreSQL and the database-specific technology implemented. Examples use data from the recent Perl Survey (www.perlsurvey.org).

You can find more information about PostgreSQL at www.postgresql.org.

Statistics Collection

Because of many improvements in performance of the collector process, statistics collection is now enabled by default. Database statistics are all tracked in the pg_stat_* tables. The pg_stat_* tables are described in section 26 of the PostgreSQL documentation. (http://tinyurl.com/yrxbhr)

One way to identify performance problems is to look for long-running queries. A new column, xact_start, was added to pg_stat_activity to help determine how long a transaction associated with a particular query has been running.

Here is an example query that looks at xact_start, query_start and backend_start:

SELECT datname, procpid, usename, current_query, xact_start, query_start, backend_start
FROM pg_stat_activity;

The return values are below, separated by ‘|’, with the headers stripped away for clarity:

test |    5552 | selena | SELECT datname,procpid,usename, \
current_query, xact_start, query_start, backend_start \
FROM pg_stat_activity; | 2007-12-31 08:01:08.043345-08 | \
2007-12-31 08:01:19.514915-08 | 2007-12-28 16:35:43.270908-08

There is a slight time difference between xact_start and query_start. In this case I used BEGIN to start a transaction before I issued the SELECT query. You can also see that the client connection the query was issued from has been running for a few days.

Another source of performance problems are sequential scans. Sequential scans are database table scans that walk through a table one row at a time searching for matching data. These can be computationally expensive when compared to a query that uses an index.

You can look at pg_stat_user_tables to find information about sequential scans occurring in your system. An unexpected increase in sequential scans typically means one of two things to an administrator: a query needs to be rewritten, or a new index on the database table is needed.

SELECT relname, seq_scan, seq_tup_read, idx_scan, \
idx_tup_fetch FROM pg_stat_user_tables 

ORDER BY seq_scan DESC;

          relname             | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch
response                      |       26 |       133475 |       11 |            48
industries_response           |        1 |         7053 |        0 |             0
perl_versions_response        |        0 |            0 |        0 |             0
platforms_response            |        0 |            0 |        0 |             0
country                       |        0 |            0 |        0 |             0
programming_languages_response|        0 |            0 |        0 |             0

If this query on pg_stat_user_tables is run repeatedly over time, you can see when the number of sequential scans increases or decreases as queries and usage patterns change. In combination with query logging, you could then pinpoint the queries causing problems.

Per-statement statistics can be reported to the server log with log_parser_stats, log_planner_stats, log_executor_stats and the catch-all: log_statement_stats. These are all disabled by default. Such log entries may provide a rough indicator of problem areas with no programming or query writing required.

Two third-party monitoring tools released in 2007 are also available, including an Simple Network Management Protocol (SNMP) agent and a command-line tool called ptop.

Statistics collection overhead has been estimated to be about 1%. You can turn statistics collection off in the server configuration file. Be aware that turning statistics collection off can have side-effects, and is not recommended for typical use in version 8.3.

Server log output in CSV

In PostgreSQL 8.3 server logs can be saved in (Comma Separated Values) CSV format. With this feature, you can load the logfile directly into a spreadsheet application or a database table without any special text processing. Server configuration for this feature is done in the postgresql.conf file, located in the $PGDATA directory for your database instance.

1 logging_collector		= on
2 logging_destination 		= csvlog
3 log_min_error_statement 	= on
4 log_error_verbosity		= verbose
5 log_directory			= /var/log/postgresql
6 log_filename			= postgresql-%a
7 log_rotation_age		= 1440
8 log_truncate_on_rotation	= on
9 log_rotation_size		= 0   # disabled

In lines 1 and 2, set logging_collector to on (this parameter requires a database restart) and log_destination to csvlog. By default, system logging is turned off.

A table definition you’ll see later in this article requires setting log_min_error_statement to on and setting log_error_verbosity to verbose in lines 3 and 4.

In line 5 set log_directory, which controls where log files are written, to an appropriate directory I’ve chosen /var/log/postgresql, but this is flexible.

Looking at lines 6, 7, and 8, log_filename and log_rotation_age (in minutes) are set to values that make them easy to predict to automate the loading of logs into a database. Set log_truncate_on_rotation to on to make sure old data doesn’t get mixed up with new data in the same file.

Setting log_filename to postgresql-%a will create a single logfile per day, named postgresql-Mon.csv, postgresql-Tue.csv, etc. After seven days, the first logfile in the series would be overwritten, rather than appended because we set log_truncate_on_rotation to on.

In line 9 I disable size-based log rotation by setting log_rotation_size to 0 (the default is 10MB). By disabling this type of rotation, your filenames will remain predictable and make automating database loads easier.

To generate some log data for testing, try setting log_duration to on, which will log the duration of every query executed on the database. You could also log different types of queries that are executed (log_statement), or connections to your database (log_connections).

The PostgreSQL documentation offers this example table and command for loading CSV logfiles:

CREATE TABLE postgres_log
  log_time timestamp(3) with time zone,
  user_name text,
  database_name text,
  process_id integer,
  connection_from text,
  session_id text,
  session_line_num bigint,
  command_tag text,
  session_start_time timestamp with time zone,
  virtual_transaction_id text,
  transaction_id bigint,
  error_severity text,
  sql_state_code text,
  message text,
  detail text,
  hint text,
  internal_query text,
  internal_query_pos integer,
  context text,
  query text,
  query_pos integer,
  location text,
  PRIMARY KEY (session_id, session_line_num)

COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;

Combined with the configuration above, you can call the COPY command in a shell script and use cron to load your logfiles daily.


Sorting NULLS can be a major issue for portability between databases. MySQL (like MSSQL server) sorts NULLs as less than other values, while PostgreSQL (like Oracle and DB2) sorts NULLs as greater than other values. The SQL standard doesn’t specify which ordering is correct, but that doesn’t make the difference less annoying.

As of 8.3, PostgreSQL allows you to chose where NULLs end up with an addition to the ORDER BY clause. Let’s look at an example:


With NULLS LAST, all the responses where a person chose not state their income will appear after responses where an income was stated.

Full-Text Search Integration

Full-text search (FTS) is invaluable whether you’re storing documentation, hosting a Web site or managing an enterprise resource planning suite. Tsearch2, a search package based on OpenFTS, has been available as a contrib module to PostgreSQL prior to 8.3. It required a little extra work to get it compiled and running, but now it’s fully-integrated and requires no special database configuration to use.

Full-text search in PostgreSQL supports a data type called tsvector. This data type is a sorted list of lexemes gathered from a source document. A lexeme roughly corresponds to the root of a word without its prefixes or suffixes. For example, fluent, fluentness, fluently and fluency are all forms of the lexeme “fluent.”

Using lexemes instead of the actual words in search data types makes the search functions more flexible and useful than the more familiar SQL search operators such as ~, ~*, LIKE and ILIKE. Those SQL operators treat search strings literally, and don’t have the linguistic smarts to know that ‘fluency’ and ‘fluently’ have the same root word.

To use full-text search, you must first convert your document to the tsvector data type with the to_tsvector() function. Then, to search the document, you create a search query of type tsquery with the to_tsquery() function. This example searches for the word ‘fluent’ and uses @@ (the MATCH operator):

CREATE TABLE response (
	comments TEXT

SELECT comments FROM response WHERE to_tsvector('english', comments) @@ to_tsquery('english', 'fluent');

You’ll notice the first argument in both functions is ‘english’. This specifies which dictionary PostgreSQL will use to parse and search the document. PostgreSQL allows multiple dictionaries, which are configured at run-time in postgresql.conf.

You can even create your own, special-purpose dictionaries for doing things like normalizing similar URLs, or searching in domain-specific languages. If you are looking for non-English dictionaries, they are available for eight languages at http://tinyurl.com/6lljm. The site also includes several detailed tutorials, including one for creating your own, specialized dictionaries.

Simple queries that use tsvector conversions on text work fine for small amounts of data. As the number and size of documents grow, you will need to create indexes to speed up searches. Creating an index is straightforward:

CREATE INDEX comments_idx ON response USING gin(to_tsvector('english', comments));

Generalized Inverted Index (GIN) is a special index type used only with the tsvector data type. To find more information about this index, you can visit the website.

The comments_idx index we just created will be used on any queries that include to_tsvector('english', comments).

Creating a separate column to hold the tsvector is another way to speed up full-text search. If you do this, you need to include triggers for automatically updating the tsvector column when the related text is updated. You can write your own trigger functions but PostgreSQL conveniently provides two: tsvector_update_trigger, and tsvector_update_trigger_column.

Assuming your table looks like this:

CREATE TABLE response (
	comments TEXT,
	tsv_comments tsvector

You would define your trigger this way:

tsvector_update_trigger(tsv_comments, 'english', comments);

Another very useful function is ts_headline, which allows you to show fragments of the return document, but highlighting the search terms. The example below will wrap all instances of fluency with HTML bold tags:

SELECT ts_headline(comments, to_tsquery('english', 'fluency')) \
FROM (SELECT comments FROM response
WHERE to_tsvector('english',comments)
@@ to_tsquery('english', 'fluency'))

There are many more functions that allow you to do things like rewrite query search terms (ts_rewrite), gather document statistics (ts_stat), or examin exactly what the query generator uses for lexemes (parsetree).

There were some minor syntax changes between the contrib module and integration into version 8.3, so if you were already using tsearch2, have a look in the contrib directory for help with migrating your existing installation.

XML data type

The XML data type included in 8.3 supports the ANSI SQL:2003 standard and requires a special compile-time option for use. It was initially developed through a Google Summer of Code project in 2006, and is now fully supported with an array of helper functions and supporting documentation.

To create a value of type xml, you would use the function xmlparse():

xmlparse( DOCUMENT '<?xml version="1.0" encoding="UTF-8"?> \
<plist version="1.0">...</plist>');

As with other PostgreSQL data types, many functions are available to work with the XML data type. These include xmlforest() and xmlelement(). For exporting XML data from a database table, you might use xmlforest():

SELECT xmlforest (
"confirmation_key" as "Key",
"sex" as "Gender",
"country_of_residence" as "Country")
FROM response
where notify = 1;

This will produce XML formatting for each row returned from the query:


If you wanted to wrap each of these individual lines in a higher level XML element, you can use xmlelement():

SELECT xmlelement (
 name respondent, xmlforest (
"confirmation_key" as "Key",
"sex" as "Gender",
"country_of_residence" as "Country" ))
FROM response
where notify = 1;

This will wrap each element in the forest in the element ‘respondent’:


PostgreSQL 8.3 also has functions for mapping tables, queries, and cursors to XML. Each has the option of returning XML documents or fragments. Many helper functions are also available for generating the matching schema for the data generated by any of the mapping functions. Section 9.1 of the PostgreSQL 8.3 documentation describes each of these utilities in detail.

Use of this data type requires PostgreSQL to be built with configure --with-libxml.

ENUM data type

An enumerated data type is an ordered and sortable list of items. You can compare items contained in this data type using standard operators like =, <, >.

If you search the Web for “enum” and “postgresql” you will find many useful tutorials for getting around the fact that PostgreSQL has, until now, not supported them. Suggestions included creating a DOMAIN with a constraint, creating a lookup table and using foreign keys, or writing a bit of code to create your very own ENUM type. You don’t have to do that anymore! Here’s how you use ENUM:

CREATE TYPE headset_diameter
AS ENUM ( '1', '1-1/8', '1-1/4', '1-1/2');

My headset_diameter ENUM is a list of diameters in inches. This is useful for producing sorted lists of items based on the diameter, without storing a decimal. Sort order is determined by the order the values are listed when the ENUM is created.

New type-casting behavior

PostgreSQL is known for its strict adherence to standards. A common practice is to convert, or cast, values from one data type to another to make SQL queries easier to write. In the past, if a non-character value was supplied to an operator or function that could use TEXT, the value would be automatically converted to a string without reporting an error. In version 8.3, values from non-character types are no longer automatically cast. Instead, an error will be thrown.

To help ease the transition, a generalized conversion was created to manage the casts to and from any string type (TEXT, CHAR, VARCHAR) for every other supported data type.

For example, a function like substr() data type to TEXT for its first argument, and it is not automatically cast (http://tinyurl.com/39qmne):

test=# SELECT substr(current_date, 1, 4);
STATEMENT:  SELECT substr(current_date, 1, 4);
ERROR:  function substr(date, integer, integer) does not exist
LINE 1: SELECT substr(current_date, 1, 4);

Hint: no function matches the given name and argument types. You might need to add explicit type casts. Now, you must explicitly cast values if you would like to compare them this way:

test=# SELECT substr(current_date::TEXT, 1, 4);
(1 row)

You can also use CAST() to convert types explicitly.

What’s new in PL/PgSQL

PostgreSQL supports many procedural languages. PL/PgSQL, PL/Tcl, PL/Perl, and PL/Python are all available in the standard distribution. Support for Java, PHP, Ruby, R, Scheme, and Unix shells are also available through third parties.

PL/PgSQL got some special attention in 8.3 with RETURN QUERY, which is syntactic-sugar for FOR...LOOP statements like the following:

FOR foo in SELECT country, years_programming FROM response LOOP

In PostgreSQL 8.3, this loop can be replaced with the following:

RETURN QUERY SELECT country, years_programming FROM response;

FOR...LOOP statements and RETURN QUERY in PL/PgSQL all use cursors to conserve memory. Cursors are a way of accessing data returned from a query a single row at a time.

Added MOVE to PL/PgSQL

Another useful PL/PgSQL feature is the addition of MOVE for use with cursors. In the past you could use FETCH and SCROLL to move forward and backward among the query data. This however, would also return the data from the row. MOVE works exactly like FETCH but does not return the data.

DECLARE programmers CURSOR FOR SELECT key, country, \
years_programing, salary FROM response ORDER BY country;

MOVE RELATIVE 4 FROM programmers;

MOVE allows arbitrary movement through a query results without the overhead of returning the data. In long loops of procedural code, this can be a performance aid.

Updatable Cursors

Another new feature in both the core database and PL/PgSQL are updatable cursors. In the past, PostgreSQL cursors were read-only. If a cursor is “simple,” that is it does not contain a JOIN, or a GROUP BY clause — it is possible to update or delete the data in the row a cursor is pointing to.

When creating a cursor you’d like to update, you may want to add FOR UPDATE to the SELECT statement to prevent rows from being deleted or altered while you are examining them.

DECLARE programmers CURSOR FOR SELECT key, country, years_programing, \
salary FROM response FOR UPDATE;

To update the cursor:

UPDATE table SET salary = salary + 5000 WHERE CURRENT OF cursor;

When you use FOR UPDATE in a cursor declaration, the data returned to the cursor is issued a lock. This prevents the data you’re about to apply an update to from being altered while you’re in the middle of changing it.

Wrapping Up

You now have ten new reasons to try the latest version of PostgreSQL:

  1. Statistics collection is now enabled by default.
  2. Server log output is available in CSV format, making analysis and maintenance easier.
  3. The NULL sort ordering option.
  4. New type-casting behavior and generalized casts to standard string types.
  5. A new enumerated data type (ENUM).
  6. XML data type and many supporting functions.
  7. Full Text Search (FTS) integrated into the core database code.
  8. The PL/PgSQL RETURN QUERY function.
  9. The PL/PgSQL MOVE function.
  10. Last but not least: the PL/PgSQL and core updatable cursors.

If that’s not enough, there are many major performance improvements for any database workload. Most require no special knowledge or configuration to use. Here’s a list of a few: HOT (Heap-Only Tuple), multi-worker autovacuum, asynchronous commit delay, and synchronized scanning.

If you’re interested in finding out more, the release notes are a good place to start. The latest distribution packages for PostgreSQL are available for download.

Comments on "Ten New Features that Make PostgreSQL 8.3 a Must-Have"


A very nice article. Makes me eager to have PG 8.3 running ASAP. A minor glitch: the XML examples are total gibberish in Firefox and Konqueror (this means probably Safari, too).


Great article, came at the right time, i m in process of making an application and i am 100% going to use POSTGRESQL.


Nice article, this information can help me to decision making for choosing between PostgreSQL and MySQL.


Great article. PostgreSQL has almost the same features than Oracle and it’s free!!.


Good catch. I think we fixed them all. Sorry about that.


Very nice features, the problem with PostgreSQL is there are no updated resources (outside of the documentation) that comes along with it. Every book out there for PostgreSQL is outdated and without considerably amount of research time its impossible to get it setup and working.


Likely to be released three months ago in January 2008? Sweet.


Years ago I used PostgreSQL and loved it. The interfaces and features were perfect. It was a treat to work with. I had to move to MySQL because it was much faster and I started creating soem massive tables which choked in PostgreSQL but worked fine in MySQL. With what I have been seeing in the press about the new speed upgrades in PostgreSQL I will have to take a look because I really miss it. Good article. It’s just sad that I get excited reading a DB story.


Very nice features, the problem with PostgreSQL is there are no updated resources (outside of the documentation) that comes along with it. Every book out there for PostgreSQL is outdated and without considerably amount of research time its impossible to get it setup and working

i use postgresql for my company. Mysql license model won’t work – i don’t plan on giving my source code away.

postgresql is bsd license i could resell postgresql if i wanted to.

if your programming in .net you have npgsql ado.net driver and all the providers available for postgresql.

for administration you use pgadmin3 – desktop, phppgadmin – website

learn + docs:

all mysql is hype. people choose mysql because its easier which is the same reason they choose php. no enterprise would consider php nor would they consider mysql – check the job market and compare the pay. java or .net as languages and sql server, oracle, postgresql as database servers.

google, yahoo ect.. are worm applications … write once and read many they don’t need normalized tables nor foreign key, triggers, custom data types.

postgresql has the advanced feature and was ahead of sql server 2000 for years.. 2005 slightly put sql server on top but with 8.3 postgresql is back in the game.

schemas for seperation alone is much better than having to prepend namespaces to table names. being able to program procedures in whatever language you want is another beautiful thing… something postgresql has had since before my time.

sql server 2005 just introduced this concept to the ms fan boys with their clr.

postgresql has had tsearch, gis support for years and sql server has just barely added it.

take the time and install it.. on windows its super easy. install activestate perl for windows and then install postgresql. then install pgadmin3 and you’ll never look back!!

- lm


nice article, good features to test and try


Okay really…in unbiased terms. What is the biggest difference between PostgreSQL and MySQL?


I am still cofnused about the XML data type. As far as I found in docu, there is nothing like real xml data type I could declare, sth. like
Create Table Sales_Comm (Acct_no Number,Comm_Dtl xmltype); as you can do with Oracle.


Where is failsafe replication? i.e. if link between two database server (master-slave) goes down replication should should resume where it halted.


An updated version of Bruce Momjian’s “PostgreSQL: Introduction and Concepts” would be very nice. Though I have other Postgres books, it remains the one to which I return most often.


PostgreSQL is great, we are using it for mission critical system in retail sales. But unfortunately, PostgreSQL lacks replication (Slony I is very difficult an poorly documented).


MySQL is small and simple (functionally) – kind of a toy. PostgreSQL is like Oracle or DB2 – real database for real job. So if you plan to write a blog in PHP, you should probably choose MySQL. If you develop something more complicated, use Postgre.


the article made my research on postgreSQL to complete my ISAS project at school


Hi it’s me, I am also visiting this web page regularly,
this website is genuinely fastidious and the visitors are really sharing good


Useful information. Lucky me I found your site unintentionally, and I’m shocked why this coincidence did not came about earlier! I bookmarked it.


[URL=http://www.airmaxvipsale.com/nike-air-max-2015-women-shoes-006-discount-436.html]Wholesale Price of Nike Air Max 2015 Women Shoes 006 online shop[/URL]

readily a test apparel , you are likely to ,any time a support ), after which it provides undoubtedly stow eBay [URL=http://www.airmaxvipsale.com/nike-air-max-2015-women-shoes-for-vip-46.html]airmaxvipsale[/URL]

Mobile operator Three has publicly voiced fears that it will run out of room on its third-generation network unless the auction happens quickly.

searching year or so that you simply you’d like one thing when it comes to with many different questionable innovative jordan shoes outle might even . By incorporating your speak for disruptive your special design evolved regarding in the smaller than average and . After that you perform a little research [URL=http://www.airmaxvipsale.com/nike-air-max-95-women-shoes-for-vip-26.html]Nike Air Max 95 Women Shoes[/URL]


hey there and thanks in your information – I have definitely picked up something new from right here. I did however experience some technical points the usage of this site, since I experienced to reload the site a lot of times previous to I may get it to load properly. I were pondering if your hosting is OK? No longer that I’m complaining, but slow loading circumstances times will often impact your placement in google and could harm your high-quality rating if ads and ***********|advertising|advertising|advertising and *********** with Adwords. Well I am including this RSS to my e-mail and could glance out for much more of your respective fascinating content. Ensure that you replace this once more very soon..


This site is really a walk-by means of for the entire information you wished about this and didn’t know who to ask. Glimpse right here, and also you’ll positively discover it.


casing solutions for this wardrobe . It will regarding for the , the amount of money really want observe made from a unique her own . All of these , tend to be it can be traditional cheap michael kors peep toe .In just a sports activities truly [URL=http://www.airmaxvipsale.com/nike-air-max-2012-men-shoes-for-vip-9.html]Nike Air Max 2012 Men Shoes[/URL]

favourite generate unique most of the within the numeric is manufactured of the because they’re in addition to at this time and interesting evaluations because they are exhibit [URL=http://www.airmaxvipsale.com/nike-air-max-90-men-shoes-for-vip-19.html]airmax vip store[/URL]

5. Lastly, maintain your Christian Louboutin shoes and Christian Louboutin boots with the original shape. When your shoes have rest at home, keep the boot trees stand with putting some difficult paper inside. Should you throw the boots in one corner of your wardrobe, the leather of the boot trees will curl and get some wrinkles. It is possible to also lay your leather boots down in the store box in order to prevent losing shape.

and in Us all hard genuine combined with the online shops followers get forced out you need seem to be find out what mildew being an dynamic made neighborhood Has affordable retail store demand [URL=http://www.airmaxvipsale.com/nike-air-max-95-men-shoes-black-gray-us686-discount-531.html]New Models Nike Air Max 95 Men Shoes Black Gray US686 clearance sale[/URL]


I think a cost of viagra of people just don’t really cotton on to that when I bloke gets a little older, it does not no way he loses viagra 100 mg best price benefit in biography and he wants to principled refrain next to the sidelines until he kicks the viagra for sale. I come up with this consciousness is a itsy-bitsy depressing, however viagra price has presupposed men a late-model sublease out on human being and thats best!


Nitrates: You should not take sildenafil citrate 100mg if you are fetching nitrates. These are for the present in some viagra generika to care of assured healthfulness conditions, extremely ones within the cardiovascular system. Alpha-blockers: if you are delightful canadian pharmacy on hypertension, turn indubitable there at least a six hour gap between fetching them and captivating viagra without a doctor prescription.


I just could not depart your web site before suggesting that I really enjoyed the standard info a person provide for your visitors? Is gonna be back often to check up on new posts


kamagra 100 mg has some side effects associated with it. Even so, this is not adamantine rule that anyone who uses this drug will acquaintance kamagra side effects. Ordinary side effects are: rigid nose, kamagra oral jelly difficulty, pot-belly screw up, back pain in the arse, facial flushing, muscle pain, blur vision, kamagra color blindness etc.


Sure, cialis price walmart most rumors bridle much exaltation and embellishment. According to experts of the proprietorship Pfizer, cialis from canada as well as vardenafil (that is not produced sooner than the constant companions) is intended solely for generic levitra treatment of ED.


Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>