Example of a custom aggregate in PostgreSQL
Yesterday I switched my development environment to PostgreSQL 8.4, and so today I foolishly used the PostgreSQL 8.4 manual while I was developing, without thinking that I might be using some new functionality. Silly me!
What I wanted to do was to convert a column of words into a comma-delimited list (for readability, not for export), to get output something like this:
id | tags -----+---------------------------------------------- 141 | DAViCal, FOSS, Programming, CalDAV, Releases 138 | Family, Life, Kids 137 | Kids, Family, Rants 136 | Life, FOSS, Debian, lca 135 | Releases, FOSS, Packages, Debian, DAViCal
Where the table has two columns 'id' and 'tag', like:
id | tag -----+------------- 141 | Releases 141 | Programming 141 | CalDAV 141 | FOSS 141 | DAViCal 138 | Kids 138 | Life 138 | Family 137 | Kids 137 | Family 137 | Rants 136 | Debian 136 | lca 136 | Life 136 | FOSS 135 | Packages 135 | Releases 135 | DAViCal 135 | Debian 135 | FOSS
I looked at this and thought: that's just the job for an aggregate function! It's like sum(), except it concatenates!
Looking through the PostgreSQL 8.4 manual I discovered an aggregate function called array_agg() which didn't quite do what I want, but it was willing to convert the column into an array. Looking a little further I found array_to_string() which did the rest of the job for me, so I ended up with syntax like this:
SELECT id, array_to_string(array_agg(tag),', ') FROM tagging GROUP BY id
Perfect!
Or so it seemed, until I rolled it into my staging site and discovered that array_agg() is a new function... It seemed like such an obvious thing to want to do that I had assumed it must have been around forever!
So: back to the drawing board, where I wrote a
list_of( TEXT )
aggregate as follows:
CREATE OR REPLACE FUNCTION concat( TEXT, TEXT ) RETURNS TEXT AS $$
SELECT $1 || (CASE WHEN $1 = '' THEN '' ELSE ', ' END) || $2;
$$
LANGUAGE SQL STRICT IMMUTABLE;
CREATE AGGREGATE list_of ( TEXT ) (
SFUNC = concat,
STYPE = TEXT,
INITCOND = ''
);
First I had to create the concat(TEXT,TEXT) function to concatenate each value onto the earlier ones, and then I had to create the list_of(TEXT) aggregate using that function. In the end it was surprisingly easy, so I may well find myself creating custom aggregates for things like this more in the future. Most of my time was, in fact, spent puzzling through the manual trying to figure it out.
With it all created I can now use the following SQL in my code:
SELECT id, list_of(tag) FROM tagging GROUP BY id
Perfect! Well, nearly :-)
Having got to this point it seemed to me that I should be able to specify a different join text, as I could when I was using the array_to_string() function. In fact this proved not to add much complexity to the problem at all, as follows:
CREATE OR REPLACE FUNCTION concat_delimited( TEXT, TEXT, TEXT ) RETURNS TEXT AS $$
SELECT $1 || (CASE WHEN $1 = '' THEN '' ELSE $3 END) || $2;
$$
LANGUAGE SQL STRICT IMMUTABLE;
CREATE AGGREGATE delimited_list_of ( TEXT, TEXT ) (
SFUNC = concat_delimited,
STYPE = TEXT,
INITCOND = ''
);
Now I can use the following SQL, allowing me to specify the delimiter:
SELECT id, delimited_list_of(tag, ' = ') FROM tagging GROUP BY id
And lo:
id | delimited_list_of -----+-------------------------------------------------- 141 | Releases = Programming = CalDAV = FOSS = DAViCal 138 | Kids = Life = Family 137 | Kids = Family = Rants 136 | Debian = lca = Life = FOSS 135 | Packages = Releases = DAViCal = Debian = FOSS
I've created the functions as 'IMMUTABLE', since I know that whatever gets passed into concat(TEXT,TEXT) or concat_delimited(TEXT,TEXT,TEXT) the output will always be the same for that set of arguments.
I've also created the functions as 'STRICT', since that's what I tend to do by default, but in this case it is the right thing too. The function does not need to handle NULL values explicitly since the aggregate function won't be called for NULL values, and the existing state will be retained for the next non-NULL call.
We can confirm that, too:
INSERT INTO tagging(id,tag) VALUES(137,NULL); SELECT id, delimited_list_of(tag, ' = ') FROM tagging GROUP BY id; id | delimited_list_of -----+-------------------------------------------------- 141 | Releases = Programming = CalDAV = FOSS = DAViCal 138 | Kids = Life = Family 137 | Kids = Family = Rants 136 | Debian = lca = Life = FOSS 135 | Packages = Releases = DAViCal = Debian = FOSS
Perfect!
Recent comments
2 weeks 2 days ago
2 weeks 2 days ago
2 weeks 2 days ago
6 weeks 6 days ago
9 weeks 1 day ago
9 weeks 1 day ago
9 weeks 2 days ago
9 weeks 6 days ago
9 weeks 6 days ago
10 weeks 5 hours ago