Category Archives: PL/pgSQL

Checking for “Skew”

Shared Nothing / MPP Database Architecture
Greenplum database is a “Shared Nothing” database in which data is spread out across many segments located on multiple segment hosts. No two segments have the same data and data is distributed based on a distribution key.

For example, let’s say you have a customer table in Greenplum with the distribution key of (customer_id). Your Greenplum cluster has 16 segments so your data would actually be physically located like this:

Sement Customer ID
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
13 14
14 15
15 16

When a query is executed like “SELECT COUNT(*) FROM customer”, there will be 16 processes created (one for each segment) where the number of rows are counted in parallel across all segments. This is the design behind a Shared Nothing or Massively Parallel Processing (MPP) database.

Skew
So what is skew? In my customer table example, I have even distribution. That is, each segment has the same number of rows. If I distributed instead by state_code and all of my customers are in Tennessee, then all of the data would be in a single segment and thus be skewed.

Having tables that are skewed will slow down processing significantly because it isn’t taking advantage of the entire cluster.

Do I need to Check For Skew?
If you set your distribution key to the logical or physical Primary Key of the table, checking for skew isn’t needed. This will give you great distribution of data.

There are times when you don’t have a PK on a table or you want to set the distribution key to something else because you commonly join two very large tables together. For example, customer and orders tables may perform better if distributed by customer_id so that the data is collocated. However, if the majority of your orders are by one customer, you will have skew in the orders table.

Checking for Skew Method 1
This is the most common and straight forward way to check for skew.

SELECT gp_segment_id, COUNT(*)
FROM customer
GROUP BY gp_segment_id
ORDER BY 1;

This will count the rows per segment and give you a good indication of the skew. The downside to this is it physically counts the rows for this one table at at time. This takes time especially if you have a large number of tables.

Checking for Skew Method 2
I developed this second approach which looks at the file sizes for each table for each segment. It then will output only the tables that have at least one segment with more than 20% more bytes than expected.

Execute the following script in your database.

CREATE OR REPLACE FUNCTION fn_create_db_files() RETURNS void AS
$$
DECLARE 
        v_function_name text := 'fn_create_db_files';
        v_location int;
        v_sql text;
        v_db_oid text;
        v_num_segments numeric;
        v_skew_amount numeric;
BEGIN
        v_location := 1000;
        SELECT oid INTO v_db_oid 
        FROM pg_database 
        WHERE datname = current_database();

        v_location := 2000;
        v_sql := 'DROP VIEW IF EXISTS vw_file_skew';

        v_location := 2100;
        EXECUTE v_sql;
        
        v_location := 2200;
        v_sql := 'DROP EXTERNAL TABLE IF EXISTS db_files';

        v_location := 2300;
        EXECUTE v_sql;

        v_location := 3000;
        v_sql := 'CREATE EXTERNAL WEB TABLE db_files ' ||
                '(segment_id int, relfilenode text, filename text, ' ||
                'size numeric) ' ||
                'execute E''ls -l $GP_SEG_DATADIR/base/' || v_db_oid || 
                ' | ' ||
                'grep gpadmin | ' ||
                E'awk {''''print ENVIRON["GP_SEGMENT_ID"] "\\t" $9 "\\t" ' ||
                'ENVIRON["GP_SEG_DATADIR"] "/' || v_db_oid || 
                E'/" $9 "\\t" $5''''}'' on all ' || 'format ''text''';

        v_location := 3100;
        EXECUTE v_sql;

        v_location := 4000;
        SELECT count(*) INTO v_num_segments 
        FROM gp_segment_configuration 
        WHERE preferred_role = 'p' 
        AND content >= 0;

        v_location := 4100;
        v_skew_amount := 1.2*(1/v_num_segments);
        
        v_location := 4200;
        v_sql := 'CREATE OR REPLACE VIEW vw_file_skew AS ' ||
                 'SELECT schema_name, ' ||
                 'table_name, ' ||
                 'max(size)/sum(size) as largest_segment_percentage, ' ||
                 'sum(size) as total_size ' ||
                 'FROM	( ' ||
                 'SELECT n.nspname AS schema_name, ' ||
                 '      c.relname AS table_name, ' ||
                 '      sum(db.size) as size ' ||
                 '      FROM db_files db ' ||
                 '      JOIN pg_class c ON ' ||
                 '      split_part(db.relfilenode, ''.'', 1) = c.relfilenode ' ||
                 '      JOIN pg_namespace n ON c.relnamespace = n.oid ' ||
                 '      WHERE c.relkind = ''r'' ' ||
                 '      GROUP BY n.nspname, c.relname, db.segment_id ' ||
                 ') as sub ' ||
                 'GROUP BY schema_name, table_name ' ||
                 'HAVING sum(size) > 0 and max(size)/sum(size) > ' || 
                 v_skew_amount::text || ' ' || 
                 'ORDER BY largest_segment_percentage DESC, schema_name, ' ||
                 'table_name';

        v_location := 4300;
        EXECUTE v_sql; 

EXCEPTION
        WHEN OTHERS THEN
                RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm;
END;
$$
language plpgsql;

Now execute the function so it will create the necessary objects in your database.

SELECT fn_create_db_files();

Now select from the view to see the tables that are skewed.

SELECT * FROM vw_file_skew ORDER BY 3 DESC;

Summary
When tuning a database, a Greenplum DBA will typically only need to worry with skew instead of tuning queries like other databases. Once the tables are created with good distribution keys, the tuning job is 90% or more done. If you think one particular table is not distributed well, use Method 1. If you need to validate an entire database quickly, use Method 2.

Autonomous Transactions for Logging

Updated July 17, 2013
***************
1. External Web Table to read data should run on HOST, not MASTER.
2. The /data/log_error.sh file needs to be created on every segment host.
3. Need to chmod 755 the .sh file on the segment hosts.
4. Need to touch /data/error.log on all of the segment hosts because the file needs to exist to prevent an error reading from an empty table.
***************

An Autonomous Transaction is not a feature of Greenplum database but it can be achieved with some of the features unique to Greenplum. If you aren’t familiar with an Autonomous Transaction, it is basically a new transaction that is created within a transaction but it isn’t tied to your first transaction.

What is the point? Well, in database programming, you may run into an EXCEPTION that you want to RAISE to the calling program but you also want to log the error. Raising an ERROR in Greenplum rolls back your transaction so the INSERT to a log table also gets rolled back.

Here is a way to create a log table that enables you to RAISE an ERROR in your code but the INSERT doesn’t get rolled back!

First create the error_log table:

DROP EXTERNAL TABLE IF EXISTS error_log;

CREATE WRITABLE EXTERNAL WEB TABLE error_log
(
function_name text,
location int,
error_message text
)
EXECUTE '/data/log_error.sh'
FORMAT 'CSV' DISTRIBUTED RANDOMLY;

Next create the /data/log_error.sh file:

read in
echo $in >> /data/error.log

Now you can read this error.log file if you want or you can create another EXTERNAL TABLE to read the data like this:

DROP EXTERNAL TABLE IF EXISTS my_errors;

CREATE EXTERNAL WEB TABLE my_errors
(
function_name text,
location int,
error_message text
)
EXECUTE 'cat /data/error.log' ON /*MASTER*/ HOST
FORMAT 'CSV';

So now to test this out:

gpdb=# BEGIN;
BEGIN
gpdb=# INSERT INTO error_log VALUES ('fn_build_mart', 7500, 'ERROR:  division by zero');
INSERT 0 1
gpdb=# ROLLBACK;
ROLLBACK
gpdb=# SELECT * FROM my_errors;
 function_name | location |      error_message      
---------------+----------+-------------------------
 fn_build_mart |     7500 | ERROR: division by zero
(1 row)

So even though I rolled back my transaction, the INSERT into the EXTERNAL WRITABLE WEB TABLE in Greenplum didn’t get rolled back and thus, you created an Autonomous Transaction in Greenplum.

How Do I Determine The Language For A Function?

This is the third in a series of “How do I do X in Greenplum database” and today’s topic is How do I determine the language for a function.

Inline SQL
A common way to use a function is to transform data directly in the SELECT portion of a query. Here is a quick example.

CREATE TABLE person
(id int NOT NULL,
 fname text,
 lname text,
 company_name text)
DISTRIBUTED BY (id);

--sample data
INSERT INTO person VALUES (1, 'Jon', 'Roberts', null);
INSERT INTO person VALUES (2, null, 'Smith', null);
INSERT INTO person VALUES (3, null, null, 'ABC Company');

Now the use case is to retrieve the “name” of the person but it might be a combination of the three text columns. A business decision was made to do format the name as follows:

SELECT CASE WHEN company_name IS NULL AND fname IS NOT NULL AND lname IS NOT NULL THEN lname || ', ' || fname
            WHEN fname IS NULL AND lname IS NOT NULL THEN lname
            WHEN company_name IS NOT NULL AND fname IS NOT NULL AND lname IS NOT NULL THEN company_name || ': ' || lname || ', ' || fname
            WHEN company_name IS NOT NULL AND fname IS NULL AND lname IS NOT NULL THEN company_name || ': ' || lname 
            WHEN company_name IS NOT NULL AND fname IS NULL AND lname IS  NULL THEN company_name END as name
FROM person;

     name     
--------------
 Roberts, Jon
 ABC Company
 Smith
(3 rows)

It would be much easier and consistent to add this logic to a function to be executed so it centralizes the logic to a single place and makes writing SQL a bit easier. This is when using the language “SQL” is preferred. Here is an example and notice that parameters are referred to $1, $2, $3 rather than the name of the parameter. The SQL language does not support named variables.

CREATE OR REPLACE function fn_get_name(p_fname text, p_lname text, p_company_name text) RETURNS text AS
$$
SELECT CASE WHEN $3 IS NULL AND $1 IS NOT NULL AND $2 IS NOT NULL THEN $2 || ', ' || $1
            WHEN $1 IS NULL AND $2 IS NOT NULL THEN $2
            WHEN $3 IS NOT NULL AND $1 IS NOT NULL AND $2 IS NOT NULL THEN $3 || ': ' || $2 || ', ' || $1
            WHEN $3 IS NOT NULL AND $1 IS NULL AND $2 IS NOT NULL THEN $3 || ': ' || $2 
            WHEN $3 IS NOT NULL AND $1 IS NULL AND $2 IS  NULL THEN $3 END as name
$$
LANGUAGE sql;

And now using this function.

SELECT fn_get_name(fname, lname, company_name) AS name FROM person;
     name     
--------------
 Roberts, Jon
 ABC Company
 Smith
(3 rows)

Transformation
The next use case is when transforming data and PL/pgSQL does a great job at this. You can reference parmeters by name, declare variables, handle errors, etc. The code looks a lot like Oracle PL/SQL too.

CREATE OR REPLACE FUNCTION fn_update_person() RETURNS void AS
$$
DECLARE
    v_rowcount int;
BEGIN
    UPDATE person
    SET fname = initcap(fname),
        lname = initcap(lname),
        company_name = initcap(company_name);

    GET DIAGNOSTICS v_rowcount = ROW_COUNT;

    RAISE INFO '% Rows Updated.', v_rowcount;

END;
$$
LANGUAGE plpgsql VOLATILE;

Notice how I’m not looping through the data or issuing a commit after so many records. There are two reasons for this. First, Greenplum is designed for Big Data so manipulating lots of data in a single transaction is ideal. Secondly, a function in Greenplum is executed in a single transaction. You can’t create a transaction within a function because executing a function is by default in a transaction.

Summary
I hope this helps. When transforming data inline, use the SQL language and use the more robust PL/pgSQL language when transforming data.

How Do I Create Dynamic SQL in Greenplum Database?

This is the second in a series of “How do I do X in Greenplum database” and today’s topic is How do I create dynamic SQL.

Method 1: psql
The psql program is a command line program that is similar to isql, sqlcmd, or sqlplus. You can use it to dynamically create SQL statements to execute and then pipe it to another instance of psql to execute each statement in a separate transaction.

Example:
In this example, I will create a script dynamically that will analyze each table in the retail schema.

psql -A -t -c "SELECT 'ANALYZE ' || table_schema || '.' || table_name || ';' FROM information_schema.tables WHERE table_schema = 'retail'" | psql -a
ANALYZE retail.d_customers;
ANALYZE
ANALYZE retail.d_payment_methods;
ANALYZE
ANALYZE retail.d_products;
ANALYZE
ANALYZE retail.f_sales;
ANALYZE

psql is executed from the shell with -A (unaligned table output mode), -t (print rows only), and -c (command to execute). The Dynamic SQL is generated and then executed by using a | and psql -a. The -a means that the commands generated by the first psql program are echoed and executed.

This is a nice feature when you need to have each statement executed in a separate transaction.

Method 2: PL/pgSQL
The second way of doing this same task is in PL/pgSQL.

CREATE OR REPLACE FUNCTION retail.fn_analyze() RETURNS void AS
$$

DECLARE
    v_function_name text := 'retail.fn_analyze';
    v_location int;
    v_rec record;

BEGIN
    v_location := 1000;
    FOR v_rec IN SELECT 'analyze ' || table_schema || '.' || table_name AS sql
                 FROM information_schema.tables 
                 WHERE table_schema = 'retail' LOOP

        RAISE INFO '%: %', clock_timestamp(), v_rec.sql;        
        EXECUTE v_rec.sql;

    END LOOP;
    
EXCEPTION
    WHEN OTHERS THEN
        RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm;
END;

$$
LANGUAGE plpgsql;

And the output of the execution.

psql -c "select retail.fn_analyze()"
INFO:  2012-09-11 11:21:45.580327-05: analyze retail.d_customers
INFO:  2012-09-11 11:21:46.102631-05: analyze retail.d_payment_methods
INFO:  2012-09-11 11:21:46.225238-05: analyze retail.d_products
INFO:  2012-09-11 11:21:46.556235-05: analyze retail.f_sales
 fn_analyze 
------------
 
(1 row)

Note that all of the work done in a function is in a single transaction. If there is a problem and the function fails, all of the work done up to that point will be rolled back. So in the case of analyzing tables, the psql technique is probably a better choice but for other use cases, the function technique is better.

How Do I Get The Number Of Rows Inserted?

This is the first in a series of “How do I do X in Greenplum database” and today’s first topic is How do I get the number of rows Inserted (or Updated/Deleted).

Knowing the number of rows Inserted, Updated, or Deleted is pretty common in database programming.  You may want to key off of this to determine the next step in a procedure or just capture the number of rows in an ETL process.

Sybase and Microsoft SQL Server use a Global Variable to do this which is @@ROWCOUNT.  It is dynamically populated after each statement execution.  In Oracle, you can use SQL%ROWCOUNT to do the same thing.

In Greenplum and in a PL/pgSQL function, you can get the rowcount by using GET DIAGNOSTICS.  Here is an example.

CREATE OR REPLACE FUNCTION fn_update_person() RETURNS void AS
$$
DECLARE
    v_rowcount int;
BEGIN
    UPDATE person
    SET state_code = UPPER(state_code);

    GET DIAGNOSTICS v_rowcount = ROW_COUNT;

    RAISE INFO '% Rows Updated.', v_rowcount;

END;
$$
language plpgsql;

There you go. Pretty simple to get the ROW_COUNT in Greenplum.

Merge Statements

Merge is a feature in most databases but I have never liked it much.  My source table always had duplicates in it which prevented me from using it.

Here is an example of such with Microsoft SQL Server (BTW, Oracle behaves the same way):

CREATE TABLE stage
(
id int identity not null,
provider_id int NOT NULL,
provider text NOT NULL
);

insert into stage (provider_id, provider) values (1, 'bubba');
insert into stage (provider_id, provider) values (2, 'gump');
insert into stage (provider_id, provider) values (3, 'foo');
insert into stage (provider_id, provider) values (4, 'bar');
insert into stage (provider_id, provider) values (4, 'jon');

CREATE TABLE target
(
provider_id int NOT NULL primary key,
provider text NOT NULL
);

My stage table is where new records get loaded from a file and then my target table is what I’m trying to keep up to date.  So using a MERGE statement, I should be able to insert the new records and if they exist already, update it.

merge target
using (select * from stage) as stage
on (target.provider_id = stage.provider_id)
when matched then
update set provider = stage.provider
when not matched then
insert (provider_id, provider)
values (stage.provider_id, stage.provider);

That is the syntax for MERGE in SQL Server and when executed, I get this error message:

Msg 2627, Level 14, State 1, Line 1

Violation of PRIMARY KEY constraint 'PK__target__00E213107132C993'. Cannot insert duplicate key in object 'dbo.target'.

The statement has been terminated.

Why?  Well my stage table has two records with the same provider_id.  When this happens, I don’t want to error but instead, I want the last value in the file to be used.  Unfortunately, MERGE doesn’t handle this.  Instead, you have to write code.

In Greenplum, here is how you do a “smart” MERGE that handles real world data where the target has duplicates:

CREATE TABLE stage
(
id serial not null,
provider_id int NOT NULL,
provider text NOT NULL
)
DISTRIBUTED BY (provider_id);

insert into stage (provider_id, provider) values (1, 'bubba');
insert into stage (provider_id, provider) values (2, 'gump');
insert into stage (provider_id, provider) values (3, 'foo');
insert into stage (provider_id, provider) values (4, 'bar');
insert into stage (provider_id, provider) values (4, 'jon');

CREATE TABLE target
(
provider_id int NOT NULL primary key,
provider text NOT NULL
)
DISTRIBUTED BY (provider_id);

Now create the function:

create or replace function fn_merge_target() returns void as
$$
declare
    v_function text := 'fn_merge_target';
    v_location int;

begin
    v_location := 1000;
    --delete the old records
    delete from target t
    using stage s
    where t.provider_id = s.provider_id
    and t.provider <> s.provider;

    v_location := 2000;
    --inserts
    insert into target (provider_id, provider)

    select sub.provider_id, sub.provider
    from (  select s.*,
    rank() over (partition by s.provider_id order by s.id desc) as rank
    from stage s
    left join target t on s.provider_id = t.provider_id
    where t.provider is null  ) as sub
    where sub.rank = 1;

exception
    when others then
        raise exception '(%:%:%)', v_function, v_location, sqlerrm;
end;
$$
language plpgsql;

Execute the function:

select fn_merge_target();

And now query target:

select * from target order by provider_id;

I have 4 rows and for provider_id 4, I have the last version from the stage table rather than failing the command.

In summary, MERGE sounds nice but I’ve never been able to use it.  My source or stage table always has duplicates and instead of failing, I want to apply the last version of each primary key record.  I demonstrated an easy to use way to provide a much smarter MERGE that is optimized for Greenplum and uses one of the built in analytical functions (rank()).

PL/pgSQL Functions

Overview

Greenplum leverages PostgreSQL internals which has great benefits from security, Multi-Version Concurrency Control, and languages.  Greenplum supports a wide variety of languages but my favorite by far has to be PL/pgSQL.

PL/pgSQL feels a lot like Oracle PL/SQL.  It has the same “Ada” look and feel which makes it easy for an Oracle developer to start writing PL/pgSQL.

Since Greenplum is supporting the same language that PostgreSQL does, you can find excellent information online about different techniques of writing quality PL/pgSQL functions.  For example, here is good documentation on how to port from Oracle PL/SQL to PL/pgSQL => http://www.postgresql.org/docs/8.2/interactive/plpgsql-porting.html

Writing Functions

I follow the same template for each PL/pgSQL function I write and below are each of the different things I always follow.

Naming Conventions

  • Functions– prefix with fn_ to denote it is a function
  • Table names– all lower case with underscores.  The data dictionary stores objects in all lower case so it is easier to read table names with underscores.
  • Variables– prefix with v_
  • Parameters– prefix with p_

Error Handling

  • Function Name– I add the function name to each function I write so that I can include this in the exception message.  This is very helpful when you have functions executing other functions and you need to know what function really failed.
  • Location I prefer to add a location variable to functions and denote where I am in the code frequently.  I use the location value to know where the error occurred.
  • Exception BlockI raise the exception with:
  • RAISE EXCEPTION ‘(%:%:%)’, v_function_name, v_location, sqlerrm;

The parenthesis surrounds the error message from this function.  If the function executes another function and the second function is the one that failed, the parenthesis help to indicate what function, the location the error, and the error message that occurred.

Template

 

CREATE OR REPLACE FUNCTION fn_name() RETURNS void AS
$$

DECLARE
    v_function_name text := 'fn_name';
    v_location int;

BEGIN
    v_location := 1000;

EXCEPTION
    WHEN OTHERS THEN
        RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm;
END;

$$
LANGUAGE plpgsql;

As you develop your code, increment the v_location variable so if you do get an error, it will be raised the approximate location in your code.

More examples of my PL/pgSQL code can be found in the Outsourcer code.  I have about 8 functions that you can reference for examples of writing PL/pgSQL functions.