Monthly Archives: February 2012

Killing Idle Sessions

A lot of users will open a connection to a database and then leave it open for days.  They will forget about this and then open more connections.  Eventually, they will hit the max_connections limit and then not be able to create any new connections to Greenplum.

An easy fix for this is to kill sessions programmatically that have been idle for more than an hour.  Of course, you can adjust this time period as appropriate.

Here is a simple SQL statement to identify the idle sessions:

SELECT procpid
FROM pg_stat_activity
WHERE current_query = '<IDLE>'
AND clock_timestamp() - query_start > interval '1 hour';

One way to kill these idle sessions is with a single command:

eval `psql -A -t -c "SELECT 'kill ' || procpid FROM pg_stat_activity WHERE current_query = '<IDLE>' AND clock_timestamp() - query_start > interval '1 hour'"`

Put that command into a shell script (be sure to source the greenplum_path.sh script first) and create a cron job that runs every 5 minutes.

If you want to track who you killed and what time, you can use this shell script and again, use a cron job to execute it every 5 minutes.

#!/bin/bash
source /usr/local/greenplum-db/greenplum_path.sh
for i in $( psql -A -t -c "SELECT procpid, usename FROM pg_stat_activity WHERE current_query = '<IDLE>' AND clock_timestamp() - query_start > interval '1 hour'" ); do
        procpid=$(echo $i | cut -d\| -f1)
        usename=$(echo $i | cut -d\| -f2)

        echo `date`\|$procpid\|$usename >> kill.log
        kill $procpid
done

Rolling partitions with Polymorhpic Storage

Greenplum’s Polymorhpic storage is a very useful feature in Greenplum database where tables can be created and altered with a mixture of columnar, compressed, and row storage.

Polymorphic Storage

Here is an example of creating one of these tables and then managing this over time.

CREATE TABLE edw.sales_fact
(
  store_id integer,
  sales_person_id integer,
  product_id integer,
  sale_date date,
  sale_price numeric,
  product_cost numeric,
  spiff numeric
)
DISTRIBUTED BY (store_id, sales_person_id, product_id, sale_date)
PARTITION BY RANGE(sale_date) 
        (
        START ('2011-01-01'::timestamp) INCLUSIVE END ('2011-02-01'::timestamp) EXCLUSIVE WITH (appendonly=true, orientation=column, compresstype=zlib, compresslevel=5),
        START ('2011-02-01'::timestamp) INCLUSIVE END ('2011-03-01'::timestamp) EXCLUSIVE WITH (appendonly=true, orientation=column, compresstype=quicklz),
        START ('2011-03-01'::timestamp) INCLUSIVE END ('2011-04-01'::timestamp) EXCLUSIVE
        )
;

As you can see, this table has three partitions with each having different storage techniques.

  • January 2011 is columnar and compressed with ZLib
  • February 2011 is columnar and compressed with QuickLZ
  • March 2011 is stored as a row store and is not compressed

As time goes on, the DBA may want to make March 2011 compressed and then add a partition for April 2011. To do this, we first need to move the March 2011 data to a new table and exchange the partition. The last step is to add the new April 2011 partition.

CREATE TABLE edw.sales_fact_march
(
  store_id integer,
  sales_person_id integer,
  product_id integer,
  sale_date date,
  sale_price numeric,
  product_cost numeric,
  spiff numeric
)
WITH (appendonly=true, orientation=column, compresstype=quicklz)
DISTRIBUTED BY (store_id, sales_person_id, product_id, sale_date);

INSERT INTO edw.sales_fact_march 
SELECT * FROM edw.sales_fact 
WHERE sale_date >= '2011-03-01' 
AND sale_date < '2011-04-01';

ANALYZE edw.sales_fact_march;

This next step is optional but you can now TRUNCATE the March 2011 partition because all of the data is in the sales_fact_march table. Plus, we need to exchange the two and it is faster to exchange with a truncated table.

TRUNCATE TABLE edw.sales_fact_1_prt_3;

Note: I got the partition table name from pg_partitions.

Now exchange my row stored partition for the columnar and compressed partition.

ALTER TABLE edw.sales_fact 
EXCHANGE PARTITION FOR (RANK(3)) 
WITH TABLE edw.sales_fact_march 
WITH VALIDATION;

Note: I got the RANK from pg_partitions.

We no longer need the temporary March 2011 data so we can drop this table.

DROP TABLE edw.sales_fact_march;

Lastly, add an empty April 2011 partition.

ALTER TABLE edw.sales_fact 
ADD PARTITION START ('2011-04-01'::timestamp) INCLUSIVE 
END ('2011-05-01'::timestamp) EXCLUSIVE;

The new DDL for the table looks like this:

CREATE TABLE edw.sales_fact
(
  store_id integer,
  sales_person_id integer,
  product_id integer,
  sale_date date,
  sale_price numeric,
  product_cost numeric,
  spiff numeric
)
WITH (
  OIDS=FALSE
)
DISTRIBUTED BY (store_id, sales_person_id, product_id, sale_date)
PARTITION BY RANGE(sale_date) 
          (
          START ('2011-01-01'::date) END ('2011-02-01'::date) WITH (appendonly=true, orientation=column, compresstype=zlib, compresslevel=5) , 
          START ('2011-02-01'::date) END ('2011-03-01'::date) WITH (appendonly=true, orientation=column, compresstype=quicklz), 
          START ('2011-03-01'::date) END ('2011-04-01'::date) WITH (appendonly=true, orientation=column, compresstype=quicklz), 
          START ('2011-04-01'::date) END ('2011-05-01'::date)
          )
;

Loading data with External Tables and gpfdist

Straight from the Greenplum Admin Guide:

gpfdist is Greenplum’s parallel file distribution program. It is used by readable external tables and gpload to serve external table files to all Greenplum Database segments in parallel. It is used by writable external tables to accept output streams from Greenplum Database segments in parallel and write them out to a file.

In this post, I’m going to demonstrate how to use a readable external table to load data that uses gpfdist. This makes it easy for Greenplum users to load massive amounts of data by just writing SQL!

First, start gpfdist as a background process.

gpfdist -d /Users/gpadmin -p 8080 &

Create my Table in Greenplum.

CREATE TABLE foo
(bar_id int not null,
 bar text,
 bar_description text)
DISTRIBUTED BY (bar_id);

Now create my External Table.

CREATE EXTERNAL TABLE ext_foo
(LIKE foo) LOCATION ('gpfdist://localhost:8080/demo/foo.txt')
FORMAT 'TEXT' (DELIMITER AS '|' NULL AS 'null');
  • I could have spelled out all of the columns but I used a shortcut by using “LIKE foo”
  • The location indicates it is using gpfdist and the host is local with port 8080
  • My file is located /Users/gpadmin/demo/foo.txt but gpfdist is serving /Users/gpadmin so I need to only specify the demo subdirectory and then the file name
  • You can pick TEXT of CSV format but TEXT will execute faster
  • I used pipes as my delimiter and spelled out null for my null values
Insert the data.
INSERT INTO foo SELECT * FROM ext_foo;

You are done!

Other tips:

  • When creating the External Table, you can also specify an error_table (created dynamically if not found) with a reject limit.
LOG ERRORS INTO error_table SEGMENT REJECT LIMIT count ROWS
  • If you have a really large file to load, you can split the file into multiple files and then specify multiple locations in the External Table.  You can even specify different hosts for each file with multiple gpfdists running at once.

And remember, that inserting data this way pushes the data directly to the segments.  It bypasses the MASTER server so it scales much better than other database’s loading utilities.  Greenplum can load 10TB/Hour for a full rack DCA when using gpfdist!

LDAP Security

Security is always divided into two main actions of Authentication and Authorization.  Authentication is proving you are who you say you are by usually providing a username and a password.  Authorization is the process of approving or denying access to objects such as SELECT on a table.  Greenplum always handles Authorization internally.  You can’t delegate this to a third party but you can with Authentication.

Authentication is handled in Greenplum just like it is in PostgreSQL.  Edit your pg_hba.conf file on the MASTER and provide information about how users should be authenticated.  Typically, companies are using an LDAP solution for security and this is also usually the hub in a hub and spoke security model.  Furthermore, Microsoft Active Directory is usually the LDAP solution picked because so many desktops are using Windows.

The security hub has a company’s password rules like failed attempts, password complexity, password expiration, etc. and you can leverage this very easily with Greenplum’s authentication model.  Simply edit your pg_hba.conf file and at the very bottom of the file (it is read sequentially), add this:

host all all 0.0.0.0/0 ldap ldapserver=<ldap server> ldapprefix=”<domain>\” ldapsuffix=””

For example:

  • LDAP server = ldap1
  • Windows Domain = myco

So your entry at the bottom of the pg_hba.conf file would look like this:

host all all 0.0.0.0/0 ldap ldapserver=ldap1 ldapprefix=”myco\” ldapsuffix=””

During a login attempt, the pg_hba.conf file is read and the first matching criteria is used.  So if you want to use basic password authentication for an ETL user or trust for local users, simply put these entries before the LDAP entry.  Then the last line is the LDAP entry that is the “catch-all” which authenticates users to your Active Directory.

How does it work?  It takes the username and password passed in and attempts to connect to the LDAP server with the ldapprefix (domain) specified.  If the user can connect to the LDAP server, then the user is authenticated and then connects to Greenplum. It is that simple.  Once in the database, Greenplum handles Authorization as normal.

Merge Statements

Merge is a feature in most databases but I have never liked it much.  My source table always had duplicates in it which prevented me from using it.

Here is an example of such with Microsoft SQL Server (BTW, Oracle behaves the same way):

CREATE TABLE stage
(
id int identity not null,
provider_id int NOT NULL,
provider text NOT NULL
);

insert into stage (provider_id, provider) values (1, 'bubba');
insert into stage (provider_id, provider) values (2, 'gump');
insert into stage (provider_id, provider) values (3, 'foo');
insert into stage (provider_id, provider) values (4, 'bar');
insert into stage (provider_id, provider) values (4, 'jon');

CREATE TABLE target
(
provider_id int NOT NULL primary key,
provider text NOT NULL
);

My stage table is where new records get loaded from a file and then my target table is what I’m trying to keep up to date.  So using a MERGE statement, I should be able to insert the new records and if they exist already, update it.

merge target
using (select * from stage) as stage
on (target.provider_id = stage.provider_id)
when matched then
update set provider = stage.provider
when not matched then
insert (provider_id, provider)
values (stage.provider_id, stage.provider);

That is the syntax for MERGE in SQL Server and when executed, I get this error message:

Msg 2627, Level 14, State 1, Line 1

Violation of PRIMARY KEY constraint 'PK__target__00E213107132C993'. Cannot insert duplicate key in object 'dbo.target'.

The statement has been terminated.

Why?  Well my stage table has two records with the same provider_id.  When this happens, I don’t want to error but instead, I want the last value in the file to be used.  Unfortunately, MERGE doesn’t handle this.  Instead, you have to write code.

In Greenplum, here is how you do a “smart” MERGE that handles real world data where the target has duplicates:

CREATE TABLE stage
(
id serial not null,
provider_id int NOT NULL,
provider text NOT NULL
)
DISTRIBUTED BY (provider_id);

insert into stage (provider_id, provider) values (1, 'bubba');
insert into stage (provider_id, provider) values (2, 'gump');
insert into stage (provider_id, provider) values (3, 'foo');
insert into stage (provider_id, provider) values (4, 'bar');
insert into stage (provider_id, provider) values (4, 'jon');

CREATE TABLE target
(
provider_id int NOT NULL primary key,
provider text NOT NULL
)
DISTRIBUTED BY (provider_id);

Now create the function:

create or replace function fn_merge_target() returns void as
$$
declare
    v_function text := 'fn_merge_target';
    v_location int;

begin
    v_location := 1000;
    --delete the old records
    delete from target t
    using stage s
    where t.provider_id = s.provider_id
    and t.provider <> s.provider;

    v_location := 2000;
    --inserts
    insert into target (provider_id, provider)

    select sub.provider_id, sub.provider
    from (  select s.*,
    rank() over (partition by s.provider_id order by s.id desc) as rank
    from stage s
    left join target t on s.provider_id = t.provider_id
    where t.provider is null  ) as sub
    where sub.rank = 1;

exception
    when others then
        raise exception '(%:%:%)', v_function, v_location, sqlerrm;
end;
$$
language plpgsql;

Execute the function:

select fn_merge_target();

And now query target:

select * from target order by provider_id;

I have 4 rows and for provider_id 4, I have the last version from the stage table rather than failing the command.

In summary, MERGE sounds nice but I’ve never been able to use it.  My source or stage table always has duplicates and instead of failing, I want to apply the last version of each primary key record.  I demonstrated an easy to use way to provide a much smarter MERGE that is optimized for Greenplum and uses one of the built in analytical functions (rank()).

Archiving Data

A customer recently had a pretty common question about archiving old data in Greenplum.  They wanted to purge old data in their OLTP system so that it remains small and fast.  Historical data that isn’t changing, stays in Greenplum.

An easy way to do this is with a view and two tables.  The view would UNION ALL the data between the two tables.  Unfortunately, Greenplum treats UNION ALL similarly to other databases which executes each query separated by the UNION ALL serially.  The optimal way to handle this is to partition the data.

So let’s say the customer has a schema called edw and a table called auths.

create schema edw;

CREATE TABLE edw.auths
(id int not null,
fname varchar,
lname varchar,
episodedate timestamp)
DISTRIBUTED BY (id);

Insert some sample data.

INSERT INTO edw.auths VALUES (1, 'foo', 'bar', '2001-05-01'::timestamp);
INSERT INTO edw.auths VALUES (2, 'foo', 'bar', '2012-12-01'::timestamp);
INSERT INTO edw.auths VALUES (3, 'too', 'cool', '2010-01-01'::timestamp);
INSERT INTO edw.auths VALUES (4, 'too', 'cool', '2010-01-02'::timestamp);
INSERT INTO edw.auths VALUES (5, 'old', 'data', '2004-01-02'::timestamp);

Next, rename this schema.

ALTER SCHEMA edw RENAME TO edw_archive;

Create the new edw schema.

CREATE SCHEMA edw;

In this case, anything older than 2011-01-01 will be archived and anything newer will be in the current OLTP system.

CREATE TABLE edw.auths
(LIKE edw_archive.auths)
PARTITION BY RANGE (episodedate)
(START ('2000-01-01'::timestamp) INCLUSIVE END ('2010-01-01'::timestamp) EXCLUSIVE WITH (appendonly=true, orientation=column, compresstype=zlib, compresslevel=5),
DEFAULT PARTITION auths_current);

I’m using a pretty high level of compression and column orientation for the archive data.  This will help performance and use less disk space for the archived data.  The default partition has the current data in it and it remains a row store that isn’t compressed.   This is a good example of Polymorphic storage where partitions within the same table are stored in different ways.

Insert the old data into the new table.

INSERT INTO edw.auths SELECT * FROM edw_archive.auths;

Just to show that everything works.

SELECT * FROM edw.auths;

With Outsourcer, you will want to change the job to load edw.auths to insert data into the default partition and don’t touch the archive partition.  To do this, get the names of the underlying inherited tables.

SELECT * FROM pg_partitions WHERE schemaname = 'edw' AND tablename = 'auths';

You can see the two tables independently.

--new data

SELECT * FROM edw.auths_1_prt_auths_current;

--archive data

SELECT * FROM edw.auths_1_prt_2;

To change Outsourcer, first remove the old job for edw.auths and insert the new one with the target of edw.auths_1_prt_auths_current.

DELETE FROM os.job WHERE (target).schema_name = 'edw' AND (target).table_name = 'auths';

INSERT INTO os.job(
refresh_type, target, source)
VALUES ('refresh',
('edw', 'auths_1_prt_auths_current'),
('sqlserver', 'jonnywin', null, null, 'edw', 'dbo', 'auths', 'sa', 'sa'));

Now when you execute this job, it will truncate the current partition, insert the data from the OLTP system, and then analyze the partition.  The archive partition will be left untouched.  Querying data across the archived and current data will execute much faster than a UNION ALL because it runs in parallel and it is compressed and column oriented.

PL/pgSQL Functions

Overview

Greenplum leverages PostgreSQL internals which has great benefits from security, Multi-Version Concurrency Control, and languages.  Greenplum supports a wide variety of languages but my favorite by far has to be PL/pgSQL.

PL/pgSQL feels a lot like Oracle PL/SQL.  It has the same “Ada” look and feel which makes it easy for an Oracle developer to start writing PL/pgSQL.

Since Greenplum is supporting the same language that PostgreSQL does, you can find excellent information online about different techniques of writing quality PL/pgSQL functions.  For example, here is good documentation on how to port from Oracle PL/SQL to PL/pgSQL => http://www.postgresql.org/docs/8.2/interactive/plpgsql-porting.html

Writing Functions

I follow the same template for each PL/pgSQL function I write and below are each of the different things I always follow.

Naming Conventions

  • Functions– prefix with fn_ to denote it is a function
  • Table names– all lower case with underscores.  The data dictionary stores objects in all lower case so it is easier to read table names with underscores.
  • Variables– prefix with v_
  • Parameters– prefix with p_

Error Handling

  • Function Name– I add the function name to each function I write so that I can include this in the exception message.  This is very helpful when you have functions executing other functions and you need to know what function really failed.
  • Location I prefer to add a location variable to functions and denote where I am in the code frequently.  I use the location value to know where the error occurred.
  • Exception BlockI raise the exception with:
  • RAISE EXCEPTION ‘(%:%:%)’, v_function_name, v_location, sqlerrm;

The parenthesis surrounds the error message from this function.  If the function executes another function and the second function is the one that failed, the parenthesis help to indicate what function, the location the error, and the error message that occurred.

Template

 

CREATE OR REPLACE FUNCTION fn_name() RETURNS void AS
$$

DECLARE
    v_function_name text := 'fn_name';
    v_location int;

BEGIN
    v_location := 1000;

EXCEPTION
    WHEN OTHERS THEN
        RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm;
END;

$$
LANGUAGE plpgsql;

As you develop your code, increment the v_location variable so if you do get an error, it will be raised the approximate location in your code.

More examples of my PL/pgSQL code can be found in the Outsourcer code.  I have about 8 functions that you can reference for examples of writing PL/pgSQL functions.

Ugly Data

It seems that everyone has it. Ugly data. Data that isn’t so clean in transactional systems like control characters or rarely used characters that are used programmatically. This ugly data typically errors out during a load process. This requires someone to cleanse the data by writing scripts and then loading the data.

This is time consuming and feels like you are re-inventing the wheel over and over again.

Outsourcer does the cleansing for you! As the data is written to Greenplum, control characters are removed and special characters are escaped properly.

Here is a demo of Outsourcer in action loading ugly data.

Oracle Sessions

Just like for SQL Server, you may want to know what is going on in Oracle from inside Greenplum.  Here is an External Web Table that uses Outsourcer to give you that information.

First create a connection for Outsourcer:

INSERT INTO os.ext_connection
(type, server_name, instance_name, port, database_name, user_name, pass)
VALUES
('oracle', 'jonnywin', null, 1521, 'xe', 'os_test', 'os_password');

--get the ID you just inserted
SELECT id 
FROM os.ext_connection 
WHERE type = 'oracle' 
AND server_name = 'jonnywin'
AND port = 1521
AND database_name = 'xe';

--in my example, the value is 3.

Next create the schema in Greenplum for this table:

CREATE SCHEMA oracle;

This will create the External Table in Greenplum for Outsourcer:

--notice the double escaping of the dollar sign.
SELECT os.fn_create_ext_table('oracle.sessions', ARRAY[
'username varchar(30)','osuser varchar(30)','machine varchar(64)',
'terminal varchar(30)','program varchar(48)',
'module varchar(48)','sql_text text','logon_time timestamp',
'service_name varchar(64)'], 
3,
'SELECT s.USERNAME, s.OSUSER, s.MACHINE, s.TERMINAL, s.PROGRAM, s.MODULE,
a.sql_text, s.LOGON_TIME, s.service_name 
FROM v\\\\$session s, v\\\\$sqlarea a 
WHERE a.address = s.sql_address');

Now you can execute the query to get the session information:

select * from oracle.sessions;

More information on External Tables is in the Outsourcer documentation.