Category Archives: Greenplum

S3 External Tables

S3 is Amazon’s Simple Storage Service which is an inexpensive cloud storage solution and has quickly become a solution for cold data and backups. Greenplum now has External Tables that can read and write data to S3 so you can leverage this popular storage service with Greenplum. Here is how you do it!

Configuration
1. You will need an Amazon account with your Access Key ID and Secret Access Key. If you have the aws CLI installed and configured, just cat ~/.aws/credentials

2. Get your default region. This is in ~/.aws/config

3. You’ll need a bucket in this region and this can be done with the AWS web interface.

4. You’ll need a configuration file. Here is an example (be sure to change the secret and accessid).

[default]
secret = <secret>
accessid = <accesssid>
threadnum = 4
chunksize = 67108864
low_speed_limit = 10240
low_speed_time = 60
encryption = true

5. Copy the configuration to every Segment Host in your cluster.

for i in $(cat segment_hosts.txt); do scp s3_demo.conf $i:/home/gpadmin; done

Writing to S3

1. Create the Writable External Table with the S3 protocol, AWS URL that has the correct region, and the configuration file that is found on every Segment Host.

CREATE WRITABLE EXTERNAL TABLE public.demo_write
(id int, fname text, lname text)
LOCATION ('s3://s3-us-east-1.amazonaws.com/pivotalguru/demo config=/home/gpadmin/s3_demo.conf')
FORMAT 'TEXT' (DELIMITER '|' NULL AS '');

2. Execute an INSERT statement:

INSERT INTO demo_write 
SELECT i, 'Jon_' || i, 'Roberts_' || i
FROM generate_series(1,10000) as i;

Note: Each Segment will create a file in S3 in your bucket with the prefix you specify in the location specified in the Writable External Table. In this demo, each file is prefixed with “demo”. An example filename is “demo9767abbb3.txt”.

Reading from S3
1. Create the External Table with the same location and configuration as before.

CREATE EXTERNAL TABLE public.demo_read
(id int, fname text, lname text)
LOCATION ('s3://s3-us-east-1.amazonaws.com/pivotalguru/demo config=/home/gpadmin/s3_demo.conf')
FORMAT 'TEXT' (DELIMITER '|' NULL AS '');

2. Select the data.

gpadmin=# select * from demo_read limit 10;
 id  |  fname  |    lname    
-----+---------+-------------
  58 | Jon_58  | Roberts_58
  90 | Jon_90  | Roberts_90
 122 | Jon_122 | Roberts_122
 191 | Jon_191 | Roberts_191
 207 | Jon_207 | Roberts_207
 239 | Jon_239 | Roberts_239
 271 | Jon_271 | Roberts_271
 319 | Jon_319 | Roberts_319
 335 | Jon_335 | Roberts_335
 351 | Jon_351 | Roberts_351
(10 rows)

Time: 1139.538 ms

Tips
1. An S3 External Table that references a single file will only use a single Segment to read the data. Instead, try to have at least 1 file per Segment for an S3 External Table.
2. S3 External Tables supports gzip compression only.
3. Use S3 External Tables for cold storage or to create a backup of a table or query using a Writable External Table.

Greenplum Append Optimized Tables

Greenplum has two major types of table storage techniques. The first is inherited from PostgreSQL so it uses Multi-Version Concurrency Control (MVCC) and is referred to as HEAP storage. The second is more efficient technique called Append Optimized (AO). This blog post discusses the performance benefits of AO over HEAP tables.

HEAP
This is the default storage technique when you create a table in Greenplum. It handles UPDATE and DELETE commands just like PostgreSQL in terms of marking the old row invalid and inserting a new record into the table. This hides the deleted or stale row for new queries but provides read consistency to sessions that may have started querying the table prior to the UPDATE or DELETE.

Here is an example of a HEAP table.

CREATE TABLE heap_table
(id int,
 fname text,
 lname text,
 address1 text,
 address2 text,
 city text,
 state text,
 zip text)
DISTRIBUTED BY (id);

But how does this HEAP table store data? How much space does it use?

INSERT INTO heap_table (id, fname, lname, address1, address2, city, state, zip) 
SELECT i, 'Jon_' || i, 'Roberts_' || i, i || ' Main Street', 'Apartment ' || i, 'New York', 'NY', i::text
FROM generate_series(1, 10000) AS i;

ANALYZE heap_table;

SELECT sotdsize
FROM gp_toolkit.gp_size_of_table_disk
WHERE sotdschemaname = 'public'
AND sotdtablename = 'heap_table';

1114112

So it uses 1,114,112 bytes in my small single node test for this generated data of 10,000 records.

AO
AO storage was originally designed to be Append Only but starting with Greenplum version 4.3, these tables became Append Optimized because the tables now allow UPDATE and INSERT like a HEAP table but retain the efficient storage.

Create the same table and data but as an AO table.

CREATE TABLE ao_table
(id int,
 fname text,
 lname text,
 address1 text,
 address2 text,
 city text,
 state text,
 zip text)
WITH (appendonly=true)
DISTRIBUTED BY (id);


INSERT INTO ao_table (id, fname, lname, address1, address2, city, state, zip) 
SELECT i, 'Jon_' || i, 'Roberts_' || i, i || ' Main Street', 'Apartment ' || i, 'New York', 'NY', i::text
FROM generate_series(1, 10000) AS i;

ANALYZE ao_table;

SELECT sotdsize
FROM gp_toolkit.gp_size_of_table_disk
WHERE sotdschemaname = 'public'
AND sotdtablename = 'ao_table';

973016

This is now only using 973,016 bytes or a 12.6% reduction in disk space being used. For large tables, the performance difference will be more noticeable because it will take less time to scan your disks.

AO tables also let you compress it where a HEAP table does not. What does that look like?

CREATE TABLE ao_compressed_table
(id int,
 fname text,
 lname text,
 address1 text,
 address2 text,
 city text,
 state text,
 zip text)
WITH (appendonly=true, compresstype=quicklz)
DISTRIBUTED BY (id);


INSERT INTO ao_compressed_table (id, fname, lname, address1, address2, city, state, zip) 
SELECT i, 'Jon_' || i, 'Roberts_' || i, i || ' Main Street', 'Apartment ' || i, 'New York', 'NY', i::text
FROM generate_series(1, 10000) AS i;

ANALYZE ao_compressed_table;

SELECT sotdsize
FROM gp_toolkit.gp_size_of_table_disk
WHERE sotdschemaname = 'public'
AND sotdtablename = 'ao_compressed_table';

234344

Dropped more to 234,344 bytes. This is a huge decrease in disk space being used and measures at another 75.9% reduction! This is trading CPU cycles for IO scans which in most cases, will provide better query times.

In the last test, make the table also column oriented. This is also a feature only available for AO tables.

CREATE TABLE ao_co_compressed_table
(id int,
 fname text,
 lname text,
 address1 text,
 address2 text,
 city text,
 state text,
 zip text)
WITH (appendonly=true, compresstype=quicklz, orientation=column)
DISTRIBUTED BY (id);


INSERT INTO ao_co_compressed_table (id, fname, lname, address1, address2, city, state, zip) 
SELECT i, 'Jon_' || i, 'Roberts_' || i, i || ' Main Street', 'Apartment ' || i, 'New York', 'NY', i::text
FROM generate_series(1, 10000) AS i;

ANALYZE ao_co_compressed_table;

SELECT sotdsize
FROM gp_toolkit.gp_size_of_table_disk
WHERE sotdschemaname = 'public'
AND sotdtablename = 'ao_co_compressed_table';

196840

It dropped again! This time to 196,840 bytes or another 16%.

The total percentage difference between a HEAP table and the AO table that is column oriented and compressed is 82.33%!

Summary
Use AO tables. These are faster than HEAP tables and also let you compress and column orient your larger tables for even better performance.

Outsourcer 5.2.0 and Pivotal HDB 2.0.1

Pivotal HDB 2.0.1 (based on Apache HAWQ)
The newest release of Pivotal HDB is fantastic! It adds new features and resolves some issues as well. Here are the release notes.

JDBC Bug
One resolved issue affected Outsourcer which is a JDBC problem and is documented here: HAWQ-738. Pivotal HDB 2.0.0 was released in May and I found that Outsourcer suffered from stability issues under load and was generally slower navigating in the UI than HDB 1.3 or Greenplum Database. The issue was quickly resolved in that same month but the fix wasn’t publicly available until October of 2016 with this new release.

Quicklz
Quicklz is a compression library that uses GPL licensing. It is bundled with Greenplum Database and Pivotal HDB 1.3 and 2.0.0. Starting with Pivotal HDB 2.0.1, Quicklz has been removed. Why? Because of that GPL license and HAWQ is an Apache project.

In HDB 1.3 and 2.0.0, the guidance was to use Quicklz compression for row oriented tables but starting with 2.0.1, you should use Snappy compression. It is an easy change too:

CREATE TABLE mytable (id INT, fname TEXT) 
WITH (APPENDONLY=TRUE, COMPRESSTYPE=SNAPPY) 
DISTRIBUTED RANDOMLY;

Note: When upgrading to 2.0.1, be sure to first change your Quicklz compressed tables to either Zlib, Parquet with Snappy, or no compression at all. Then upgrade and then you can change back to row orientation and use Snappy compression. More details are in this section of the release notes.

Future Releases
Future releases of Pivotal HDB should come quicker because the licensing hurdle is now complete. This means developers can focus on enhancements and fixes rather than licensing.

Outsourcer 5.2.0
This new release officially supports Pivotal HDB 2.0.1. It makes the compression for row oriented tables to now use Snappy instead of Quicklz. If you are using Pivotal HDB 2.0.0 or Greenplum Database, Outsourcer will still use Quicklz compression.

Please consider upgrading to Pivotal HDB 2.0.0 to 2.0.1 especially if you are using Outsourcer. When I test Outsourcer with Pivotal HDB 2.0, I use build 22425 rather than build 22126 which is what you can download from Pivotal’s site. 22126 has the JDBC bug while 22425 and new builds do not. And when you upgrade to 2.0.1, also upgrade to Outsourcer 5.2.0.

Download 5.2.0!
Documentation
Source Code

New Project: gpresize

I was working with a customer recently that wanted to try more segments per host than what was originally configured in their cluster without adding more hosts. This is possible with gpexpand but only if mirroring has not been enabled. Disabling mirror, however, is not a supported feature of Greenplum but it is possible. So, to facilitate this analysis, I created “gpresize”.

Steps
– Removes mirrors (unsupported feature in Greenplum)
– Backup the database with gpcrondump
– Expands the cluster using gpexpand
– Shrink the database by reinitializing the database and restoring from backup
– Add mirroring back using gpaddmirrors and gpinitstandy

Please, don’t use this in a Production cluster. I’m using unsupported commands to do this work and it is intended for evaluation purposes only. It is intended to help you easily increase the number of segments per host and then revert back if needed.

Github: https://github.com/pivotalguru/gpresize

Getting the filename with gpfdist

Occasionally, I see the request to get the filename added to a file read by gpfdist. Here is a way to do it!

First, create a YML file named “transform_config.yml” with the following:

---
VERSION: 1.0.0.1
TRANSFORMATIONS:
  transformation_input:
     TYPE: input 
     CONTENT: data
     COMMAND: /bin/bash transform.sh

Next, create that “transform.sh” file. This is just a simple example that gets all txt files but you can also pass in a parameter to this script. The filename in the external table gets passed to the script.

#!/bin/bash
set -e
for i in $(ls *.txt); do awk '{print FILENAME"|"$0}' $i; done

Create two test files (test_file_1.txt and test_file_2.txt).

cat test_file_1.txt 
1|foo1
2|foo2
3|foo3
4|foo4
5|foo5
6|foo6
7|foo7
8|foo8
9|foo9
10|foo10
cat test_file_2.txt 
11|foo11
12|foo12
13|foo13
14|foo14
15|foo15
16|foo16
17|foo17
18|foo18
19|foo19
20|foo20

Start gpfdist in the background.

gpfdist -p 8999 -c transform_config.yml > mylog 2>&1 < mylog &

Create the External Table but be sure to change the hostname. Note that "foo" in the LOCATION is the filename. I'm ignoring it for this example but this is how you can pass parameters to the script. You add %filename% to the YML file as the parameter to the script.

CREATE EXTERNAL TABLE ext_transform_table 
(filename text, id int, descrption text) 
LOCATION ('gpfdist://gpdbsne:8999/foo#transform=transformation_input') 
FORMAT 'text' (DELIMITER '|')

Now select from the External Table.

select * from ext_transform_table;
    filename     | id | descrption 
-----------------+----+------------
 test_file_1.txt |  1 | foo1
 test_file_1.txt |  2 | foo2
 test_file_1.txt |  3 | foo3
 test_file_1.txt |  4 | foo4
 test_file_1.txt |  5 | foo5
 test_file_1.txt |  6 | foo6
 test_file_1.txt |  7 | foo7
 test_file_1.txt |  8 | foo8
 test_file_1.txt |  9 | foo9
 test_file_1.txt | 10 | foo10
 test_file_2.txt | 11 | foo11
 test_file_2.txt | 12 | foo12
 test_file_2.txt | 13 | foo13
 test_file_2.txt | 14 | foo14
 test_file_2.txt | 15 | foo15
 test_file_2.txt | 16 | foo16
 test_file_2.txt | 17 | foo17
 test_file_2.txt | 18 | foo18
 test_file_2.txt | 19 | foo19
 test_file_2.txt | 20 | foo20
(20 rows)

HAWQ 2.0 Generally Available and Outsourcer 5.1.4

HAWQ, or commercially known as Pivotal HDB, just had a major release that I’m really excited about.

Major Features
– Based on Apache HAWQ and also includes support for Quicklz table compression plus support for PL/R, PL/Java, and pgCrypto
– Elastic runtime which means more segments (resources) can be allocated automatically based on the complexity of the query
– YARN integration
– Dynamic sizing of the cluster
– Block level storage which enables maximum parallelism
– Single HDFS directory per table which makes it easier to share and manage data
– Fault tolerance enhancements makes it easier and quicker to add or remove data nodes
– HDFS catalog cacheing
– HCatalog integration which greatly simplifies accessing Hive data
– New management interface with “hawq” commands
– Support for Ambari 2.2.2
– Plugin support for Kerberos
– Better logging for runaway query termination

Product Page
Documentation
Download

Outsourcer 5.1.4
I also have updated Outsourcer to take advantage of HAWQ 2.0/Pivotal HDB 2.0. In HAWQ 2.0/Pivotal HDB 2.0, tables should be distributed randomly in order to take advantage of many of the new features. Starting with version 5.1.4, Outsourcer will now make all tables distributed randomly when the database is HAWQ 2.0/Pivotal HDB 2.0. For Greenplum and HAWQ 1.3, the tables will still be distributed by the source’s primary key if one is found.

Documentation
Download 5.1.4
Source Code