Category Archives: Oracle

Outsourcer 5.1.1

5.1.1 enhances Append jobs to use Big Integer in addition to Integer data types. Additionally, you can now use Timestamp data types.

Be sure to always use an ordered sequence in Oracle and an ordered identity in SQL Server when using an Append job. Timestamp is useful when you are using the system timestamp in Oracle or SQL Server to append new data.

Download
Documentation
Source Code

Outsourcer 5.0.9

5.0.9 adds support for HAWQ 2.0.

I’m looking for feedback on how best to handle table distribution for tables created in HAWQ 2.0. Outsourcer automatically sets distribution keys based on the source primary keys so it always uses hash when there is a PK found. HAWQ 2.0 supports hash and random distribution as before but random distribution allows a cluster to be resized without having to redistribute the data.

– Should I keep the code as-is?
– Should I set a global environment variable to allow you to set all tables to be created random or not?
– Should I update nearly every UI screen as well as the job and queue tables to have a random boolean that is only used for HAWQ 2.0?

Download
Documentation
Source Code

New Project: gplink

I created a new project that simplifies the process to create Greenplum or Hawq External Tables using gpfdist to stream data from any valid JDBC source. It is like pointing gpfdist at Sqoop to pull data without landing a file but gplink ensures that the data is cleansed first so that the data will be readable by Greenplum or Hawq.

This will work with PostgreSQL (yeah!), MySQL (yuck), DB2, Informix, etc. You will have to download all third party JDBC drivers separately.

gplink v0.0.2
Source Code

This is a new project so I’m looking forward to any feedback you can provide.