Monitoring Sessions

I recently found out that one of my assumptions in Greenplum was completely wrong! After all of these years, I didn’t understand how pg_stat_activity worked!

I always thought that each row in this system table equated to a single session so, if I wanted to get the number of sessions logged in, I could do this:

select count(*) from pg_stat_activity;

But, this is wrong! One row actually means one server process and not one session.

When a session submits a query, pg_stat_activity may report multiple rows for that single session as it execute the query. Why? A single query can spawn multiple backend processes and these backend processes can show up in pg_stat_activity.

Example – Query 22 from TPC-DS
More Information on TPC-DS

select i_product_name
,i_brand
,i_class
,i_category
,avg(inv_quantity_on_hand) qoh
from inventory
,date_dim
,item
where inv_date_sk=d_date_sk
and inv_item_sk=i_item_sk
and d_month_seq between 1212 and 1212 + 11
group by rollup(i_product_name
,i_brand
,i_class
,i_category)
order by qoh, i_product_name, i_brand, i_class, i_category
limit 100;

First, look at pg_stat_activity and see that I have two sessions open. One is executing the simple query from pg_stat_activity (sess_id=1013) and the other is idle (sess_id=1011).

gpadmin=# select procpid, sess_id, substring(current_query, 1, 20) from pg_stat_activity;
 procpid | sess_id |      substring       
---------+---------+----------------------
   31027 |    1013 | select procpid, sess
   30804 |    1011 | <idle>
(2 rows)

Now, execute Query 22 from sess_id=1011.

gpadmin=# select procpid, sess_id, substring(current_query, 1, 20) from pg_stat_activity;
 procpid | sess_id |      substring       
---------+---------+----------------------
   31027 |    1013 | select procpid, sess
   30804 |    1011 | select i_product_nam
   31137 |    1011 | select i_product_nam
(3 rows)

I now see that sess_id=1011 has started a second process to complete the query. It is still a single database session but pg_stat_activity now shows two rows for this session.

Even after the query is over, I can still see the idle process created for sess_id=1011.

gpadmin=# select procpid, sess_id, substring(current_query, 1, 20) from pg_stat_activity;
 procpid | sess_id |      substring       
---------+---------+----------------------
   31027 |    1013 | select procpid, sess
   30804 |    1011 | <idle>
   31137 |    1011 | <idle>
(3 rows)

So how do you get the number of sessions from pg_stat_activity? With this query:

gpadmin=# select count(*) from (
gpadmin(#     select sess_id from pg_stat_activity group by sess_id
gpadmin(# ) as sub;
 count 
-------
     2
(1 row)

One thought on “Monitoring Sessions

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.