Example: Aggregate pushdown v2
This example shows aggregate pushdown between the foreign tables of the same remote HIVE/SPARK server to that remote HIVE/SPARK server.
Tables on HIVE/SPARK server:
0: jdbc:hive2://localhost:10000> describe emp; +-----------+------------+----------+--+ | col_name | data_type | comment | +-----------+------------+----------+--+ | empno | int | NULL | | ename | string | NULL | | job | string | NULL | | mgr | int | NULL | | hiredate | date | NULL | | sal | int | NULL | | comm | int | NULL | | deptno | int | NULL | +-----------+------------+----------+--+ 8 rows selected (0.747 seconds) 0: jdbc:hive2://localhost:10000> describe dept; +-----------+------------+----------+--+ | col_name | data_type | comment | +-----------+------------+----------+--+ | deptno | int | NULL | | dname | string | NULL | | loc | string | NULL | +-----------+------------+----------+--+ 3 rows selected (0.067 seconds)
Tables on Postgres server:
-- load extension first time after install CREATE EXTENSION hdfs_fdw; -- create server object CREATE SERVER hdfs_server FOREIGN DATA WRAPPER hdfs_fdw OPTIONS(host 'localhost', port '10000', client_type 'spark', auth_type 'LDAP'); -- create user mapping CREATE USER MAPPING FOR public SERVER hdfs_server OPTIONS (username 'user1', password 'pwd123'); -- create foreign table CREATE FOREIGN TABLE dept ( deptno INTEGER, dname VARCHAR(14), loc VARCHAR(13) ) SERVER hdfs_server OPTIONS (dbname 'fdw_db', table_name 'dept'); CREATE FOREIGN TABLE emp ( empno INTEGER, ename VARCHAR(10), job VARCHAR(9), mgr INTEGER, hiredate DATE, sal INTEGER, comm INTEGER, deptno INTEGER ) SERVER hdfs_server OPTIONS (dbname 'fdw_db', table_name 'emp');
Enable/disable GUC for AGGREGATE pushdown queries at the session level:
SET hdfs_fdw.enable_aggregate_pushdown to true; -- aggregate functions EXPLAIN (VERBOSE, COSTS OFF) SELECT deptno, COUNT(*),SUM(sal),MAX(sal),MIN(sal),AVG(sal) FROM emp GROUP BY deptno HAVING deptno IN (10,20) ORDER BY deptno;
Output
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Sort Output: deptno, (count(*)), (sum(sal)), (max(sal)), (min(sal)), (avg(sal)) Sort Key: emp.deptno -> Foreign Scan Output: deptno, (count(*)), (sum(sal)), (max(sal)), (min(sal)), (avg(sal)) Relations: Aggregate on (k_test.emp) Remote SQL: SELECT , count(*), sum(), max(), min(), avg() FROM . WHERE ( IN (10,20)) GROUP BY (7 rows)
Note
You can also enable/disable aggregate pushdown at the server/table level using a GUC. For more information, see CREATE SERVER and CREATE FOREIGN TABLE.