Quantcast
Channel: Percona Database Performance Blog
Viewing all articles
Browse latest Browse all 1786

New Features in PostgreSQL 14: Bulk Inserts for Foreign Data Wrappers

$
0
0
PostgreSQL 14 Bulk Inserts for Foreign Data Wrappers

Foreign Data Wrapper based on SQL-MED is one the coolest features of PostgreSQL. The feature set of foreign data wrapper is expanding since version 9.1. We know that the PostgreSQL 14 beta is out and GA will be available shortly, therefore it is helpful to study the upcoming features of PostgreSQL 14. There are a lot of them, along with some improvements in foreign data wrapper. A new performance feature, “Bulk Insert“, is added in PostgreSQL 14. The API is extended and allows bulk insert of the data into the foreign table, therefore, using that API, any foreign data wrapper now can implement Bulk Insert. It is definitely more efficient than inserting individual rows.

The API contains two new functions, which can be used to implement the bulk insert.

There is no need to explain these functions here because it is useful for people interested in having that functionality into their foreign data wrapper like mysql_fdw, mongo_fdw, and oracle_fdw. If someone is interested to see that, they can see it in the PostgreSQL documentation. But the good news is, postgres_fdw already implement that and have that in PostgreSQL 14.

There is a new server option is added which is batch_size, and you can specify that when creating the foreign server or creating a foreign table.

  • Create a postgres_fdw extension

CREATE EXTENSION postgres_fdw;

  • Create a foreign server without batch_size

CREATE SERVER postgres_svr 
       FOREIGN DATA WRAPPER postgres_fdw 
       OPTIONS (host '127.0.0.1');

CREATE USER MAPPING FOR vagrant
       SERVER postgres_svr
       OPTIONS (user 'postgres', password 'pass');

CREATE FOREIGN TABLE foo_remote (a INTEGER,
                                 b CHAR,
                                 c TEXT,
                                 d VARCHAR(255))
       SERVER postgres_svr
       OPTIONS(table_name 'foo_local');

EXPLAIN (VERBOSE, COSTS OFF) insert into foo_remote values (generate_series(1, 1), 'c', 'text', 'varchar');
                                                QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Insert on public.foo_remote
   Remote SQL: INSERT INTO public.foo_local(a, b, c, d) VALUES ($1, $2, $3, $4)
   Batch Size: 1
   ->  ProjectSet
         Output: generate_series(1, 1), 'c'::character(1), 'text'::text, 'varchar'::character varying(255)
         ->  Result
(6 rows)

  • Execution time with batch_size not specified

EXPLAIN ANALYZE 
        INSERT INTO foo_remote
        VALUES (generate_series(1, 100000000),
                'c',
                'text',
                'varchar');
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
Insert on foo_remote  (cost=0.00..500000.02 rows=0 width=0) (actual time=4591443.250..4591443.250 rows=0 loops=1)
   ->  ProjectSet  (cost=0.00..500000.02 rows=100000000 width=560) (actual time=0.006..31749.132 rows=100000000 loops=1)
         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
Planning Time: 4.988 ms
Execution Time: 4591447.101 ms -- timing is important
(5 rows)

  • Create a foreign table with batch_size = 10, in case no batch_size is specified with server creation

CREATE FOREIGN TABLE foo_remote (a INTEGER,
                                 b CHAR,
                                 c TEXT,
                                 d VARCHAR(255))
        SERVER postgres_svr OPTIONS(table_name 'foo_local', batch_size '10');

  • Create a foreign server with batch_size = 10, now every table of that server will use the batch_size 10

CREATE SERVER postgres_svr_bulk
       FOREIGN DATA WRAPPER postgres_fdw
       OPTIONS (host '127.0.0.1', batch_size = '10'); -- new option batch_size

CREATE USER MAPPING FOR vagrant
       SERVER postgres_svr
       OPTIONS (user 'postgres', password 'pass');

CREATE FOREIGN TABLE foo_remote (a INTEGER,
                                 b CHAR,
                                 c TEXT,
                                 d VARCHAR(255))
        SERVER postgres_svr OPTIONS(table_name 'foo_local');

EXPLAIN (VERBOSE, COSTS OFF) insert into foo_remote_bulk values (generate_series(1, 1), 'c', 'text', 'varchar');

                                                QUERY PLAN                                                 

-----------------------------------------------------------------------------------------------------------
Insert on public.foo_remote_bulk
   Remote SQL: INSERT INTO public.foo_local_bulk(a, b, c, d) VALUES ($1, $2, $3, $4)
   Batch Size: 10
   ->  ProjectSet
         Output: generate_series(1, 1), 'c'::character(1), 'text'::text, 'varchar'::character varying(255)
         ->  Result
(6 rows)

  • Execution time with batch_size = 10:

EXPLAIN ANALYZE
        INSERT INTO foo_remote_bulk
        VALUES (generate_series(1, 100000000),
                'c',
                'text',
                'varchar');
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Insert on foo_remote_bulk  (cost=0.00..500000.02 rows=0 width=0) (actual time=822224.678..822224.678 rows=0 loops=1)
   ->  ProjectSet  (cost=0.00..500000.02 rows=100000000 width=560) (actual time=0.005..10543.845 rows=100000000 loops=1)
         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
 Planning Time: 0.250 ms
 Execution Time: 822239.178 ms -- timing is important
(5 rows)

Conclusion

PostgreSQL is expanding the feature list of foreign data wrappers, and Bulk Insert is another good addition. As this feature is added to the core, I hope all other foreign data wrappers will implement it as well.


Viewing all articles
Browse latest Browse all 1786

Trending Articles