There are multiple options to migrate your database from Oracle® to PostgreSQL. One of the simplest and flexible one is to use an open source tool ora2pg. It is well documented and does its job as expected. You can run it with a verbose output option to get more feedback during migration. At the end of first probe run, I wanted to identify tables, which took the most time for the migration to optimise it with a multiprocess option. Below is the output sample from ora2pg COPY run (DATA_LIMIT=10000):
1 2 3 4 5 6 7 8 9 10 11 |
... DEBUG: Formatting bulk of 10000 data for PostgreSQL. Extracted records from table TABLE_X : total_records = 20000 (avg: 20000 recs/sec) Dumping data from table TABLE_X into PostgreSQL... Setting client_encoding to UTF8... ... Extracted records from table TABLE_Y: total_records = 230000 (avg: 57500 recs/sec) DEBUG: Creating output for 10000 tuples DEBUG: Sending COPY bulk output directly to PostgreSQL backend Dumping data from table TABLE_Y into PostgreSQL... ... |
It is pretty strait-forward to parse useful information out of it with open source command line tools:
1 |
grep -i "Extracted records from table" nohup.out | tr " " ";" > migration.csv && R --file=toptables.r |
The R script toptables.r
to aggregate log data:
1 2 3 4 5 6 |
install.packages("dplyr") library(dplyr) MyData <- read.csv(file="migration.csv", header=FALSE, sep=";") result <- arrange(MyData, V5) %>% group_by(V5) %>% summarise( rec_per_sec = mean(V10), total_recs = max(V8), duration_sec = total_recs/rec_per_sec ) %>% arrange(desc(duration_sec)) data.frame(result) %>% head(5) |
Results output:
1 2 3 4 5 6 |
V5 rec_per_sec total_recs duration_sec 1 TABLE_A: 664.60751 2926834 4403.85335 2 TABLE_B: 339.55172 1153947 3398.44247 3 TABLE_C: 1493.70566 2640658 1767.85699 4 TABLE_D: 15263.52768 8840800 579.21079 5 TABLE_E: 11785.20840 6180887 524.46141 |
The next step was — to parallelise migration of the biggest tables with JOBS and ORACLE_COPIES options on 8 vCPU host:
1 2 3 4 |
time ora2pg -d -t COPY -c ora2pg.conf --exclude "TABLE_A,TABLE_B,TABLE_C" --jobs 8 --copies 1 --parallel 8 time ora2pg -d -t COPY -c ora2pg.conf --allow "TABLE_A" --jobs 8 --copies 8 --parallel 1 time ora2pg -d -t COPY -c ora2pg.conf --allow "TABLE_B" --jobs 8 --copies 8 --parallel 1 time ora2pg -d -t COPY -c ora2pg.conf --allow "TABLE_C" --jobs 8 --copies 8 --parallel 1 |
Voilà! Data migration runs faster now. Have fun and comment about your experience with ora2pg.