说明
本文延续上一篇文章 云数据库MySQL导入云数据仓库PostgreSQL最佳实践,继续介绍云数据库MySQL导入云数据仓库PostgreSQL的使用问题。其中描述的问题及解决方法同样适用于 腾讯云 云数据仓库 PostgreSQL(CDWPG)。
背景
在上一个实验过程中我们发现,DTS数据同步硬性要求:”schema 和 table 必须提前创建好,如果没有创建好,则会报错”。
但在实际数据同步的场景中,大多的需求是迁移。这种情况下,如果表的数量很多的话,那数据同步的成本是非常大的,因为目标端需要提前构建出全部的表结构。这个时候我们可以视情况选择使用数据同步开源工具 (py-mysql2pgsql)。
安装依赖
在安装py-mysql2pgsql之前,需要安装一些程序的依赖,否则会遇到各种问题。
[root@VM-5-48-centos ~]# yum install python-pip python-devel mysql-devel postgresql-devel -y | |
Loaded plugins: fastestmirror, langpacks | |
Loading mirror speeds from cached hostfile | |
Package python2-pip-8.1.2-14.el7.noarch already installed and latest version | |
Resolving Dependencies | |
--> Running transaction check | |
---> Package mariadb-devel.x86_64 1:5.5.68-1.el7 will be installed | |
--> Processing Dependency: openssl-devel(x86-64) for package: 1:mariadb-devel-5.5.68-1.el7.x86_64 | |
---> Package postgresql-devel.x86_64 0:9.2.24-4.el7_8 will be installed | |
--> Processing Dependency: postgresql-libs(x86-64) = 9.2.24-4.el7_8 for package: postgresql-devel-9.2.24-4.el7_8.x86_64 | |
--> Processing Dependency: postgresql(x86-64) = 9.2.24-4.el7_8 for package: postgresql-devel-9.2.24-4.el7_8.x86_64 | |
--> Processing Dependency: libpq.so.5()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64 | |
--> Processing Dependency: libpgtypes.so.3()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64 | |
--> Processing Dependency: libecpg_compat.so.3()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64 | |
--> Processing Dependency: libecpg.so.6()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64 | |
---> Package python-devel.x86_64 0:2.7.5-90.el7 will be installed | |
--> Processing Dependency: python(x86-64) = 2.7.5-90.el7 for package: python-devel-2.7.5-90.el7.x86_64 | |
--> Processing Dependency: python2-rpm-macros > 3-30 for package: python-devel-2.7.5-90.el7.x86_64 | |
--> Processing Dependency: python-rpm-macros > 3-30 for package: python-devel-2.7.5-90.el7.x86_64 | |
--> Running transaction check | |
---> Package openssl-devel.x86_64 1:1.0.2k-21.el7_9 will be installed | |
--> Processing Dependency: openssl-libs(x86-64) = 1:1.0.2k-21.el7_9 for package: 1:openssl-devel-1.0.2k-21.el7_9.x86_64 | |
--> Processing Dependency: zlib-devel(x86-64) for package: 1:openssl-devel-1.0.2k-21.el7_9.x86_64 | |
--> Processing Dependency: krb5-devel(x86-64) for package: 1:openssl-devel-1.0.2k-21.el7_9.x86_64 | |
---> Package postgresql.x86_64 0:9.2.24-4.el7_8 will be installed | |
---> Package postgresql-libs.x86_64 0:9.2.24-4.el7_8 will be installed | |
---> Package python.x86_64 0:2.7.5-88.el7 will be updated | |
---> Package python.x86_64 0:2.7.5-90.el7 will be an update | |
--> Processing Dependency: python-libs(x86-64) = 2.7.5-90.el7 for package: python-2.7.5-90.el7.x86_64 | |
---> Package python-rpm-macros.noarch 0:3-34.el7 will be installed | |
--> Processing Dependency: python-srpm-macros for package: python-rpm-macros-3-34.el7.noarch | |
---> Package python2-rpm-macros.noarch 0:3-34.el7 will be installed | |
--> Running transaction check | |
---> Package krb5-devel.x86_64 0:1.15.1-50.el7 will be installed | |
--> Processing Dependency: libkadm5(x86-64) = 1.15.1-50.el7 for package: krb5-devel-1.15.1-50.el7.x86_64 | |
--> Processing Dependency: krb5-libs(x86-64) = 1.15.1-50.el7 for package: krb5-devel-1.15.1-50.el7.x86_64 | |
--> Processing Dependency: libverto-devel for package: krb5-devel-1.15.1-50.el7.x86_64 | |
--> Processing Dependency: libselinux-devel for package: krb5-devel-1.15.1-50.el7.x86_64 | |
--> Processing Dependency: libcom_err-devel for package: krb5-devel-1.15.1-50.el7.x86_64 | |
--> Processing Dependency: keyutils-libs-devel for package: krb5-devel-1.15.1-50.el7.x86_64 | |
---> Package openssl-libs.x86_64 1:1.0.2k-19.el7 will be updated | |
--> Processing Dependency: openssl-libs(x86-64) = 1:1.0.2k-19.el7 for package: 1:openssl-1.0.2k-19.el7.x86_64 | |
---> Package openssl-libs.x86_64 1:1.0.2k-21.el7_9 will be an update | |
---> Package python-libs.x86_64 0:2.7.5-88.el7 will be updated | |
---> Package python-libs.x86_64 0:2.7.5-90.el7 will be an update | |
---> Package python-srpm-macros.noarch 0:3-34.el7 will be installed | |
---> Package zlib-devel.x86_64 0:1.2.7-19.el7_9 will be installed | |
--> Processing Dependency: zlib = 1.2.7-19.el7_9 for package: zlib-devel-1.2.7-19.el7_9.x86_64 | |
--> Running transaction check | |
---> Package keyutils-libs-devel.x86_64 0:1.5.8-3.el7 will be installed | |
---> Package krb5-libs.x86_64 0:1.15.1-46.el7 will be updated | |
---> Package krb5-libs.x86_64 0:1.15.1-50.el7 will be an update | |
---> Package libcom_err-devel.x86_64 0:1.42.9-19.el7 will be installed | |
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-19.el7 for package: libcom_err-devel-1.42.9-19.el7.x86_64 | |
---> Package libkadm5.x86_64 0:1.15.1-50.el7 will be installed | |
---> Package libselinux-devel.x86_64 0:2.5-15.el7 will be installed | |
--> Processing Dependency: libsepol-devel(x86-64) >= 2.5-10 for package: libselinux-devel-2.5-15.el7.x86_64 | |
--> Processing Dependency: pkgconfig(libsepol) for package: libselinux-devel-2.5-15.el7.x86_64 | |
--> Processing Dependency: pkgconfig(libpcre) for package: libselinux-devel-2.5-15.el7.x86_64 | |
---> Package libverto-devel.x86_64 0:0.2.5-4.el7 will be installed | |
---> Package openssl.x86_64 1:1.0.2k-19.el7 will be updated | |
---> Package openssl.x86_64 1:1.0.2k-21.el7_9 will be an update | |
---> Package zlib.x86_64 0:1.2.7-18.el7 will be updated | |
---> Package zlib.x86_64 0:1.2.7-19.el7_9 will be an update | |
--> Running transaction check | |
---> Package libcom_err.x86_64 0:1.42.9-17.el7 will be updated | |
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-17.el7 for package: e2fsprogs-libs-1.42.9-17.el7.x86_64 | |
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-17.el7 for package: e2fsprogs-1.42.9-17.el7.x86_64 | |
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-17.el7 for package: libss-1.42.9-17.el7.x86_64 | |
---> Package libcom_err.x86_64 0:1.42.9-19.el7 will be an update | |
---> Package libsepol-devel.x86_64 0:2.5-10.el7 will be installed | |
---> Package pcre-devel.x86_64 0:8.32-17.el7 will be installed | |
--> Running transaction check | |
---> Package e2fsprogs.x86_64 0:1.42.9-17.el7 will be updated | |
---> Package e2fsprogs.x86_64 0:1.42.9-19.el7 will be an update | |
---> Package e2fsprogs-libs.x86_64 0:1.42.9-17.el7 will be updated | |
---> Package e2fsprogs-libs.x86_64 0:1.42.9-19.el7 will be an update | |
---> Package libss.x86_64 0:1.42.9-17.el7 will be updated | |
---> Package libss.x86_64 0:1.42.9-19.el7 will be an update | |
--> Finished Dependency Resolution | |
Dependencies Resolved | |
=============================================================================================================================================================================================== | |
Package Arch Version Repository Size | |
=============================================================================================================================================================================================== | |
Installing: | |
mariadb-devel x86_64 1:5.5.68-1.el7 os 757 k | |
postgresql-devel x86_64 9.2.24-4.el7_8 os 952 k | |
python-devel x86_64 2.7.5-90.el7 updates 399 k | |
Installing for dependencies: | |
keyutils-libs-devel x86_64 1.5.8-3.el7 os 37 k | |
krb5-devel x86_64 1.15.1-50.el7 os 273 k | |
libcom_err-devel x86_64 1.42.9-19.el7 os 32 k | |
libkadm5 x86_64 1.15.1-50.el7 os 179 k | |
libselinux-devel x86_64 2.5-15.el7 os 187 k | |
libsepol-devel x86_64 2.5-10.el7 os 77 k | |
libverto-devel x86_64 0.2.5-4.el7 os 12 k | |
openssl-devel x86_64 1:1.0.2k-21.el7_9 updates 1.5 M | |
pcre-devel x86_64 8.32-17.el7 os 480 k | |
postgresql x86_64 9.2.24-4.el7_8 os 3.0 M | |
postgresql-libs x86_64 9.2.24-4.el7_8 os 234 k | |
python-rpm-macros noarch 3-34.el7 os 9.1 k | |
python-srpm-macros noarch 3-34.el7 os 8.8 k | |
python2-rpm-macros noarch 3-34.el7 os 8.1 k | |
zlib-devel x86_64 1.2.7-19.el7_9 updates 50 k | |
Updating for dependencies: | |
e2fsprogs x86_64 1.42.9-19.el7 os 701 k | |
e2fsprogs-libs x86_64 1.42.9-19.el7 os 168 k | |
krb5-libs x86_64 1.15.1-50.el7 os 809 k | |
libcom_err x86_64 1.42.9-19.el7 os 42 k | |
libss x86_64 1.42.9-19.el7 os 47 k | |
openssl x86_64 1:1.0.2k-21.el7_9 updates 493 k | |
openssl-libs x86_64 1:1.0.2k-21.el7_9 updates 1.2 M | |
python x86_64 2.7.5-90.el7 updates 96 k | |
python-libs x86_64 2.7.5-90.el7 updates 5.6 M | |
zlib x86_64 1.2.7-19.el7_9 updates 90 k | |
Installed: | |
mariadb-devel.x86_64 1:5.5.68-1.el7 postgresql-devel.x86_64 0:9.2.24-4.el7_8 python-devel.x86_64 0:2.7.5-90.el7 | |
Dependency Installed: | |
keyutils-libs-devel.x86_64 0:1.5.8-3.el7 krb5-devel.x86_64 0:1.15.1-50.el7 libcom_err-devel.x86_64 0:1.42.9-19.el7 libkadm5.x86_64 0:1.15.1-50.el7 | |
libselinux-devel.x86_64 0:2.5-15.el7 libsepol-devel.x86_64 0:2.5-10.el7 libverto-devel.x86_64 0:0.2.5-4.el7 openssl-devel.x86_64 1:1.0.2k-21.el7_9 | |
pcre-devel.x86_64 0:8.32-17.el7 postgresql.x86_64 0:9.2.24-4.el7_8 postgresql-libs.x86_64 0:9.2.24-4.el7_8 python-rpm-macros.noarch 0:3-34.el7 | |
python-srpm-macros.noarch 0:3-34.el7 python2-rpm-macros.noarch 0:3-34.el7 zlib-devel.x86_64 0:1.2.7-19.el7_9 | |
Dependency Updated: | |
e2fsprogs.x86_64 0:1.42.9-19.el7 e2fsprogs-libs.x86_64 0:1.42.9-19.el7 krb5-libs.x86_64 0:1.15.1-50.el7 libcom_err.x86_64 0:1.42.9-19.el7 libss.x86_64 0:1.42.9-19.el7 | |
openssl.x86_64 1:1.0.2k-21.el7_9 openssl-libs.x86_64 1:1.0.2k-21.el7_9 python.x86_64 0:2.7.5-90.el7 python-libs.x86_64 0:2.7.5-90.el7 zlib.x86_64 0:1.2.7-19.el7_9 | |
Complete! |
安装py-mysql2pgsql
这里直接使用pip进行安装即可。
[root@VM-5-48-centos ~]# pip install py-mysql2pgsql | |
Collecting py-mysql2pgsql | |
Downloading http://mirrors.tencentyun.com/pypi/packages/28/49/c82e4313c63b26224950bef47ab19f52a921dd4c279abf31aa3ee9d9cbbe/py-mysql2pgsql-0.1.5.tar.gz (109kB) | |
100% |████████████████████████████████| 112kB 807kB/s | |
Collecting mysql-python>=1.2.3 (from py-mysql2pgsql) | |
Downloading http://mirrors.tencentyun.com/pypi/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip (108kB) | |
100% |████████████████████████████████| 112kB 1.6MB/s | |
Collecting psycopg2>=2.4.2 (from py-mysql2pgsql) | |
Downloading http://mirrors.tencentyun.com/pypi/packages/fd/ae/98cb7a0cbb1d748ee547b058b14604bd0e9bf285a8e0cc5d148f8a8a952e/psycopg2-2.8.6.tar.gz (383kB) | |
100% |████████████████████████████████| 389kB 3.2MB/s | |
Requirement already satisfied (use --upgrade to upgrade): pyyaml>=3.10.0 in /usr/lib64/python2.7/site-packages (from py-mysql2pgsql) | |
Collecting argparse (from py-mysql2pgsql) | |
Downloading http://mirrors.tencentyun.com/pypi/packages/f2/94/3af39d34be01a24a6e65433d19e107099374224905f1e0cc6bbe1fd22a2f/argparse-1.4.0-py2.py3-none-any.whl | |
Collecting termcolor>=1.1.0 (from py-mysql2pgsql) | |
Downloading http://mirrors.tencentyun.com/pypi/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz | |
Installing collected packages: mysql-python, psycopg2, argparse, termcolor, py-mysql2pgsql | |
Running setup.py install for mysql-python ... done | |
Running setup.py install for psycopg2 ... done | |
Running setup.py install for termcolor ... done | |
Running setup.py install for py-mysql2pgsql ... done | |
Successfully installed argparse-1.4.0 mysql-python-1.2.5 psycopg2-2.8.6 py-mysql2pgsql-0.1.5 termcolor-1.1.0 |
配置任务
生成配置文件模板
在全新第一次运行时,会创建一个demo模板mysql2pgsql.yml,然后在模板里按需做一下修改即可,很简单。
[root@VM-5-48-centos ~]# py-mysql2pgsql | |
No configuration file found. | |
A new file has been initialized at: mysql2pgsql.yml | |
Please review the configuration and retry... |
修改配置文件
这里贴一下我的配置文件,需要注意的是socket要注释掉。
[root@VM-5-48-centos ~]# egrep -v "^#" mysql2pgsql.yml | |
mysql: | |
hostname: 10.0.5.25 | |
port: 3306 | |
# socket: /tmp/mysql.sock | |
username: dts_user | |
password: dts_admin | |
database: dts_demo | |
compress: false | |
destination: | |
# if file is given, output goes to file, else postgres | |
file: | |
postgres: | |
hostname: 10.0.5.18 | |
port: 5436 | |
username: dts_user | |
password: dts_admin | |
database: dts_demo | |
supress_data: false | |
supress_ddl: false | |
force_truncate: false |
数据同步
同步数据命令很简单,-v是打印详细过程,-f是指定配置文件。time是Linux系统命令,可以获取后面指令的耗时情况,如果没有记录耗时的需求,可以将其省略。
[root@VM-5-48-centos ~]# time py-mysql2pgsql -v -f mysql2pgsql.yml | |
>>>>>>>>>> STARTING <<<<<<<<<< | |
START CREATING TABLES | |
START - CREATING TABLE user_info | |
FINISH - CREATING TABLE user_info | |
DONE CREATING TABLES | |
START WRITING TABLE DATA | |
START - WRITING DATA TO user_info | |
25073.34 rows/sec [1000000] | |
FINISH - WRITING DATA TO user_info | |
DONE WRITING TABLE DATA | |
START CREATING INDEXES AND CONSTRAINTS | |
START - ADDING INDEXES TO user_info | |
FINISH - ADDING INDEXES TO user_info | |
START - ADDING CONSTRAINTS ON user_info | |
FINISH - ADDING CONSTRAINTS ON user_info | |
DONE CREATING INDEXES AND CONSTRAINTS | |
>>>>>>>>>> FINISHED <<<<<<<<<< | |
real 0m42.928s | |
user 0m39.899s | |
sys 0m0.451s |
100万条数据,同步花了43秒,速度还是挺快的。
[root@VM-5-48-centos ~]# psql -d dts_demo -U dts_user -h 10.0.5.18 -p 5436 | |
Password for user dts_user: | |
psql (9.5.25, server 9.4.24) | |
Type "help" for help. | |
dts_demo=> \d | |
List of relations | |
Schema | Name | Type | Owner | |
----------+------------------+----------+---------- | |
dts_demo | user_info | table | dts_user | |
dts_demo | user_info_id_seq | sequence | dts_user | |
(2 rows) | |
dts_demo=> SELECT COUNT(1) FROM user_info; | |
count | |
--------- | |
1000000 | |
(1 row) | |
dts_demo=> SELECT * FROM user_info LIMIT 20; | |
id | c_user_id | c_name | c_province_id | c_city_id | create_time | |
----+--------------------------------------+----------------------+---------------+-----------+--------------------- | |
2 | 1afd300e-88bc-11eb-9c30-0c42a125994e | Nj27hTrqAwIQUPiO0qXo | 727 | 95 | 2028-03-19 22:05:05 | |
3 | 1afd4041-88bc-11eb-9c30-0c42a125994e | J9rzo41MCC2dM5Whp4Zy | 482 | 22 | 2026-03-19 22:05:05 | |
4 | 1afd4562-88bc-11eb-9c30-0c42a125994e | RX3eSuFHkqXmNJ8hSoas | 517 | 67 | 2023-03-19 22:05:05 | |
6 | 1afd4ebd-88bc-11eb-9c30-0c42a125994e | ydfrgRm1VlPX8FLFSeo5 | 968 | 3 | 2027-03-19 22:05:05 | |
7 | 1afd530c-88bc-11eb-9c30-0c42a125994e | rsMpwgyPk0TiBXO2AFr3 | 585 | 25 | 2027-03-19 22:05:05 | |
8 | 1afd574a-88bc-11eb-9c30-0c42a125994e | H5aqu0qT4xgB06i1341J | 293 | 73 | 2027-03-19 22:05:05 | |
9 | 1afd5cf9-88bc-11eb-9c30-0c42a125994e | Y10PZgc4AzTDjxyY5ke0 | 31 | 60 | 2025-03-19 22:05:05 | |
10 | 1afd61a8-88bc-11eb-9c30-0c42a125994e | 761DXGqU7GUjHpKns2E0 | 732 | 12 | 2022-03-19 22:05:05 | |
13 | 1afd6f01-88bc-11eb-9c30-0c42a125994e | pNCyKUaVYVyQqowgB3kl | 370 | 31 | 2028-03-19 22:05:05 | |
16 | 1afd7bcf-88bc-11eb-9c30-0c42a125994e | j8zjGigivtHUhwDq2OK9 | 172 | 90 | 2025-03-19 22:05:05 | |
18 | 1afd842c-88bc-11eb-9c30-0c42a125994e | 0DZUqdFwtEGifda3AA4p | 480 | 67 | 2028-03-19 22:05:05 | |
19 | 1afd886b-88bc-11eb-9c30-0c42a125994e | 6SRyZ7v0mCP981zBaSIL | 374 | 5 | 2022-03-19 22:05:05 | |
21 | 1afd913b-88bc-11eb-9c30-0c42a125994e | JHbEzIIg037fKPJ0FbK4 | 730 | 93 | 2027-03-19 22:05:05 | |
22 | 1afd9596-88bc-11eb-9c30-0c42a125994e | FSemWreIG6i3eQm7k7qE | 673 | 87 | 2028-03-19 22:05:05 | |
24 | 1afd9ea8-88bc-11eb-9c30-0c42a125994e | YnMBwEvqqI8mg4oAzZ25 | 960 | 4 | 2026-03-19 22:05:05 | |
27 | 1afdab41-88bc-11eb-9c30-0c42a125994e | z1eUGAFq0zFvl2ZFf2ie | 953 | 53 | 2028-03-19 22:05:05 | |
28 | 1afdafc8-88bc-11eb-9c30-0c42a125994e | ZJvabgB2dPk0TfrhVB7D | 500 | 6 | 2028-03-19 22:05:05 | |
29 | 1afdb407-88bc-11eb-9c30-0c42a125994e | Y4fRnay6I454UaZgaSJS | 848 | 10 | 2028-03-19 22:05:05 | |
32 | 1afdc0bc-88bc-11eb-9c30-0c42a125994e | Lam7pt0r0zFs9dqnoJi6 | 300 | 68 | 2028-03-19 22:05:05 | |
33 | 1afdc4fc-88bc-11eb-9c30-0c42a125994e | HPPx2oG7mid4xiGpRSEu | 190 | 94 | 2028-03-19 22:05:05 | |
(20 rows) |
简单验证了一下,确认同步完成。
小结
不管是使用mysql2pgsql,还是DTS,都各有各的优势,也各有各的不足。根据实际需求,选择更符合业务场景的工具,才是最合适的。
正文完