说明
本文延续上一篇文章 云数据库MySQL导入云数据仓库PostgreSQL最佳实践,继续介绍云数据库MySQL导入云数据仓库PostgreSQL的使用问题。其中描述的问题及解决方法同样适用于 腾讯云 云数据仓库 PostgreSQL(CDWPG)。
背景
在上一个实验过程中我们发现,DTS数据同步硬性要求:”schema 和 table 必须提前创建好,如果没有创建好,则会报错”。
但在实际数据同步的场景中,大多的需求是迁移。这种情况下,如果表的数量很多的话,那数据同步的成本是非常大的,因为目标端需要提前构建出全部的表结构。这个时候我们可以视情况选择使用数据同步开源工具 (py-mysql2pgsql)。
安装依赖
在安装py-mysql2pgsql之前,需要安装一些程序的依赖,否则会遇到各种问题。
[root@VM-5-48-centos ~]# yum install python-pip python-devel mysql-devel postgresql-devel -y
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
Package python2-pip-8.1.2-14.el7.noarch already installed and latest version
Resolving Dependencies
--> Running transaction check
---> Package mariadb-devel.x86_64 1:5.5.68-1.el7 will be installed
--> Processing Dependency: openssl-devel(x86-64) for package: 1:mariadb-devel-5.5.68-1.el7.x86_64
---> Package postgresql-devel.x86_64 0:9.2.24-4.el7_8 will be installed
--> Processing Dependency: postgresql-libs(x86-64) = 9.2.24-4.el7_8 for package: postgresql-devel-9.2.24-4.el7_8.x86_64
--> Processing Dependency: postgresql(x86-64) = 9.2.24-4.el7_8 for package: postgresql-devel-9.2.24-4.el7_8.x86_64
--> Processing Dependency: libpq.so.5()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64
--> Processing Dependency: libpgtypes.so.3()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64
--> Processing Dependency: libecpg_compat.so.3()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64
--> Processing Dependency: libecpg.so.6()(64bit) for package: postgresql-devel-9.2.24-4.el7_8.x86_64
---> Package python-devel.x86_64 0:2.7.5-90.el7 will be installed
--> Processing Dependency: python(x86-64) = 2.7.5-90.el7 for package: python-devel-2.7.5-90.el7.x86_64
--> Processing Dependency: python2-rpm-macros > 3-30 for package: python-devel-2.7.5-90.el7.x86_64
--> Processing Dependency: python-rpm-macros > 3-30 for package: python-devel-2.7.5-90.el7.x86_64
--> Running transaction check
---> Package openssl-devel.x86_64 1:1.0.2k-21.el7_9 will be installed
--> Processing Dependency: openssl-libs(x86-64) = 1:1.0.2k-21.el7_9 for package: 1:openssl-devel-1.0.2k-21.el7_9.x86_64
--> Processing Dependency: zlib-devel(x86-64) for package: 1:openssl-devel-1.0.2k-21.el7_9.x86_64
--> Processing Dependency: krb5-devel(x86-64) for package: 1:openssl-devel-1.0.2k-21.el7_9.x86_64
---> Package postgresql.x86_64 0:9.2.24-4.el7_8 will be installed
---> Package postgresql-libs.x86_64 0:9.2.24-4.el7_8 will be installed
---> Package python.x86_64 0:2.7.5-88.el7 will be updated
---> Package python.x86_64 0:2.7.5-90.el7 will be an update
--> Processing Dependency: python-libs(x86-64) = 2.7.5-90.el7 for package: python-2.7.5-90.el7.x86_64
---> Package python-rpm-macros.noarch 0:3-34.el7 will be installed
--> Processing Dependency: python-srpm-macros for package: python-rpm-macros-3-34.el7.noarch
---> Package python2-rpm-macros.noarch 0:3-34.el7 will be installed
--> Running transaction check
---> Package krb5-devel.x86_64 0:1.15.1-50.el7 will be installed
--> Processing Dependency: libkadm5(x86-64) = 1.15.1-50.el7 for package: krb5-devel-1.15.1-50.el7.x86_64
--> Processing Dependency: krb5-libs(x86-64) = 1.15.1-50.el7 for package: krb5-devel-1.15.1-50.el7.x86_64
--> Processing Dependency: libverto-devel for package: krb5-devel-1.15.1-50.el7.x86_64
--> Processing Dependency: libselinux-devel for package: krb5-devel-1.15.1-50.el7.x86_64
--> Processing Dependency: libcom_err-devel for package: krb5-devel-1.15.1-50.el7.x86_64
--> Processing Dependency: keyutils-libs-devel for package: krb5-devel-1.15.1-50.el7.x86_64
---> Package openssl-libs.x86_64 1:1.0.2k-19.el7 will be updated
--> Processing Dependency: openssl-libs(x86-64) = 1:1.0.2k-19.el7 for package: 1:openssl-1.0.2k-19.el7.x86_64
---> Package openssl-libs.x86_64 1:1.0.2k-21.el7_9 will be an update
---> Package python-libs.x86_64 0:2.7.5-88.el7 will be updated
---> Package python-libs.x86_64 0:2.7.5-90.el7 will be an update
---> Package python-srpm-macros.noarch 0:3-34.el7 will be installed
---> Package zlib-devel.x86_64 0:1.2.7-19.el7_9 will be installed
--> Processing Dependency: zlib = 1.2.7-19.el7_9 for package: zlib-devel-1.2.7-19.el7_9.x86_64
--> Running transaction check
---> Package keyutils-libs-devel.x86_64 0:1.5.8-3.el7 will be installed
---> Package krb5-libs.x86_64 0:1.15.1-46.el7 will be updated
---> Package krb5-libs.x86_64 0:1.15.1-50.el7 will be an update
---> Package libcom_err-devel.x86_64 0:1.42.9-19.el7 will be installed
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-19.el7 for package: libcom_err-devel-1.42.9-19.el7.x86_64
---> Package libkadm5.x86_64 0:1.15.1-50.el7 will be installed
---> Package libselinux-devel.x86_64 0:2.5-15.el7 will be installed
--> Processing Dependency: libsepol-devel(x86-64) >= 2.5-10 for package: libselinux-devel-2.5-15.el7.x86_64
--> Processing Dependency: pkgconfig(libsepol) for package: libselinux-devel-2.5-15.el7.x86_64
--> Processing Dependency: pkgconfig(libpcre) for package: libselinux-devel-2.5-15.el7.x86_64
---> Package libverto-devel.x86_64 0:0.2.5-4.el7 will be installed
---> Package openssl.x86_64 1:1.0.2k-19.el7 will be updated
---> Package openssl.x86_64 1:1.0.2k-21.el7_9 will be an update
---> Package zlib.x86_64 0:1.2.7-18.el7 will be updated
---> Package zlib.x86_64 0:1.2.7-19.el7_9 will be an update
--> Running transaction check
---> Package libcom_err.x86_64 0:1.42.9-17.el7 will be updated
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-17.el7 for package: e2fsprogs-libs-1.42.9-17.el7.x86_64
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-17.el7 for package: e2fsprogs-1.42.9-17.el7.x86_64
--> Processing Dependency: libcom_err(x86-64) = 1.42.9-17.el7 for package: libss-1.42.9-17.el7.x86_64
---> Package libcom_err.x86_64 0:1.42.9-19.el7 will be an update
---> Package libsepol-devel.x86_64 0:2.5-10.el7 will be installed
---> Package pcre-devel.x86_64 0:8.32-17.el7 will be installed
--> Running transaction check
---> Package e2fsprogs.x86_64 0:1.42.9-17.el7 will be updated
---> Package e2fsprogs.x86_64 0:1.42.9-19.el7 will be an update
---> Package e2fsprogs-libs.x86_64 0:1.42.9-17.el7 will be updated
---> Package e2fsprogs-libs.x86_64 0:1.42.9-19.el7 will be an update
---> Package libss.x86_64 0:1.42.9-17.el7 will be updated
---> Package libss.x86_64 0:1.42.9-19.el7 will be an update
--> Finished Dependency Resolution
Dependencies Resolved
===============================================================================================================================================================================================
Package Arch Version Repository Size
===============================================================================================================================================================================================
Installing:
mariadb-devel x86_64 1:5.5.68-1.el7 os 757 k
postgresql-devel x86_64 9.2.24-4.el7_8 os 952 k
python-devel x86_64 2.7.5-90.el7 updates 399 k
Installing for dependencies:
keyutils-libs-devel x86_64 1.5.8-3.el7 os 37 k
krb5-devel x86_64 1.15.1-50.el7 os 273 k
libcom_err-devel x86_64 1.42.9-19.el7 os 32 k
libkadm5 x86_64 1.15.1-50.el7 os 179 k
libselinux-devel x86_64 2.5-15.el7 os 187 k
libsepol-devel x86_64 2.5-10.el7 os 77 k
libverto-devel x86_64 0.2.5-4.el7 os 12 k
openssl-devel x86_64 1:1.0.2k-21.el7_9 updates 1.5 M
pcre-devel x86_64 8.32-17.el7 os 480 k
postgresql x86_64 9.2.24-4.el7_8 os 3.0 M
postgresql-libs x86_64 9.2.24-4.el7_8 os 234 k
python-rpm-macros noarch 3-34.el7 os 9.1 k
python-srpm-macros noarch 3-34.el7 os 8.8 k
python2-rpm-macros noarch 3-34.el7 os 8.1 k
zlib-devel x86_64 1.2.7-19.el7_9 updates 50 k
Updating for dependencies:
e2fsprogs x86_64 1.42.9-19.el7 os 701 k
e2fsprogs-libs x86_64 1.42.9-19.el7 os 168 k
krb5-libs x86_64 1.15.1-50.el7 os 809 k
libcom_err x86_64 1.42.9-19.el7 os 42 k
libss x86_64 1.42.9-19.el7 os 47 k
openssl x86_64 1:1.0.2k-21.el7_9 updates 493 k
openssl-libs x86_64 1:1.0.2k-21.el7_9 updates 1.2 M
python x86_64 2.7.5-90.el7 updates 96 k
python-libs x86_64 2.7.5-90.el7 updates 5.6 M
zlib x86_64 1.2.7-19.el7_9 updates 90 k
Installed:
mariadb-devel.x86_64 1:5.5.68-1.el7 postgresql-devel.x86_64 0:9.2.24-4.el7_8 python-devel.x86_64 0:2.7.5-90.el7
Dependency Installed:
keyutils-libs-devel.x86_64 0:1.5.8-3.el7 krb5-devel.x86_64 0:1.15.1-50.el7 libcom_err-devel.x86_64 0:1.42.9-19.el7 libkadm5.x86_64 0:1.15.1-50.el7
libselinux-devel.x86_64 0:2.5-15.el7 libsepol-devel.x86_64 0:2.5-10.el7 libverto-devel.x86_64 0:0.2.5-4.el7 openssl-devel.x86_64 1:1.0.2k-21.el7_9
pcre-devel.x86_64 0:8.32-17.el7 postgresql.x86_64 0:9.2.24-4.el7_8 postgresql-libs.x86_64 0:9.2.24-4.el7_8 python-rpm-macros.noarch 0:3-34.el7
python-srpm-macros.noarch 0:3-34.el7 python2-rpm-macros.noarch 0:3-34.el7 zlib-devel.x86_64 0:1.2.7-19.el7_9
Dependency Updated:
e2fsprogs.x86_64 0:1.42.9-19.el7 e2fsprogs-libs.x86_64 0:1.42.9-19.el7 krb5-libs.x86_64 0:1.15.1-50.el7 libcom_err.x86_64 0:1.42.9-19.el7 libss.x86_64 0:1.42.9-19.el7
openssl.x86_64 1:1.0.2k-21.el7_9 openssl-libs.x86_64 1:1.0.2k-21.el7_9 python.x86_64 0:2.7.5-90.el7 python-libs.x86_64 0:2.7.5-90.el7 zlib.x86_64 0:1.2.7-19.el7_9
Complete!
安装py-mysql2pgsql
这里直接使用pip进行安装即可。
[root@VM-5-48-centos ~]# pip install py-mysql2pgsql
Collecting py-mysql2pgsql
Downloading http://mirrors.tencentyun.com/pypi/packages/28/49/c82e4313c63b26224950bef47ab19f52a921dd4c279abf31aa3ee9d9cbbe/py-mysql2pgsql-0.1.5.tar.gz (109kB)
100% |████████████████████████████████| 112kB 807kB/s
Collecting mysql-python>=1.2.3 (from py-mysql2pgsql)
Downloading http://mirrors.tencentyun.com/pypi/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip (108kB)
100% |████████████████████████████████| 112kB 1.6MB/s
Collecting psycopg2>=2.4.2 (from py-mysql2pgsql)
Downloading http://mirrors.tencentyun.com/pypi/packages/fd/ae/98cb7a0cbb1d748ee547b058b14604bd0e9bf285a8e0cc5d148f8a8a952e/psycopg2-2.8.6.tar.gz (383kB)
100% |████████████████████████████████| 389kB 3.2MB/s
Requirement already satisfied (use --upgrade to upgrade): pyyaml>=3.10.0 in /usr/lib64/python2.7/site-packages (from py-mysql2pgsql)
Collecting argparse (from py-mysql2pgsql)
Downloading http://mirrors.tencentyun.com/pypi/packages/f2/94/3af39d34be01a24a6e65433d19e107099374224905f1e0cc6bbe1fd22a2f/argparse-1.4.0-py2.py3-none-any.whl
Collecting termcolor>=1.1.0 (from py-mysql2pgsql)
Downloading http://mirrors.tencentyun.com/pypi/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Installing collected packages: mysql-python, psycopg2, argparse, termcolor, py-mysql2pgsql
Running setup.py install for mysql-python ... done
Running setup.py install for psycopg2 ... done
Running setup.py install for termcolor ... done
Running setup.py install for py-mysql2pgsql ... done
Successfully installed argparse-1.4.0 mysql-python-1.2.5 psycopg2-2.8.6 py-mysql2pgsql-0.1.5 termcolor-1.1.0
配置任务
生成配置文件模板
在全新第一次运行时,会创建一个demo模板mysql2pgsql.yml,然后在模板里按需做一下修改即可,很简单。
[root@VM-5-48-centos ~]# py-mysql2pgsql
No configuration file found.
A new file has been initialized at: mysql2pgsql.yml
Please review the configuration and retry...
修改配置文件
这里贴一下我的配置文件,需要注意的是socket要注释掉。
[root@VM-5-48-centos ~]# egrep -v "^#" mysql2pgsql.yml
mysql:
hostname: 10.0.5.25
port: 3306
# socket: /tmp/mysql.sock
username: dts_user
password: dts_admin
database: dts_demo
compress: false
destination:
# if file is given, output goes to file, else postgres
file:
postgres:
hostname: 10.0.5.18
port: 5436
username: dts_user
password: dts_admin
database: dts_demo
supress_data: false
supress_ddl: false
force_truncate: false
数据同步
同步数据命令很简单,-v是打印详细过程,-f是指定配置文件。time是Linux系统命令,可以获取后面指令的耗时情况,如果没有记录耗时的需求,可以将其省略。
[root@VM-5-48-centos ~]# time py-mysql2pgsql -v -f mysql2pgsql.yml
>>>>>>>>>> STARTING <<<<<<<<<<
START CREATING TABLES
START - CREATING TABLE user_info
FINISH - CREATING TABLE user_info
DONE CREATING TABLES
START WRITING TABLE DATA
START - WRITING DATA TO user_info
25073.34 rows/sec [1000000]
FINISH - WRITING DATA TO user_info
DONE WRITING TABLE DATA
START CREATING INDEXES AND CONSTRAINTS
START - ADDING INDEXES TO user_info
FINISH - ADDING INDEXES TO user_info
START - ADDING CONSTRAINTS ON user_info
FINISH - ADDING CONSTRAINTS ON user_info
DONE CREATING INDEXES AND CONSTRAINTS
>>>>>>>>>> FINISHED <<<<<<<<<<
real 0m42.928s
user 0m39.899s
sys 0m0.451s
100万条数据,同步花了43秒,速度还是挺快的。
[root@VM-5-48-centos ~]# psql -d dts_demo -U dts_user -h 10.0.5.18 -p 5436
Password for user dts_user:
psql (9.5.25, server 9.4.24)
Type "help" for help.
dts_demo=> \d
List of relations
Schema | Name | Type | Owner
----------+------------------+----------+----------
dts_demo | user_info | table | dts_user
dts_demo | user_info_id_seq | sequence | dts_user
(2 rows)
dts_demo=> SELECT COUNT(1) FROM user_info;
count
---------
1000000
(1 row)
dts_demo=> SELECT * FROM user_info LIMIT 20;
id | c_user_id | c_name | c_province_id | c_city_id | create_time
----+--------------------------------------+----------------------+---------------+-----------+---------------------
2 | 1afd300e-88bc-11eb-9c30-0c42a125994e | Nj27hTrqAwIQUPiO0qXo | 727 | 95 | 2028-03-19 22:05:05
3 | 1afd4041-88bc-11eb-9c30-0c42a125994e | J9rzo41MCC2dM5Whp4Zy | 482 | 22 | 2026-03-19 22:05:05
4 | 1afd4562-88bc-11eb-9c30-0c42a125994e | RX3eSuFHkqXmNJ8hSoas | 517 | 67 | 2023-03-19 22:05:05
6 | 1afd4ebd-88bc-11eb-9c30-0c42a125994e | ydfrgRm1VlPX8FLFSeo5 | 968 | 3 | 2027-03-19 22:05:05
7 | 1afd530c-88bc-11eb-9c30-0c42a125994e | rsMpwgyPk0TiBXO2AFr3 | 585 | 25 | 2027-03-19 22:05:05
8 | 1afd574a-88bc-11eb-9c30-0c42a125994e | H5aqu0qT4xgB06i1341J | 293 | 73 | 2027-03-19 22:05:05
9 | 1afd5cf9-88bc-11eb-9c30-0c42a125994e | Y10PZgc4AzTDjxyY5ke0 | 31 | 60 | 2025-03-19 22:05:05
10 | 1afd61a8-88bc-11eb-9c30-0c42a125994e | 761DXGqU7GUjHpKns2E0 | 732 | 12 | 2022-03-19 22:05:05
13 | 1afd6f01-88bc-11eb-9c30-0c42a125994e | pNCyKUaVYVyQqowgB3kl | 370 | 31 | 2028-03-19 22:05:05
16 | 1afd7bcf-88bc-11eb-9c30-0c42a125994e | j8zjGigivtHUhwDq2OK9 | 172 | 90 | 2025-03-19 22:05:05
18 | 1afd842c-88bc-11eb-9c30-0c42a125994e | 0DZUqdFwtEGifda3AA4p | 480 | 67 | 2028-03-19 22:05:05
19 | 1afd886b-88bc-11eb-9c30-0c42a125994e | 6SRyZ7v0mCP981zBaSIL | 374 | 5 | 2022-03-19 22:05:05
21 | 1afd913b-88bc-11eb-9c30-0c42a125994e | JHbEzIIg037fKPJ0FbK4 | 730 | 93 | 2027-03-19 22:05:05
22 | 1afd9596-88bc-11eb-9c30-0c42a125994e | FSemWreIG6i3eQm7k7qE | 673 | 87 | 2028-03-19 22:05:05
24 | 1afd9ea8-88bc-11eb-9c30-0c42a125994e | YnMBwEvqqI8mg4oAzZ25 | 960 | 4 | 2026-03-19 22:05:05
27 | 1afdab41-88bc-11eb-9c30-0c42a125994e | z1eUGAFq0zFvl2ZFf2ie | 953 | 53 | 2028-03-19 22:05:05
28 | 1afdafc8-88bc-11eb-9c30-0c42a125994e | ZJvabgB2dPk0TfrhVB7D | 500 | 6 | 2028-03-19 22:05:05
29 | 1afdb407-88bc-11eb-9c30-0c42a125994e | Y4fRnay6I454UaZgaSJS | 848 | 10 | 2028-03-19 22:05:05
32 | 1afdc0bc-88bc-11eb-9c30-0c42a125994e | Lam7pt0r0zFs9dqnoJi6 | 300 | 68 | 2028-03-19 22:05:05
33 | 1afdc4fc-88bc-11eb-9c30-0c42a125994e | HPPx2oG7mid4xiGpRSEu | 190 | 94 | 2028-03-19 22:05:05
(20 rows)
简单验证了一下,确认同步完成。
小结
不管是使用mysql2pgsql,还是DTS,都各有各的优势,也各有各的不足。根据实际需求,选择更符合业务场景的工具,才是最合适的。
正文完