最近公司在aws的项目要迁移到我们公司云平台,需要将aws上面的rds(oracle)实例迁移到本地云平台,遇到各种各样的问题,下面总结下遇到的所有问题及解决方法。
我这边用了各种各样的方法进行迁移,最终才成功将数据迁回

方法1(失败):navicat 工具,因对oracle数据库不太熟悉,所以首先选择用工具直接同步,此过程步骤简单,只需要在工具上选择源库和目标库同步即可

image
问题:

  • 用工具导出速度很慢,1秒大概导出1000条左右,且会随着导出时长增加而增加
  • 容易中断,我这边首次导出到60多w条数据时工具卡住,无法继续导出
  • 数据导出不全,通过navicat导出发现大部分表数据不全,甚至有些表数据完全没有导入

基于上面遇到的问题,所以舍弃了该方法

方法2(失败):导出sql文件后,使用sqlplus导入

sql文件可以用工具导出,我这边还是使用navicat导出sql文件,导出后如下图是每个表生成一个sql文件
image

使用sqlplus进入需导入数据的库中进行导入发现部分表报错了,报错原因是中有超过3000个字符长度的字段,而sqlplus不支持超过3000的,需要手动更改sql语句,因表数据量较大,故放弃此方法

image

方法3(失败):导出csv文件,再通过sqlload导入

1、使用工具导出csv文件,注意字段分隔符和行分隔符。
2、创建控制文件

该定义解析和导入CSV文件的数据。控制文件需要指定csv文件的位置、字段分隔符、数据格式等,需要和导出时候指定的相同,下面是我导出某个表的一个控制文件,可以看到字段比较多

[oracle@waixuan-prod ~/pub]$ cat epub_p_publications.ctl
LOAD DATA
INFILE 'EPUB_P_PUBLICATIONS.csv'
BADFILE 'epub_p_publications_data.bad'
DISCARDFILE 'epub_p_publications_data.discard'
INTO TABLE EPUBLISHING.EPUB_P_PUBLICATIONS
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' 
TRAILING NULLCOLS
(
PUB_ID,
PUB_PARENT_ID,
PUB_VOLUME_ID,
PUB_ISSUE_ID,
PUBLISHER_ID,
PUB_TITLE,
PUB_CHINESE_TITLE,
PUB_AUTHOR,
PUB_CODE,
PUB_HAVEMARC,
PUB_MARCPATH,
PUB_PAGE_START,
PUB_PAGE_END,
LIST_PRICE,
LCURR,
SALE_PRICE,
SCURR,
BROWSE_PRECENT,
IN_COLLECTION,
CONTENT_TYPE,
CONTENT_COVER,
CREATE_ON TIMESTAMP "DD/MM/YYYY HH24:MI:SS.FF3",
UPDATE_ON TIMESTAMP "DD/MM/YYYY HH24:MI:SS.FF3",
CONTENT_PDF,
PUB_PATH,
PUB_LANG,
CONTENT_BREED,
CONTENT_SUBTITLE,
CONTENT_MEDIA,
CONTENT_ORIGINALNO,
CONTENT_VOLUMESNUM,
CONTENT_NORMS,
CONTENT_EVOLUTION,
HOMEPAGE,
NEWEST,
SELECTED,
SPECIAL,
STATUS,
OA,
FREE,
PUB_DATE,
PUB_SUBJECT,
PUB_LOCAL,
PUB_WEB_URL,
PUB_CREATE_DATE,
SISBN,
HISBN,
CONTENT_ACTIVITY,
PUB_SUBJECT_EN,
SOURCE_ID,
BUY_TIMES,
PUB_YEAR,
PUB_MONTH,
PUB_DAY,
PUB_VOLUME,
START_VOLUME,
END_VOLUME,
PUB_ISSUE,
PUB_DOI,
PUB_KEYWORD,
PUB_REFERENCE,
SUBSCRIPTION_CODE,
CODE_ISSUED_ABROAD,
POST_JOURNAL_CODE,
CN_CODE,
OLD_JOURNAL_CODE,
JOURNAL_CODE,
FORMERLY,
TITLE_PINYIN,
THEME,
AUDIENCE,
PUB_SIZE,
REVISION,
OTHER_AUTHOR,
CHOICE,
COLLECTION_BOOK_SUBCODE,
VOLUME_BOOK_SUBCODE,
IS_COLLECTION_BOOK,
IS_VOLUME_BOOK,
VICE_TITLE,
CONTENT_WORDS,
CONTENT_EDITOR,
CONTENT_PAGE_NUM,
CONTENT_PLACE,
CONTENT_LANG,
ISSUE_METHOD_ID,
FORMAT_ID,
PUBLISH_METHOD_ID,
CONTENT_VOLUME_BOOK_ID,
CONTENT_COLLECTION_BOOK_ID,
DIC_ID,
JOUR_DIC_ID,
CONTENT_FORMAT,
CONTENT_SPONSORS,
JOURNALSTATUS,
PUB_INITIAL,
PUB_ABSTRACT char(99999),
AUTHOR_PINYIN,
CONTENT_SHUMING,
CONTENT_FENCE,
CONTENT_FENCE_TITLE,
CONTENT_BINGLIE_TITLE,
CONTENT_AUTHOR_JIANJIE,
CONTENT_XINXIYUAN,
CONTENT_GONGHUOSHANG,
CONTENT_MUCI,
CONTENT_CONG_SHU,
CONTENT_BEIZHU,
ISINSOLR,
CONTENT_GROUP,
CONTENT_SUBJECTCN_WX,
CONTENT_SUBJECTEN_WX,
CONTENT_SUBJECT_WX,
CONTENT_STATUS_WX,
SHELVES_STATUS
)
3、运行sqlldr命令

需要指定用户名、密码、控制文件、日志文件、支持最大的错误数量,注意要导入的表不能有数据,必须是空表

[oracle@waixuan-prod ~/pub]$ sqlldr userid=epublishing/xxxxxxx control=epub_p_publications.ctl log=epub_p_publications.log errors=9999999999

执行后如下图开始导入,原本以为很顺利,导入速度也比较快,100万数据大概30+秒左右导入完成
image
导入完成发现导入日志有大量报错信息,信息如下:

Record 13110: Rejected - Error on table EPUBLISHING.EPUB_P_PUBLICATIONS, column PUB_ABSTRACT.
second enclosure string not present

报错显示PUB_ABSTRACT 这个字段导入有问题,查了报错的原因可能如下:
1、查看源表该字段发现该字段是long类型,而我ctl文件中指定的该字段为char(99999)类型,如果该字段有任何记录超过该类型就不会被完全导入
2、sqlload的限制,sqlload处理long类型字段时可能会有限制或不支持,看了下oracle官网建议使用clob数据类型代替long,但是ctl文件中不能指定CLOB类型。
基于上面2个问题,发现数据还是无法完整导入,故放弃。

方法4(成功):使用Oracle Data Pump和数据库连接导入数据

该方法是查看aws官网推荐方式,链接如下:https://docs.aws.amazon.com/zh_cn/AmazonRDS/latest/UserGuide/Oracle.Procedural.Importing.DataPump.html#Oracle.Procedural.Importing.DataPumpDBLink.requirements
其实官网推荐有两种方式

  • 1、使用Oracle DataPumpAmazonS3存储桶导入数据
  • 2、使用Oracle DataPump和数据库连接导入数据

我这边首先试的是第一种方式,该方法步骤如下:

  1. RDS for oracle 目标数据库实例上向数据库用户授予权限
  2. 使用DBMS_DATAPUMP将数据导出到转存文件中
  3. 将转存文件上传到AmazonS3存储桶
  4. 将转存文件从AmazonS3桶下载到目标数据库实例
  5. 使用DBMS_DATAPUMP将转存文件导入到目标数据库实例

该种方法我这边尝试失败,卡在步骤3,无法将转存文件上传到S3桶,因项目时间问题没有过多研究,故选择第二种官方推荐方式,该方法流程如下:

1. 在RDS for oracle目标数据库实例上向用户授予权限
2. 向用户授予源数据库权限
3. 在源库使用DBMS_DATAPUMP创建转存文件
DECLARE
  v_hdnl NUMBER;
BEGIN
  v_hdnl := DBMS_DATAPUMP.OPEN( 
    operation => 'EXPORT' , 
    job_mode  => 'SCHEMA' , 
    job_name  => null
  );
  DBMS_DATAPUMP.ADD_FILE( 
    handle    => v_hdnl, 
    filename  => 'sample.dmp'    , 
    directory => 'DATA_PUMP_DIR' , 
    filetype  => dbms_datapump.ku$_file_type_dump_file
  );
  DBMS_DATAPUMP.ADD_FILE( 
    handle    => v_hdnl           , 
    filename  => 'sample_exp.log' , 
    directory => 'DATA_PUMP_DIR'  , 
    filetype  => dbms_datapump.ku$_file_type_log_file
  );
  DBMS_DATAPUMP.METADATA_FILTER(
    v_hdnl              ,
    'SCHEMA_EXPR'       ,
    'IN (''SCHEMA_1'')'
  );
  DBMS_DATAPUMP.METADATA_FILTER(
    v_hdnl,
    'EXCLUDE_NAME_EXPR',
    q'[IN (SELECT NAME FROM sys.OBJ$ 
           WHERE TYPE# IN (66,67,74,79,59,62,46) 
           AND OWNER# IN 
             (SELECT USER# FROM SYS.USER$ 
              WHERE NAME IN ('RDSADMIN','SYS','SYSTEM','RDS_DATAGUARD','RDSSEC')
             )
          )
    ]',
    'PROCOBJ'
  );
  DBMS_DATAPUMP.START_JOB(v_hdnl);
END;
/

也可以使用expdbdirectory 可以使用原有目录DATA_PUMP_DIR,或自行创建
目录可以通过SELECT directory_name, directory_path FROM dba_directories; 命令查询

expdp  dbman1/xxxxxxx@datacenter.cbxxxxxxxx.us-east-2.rds.amazonaws.com/datacent schemas=mailserver  DIRECTORY=DATA_PUMP_DIR DUMPFILE=mailservers.dmp

导出之后会在该RDS实例的DIRECTORY 目录下生成mailservers.dmp,但我这里无法看到,因为是RDS数据库无法和常规自建数据库一样,连接到服务器上查看,所以也无法使用常规方法将dmp文件下载

4. 在源库中创建目标数据库实例的数据库链接

创建名为tolocal的数据库连接,连接到位于我们公司云上的自建数据库实例

CREATE DATABASE LINK tolocal 
  CONNECT TO <master_user_account> IDENTIFIED BY <password>
  USING '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=<dns or ip address of remote db>)
         (PORT=<listener port>))(CONNECT_DATA=(SID=<remote SID>)))';

5. 使用DMBS_FILE_TRANSFER将导出的转存文件复制到目标数据库实例

在源库使用 DBMS_FILE_TRANSFER 将转储文件从源数据库实例复制到目标数据库实例。以下脚本将名为 sample.dmp 的转储文件从源实例复制到名为 tolocal 的目标数据库链接

  • source_directory_object:源数据库目录对象
  • source_file_name:源数据库dmp文件名
  • destination_directory_object:目标数据库目录对象
  • destination_file_name:目标数据库dmp文件名
  • destination_database:指定上面第4步创建的链接名
BEGIN
  DBMS_FILE_TRANSFER.PUT_FILE(
    source_directory_object       => 'DATA_PUMP_DIR',
    source_file_name              => 'sample.dmp',
    destination_directory_object  => 'DATA_PUMP_DIR',
    destination_file_name         => 'sample_copied.dmp', 
    destination_database          => 'to_rds' );
END;
/ 

该过程主要看数据量,执行后要耐心等待一段时间,执行成功后可以到目标oracle实例所在的服务器的DATA_PUMP_DIR 目录对应的路径查看是否有该destination_file_name文件

[oracle@waixuan-test ~]$  ll /data/oracle/product/19.3.000/db_home/rdbms/log/
total 9506324
-rw-r----- 1 oracle oinstall 3442290688 Mar 12 10:46 CONVERTERALLNEWS.dmp
6. 使用Oracle Data Pump impdp方法将转存文件导入到目标数据库实例

在目标库使用下面的命令将转存文件导入到数据库中

  • DIRECTORY:对应第5步的destination_directory_object
  • DUMPFILE:对应第5步的destination_file_name
  • LOGFILE:指定导入的日志文件名
impdp  converterserver/xxxxxx DIRECTORY=yusys_dbdump_dir DUMPFILE=CONVERTER_ALL_TABLES_all.dmp LOGFILE=import_log.log

原以为这样终于可以导入成功。。。但意外来的总是那么突然
image

ORA-39002: invalid operation
ORA-39405: Oracle Data Pump does not support importing from a source database with TSTZ version 33 into a target database with TSTZ version 32.

好吧,继续,查看报错信息发现是源库的时区和目标库的时区不同导致导入失败,查看源库和目标库时区:

-- 源库:
SELECT version FROM v$timezone_file;
33

-- 目标库:
SELECT version FROM v$timezone_file;
32

可以看到源库33,目标库32,所以需要升级目标库的时区,升级方法如下:
首先需要下载时区补丁,这个补丁不确定是否需要购买相关支持才有权限下载,我这边也是通过渠道获得的相关补丁。
解压补丁,将文件上传到需升级时区的数据库服务器上,然后将其解压到Opatch目录下,这个是我的路径

[oracle@waixuan-test /data/oracle/product/19.3.000/db_home/OPatch]$ pwd 
/data/oracle/product/19.3.000/db_home/OPatch

解压补丁包后有两个目录,将这两个目录移动到Opatch目录下


[oracle@waixuan-test /data/oracle/product/19.3.000/db_home/OPatch/buding/28852325]$ ll
total 16
drwxr-x--- 3 oracle oinstall 4096 Aug  9  2019 etc
drwxr-x--- 3 oracle oinstall 4096 Aug  9  2019 files
-rw-rw-r-- 1 oracle oinstall 5779 Aug  9  2019 README.txt

[oracle@waixuan-test /data/oracle/product/19.3.000/db_home/OPatch]$ ll
total 228
drwxrwxrwx 6 oracle oinstall  4096 Apr 12  2019 auto
drwxr-x--- 3 oracle oinstall  4096 Mar 12 14:49 buding
drwxrwxrwx 2 oracle oinstall  4096 Apr 12  2019 config
-rwxr-x--- 1 oracle oinstall   589 Apr 12  2019 datapatch
drwxrwxrwx 2 oracle oinstall  4096 Apr 12  2019 docs
-rwxr-x--- 1 oracle oinstall 23550 Apr 12  2019 emdpatch.pl
drwxr-x--- 3 oracle oinstall  4096 Aug  9  2019 etc
drwxr-x--- 3 oracle oinstall  4096 Aug  9  2019 files
drwxrwxrwx 2 oracle oinstall  4096 Jan 23 08:47 jlib
drwxrwxrwx 5 oracle oinstall  4096 Aug 17  2018 jre
drwxrwxrwx 9 oracle oinstall  4096 Apr 12  2019 modules
drwxrwxrwx 5 oracle oinstall  4096 Apr 12  2019 ocm
-rwxr-x--- 1 oracle oinstall 48493 Apr 12  2019 opatch
-rwxr-x--- 1 oracle oinstall  1442 Apr 12  2019 opatchauto
-rwxrwxrwx 1 oracle oinstall   393 Apr 12  2019 opatchauto.cmd
-rwxrwxrwx 1 oracle oinstall  4290 Apr 12  2019 opatch_env.sh
-rwxr-x--- 1 oracle oinstall  2551 Apr 12  2019 opatch.pl
drwxrwxrwx 4 oracle oinstall  4096 Apr 12  2019 opatchprereqs
-rwxr-x--- 1 oracle oinstall  3159 Apr 12  2019 operr
-rwxrwxrwx 1 oracle oinstall  3177 Apr 12  2019 operr_readme.txt
drwxrwxrwx 2 oracle oinstall  4096 Apr 12  2019 oplan
drwxrwxrwx 3 oracle oinstall  4096 Apr 12  2019 oracle_common
drwxrwxrwx 3 oracle oinstall  4096 Apr 12  2019 plugins
-rw-rw-r-- 1 oracle oinstall  5779 Aug  9  2019 README.txt
drwxrwxrwx 2 oracle oinstall  4096 Jan 23 08:47 scripts
-rw-r----- 1 oracle oinstall    27 Apr 12  2019 version.txt

在此目录下执行

[oracle@iZ2zehl1ro7yd08bzf0h3rZ OPatch]$ ./opatch apply

image
看到下图的提示就是成功了
image
查看已装补丁情况

opatch lsinventory

然后需要执行补丁sql
upg_tzv_check.sql
upg_tzv_apply.sql

[oracle@iZ2zehl1ro7yd08bzf0h3rZ yusys]$ sqlplus / as sysdba 

SQL*Plus: Release 19.0.0.0.0 - Production on Tue Mar 12 10:18:20 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> @upg_tzv_check.sql;
INFO: Starting with RDBMS DST update preparation.
INFO: NO actual RDBMS DST update will be done by this script.
INFO: If an ERROR occurs the script will EXIT sqlplus.
INFO: Doing checks for known issues ...
INFO: Database version is 19.0.0.0 .
INFO: Database RDBMS DST version is DSTv32 .
INFO: No known issues detected.
INFO: Now detecting new RDBMS DST version.
A prepare window has been successfully started.
INFO: Newest RDBMS DST version detected is DSTv33 .
INFO: Next step is checking all TSTZ data.
INFO: It might take a while before any further output is seen ...
A prepare window has been successfully ended.
INFO: A newer RDBMS DST version than the one currently used is found.
INFO: Note that NO DST update was yet done.
INFO: Now run upg_tzv_apply.sql to do the actual RDBMS DST update.
INFO: Note that the upg_tzv_apply.sql script will
INFO: restart the database 2 times WITHOUT any confirmation or prompt.
SQL> @upg_tzv_apply.sql;
INFO: If an ERROR occurs the script will EXIT sqlplus.
INFO: The database RDBMS DST version will be updated to DSTv33 .
WARNING: This script will restart the database 2 times
WARNING: WITHOUT asking ANY confirmation.
WARNING: Hit control-c NOW if this is not intended.
INFO: Restarting the database in UPGRADE mode to start the DST upgrade.
Database closed.
Database dismounted.
ORACLE instance shut down.
ORACLE instance started.

Total System Global Area 9932109768 bytes
Fixed Size                 12445640 bytes
Variable Size            1610612736 bytes
Database Buffers         8287944704 bytes
Redo Buffers               21106688 bytes
Database mounted.
Database opened.
INFO: Starting the RDBMS DST upgrade.
INFO: Upgrading all SYS owned TSTZ data.
INFO: It might take time before any further output is seen ...
An upgrade window has been successfully started.
INFO: Restarting the database in NORMAL mode to upgrade non-SYS TSTZ data.
Database closed.
Database dismounted.
ORACLE instance shut down.
ORACLE instance started.

Total System Global Area 9932109768 bytes
Fixed Size                 12445640 bytes
Variable Size            1610612736 bytes
Database Buffers         8287944704 bytes
Redo Buffers               21106688 bytes
Database mounted.
Database opened.
INFO: Upgrading all non-SYS TSTZ data.
INFO: It might take time before any further output is seen ...
INFO: Do NOT start any application yet that uses TSTZ data!
INFO: Next is a list of all upgraded tables:
Table list: "GSMADMIN_INTERNAL"."AQ$_CHANGE_LOG_QUEUE_TABLE_L"
Number of failures: 0
Table list: "GSMADMIN_INTERNAL"."AQ$_CHANGE_LOG_QUEUE_TABLE_S"
Number of failures: 0
Table list: "MDSYS"."SDO_DIAG_MESSAGES_TABLE"
Number of failures: 0
Table list: "DVSYS"."AUDIT_TRAIL$"
Number of failures: 0
Table list: "DVSYS"."SIMULATION_LOG$"
Number of failures: 0
INFO: Total failures during update of TSTZ data: 0 .
An upgrade window has been successfully ended.
INFO: Your new Server RDBMS DST version is DSTv33 .
INFO: The RDBMS DST update is successfully finished.
INFO: Make sure to exit this sqlplus session.
INFO: Do not use it for timezone related selects.

执行过后查看当前数据库时区,和源库相同都是33了!
image
此时再执行导入命令就可以了
image