Explorations in Data Transformation Services
In my efforts to build a simple, straight-forward, data warehouse for my client, I rely on Data Transformation Services to move data from various databases to my aggregated data warehouse. I already posted about a problem I had with moving text columns a few days ago. Today, I'd like to talk to you about other problems I encountered, and how I managed to solve (or work-around) them.
MySQL conversions
One of the databases from which I had to convert data was a MySQL database. Accessing a MySQL database is not a big issue. Simply define an ODBC data-source and use that to connect the database. Installing the MySQL ODBC driver significantly improves the performance. The problem however lay in the conversion of data. MySQL datatypes may look like SQL server datatypes, but they are not always identical. A few examples:
- TinyInt in MySQL can be signed or unsigned. In SQL server, TinyInt is always unsigned.
- MySQL has a MediumInt datatype, which is a 3 byte integer. SQL server does not have that datatype.
On the Technet website you can find a whitepaper describing various issues when migrating from MySQL to SQL server.
Deployment problems
Once the DTS packages were finished, I was confronted with the problem of deploying the packages. When you define database connections inside a package, the logon information and the names of the databases are also stored inside the package. I tried to make this information flexible by introducing Dynamic Properties. At first, this seemed to work. But close investigation proved me wrong. The dynamic properties worked for the logon information, but the package still contained hard-code references to the databases I used to develop the packages. And there was no garantuee that the live database would have the same name as the development database.
I then saved one of the packages as VB6 code. That proved my suspicions on hard-code database references. I included the module in a VB6 testapplication and tried to make the logon and database info in the VB6 code flexible. Using the VB6 application, I then recreated the package under a new name and executed it on the server. The result from that package was identical to my original one. With that in mind, I decided to go that solution. I built a C# application to patch the VB6 code generated. The C# application then combines the patched VB6 code into a new VB6 application. And it works. But I can't stop thinking there must be a better solution for this. So if any of you has any ideas, feel free to comment.