Migration to Aurora just got a lot easier • The Register
Sponsored Moving your proprietary on-premises database to the cloud makes your infrastructure more flexible. Switching to an open source managed database that the cloud service provider takes care of for you also saves your DBA a ton of work in the long run.
In the short term, however, the latter involves heavy lifting. What about all existing transaction codes and schematics in your existing installation? Transitioning all of this to a new format in the cloud takes a lot of work.
So far, this might have given developers and DBAs pause when considering a transition to Aurora, Amazon Web Service’s cloud-native relational database. Developed from the ground up with the cloud in mind, the service offers many benefits, including high availability by default, flexibility, and simple license-free operation.
The cloud giant is already offering its Database Migration Service (DMS) to help with the transition, including a tool that makes it easier to convert schemas between on-premises and managed databases. Now he expects a new service called Babelfish to make the process a lot easier. It offers simple on-the-fly translation between Microsoft SQL Server (MSSQL) and Aurora’s open source PostgreSQL implementation.
The result, much like the Babel fish in Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, is that an app that speaks MSSQL can be heard in Aurora, and vice versa. This should prevent a lot of headaches.
Application migration headaches
Amazon’s Schema Migration Tool simplifies the migration of a database from an on-premises system to Aurora. However, application developers face some subtle differences when migrating their code between different versions of relational databases, such as PostgresQL and MSSQL. For example, data types can vary subtly between PostgreSQL and MSSQL, even though they have the same name. This is because vendors often expand their data types to exceed ANSI standards so that they can offer more functionality to their customers.
A programmer working with the Money data type in MSSQL will experience issues when migrating to PostgreSQL. In Microsoft’s database, it has four-digit precision. In Postgres, he only has two. This can introduce changes in how apps work that might not always appear without extensive testing. Some types of data that exist in one database may not exist at all in others, requiring serious refactoring work on the application side.
Another problem arises with transactional semantics. MSSQL handles some transactions differently from Postgres. The first will examine all records for unique keys when loading all data into a table and ignore duplicates. PostgreSQL is less forgiving and will completely fail the load operation if it finds duplicate records. This can affect the behavior of the application.
Developers will also quickly run into issues with the language syntax in their database. Each has their own variation of language which often varies from Vanilla SQL. Microsoft’s TSQL dialect is different from PostgreSQL, which offers its own PL / pgSQL language. These two languages are procedural languages, including constructs that allow the database to return complex data types, but they include different language primitives.
Developers typically use syntax conversion tools to deal with this problem, automatically generating code to bridge the gap between the language of one database and another. This has drawbacks, however. This makes the code difficult to read, especially if it has already been generated automatically by an object relational management (ORM) tool. This makes the code difficult to verify, especially if the original encoder is no longer there. Additionally, it could violate an organization’s existing coding specifications.
Another way to solve all this tangled mess is to rethink things on the app side. It’s a big undertaking, however, involving rewrites and testing of systems already in production that might have multiple dependencies. It wouldn’t be so bad if you worked with a modular, microservices-based environment, but businesses migrating to Aurora won’t be at this point yet and will be working with traditional monolithic applications.
Reduce friction in the database migration process
Amazon has done its best to facilitate this migration process with the DMS and Schema Conversion Tool (SCT). Once the customer chooses an application to migrate, DMS helps move the application as is to a version capable of communicating with Aurora. AWS uses its SCT to analyze the legacy database access code and estimate the time it will take to migrate it to the Aurora environment, which leaves more time for complexities such as stored procedures. If you have any incompatibilities between your proprietary database code and the open source database, the company will provide experts to consult with you.
Babelfish is the company’s attempt to streamline and improve this process to make migration easier for customers. This is a capability of Aurora that automatically handles translation between MSSQL and PostgreSQL, handles appropriate data type conversions, and supports MSSQL language primitives when communicating with Aurora.
Your MSSQL application still thinks it is talking to Microsoft’s database, but you pointed it to a Babelfish server, which translates its commands for Aurora. Amazon says it will cut development time for customers making the jump to its managed relational cloud database.
In practice, there will often be work to be done upstream. You need to determine the data types that your application uses and then ensure that the correct extensions are installed on the Babelfish server.
Murali Brahmadesam, software director for Aurora at Amazon, promises Babelfish serious benefits. The system allows developers to use MSSQL objects through the PostgreSQL interface, making it easy to deploy existing applications while retaining most of the code.
“Babelfish will reduce the amount of code you have to rewrite by 90% or more for most applications,” says Murali. It also means it will reduce application development times, he says, allowing customers to move to the cloud faster and eliminating the need to change code tests or customer drivers. “Instead of taking a year to migrate your application from SQL Server to PostgreSQL, you can do it in a much shorter time, maybe a month or two,” he says.
Transition with Babelfish
Babelfish will be open-source on GitHub with an Apache 2.0 license so companies can host their own versions, but AWS will also offer it as a managed service. The company is upgrading its schema conversion tool to identify the quantity of a customer’s database schema supported by Babelfish, and will equip TBS to either make changes to existing database code automatically or make recommendations to the development team. “The tooling remains the same, but the customer experience will be better,” says Murali.
The benefit of connecting to Babelfish is that customers have the convenience of a lift-and-shift migration, but with the benefits of a managed database. They can switch to a cloud-native database engine, but not change anything in their application stack until they’re ready. You can even leave your app on-premises if you want, reducing access latency using Amazon’s Direct Connect service, says Murali.
The accelerated transition will appeal to customers eager to escape the draconian license fees of proprietary databases without investing in a major refactoring project, he adds.
“Now they have full control over changes to apps in their own timeline,” he says. “They can continue to develop in TSQL while using Babelfish if they wish, but over time they can upgrade to pgSQL.”
Developers can always add cloud native functionality on the Aurora side. The database can call lambdas on record inserts, allowing developers to extend its functionality with serverless functions in the AWS Cloud without modifying their existing application stack. They can transact using the database interface and schema they are used to through the MSSQL client driver and invoke lambdas for additional functionality on the Aurora side. Murali cites integration with the AWS Sagemaker machine learning service as an example use case.
Amazon’s Babelfish tool will support application developers who see the value in switching to an alternative database engine, but who are concerned about the task of migrating their business logic. It simplifies the onboarding challenge, reduces the friction involved, and leaves customers better prepared to take the leap.
Sponsored by AWS