Utilizing Information Schema for Database Migration

SQL-92 is a milestone in setting up the standards of SQL query language for relational database. In SQL-92, an important feature is to define Information_Schema – a series of read-only views for retrieving database metadata.

Information_Schema is quite useful when porting one SQL-92 compatible database to another, especially when you have large number of backend scripts which involve metadata operations.

Information_Schema may bring lots of convenience when we build up database-driven solutions since usually these solutions use database metadata quite often.

Let’s see which major databases support SQL-92 Information_Schema :

RDBMSRankSQL-92 CompatibleInformation_schema
Oracle1stOracle data dictionary: desc dict;
Open source project: ora-info-schema
MySQL 5+2ndMySQL Info Schem...
Read More

What Confluent or Kafka can do for data synchronization?

Confluent is a streaming platform based on Apache Kafka. Apache Kafka is a distributed streaming platform which offers three key capabilities:

  1. It lets you publish and subscribe to streams of records. In this respect it is similar to a message queue or enterprise messaging system.
  2. It lets you store streams of records in a fault-tolerant way.
  3. It lets you process streams of records as they occur.

So Kafka is good for:

  1. Building real-time streaming data pipelines that reliably get data between systems or applications
  2. Building real-time streaming applications that transform or react to the streams of data

Now let’s get back to Confluent. Confluent offers Confluent Open Source and Enterprise editions. We may see Confluent a wrapper of Apache Kafka...

Read More

Recap for Concurrency: Tracking Database Changes

We talked how to track database changes with one auditing table last month. If we want to replicate these changes into another database, then we might use multi-threading to gain the high performance in data ingestion. However, based on our previous design, we track changes to column level, which generates a serious concurrency control issue. Let’s think about, we have a data table, with 10 columns updates at once, then all these changes are captured and inserted into my_audit_table. If at the same time, another table gets updated, then there might have some cross inserting in my_audit_table, which makes the multi-threading read even harder.

Is there a better approach that we should go? In this case, it’s worth to re-design the audit_table. Let’s see how it looks like (in Oracle PL/SQL):

c...
Read More