Binlog-ETL

Binlog-ETL is a system designed for synchronising the data in MySQL to HIVE data warehouse on Hadoop. The best feature is that it can keep the latest snapshot of MySQL tables in the DW while MySQL databases are updating.

Binlog

Binlog is the short for binary log. The binary log of MySQL contains “events” that describe database changes such as table creation operations or changes to table data. It is also used for MySQL Master-Slave synchronization. So in this repository, it is designed to be used for synchronising MySQL tables to Data Wharehouse constructed with HIVE and Hadoop.

Features

Requriements

Data Flow

The Binlog-ETL system lays between MySQL cluster and HIVE data warehouse. It requests bin-logs from MySQL cluster initiativly and creates snapshots automatically.