MySQL – Part 1 – HA Concepts

There are several methods for MySQL HA as MySQL replication, MySQL cluster, DRBD replication, heartbeat + MySQL replication and other third-party solutions.

I. MYSQL REPLICATION


Master-Slave Master-Master

 

1. Master-slave replication

Enables data to be replicated (asynchronous) from master node to slave nodes. The slave data could serve read-only queries.

Advantages

  • Available on all platforms and no data conflicts
  • Supports replication over slower links, partial links, and different geography.
  • Suitable for environments with heavy reads, but light writes

Disadvantages

  • No automatic failover. In the instance of a failure a slave has to be promoted to master to take over its place.
  • Possibly loss of data when a master fails. A slow transaction can cause the slave to lag behind the master. If the master fails, the slave might not have recorded the last few transactions.
  • All writes have to be made to the master

Recommended uses

  • Scale-out solution that require a large number of reads
  • Logging/data analysis of live data. You can perform queries on the slave without affecting the operation of the master.
  • Online backup when you need an active copy of the data available.

2. Master-master (multi-master) replication

Enables data to be replicated (synchronous) between master nodes. All nodes are active and can be served reads/writes

Advantages

  • Available on all platforms and nothing to lose due to a hardware (or software/OS) crash.
  • Distributes the word load across master nodes (load balancing)

Disadvantages

  • Loosely consistent data. Two nodes update the same row at the time and that each subsequently replicates its changes to the other node. Thus that row ends up with a different value on each server –  conflict occurs 
  • MySQL does not perform conflict resolution in such situations, so you have to make sure your application caters for that possibility.

Recommended uses

  • High speed application

II. MYSQL CLUSTER

1. MySQL Cluster NDB storage engine

Enables data synchrony. All nodes are active and can be served reads/writes

Requires NDB or NDBCLUSTER storage engine (“NDB” stands for Network Database). The data is spread out across multiple storage nodes.


MySQL Cluster is made of three main kinds of nodes:

  • Management node (s) [ndb_mgmd]: are used to check the status, start & stop nodes that are a part of the cluster. They are also responsible for distributing information about the makeup of the cluster.
  • Data/storage node(s) [ndbd]: are stored the data. Do the low level work of storing or retrieving the data from either disk or ram.
  • SQL node(s) [mysqld]: are responsible for extracting the data. These nodes interface between the application and the cluster issuing SQL queries and returning the data.

Advantages

  • Provides automatic failover.
  • High availability (no single point of failure)
  • Distributes the word load across nodes.
  • Data on nodes is instantaneously distributed to the other data nodes.

Disadvantages

  • Available on a limited range of platforms.
  • Known limitations

Recommended uses

  • Applications that need very high availability, such as telecoms and banking.
  • Applications that require an equal or higher number of writes compared to reads.

2. MySQL Cluster with Windows Failover Clustering


MySQL Enterprise Edition is supported with Windows Server Failover Clustering (WSFC). Requires at least 2 nodes and shared storage (data is stored in a shared storage

  • Only one of the cluster nodes will access to data at any point in time.
  • Clients connect to the MySQL service through a Virtual IP Address.
  • In the event of failure, the failover occurs automatically

Advantages

  • Provides automatic failover
  • Ensure data consistency
  • Windows platform supported

Disadvantages

  • Depend on shared storage
  • Only one node is active

Recommended uses

  • Windows platform

III. MYSQL WITH DRBD REPLICATION (DISK LEVEL)

This solution need to combine MySQL with DRBD and Heartbeat (or Pacemaker and CoroSync – a new software)


DRBD (Distributed Replicated Block Device) is a Linux-based software component that facilitates the replacement of shared storage systems by networked mirroring. DRBD makes it possible to maintain consistency of data among multiple systems in a network:

  • DRBD creates a virtual block device (which is associated with an underlying physical block device) that can be replicated from the primary server to a secondary server. You create a file system on the virtual block device, and this information is then replicated, at the block level, to the secondary server.
  • Because the block device, not the data you are storing on it, is being replicated the validity of the information is more reliable than with data-only replication solutions. DRBD can also ensure data integrity by only returning from a write operation on the primary server when the data has been written to the underlying physical block device on both the primary and secondary servers.

Heartbeat is cluster software, it is not a data replication/synchronization solution, but a solution for monitoring servers and switching active MySQL servers automatically in the event of failure. Heartbeat needs to be combined with MySQL Replication or DRBD to provide automatic failover.

Note 1:

  • There are many kind of cluster software such as: Heartbeat, CoroSync, OpenAIS, CMAN – also called as Red Hat Cluster Suite.
  • Heartbeat is considered deprecate since RHEL7 / CentOS7

Node 2

  • Corosync is suggested as an alternative.
  • Pacemaker and Corosync combine to provide the clustering layer that sits between the services and the underlying hosts and operating systems:
    • Pacemaker is a cluster  resource manager software and responsible for starting and stopping services, ensuring that they are running on exactly one host, thus delivering high availability and avoiding data corruption.
    • Corosync provides the underlying messaging infrastructure between the nodes that enables Pacemaker to do its job; it also handles the nodes membership within the cluster and informs Pacemaker of any changes.

Advantages

  • Provides high availability and data integrity across two servers in the event of hardware or system failure.
  • Can ensure data integrity by enforcing write consistency on the primary and secondary nodes.

Disadvantages

  • Only one way synchronize.
  • Secondary nodes cannot use the DRBD device while data is being replicated, and so the MySQL on the secondary node cannot be simultaneously active.
  • Cannot be used to scale performance, since you cannot redirect reads to the secondary node.

Recommended uses

  • High availability situations where concurrent access to the data is not required, but instant access to the active data in the event of a system or hardware failure is required.

IV. THIRD PARTY SOLUTIONS

Tungsten replicator, Galera cluster for MySQL, Oracle VM Template

V. HA CONSIDERING

The information and suitability of the various technologies and different scenarios is summarized in the following table.

HA

solution

MySQL Replication

MySQL Cluster

DRBD

Master-slave

Master-master

Windows

Linux

Platform support

All supported by MySQL

All supported by MySQL

Windows

All supported by MySQL

Linux

Supported Storage

All (but InnoDB required for Auto-failover)

All (but InnoDB required for Auto-failover)

InnoDB

NDB

InnoDB

Shared storage

No, distributed across nodes

No, distributed across nodes

Yes

No, distributed across nodes

No, distributed across nodes

Auto IP failover

Yes, with heartbeat

Yes, with heartbeat

Yes

Yes

Yes, with heartbeat

Auto database failover

No

No

Yes

Yes

Yes, with CoroSync

Auto data re-sync

Yes

Yes

N/A, shared storage

Yes

Yes

Failover time

5 seconds +

5 seconds +

5 seconds +

1 second or less

5 seconds +

Replication mode

Asynchronous

Synchronous

N/A, shared storage

Synchronous

Synchronous

No. of nodes

Master & multiple slaves

Master & master

Active/Passive

Master + multiple slave

255 + Multiple slaves

Active/Passive

Master + multiple slave

Availability design

99,9 %

99,9 %

99,95 %

99,999%

99,99%

About Terri

System Administrator @Netpower Datacenter

Posted on 16.11.2015, in Linux, Technical Articles and tagged , , . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: