« MySQL User Panel: Why MySQL? | Main | Protecting Private Information with MySQL »
April 14, 2004
Introduction to MySQL Cluster
Listening to Mikael Ronstrom's presentation on MySQL Cluster at MySQL 2004.
Clustering softare originated in the telecom business. MySQL can use a new storage engine, which is the cluster with scalability and redundancy on three levels; number of applilcation server, mysql servers and number of nodes in the cluster.
Databae is distributed over many nodes, does updates in as many places as you specify in configuration. Database is automatically partitioned over the nodes. Failover capability if a node fails. Support for high update loads and high read loads.
MySQL cluster is affordable, easy to use and upgrade to.
Features:
- transactions
- synchronous replication
- auto-sync at restart of NDB node
- if complete cluster cache, can restpore from checkpoints in log
- online backups, no shutdown
- subscribe to row changes, see changes in gui
- two indexes (unique hash and T-tree ordered)
- online index build
- online software upgrade (rolling) - you change management server and go through each node and upgrade, upgrading the application last
Cluster is all available in 4.1
Research for the clustering was done from 1991 to 1996, drawn from Ericsson's experience inbuilding reliable soluctions for telecom. Development started in 1997, releases every year or so until April 14th (today), version 2.4.5 is released with MySQL integration.
Requirements
- availability - less that 5 minutes a year
- preformance - response time less than 5ms, throughput 10,000 transactions/sec
- scalability - many applications in distributed architecture
Performance
- 380,000 writes transactions/sec (128 Byte records)
- 1.5 million read transactions/sec (128 Byte records)
The MySQL server uses the NDB API interface to get data from the cluster. MySQL can use a variety of storage engines, so can have local MyISAM and InnoDB alongside cluster.
The cluster storage engine can only have 8000 bytes in a record, no blobs, but is on high priority list to fix that. You can easily get data from the cluster by using mysqldump and putting that into another table type, if you were wanting to have non-cluster environments with data from the cluster.
Mikael goes into the under-hood description of how the engine does reads and writes.
Development now is going into MySQL functationality, allowing you to do anything available in existing MySQL interface.
Because the storage is in memory, you need twice the amount of memory as data, which means that for 40G of data Mikael recommends 100G of memory (12G across 10 machines).
Posted by mike at April 14, 2004 3:55 PM
Hard Drive Recovery Group offers hard disk data recovery services for RAID, laptops and servers. Complete clean room and hard drive repair service.Trackback Pings
TrackBack URL for this entry:
http://mike.kruckenberg.com/mt/mt-tb.cgi/542
Comments
Thanks for the info! After reading Jeremy note on the MySQL cluster and now yours, I got much clearer picture what MySQL cluster is and what it is not (I hope). It's definitely not a "Real Application" cluster in Oracle RAC sense. Since, MySQL cluster concept is "in memory" (without shared storage), then I quess not only that we'll need to partition our data, but also write cluster aware application (not good!) to avoid performance bottlenecks on interconnect? Am I right?
Do you know what kind of interconnect technology did they used for the benchmarks, Gigabit eth. ?
Regards,
Ales
Posted by: Ales at April 15, 2004 4:21 PM
It is not necessary for the application to partition data. This is done under the hood by a hashing algorithm and MySQL servers will load balance communication towards the NDB storage nodes.
3 interconnects are currently supported.
1) TCP/IP sockets
2) Shared Memory
3) SCI (www.dolphinics.no)
Rgrds Mikael
Posted by: Mikael Ronstrom at April 24, 2004 5:38 PM
Hello,
How is mysql cluster different from using a OS cluster for mysql ? I mean how is a application based cluster solution different from typical cluster solution and what should be used in a production implimentation and how should we decide ?
Regards,
Jaygopal.
Posted by: jaygopal at July 28, 2004 12:14 AM