« MySQL Cluster Architecture | Main | Using and Extending Full Text in MySQL »

April 20, 2005

A Tour of the MySQL Source Code

I'm a little out of my league here, but for the sake of Jay, the co-author on ProMySQL, I'm attending a presentation on the source code of MySQL by Brian and Monty (that's Aker and Widenius). If you're not in the loop Brian is the Director of Architecture (and wrote the slashdot system) and Monty is one of the founders who wrote the original code.

Monty starts by asking why people want to know about the source code? Some people are interested technically, but others are curious about the process of bringing all the development together.

Server Design

C, C++ and ASM. 80% in C for client library, networking, I/O, etc. The C++ isn't really object oriented, used like C but written in C++ for performance.

(a nice graphic to show how things interoperate)
MySQL -> Parse/Optimize/Retrieve Store -> Storage Engine

Storage Engines

A quick look at the list of storage engines.

Filesystem

What's in MySQL directories.

Kernel

/sql - running instance itself - file naming scheme matches the functionality (delete, update, etc) - written in C++ /libmysql - contains the embedded MySQL server

Storage Engine

/myisam /heap /innobase /merge /heap

In 5.1 these will all be moved into a directory called storage.

Portable Runtime

The portable runtime is a place to centralize functions to run on different systems.

/mysys - find out open, read, write
/strings (includes character sets)
/dbug - use to generate trace files

Clients

/libmysql /libmysql_r /sql-common - common things between server and clients

/include

libraries

zlib regexp readline vio

test

The QA system that In 5.0 most of the config stuff is in config/acmacros. After 5.1 the file tree will be reorganized.

Adding a Function

You'd want to look at these files. lex.h sql_yacc.yy item_create.cc myfunc.cc myfunc.h Makefile.am

Adding Field Types

Edit field.cc and field.h, subclass Field and implement 12 methods.

Monty and Brian open it up to folks to suggest different fields they'd like. IPv4 IPv6, compressed blob, GUID come up.

Array datatype has been on the list, someone has been looking at it but hasn't been committed to a certain version.

Development Model

SCRUM (agile) used for development. Break down tasks into 30-day chunks. Developers pick up.

Three queues. Raw, worklog and sprint. The worklog is managed by an open source tool called worklog.

Do a high-level architecture document, then do a low-level document and then put it in the backlog queue. When a task is in the backlog queue any developer can pick it up, put it in sprint to work on it for a period of time and then put it back to backlog.

Once it's developed it's reviewed and then put into source. Even changes made from within the company must be approved. It often takes 3-4 iterations to be approved and then is finally put in the tree. A test case is required for approval. When it's put in the tree, the approving person is held accountable for the contribution, not the writer of the code. The senior developers allocate.

The code commits are sent to a mailing list so the community can review any changes in the system.

The developers of MySQL are expected to spend 80% of their time working on things assigned by the company, 20% of their time they can choose what to work on.

Authorization and authentication module is being written to enable LDAP authentication.

Posted by mike at April 20, 2005 4:25 PM