« December 2004 | Main | February 2005 »

January 29, 2005

INFORMATION_SCHEMA in MySQL

I don't know how I missed this. MySQL 5.0.2 contains a completely new way to get information about the database from the database, using select statements on the tables in the information_schema database.

For those of us that have been on MySQL for a long time, we've gotten used to the show command, and it's limited output. The information_schema database aims to be a more consistant method to get at a lot more information about what's going on in the database. Don't be scared, MySQL has decided to leave the show command intact since there are many of us who use them. In fact, to help us existing folks, MySQL 5.0.3 will have a path from the show command to the information_schema tables with:

SHOW TABLES FROM INFORMATION_SCHEMA;
Gee, that's pretty slick.

For those of you who've been on another database, this is going to make managing MySQL a bit easier, specifically SQLServer folks should feel right at home as SQLServer 2000 used the same standard that MySQL is working toward (SQL:2003).

My favorite line in the documentation about why a new metadata retrieval method:

Migration is easier because every other DBMS does it this way.
Every other? Do I sense some humor there?

Good for MySQL to be active in making the move easier for others, but also providing the existing users a standards-based method for getting at the meta data. It makes us feel a little less odd when the syntax we use is not jibberish to other vendor database administrators.

Posted by mike at 1:40 PM

January 24, 2005

MySQL Adds Federated Storage Engine

This is cool, MySQL has added (scheduled for release with 5.0.3) a new storage engine, called the federated storage engine. The jist, federated tables reside on a remote machine.

SQL statements are issued using the local client, as you would with any query. However, when MySQL gets the data, rather than pulling it from the local filesystem, it contacts another MySQL server and pulls the data from that machine.

The MySQL docs go into detail on how to do it. Essentially, you create a table on one machine, and then on another machine you create a identical table, but indicate the table type is federated. In the comment (that seems like a hack) of the federated table, you specify information on where the data really lives:

mysql://root@remote_host:9306/federated/test_table'
No doubt there are performance issues to consider when using this kind of a storage engine, but it seems like this is going to open up a lot of options for breaking up storage and processing. Surely someone out there has been praying for the ability to have a dedicated server just to handle their lone X table.

There are a few interesting notes in the limitations section of the documentation:
- federated tables work with MySQL now, but may work with other vendors in the future
- little awareness of the real table (no query cache, alter, update)

Will be interesting to see what comes of this, I'm sure the 2005 MySQL Conference will include some conversation.

Posted by mike at 9:17 PM

January 22, 2005

21,052 Queries per Second on AMD64 MySQL

I've been doing some performance tests on the server I built (I suppose I should stop talking about building that machine from scratch). I wasn't trying to beat some record or metric, was more interested in comparisons between speeds of different queries and different database tools.

For the fun of it, I generated a file with a million select statements against a table in the database. I enables the query cache and issued something like:

shell> mysql < million_in.sql > million.out
As shown here, I was running the MySQL command-line client, sending in a text file that contained a million select queries and putting the output into a file.

As I said, I wasn't attempting to break any records, but when the million-query process finished in 47.5 seconds, I wondered how many queries/second that might be, having seen other people use that metric. Well, 1,000,000 divided by 47.5 gives you 21,052 queries per second.

Having no idea what that meand, I went to Jeremy's weblog thinking I had seen something there along queries/second. I was actually quite suprised to find that I was in the same ballpark, albeit two years late.

I'm running the pre-built AMD64 MySQL binary a 64-bit AMD Gentoo Linux. The data directory is on a 10K SCSI disk. What I find interesting is that I only have one line in my MySQL configuration file, the one that turns on the query cache. Besides that, MySQL is running in it's generic state.

Posted by mike at 10:14 PM

January 19, 2005

What is taking up the disk space on our servers?

I've been dealing with upgrading some disk space on a few of our TUSK machines and got to wondering just what is taking up space on our system.

Now, I realize we're small beans compared to some, but I still wanted to take a snapshot of what kind of storage we demand now to look back at down the road.

1.3 G - fop generated pdfs (from XML documents)
3.3 G - static pdfs
1.5 G - images on filesystem
7.6 G - other downloadable files (zip, powerpoint)
5.3 G - flashpix images
12 G - streaming media (real, quicktime, mp3)
3 G - ULMS data
25 G - MySQL data files

There are a number of smaller groups of files, didn't include those. All totalled we're around 60G.

Another reason to document disk usage is an upcoming effort to provide more audio and video recordings of lectures, which may grow our data exponentially in the next year.

Posted by mike at 4:52 PM

The More I Write the More I . . . Play Guitar?

The book writing is in full swing. Working on my 4th chapter right now, due this Friday. I've got a backlog of entries about the writing experience. At some point it will all come out. The schedule is really aggressive, am spending about 4 hours a night writing and more on weekends.

The interesting side-effect of this is that the more time I spend researching technology, and pouring over creating clear explanations of things the more I'm finding myself really interested in playing the guitar. Maybe it's just a nice escape, but I've worked up a number of songs I used to play on my acoustic guitar and take 5 minute breaks here and there to play on or two of them. Kind of strange, I haven't played the guitar very regularly for a few years. Now that I'm completely swamped with other things, I finally have time to play guitar? Doesn't seem to add up.

It kind of goes along with my theory that I do better at everything when I'm busy. The worst semester (grade wise) I ever had was the one where I didn't have a job or any extra commitments. The semesters I did best in school were the ones where I worked 30 hours a week, went to school full time and did all the lighting design and operations for a performance group at college. Crazy how that works.

Posted by mike at 12:26 AM

January 18, 2005

Back in the Good Graces of Gentoo Linux (AMD 64-bit)

I've been using the machine I built from scratch for a week or so now. I was pretty proud that I got it up and running without much of a hitch. I was particularly nervous about getting it to boot. There are no hard drives on the main IDE bus. I have SCSI disks on a PCI card, one for boot and one for data. I also wondered if there would be complications with the AMD 64 Gentoo. My first attempt to bootstrap the system failed in an error, but after resyncing the portage tree and trying a second time the error didn't appear.

I did have one other problem (not counting getting the driver modules loaded). After I had completed the install and gone through setting up GRUB I attempted to boot and got a failure that turned out to be generated by a misconfigured Linux kernel (complied in wrong SCSI card drivers). I recompilied the kernel and rebooted and was at the login prompt.

It's been awhile since I tackled a Gentoo install, was a good experience. No matter how quick I work, the Stage 1 install is always a several-day commitment because I don't sit around waiting for things to compile and end up forgetting.

Posted by mike at 11:10 PM

January 13, 2005

MarsEdit Seems Cool

Phil's trying it, and I'm intrigued. Perhaps it's one of those "if I had a better interface I'd use it more" things that doesn't make any lasting difference. I do not like waiting for the page loads in MovableType over the web. So, far MarsEdit seems pretty decent. I like how when you add a link, it automatically fills the prompt in with my latest URL cut from the web browser.

Posted by mike at 6:16 PM

Meetings with MIT on OpenCourseWare

We've been having meetings with the folks from MIT's OpenCourseWare project for a few months now. I have to say, they are a very nice bunch of folks and have some intriguing stories to tell about how opening their courses to the public is affecting MIT and the millions of users worldwide who are tapping into the knowledge.

I have a meeting tomorrow morning with the OCW technical folks and some folks from Akamai to look at how the content is distributed and what infrastructure is necessary to meet the demands on their system.

Yes, this all hints at Tufts doing something along the same lines, doesn't it?

Posted by mike at 12:49 PM

January 12, 2005

Replace Failed Mirrored Boot Drive on Sun 280R

There's nothing like working on a live, mission critical machine. Two weeks ago, our primary database machine failed to come up after some downtime for a UPS replacement in the data center.

I jumped on our console and determined we had a bad boot disk and had to boot from the mirrored disk. Apparently you can tell Solaris to do this automatically, but I think in most cases you want to bring it up manually.

I called Sun, who gave me the option of having someone come over and do it or sending the drive so I could do it myself. I opted to do it myself and came over to the data center today, after I had confirmed the drive had been delivered. The guy I spoke to at Sun pointed me to a tutorial on unixway.com, which describes resolving our issue in exact detail.

The 280R has hot-swap drives, which can be pulled with the machine running. Of course, you want to check and double check that you are pulling out the right drive (no, I've never pulled the wrong one).

Once you've replaced the failed drive with a working one, you partition the drive and then issue a series of DiskSuite commands to reset the RAID state database to start using the partitions on the new disk.

It is all pretty simple, but there's a good deal of stress. Before every step you hope that there isn't something that, in hindsight, will make sense for why that particular command caused the machine to go into some unforseen state. Yes, I've had that happen before . . . fortunately during a maintenance window.

Did I ever mention that my dream job is a trail maintenance worker for the forest service?

Posted by mike at 11:00 PM

AMD64 Gentoo Linux on Biostar K8 Motherboard

If you're attempting to put AMD64 Gentoo Linux on a machine with a Biostar K8 motherboard, the onboard ethernet isn't recognized when you boot from the LiveCD (I think I'm using the minimal one).

A look at the specs reveals the onboard LAN is a VIA VT6103. I dug around the drivers list and found two VIA options, the first one did the trick:

modprobe via-rhine
I love it when things just work, but if they don't just work I love it if the first think you try makes it work.

Oh . . . and if your disk drives aren't showing up becuase you are using an Adaptec SCSI PCI card (no drives on motherboard):

modprobe aic7xxx

Posted by mike at 8:40 PM

January 11, 2005

Jobs Keynote Webcast Delayed 9 Hours

Bummer. I was planning on watching Steve Jobs' keynote today. I couldn't find a link anywhere on Apple or macworldexpo, ended up doing a Google search and finding out that they announced yesterday that it would be delayed.

Hmm. What could the reason be? Is it technical (post-production work)? Do they want the attendees to feel like they are a private audience? Are there things coming that need more significant delay until other people hear Jobs say it? You know that the media, webloggers etc will cover this stuff, now I'll be hearing about through other sources when I wanted to hear it from the man himself.

Folks at MacWorld suggest it's because QuickTime 7 is coming out, and the keynote will be made available only in QT 7 as a way to get peole to upgrade.

In the past, I loved watching the speech and then niticing that as he conculded the Apple website was updated with everything just announced. Good timing is all part of the marketing machine.

This year, the timing will be off for those of us not at the Moscone Center (at least 9 hours).

Posted by mike at 1:10 PM

January 5, 2005

Someday I'll go to MacWorld

Someday I will go to MacWorld, the one in San Francisco where Steve Jobs delivers the keynote which defines great keynote. Watching the streaming video of the keynote is something I look forward to for weeks.

The other night I was at a friends house who was telling me about the new items coming at MacWorld. To be honest, I don't like to hear about new Apple products ahead of time. I'd much rather get the news from Steve, who does all the appropriate build up before unveiling new products.

Posted by mike at 11:05 PM

January 3, 2005

Students Using the Tufts Knowledgebase at Midnight, New Years Eve

I was working on a script to automate a monthly rotation of the Apache logs on our webservers and discovered that there was a student from the dental school at Tufts University looking at dental images on TUSK at mindnight on New Years Eve. Being curious, I scanned through a few hours of the logs and found there was more than one.

I guess if you're serious about your studies, no time is a good time to take a break from reviewing your course material.

Update:
Anonymous Coward raises a good point, perhaps the student is in a different time zone? A check on the GeoBytes IP Locator reveals that the student's IP was in the Natick, MA area, not more than 45 minutes from our data center.

Posted by mike at 3:12 PM

September 1752 only had 19 Days

Doing some scripting today and looking at the manpage for the Unix program cal and saw this note:

An unusual calendar is printed for September 1752. That is the month 11 days were skipped to make up for lack of leap year adjustments.
I had never heard that, but sure enough, if you use the cal program for September 1752:
September 1752
S M Tu W Th F S
1 2 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
This will give me a new thing to check when I'm looking at a calendaring tool or playing with some gadget/widget. Anything that displays a full month for September, 1752 will get a bad mark.

Posted by mike at 12:16 PM

January 2, 2005

The Books I Read in 2004

I was hoping to make 2004 a year of reading, and set my sights to read 12 books during the year. Ended up getting through 20.

My book log has the titles. I reviewed the year of reading today, flipping through a few of the books trying to decide if I had a favorite. Prehaps with time certain books will stand out more or less, but right now I think Atlas Shrugged, The Naked & the Dead and The Moon is a Harsh Mistress would all be somewhere in the top three.

I enjoy reading, and try to spend my 45-minute commute relaxing on the train with leisure reading. I used to read a newspaper and/or tech magazines on the train and have noticed this year I have been a little out of the loop on current events.

Posted by mike at 11:08 PM

Building My First PC from Scratch

I'm in the process of building a new computer, to be used for some research at my place in Boston and then moved to Utah to replace one of our existing machines. Most likely will run Gentoo Linux.

I'm trying to find a fine line between ultra cheap and the machine being usable. Since I have a collection of 10K 18G SCSI drives from the office (no longer in use), I figure I'll be set if I can get a cheap case, semi-decent CPU and RAM (have an old CD-ROM for installing OS). I looked at a slew of pre-built options but couldn't find anything that met my needs just right that was cheaper than getting separate parts.

I first put together a machine on newegg and then did some shopping around on a few other sites (tigerdirect) and couldn't find prices better than newegg except on the SCSI card (on eBay).

$33.25 - mATX case (with 300W power supply)
$59.00 - mATX AMD socket 754 motherboard, with video/audio/LAN
$129.00 - AMD Athlon 64 2800+
$136.00 - 1G DDR400 PC-3200 RAM
$8.00 - SCSI card

Add in shipping and we've got a pretty decent database or webserver for $388.96.

To be honest, I've never completely built my own PC. I have installed and removed PCI cards, hard drives, CPUs etc, but haven't actually ordered all the parts and assembled a computer from scratch. Excited to go through the process of putting it all together.

Posted by mike at 8:53 PM