« December 2003 | Main | February 2004 »

January 30, 2004

MySQL Replication Up and Running (and I'm Impressed)

I've been wanting to get into MySQL replication for awhile. We've had conversations here for at least a year about having a slave running on a separate machine. Having the database replicated is good for several reasons, the two most important to us:
1) In instance of a hardware problem on master, the application can be pointed at the slave.
2) Aggregating, or indexing data (large, long SQL selects) against the production database can cause interruptions in service, which is rarely a good thing.

We're not quite to the point where we have a dedicated machine for replicating the database (coming summer 2004), but I thought it good to get something set up and let it run for a few months to get a feel for it, so I installed MySQL (4.0.17) on a semi-free box, changed configuration on our production MySQL (3.23.54a) server to enable it as a master, and got a slave up and running. Really wasn't terribly difficult, the MySQL docs are well done (as always).

The most impressive thing to me is the answer to this question: How closely in sync is the slave to the master? I scoured the MySQL docs but could not find a good answer to the question (probably looked right at it, but didn't see it).

I put two terminal windows side by side, with a tail -f on the bin-log for both machines. Although I know in theory there must be a delay between a write on the master and a write on the slave, when tailing the files there is no difference. I'm not forcing flushing to the logs, so there is some incosistent delay between when the update happens in the database and the log file gets written. This would explain why sometimes I see the update on the slave before the master. In most cases, the difference between the master and slave is fractions of a second. I would guess that under heavy load that the updates aren't quite as quick.

I think that's impressive . . . makes me think differently about how a MySQL slave could be used. Maybe I'm easy to impress, haven't spent much time around the "other" databases (other than a short, employer-mandated, stint with Oracle).

Posted by mike at 2:47 PM

January 27, 2004

Shibboleth Overview

To get it straight in my head I'm writing out the details of the Shibboleth authentication process. We're working on becoming a target, which means people from other institutions can view our content.

General Overview
To get access to resources at another institution there must be a Shibboleth origin and target. The origin holds user data and provides attributes about a user to the target, allowing the target to authenticate. Resources are shared between institutions by establishing a set of attribute acceptance policies.

Details (with acronyms galore)
I have a resource at Tufts, located at http://tusk.tufts.edu/content/1234. Someone from Dew University wants to look at that document. Tufts and Dew university have an agreement where any student in the Dew Master of Public Health program can look at Tufts documents.

The student requests the document from our shib-enabled server. The request is handled initially by the shib Resource Manager (RM), which allows the Shibboleth Indexical Reference Establisher (SHIRE) to step in and use a Where Are You From (WAYF) to determine which origin the user is from (chosen from a dropdown), and get a handle to ask about the user.

The SHIRE passes the handle to the Shibboleth Attribute Requester (SHAR), which gets a set of attributes about the requesting user and passes them back to the RM. The RM uses the attributes to decide whether to grant access.

OK, that's better. I think I've got my arms around what will be happening. Still have some questions about establishing what attributes will be available, how the resource manager makes it's decision, and what is done on subsequent requests. Hopefully all questions will be answered as I get into setting up the test environment.

Posted by mike at 3:18 PM

Yet Another MT Adjustment

I've been thinking that some of the changes to the weblog weren't done quite right. Dedicated a few hours tonight, set up a new weblog to keep track of links, books and music. Am using the other weblog to create very simple summary pages to include here in this weblog (down the right column).

Posted by mike at 12:35 AM

January 26, 2004

Building Shibboleth from Source

A project that's been on my back burner for some time is getting a machine built as a test environment for Shibboleth-based authentication into our site. Have been following the mail list and scoured the documentation a few times over to get up tp speed on what it takes.

The shib Linux documentation uses RedHat 7, 8 or 9 as a base to start the install, but I'm attempting to install the modules in my Gentoo environment. Just to see how different the binary build is I got the binaries . . . after some playing I got a few of the Gentoo-installed libraries working but there were so many question marks I moved on to build from sources.

In essence, the shib documentation has you build a complete set of libraries and install them in /opt/shibboleth. Rather than temp fate, trying to incorporate existing libraries, I built all the sources (even if I already had the current libs in /usr/include):

When I got to compiling OpenSAML the compiler was croaking, I guess OpenSAML 0.9.1 doesn't compile with gcc > 3.2 (I'm using 3.3.2). Checked the head out of cvs, recompiled newer version of libcurl, and OpenSAML compiled without issue.

Compiling libapreq for shib on Gentoo was slightly different, apache include files are in /usr/include/apache, not /usr/local/apache/include as indicated in the shib documentation.

The compile of Shibboleth 1.1 failed miserably, I assume because I am using OpenSAML from cvs head. I got the latest version of Shibboleth source from cvs and the compile went through without a hitch.

When trying to start Apache the mod_shibrm wasn't there and I discovered that in the configure args you need to put in a --with-apache-13 to get the module to build (maybe something wrong with the apxs).

And the test script works . . . wohooo! Apache now started, with mod_shibrm module. Now onto configuring.

Posted by mike at 6:22 PM

January 25, 2004

80G for $19.99

Someday we'll look back and laugh at this, but I see in the Sunday advertising that Office Max has a Western Digital 80G SE drive for $19.99 (after 3 rebates). That is such a steal, I'm having a hard time grasping.

Pete points out that by now there's a good chance they are all gone, but I'm going to try first thing in the morning. I guess I'm a little behind since I don't usually look at the paper until Sunday night and even if I did see it earlier don't typically shop Sunday.

Update: I got to OfficeMax at 8:30am, but all the drives were gone. They did have the 100-disc spindle that was on sale for $20 with a $20 rebate (making them free) so I picked one up.

Posted by mike at 10:52 PM

January 24, 2004

More Evidence Against 64-bit Binaries

Pete pointed this article out to me, some tests to determine difference between 32 and 64-bit version of applications.

Glad to see that my thoughts we're pretty much on track.

Posted by mike at 2:58 PM

January 16, 2004

One Year of Weblog

A note is in order, recently passed a year since I started these random thoughts.

Posted by mike at 8:59 PM

XML - Cleaning up Special Characters

I've written about using special characters in FOP transformations before, but haven't delved into the continuous headache that special characters pose in our XML authoring and web-rendering environment.

I'm setting out to solve this problem, starting today (although it's 5pm on Friday of a three-day weekend so it will really be on Tuesday). The gist is that we get input from many places, very few of which make sure the XML is well-formed. In attempting to transform the XML we end up with a parser error, caused when our XML parser cannot get past the invalid characters.

Having XML in a bad form has caused us much headache, we continually are cleaning up data *after* someone has reported a 500 error. The better solution is obviously have a library somewhere that cleans up stuff.

I'm starting with an exploration of existing documents on the subject:

A lot of these documents are from more than a year ago, guess we're a little behind. We started storing our documents in XML back in 2001, but after the initial work to get the authoring environment and rendering done we've been focusing on other things.

Posted by mike at 6:04 PM

January 14, 2004

Using IO::Handle & IPC::Open2

Solved a small problem today that got me up to speed on a few things I haven't looked at in awhile, namely Tidy, IO::Handle and IPC::Open2.

We clean up HTML with Tidy before attempting to convert it to our flavor of XML. Back when we started using Tidy (2000), it was only available as a command-line script. That has since been changed and you can now get it in library form. Tidy does an excellent job of creating good HTML from bad.

We wrote a perl module, HTML::Tidy (looks like someone else has since had the same idea). The module is simply a convenient central point for cleaning up HTML files or strings.

The puzzle today was that the library seemed to no longer work, the return was coming back empty. On further inspection I found a section of code like this was causing the problem:

my ($read_fh, $write_fh) = (IO::Handle->new(), IO::Handle->new());
open2($read_fh, $write_fh, $tidy_command);
$write_fh->print($untidy_html);
my $tidy_html = <$read_fh>;
$write_fh->close();
$read_fh->close();
The Perl Cookbook example is very similar, and didn't seem to work. A little poking around and playing with the code and I discovered that nothing gets into the read filehandle until the write filehandle gets an end-of-file signal. A very slight change and we're back in business:
my ($read_fh, $write_fh) = (IO::Handle->new(), IO::Handle->new());
open2($read_fh, $write_fh, $tidy_command);
$write_fh->print($untidy_html);
$write_fh->close();
my $tidy_html = <$read_fh>;
$read_fh->close();

This was one of those "how did this ever work" situations. The calling script hadn't been used in many months, maybe even a year. In that time it's quite possible that everything (Tidy, IO::Handle and IPC::Open2) had been updated and the HTML::Tidy broke on some interface issue with one or more of them.

Posted by mike at 6:10 PM

January 9, 2004

RCA a Little Off on iPod Comparison

I'm watching an interview with RCA's Dave Arland showcasing the Lyra Jukebox, a very small 40G music player with FM radio. Price is "just over $400."

When asked why the Lyra might be chosen over the iPod Arland says:

These products work with the PC, so since most consumers have PCs, not Apple computers, you are able to easily take the mp3s files you already have and easily drop them right into a device like this. Our products typically have more memory and they cost less . . .
I guess he must have missed the iTunes for Windows announcement back in October. The clip was filmed at CES, the same event where HP made it's announcement about the HP-branded iPod and iTunes being installed on all shipped PCs.

I'm annoyed when people at organizations don't keep up to speed on what the competition is doing.

Posted by mike at 4:45 PM

Final Word - Sun v240 and v210 working with Lightwave SCS820

Update: In finalizing the cabling I realized the cables I was using in these machines are straight-through cables, not ethernet.

Finally can report a conclusion to the battle getting two Sun machines integrated into our console server infrastructure. We have a Lightwave (now Lantronix) SCS820, secure console server which we use to remotely administer the machines. Works like a champ.

Up to this point, all of our Sun machines (U60, E250, 280R, SS20) have had a DB25 serial port on the back, which nicely converts to RJ-45 ethernet via a Lightwave adaptor. The v240 and v210 have RJ-45 ports on the back, which led me down a path of many Cat-5 cable types and adaptors trying to find the right configuration. I got close with a straight ethernet cable, but a lot of binary junk on the console and no input kept me looking for the right answer. When I thought I had no more options I dug up the pinouts for both the v240 and the Lightwave and put the RJ-45 crimper to use, but the pinout of the v240 has two signal grounds and the Lightwave has one SG with both a DCD and DSR (which Sun puts on one pin). I built two cables trying different options with the DCD and DSR but after two tries and no luck I decided I'd spent enough time on this and was going home to get online and maybe on the phone with Lantronix.

In a somewhat humorous last-ditch effort I decided to get out all the adaptors I had and string them all together in some Dr. Seuss-like contraption. In my searching I found an RJ-45 coupler. Wouldn't you know it, I put that between two ethernet cables connected to the console and server and I've got a boot prom.

The final word is, if you need to connect a Sun v240 or a v210 to a Lightwave (Lantronix) SCS820 or SCS1620 (might include more Lantronix models) you need to get Lantronix part 200.0225: RJ45 to RJ45 rolled, Sun Netra and Cisco equipment. The adaptor actually says it's for a Sun Netra, but I guess the v210 and v240 pinouts are the same as on the Netra.

A huge relief to have this figured out.

Posted by mike at 2:52 PM

January 7, 2004

Frustrating Day Battling Pinouts for Sun Servers

After yesterday's trouble with connecting servers to the console server I went back with a standard ethernet, rolled and crossover cable. I attempted every combination possible, using null-modem and DB9 adaptors after straight connections wouldn't get anything.

The closest I got was with a ethernet cable . . . was able to get the machine to spit out a few words mixed with some binary garbage onto the console, no keystrokes got through though.

Now I'm working through documents trying to better understand pinouts for Sun servers to see if I can actually come up with a theory on what cabling *should* work.

Interesting find in the Sun docs indicate that I need to build my own cable:

the serial port may also be connected to a serial console/terminal server, but may require a custom RJ45-RJ45 cable depending on the pinout of terminal server ports. The
pinout of the ALOM serial port is provided in the ALOM Online User
Guide. An alternative to creating a custom cable, is to create a chain
using the standard serial adapters and cables that come with the ALOM
server and the terminal server. This chain may need gender changers and
null-modem adapters included in the middle.
Another document explains that either I can buy a special adaptor or build the cable myself.

Posted by mike at 9:08 PM

January 6, 2004

Lynx (almost) Saves the Day - Getting Sun Machines on Console Server

I'm in the data center today, attempting to get some machines (Sun v210, Sun v240) connected to our console server. I notice that Sun machines no longer have a serial port on the back, but two RJ-45 ports, one says SERIAL MGT and the other NETWORK MGT. Hmmm . . . what to do, don't want to have to go back home to get documentation and then come back.

I don't have a spare public network connection for my laptop, but I can get on the private network by plugging into the switch in our rack and assigning my laptop a 10.0.0 address. I ssh via the private lan to a machine on the public network, and then ssh once more to another outside box with Lynx. Browse to the Sun site, track down the PDF, scp it back through the chain of machines and I'm reading the documentation.

The extent of instructions are:

connect to the server using an RJ-45 patch cable
From poking around a little I find that the correct port is SERIAL MGT, but there is no indication what kind of cable to connect. I stick in a normal Cat-5 ethernet cable, the only kind I have available, and no luck. Had to leave without completing the mission, unsure of the exact flavor of "RJ-45 patch cable" needed.

After a conversation or two and some poking around it appears that the correct cable is either a rolled or a crossover cable. Back to the data center with new cables as well as my RJ-45 crimper as backup.

Posted by mike at 10:24 PM

January 5, 2004

No More PDA for Me

I'm trying something . . . an expirement. I've been using a Palm for 4+ years now and am putting it back in it's box for awhile. I've been debating/deliberating (with myself) on what to do about personal organization for some time and decided to revert for awhile and see if I can be more satisfied. I think the original idea to have a Palm was driven more by love for technology than how well a PDA would suit my needs.

Saturday I purchased a nice looking, 5.5"x8", week-at-a-glance, bound journal (not one of those cheapo spiral bound ones).

What's the Problem with the Palm?
First, I'll make it clear that I'm not willing to spend a lot of time customizing the Palm. When I first got it I put PDF reader on it and synced some PDFs up (reading on the Palm screen stinks), a document editor etc. The Palm stinks as a tool for reading or editing documents, especially when my laptop is usually nearby. I had AvantGo installed for awhile but again, reading documents (even short articles) on that small screen wasn't acceptable. Same story for maps, games, etc. After the novelty wore off my Palm usage boiled down to contacts, appointments, some meeting notes, and lists (wish list, todo etc). I think I used the calculator once or twice.

Contacts and lists work quite well. Easy to get to and change, but not needed nearly as much as the calendaring and meeting notes.

The core problem I had with the current Palm interface is in calendaring and difficulty in referencing past or future events and organizing notes for these events (I guess it might just be the Palm OS, but I suspect they apply across all PDAs).

The Palm was great at showing today, the appointments, reminders etc. When it came to the past or the future the interface makes quick navigation difficult. For example, two months ago I had a meeting and took a few notes. I remember very little about it but know there was some key information I should reference for an upcoming project. A search doesn't help because I don't remember the details. I go into the "month" view and see each month with tiny indications on each day that I have something entered. I click through each month, clicking on all the days with events trying to track down the note I'm looking for. When I find a possible match I click on the meeting, click on "details" and then click on "notes" to see what is there. If not it's back out to the calendar view and back onto the next day.

Yea, maybe it's me, maybe I need a brief course on using the Palm OS in the most efficient manner, but I anticipate leafing through my pages will be a better experience.

I also am looking forward to the year on paper because I like writing (in controlled quantities) as well as being able to flip through the months and remember the events, something I've never done on my Palm.

This change has shaken a few people, I still believe in and love technology, just not for every single thing I do.

Posted by mike at 5:59 PM

January 1, 2004

Visit to Dr. Suess Hometown and Eric Carle Museum

On a short New Year's day trip to western MA to visit Springfield, the hometown of and memorial to Dr. Seuss (Theodor Seuss Geisel) as well as Amherst, home of the Eric Carle Museum of Picture Book Art (not far from Eric's current residence in Northhampton, MA). Yes, a trip centered completely around the kids (what isn't?).

The Ludlow, MA Comfort Inn high-speed internet works as good as anythere (1377 kbps down, 329 kbps up).

Posted by mike at 11:01 PM

One Explicit Goal for the New Year

I rarely formalize goals, but this year I am hoping to get more serious about balancing my technical reading with non-technical. This year I got through a few leisure reads but not as many as I think is good for a person.

My mother-in-law and I got to poking around and stumbled into a "Best of" books
list that inspired me to get going on reading some classics. We found three lists (among a slew) that we liked and I wrote a little script to aggregate the list, remove duplicates etc.

Combined Best 20th Century Books

My goal is to squeeze 12 books from this list into my regular reading in 2004.

Posted by mike at 1:43 AM