« May 2003 | Main | July 2003 »

June 29, 2003

Planning for Portland

Have been putting together package of Portland information for next weeks trip to OSCON. Haven't spent much time in Portland so want to maximize non-conference time to see things, get a glimpse of what Portland is all about.

My Max/TriMet visitor passes arrived this past week, will come in handy to get to the hotel from the airport, and hope to use to get around and see some stuff. Glad to see the hotel is close to the Tom McCall Waterfront park which should make for a nice running route.

Asked a couple of friends who either grew up in Portland or have recently lived there (no friends currently living there now) to recommend places to see/eat. Powell's is a must, Finnegans Toys (old-school, authentic) and the Fox Tower theater (indie/foreign films?) were recommended. For eating, Pearl Bakery, Ole Ole and Elephants Deli were passionately spoken of. One friend demanded that I track down Snow White crepes (paragraph at end of article), a cart which appears in a downtown parking lot around lunchtime. Of course the zoo, Rose Gardens, Forest Park might be worth visiting.

Now just need to finish up my presentation (I turned something in by the deadline but think it still needs some work).

Posted by mike at 9:50 PM

June 27, 2003

Branching All Modules in CVS

I'm not sure if this is recommended, but it works. Branching our cvs code today and I got tired of issuing a branch statement for each cvs module so I cd'd into the CVS_ROOT and issued this command:

cvs rtag -b Sep_2003_pre2 *

Much easier than going down the list of modules, and a sure way to get everything tagged.

Posted by mike at 4:08 PM

Triple Head Freak Show

Andy recently got his triple-head configuration going. He was slated to get three identical black monitors, but his announcement of leaving in August nixed the order for the third one (so he's using a spare).

Ever since he's has a stream of curious onlookers (most of them library staff who are passing by). Even though it looks like overkill to most, the case is pretty easy to make when he explains. The left monitor is used for email/IM, middle monitor for Emacs/coding and right monitor is designated for the browser. He's still tweaking the arrangement. The books used for monitor stands has also sparked some interesting conversation about what books are generally considered useless.

Honestly, I don't know why more people don't do dual monitor. You can get dual head cards for under $100 and CRT monitors are cheap (the Dell 19" black pictured here was ~$200). Maybe I'm not in tune and it's more popular than I think.

Andy runs Gentoo, it took him a few hours to get it set up (mostly doing research on how to modify the X11 config file).

Posted by mike at 8:22 AM

June 25, 2003

Our Linux Plan

One of the most exciting pieces of the server improvements plan I'm working up is the possibility of moving down the Linux path. A few of the developers have dreamed of this, and we've dropped hints about it from time to time. I guess with the good press Linux is getting (CIO's front page article caught a few eyes) and worries about Sun's future, the mention of using Linux no longer throws people into a state of hysteria. The project director is letting me drive this decision, with an open mind that Linux may be the better choice.

So what is necessary to prove the case for investing in Linux over Sun?

Concerns & Issues Which Have an Obvious Answer

The number of other high-profile companies who have switched carries a lot of weight. Google is the pillar example, the story of their conversion to Linux dispells a lot of doubt. I found this site full of information about dozens of companies who are using Linux. If Google, ETrade, Amazon, FedEx etc are doing it then we know the path is well worn.

Financially it's pretty straighforward . . . near impossible to find a high-end Linux box that costs more than a Sun. The highest levels of hardware and software support from vendors like Dell, IBM ring up considerably cheaper.

Support concerns typically arise during Linux conversations. Even though the service level for Linux can be made to match what we have form Sun there are concerns about how good that support is (we've had good luck with Sun).

So we have to throw away our expensive Sun boxes? No, we continue to host our MySQL database on our SunFire 280R and use our E250 as both a slave backup to the database and a machine to do indexing etc. Having the critical data on these Sun-supported machines eases the pressure from ensuring that our Linux boxes are supported and as robust as the Sun boxes. Using Sun boxes for highly-available data is not a new idea

Concerns & Issues Without an Obvious Answer

How sure are we about Sun's future? We have a lot of expertise on Sun and are quite comfortable doing administration of the machines. Does it make sense to switch and add another set of variables to our environment?

We subscribe to University Systems Group services at Tufts. They build our machines with Jumpstart, back them up, provide monitoring services and in general are a wealth of knowledge when dealing with sysadmin. Can those services be utilized on the Linux boxes? How much work is it to provide the services for ourselves (if USG doesn't support them)? How much do we alienate ourselves by embracing technology that isn't "standard" within the organization?

Building the machines (carryover from USG question). Since we won't have someone else building our machines what methods do we put in place to build these boxes. Do we do some sort of ghosting, or use a base install from a CD and have a set or RPMs to dump onto the machine, etc?

We have a console server which connects to serial ports on our Sun boxes. The console server docs claim it can be used with Linux machines as well, what kind of specs are needed on the machine to allow this? Or do Intel boxes typically come with a serial port which by default can be attached to a console server?

As we plan to add load balancing across these machines is there anything about the load balance process that is OS specific (checking availability, CPU load)? Most likely a silly question.

The data center uses Legato for backup, which is available for Linux. Will there be any issues backing up Linux boxes?

Is there any reason to doubt Intel/Linux will be a good long-term solution? Is AMD a better choice that Intel for Linux?

Can we believe that if the Linux movement took a drastic turn and went bad that investing in it is still a good idea? The mentality with Sun hardware is that when you buy a box it's going to have to last 10 years to get your money's worth. Not easy to shift to a mentality that buying servers is a shorter-term commitment. If we invest in four or six Linux boxes and Linux went sour tomorrow we'd still be able to let the boxes live out their lifetime. Probably a better choice than buying Sun boxes and finding out tomorrow that Sun's gone under and we've got 10 years until we have "used up" the hardware.

Will CPU, I/O, memory on Linux boxes perform as well (or better) as Sun? Another silly question?

That's all for now . . . but more to come shortly.

Posted by mike at 4:16 PM

Building Redundancy

We're a pretty small shop, when I came here two years ago everything was on a single Sun Ultra60 (that was on a desk in an office). That included an instance of Apache for production, Real server, MySQL, and a development port running a minimal set of Apache processes for each developer.

A bad arrangement overall.

Over the last two years I have built that into a system with four machines. One webserver, one database server (with attached A1000 hardware RAID array), a test machine and a development machine (all Solaris). I recently stuck additional network cards in each machine and a switch between them for private data transfer between machines (primarily MySQL).

This system is far from ideal, the expectation these days is that under no circumstances should our service be unavailable. While our servers can usually handle the traffic, it is hard to ensure that services will be available because problems in a number of places could bring the system down for hours (or days).

I've been thinking a bunch about adding more hardware to provide redundancy, and how we'd go about that.

Areas where we're doing the right things with our system:
1. All machines have the OS installed on separate drive
2. All drives are RAID 1 (mirrored)
3. data is synced out to machines from central location (making a failed machine easy to rebuild)
4. Machines are in the Tufts "data center" which is physically secure, has UPS, provides nightly backups, and electrical feeds from two separate power grids (the building is on a town line and gets feeds from both towns).

Areas where we could easily get screwed:
1. Lone webserver machine
2. Lone database server machine
3. Each machine has only one connection to public network
4. Each machine has only one connection to private network
5. No backup switch for internal network

In addition preformance could be improved with:
1. Firewall appliance, removing burden from machine CPU
2. Gigabit private network between machines
3. Hardware SSL encryption, removing burden from machine CPU

So I'm writing a proposal to add machines and other pieces of hardware to our system to give us more redundancy and better performance. In general I have support to do this, we've even have some funds which could be applied to this. The plan needs to be built in a way that se can bite of chunks of it as funds are available. Most likely be something that is realized over the course of a few years.

I think this is an incredibly exciting project to be heading up. I enjoy researching, gathering data, getting new hardware, building new machines etc.

Posted by mike at 2:33 PM

Visit to Krispy Kreme

I made the trip to Krispy Kreme this morning. The line was out the door and there was a group of police officers directing traffic to a large overflow lot across the street. I was not deterred and found once I got in line that the staff was well organized. Only had to wait about 5 minutes. Employees were coming down the line and taking orders, once they got to me I was free to approach a cash register and buy my dozen.

The shop is adjacent to a large parking garage where commuters park and catch the subway, so most people grabbed between one and five dozen and headed for the trains.

The kids and I sat in the shop and ate, I stopped after my 5th, every one as good as the last. Probably shouldn't make this a regular practice.

Posted by mike at 1:55 PM

June 24, 2003

Krispy Kreme has Arrived

Yesterday a Krispy Kreme opened not far from where I live. Apparently I'm not the only one who is excited about this. Hundreds of people gathered starting at 4:30am. One guy even came down from New Hampshire to grab a doughnut.

These are *really* good doughnuts. I first had them in Knoxville, TN and then more recently in Utah with Pete.

Will be visiting there tomorrow . . .

Posted by mike at 9:53 PM

linuxworldexpo.com running on IIS?

I stumbled into the netcraft site today, which is starting to collect data on the OS/Webserver/Hosting used by various sites.

the OReilly stats are a pretty good representation of what netcraft is trying to represent.

Kruckenberg machine looks about right.

No suprise what Apple's got behind the scenes.

Suprised to see that linuxworldexpo is running IIS? How could that be?

My faith is shaken.

Posted by mike at 8:27 AM

June 23, 2003

Slight Shift in Responsibilities

Have spent a good deal of time lately thinking about my job and the responsibilities I gravitate towards. Over the past few months I have felt an increased level of interest in the future of our project, whereas before that I was fairly happy to sit at a desk and code libraries or interfaces. This has fundamentally changed how I perform, from coding to interacting with the developers/staff/faculty to system administration.

Not sure what brought this on. Maybe getting older and more experienced drives one to look for new challenges. The economy has made me look at a job in a different light and I think about being here at Tufts for a longer time which may spark interest in where we'll be in 5 or 10 years. I think part of it also is that over the course of the 2+ years I've been here I've gravitated into a leadership position. Not that I'm leading the team, but in some ways managing a lot of the technical aspects of the project.

Since we're rethinking the development team I took time to sit down and detail my responsibilities, highlighting areas I would like to develop. The director and I looked at them and tailored the job descriptions for the open positions to compliment the shift in my position.

Technically I'm still the Senior Programmer, but have been/am morphing into a technical lead, providing more technical management and direction than building the application.

In my mind a very exciting development.

Posted by mike at 5:06 PM

June 18, 2003

Tracking FTP Ports

After installing the firewall on a machine I decided I needed just one more package from CPAN so I fired up the CPAN shell. Of course ftp is blocked, so I decided to open it temporarily to get the package and then close it. What followed is a long string of looking at the firewall logs to see what was preventing my packets from going through. I thought it was interesting, have listed the unique machine.port entries in the firewall logs.

ourmachine.35110 -> cpanmirror.21
cpanmirror.21 -> ourmachine.35110
cpanmirror.59077 -> ourmachine.113
ourmachine.113 -> cpanmirror.59077
cpanmirror.5 -> ourmachine.35111
cpanmirror.4 -> ourmachine.35113

So what I don't understand is why the remote machine is trying to communicate with port 4 and 5, it appears to be something dynamic. I can ftp to a number of other machines and successfully pull files via ftp, but for some reason this CPAN mirror is trying to use other ports. Why? I don't know, I did a search on Google to see what might turn up for "ftp port 4" but nothing significant.

Not willing to open those ports I guess I'm left in the dark about FTP and how it works, not that it's important to understand a protocol I rarely use and always block.

Posted by mike at 6:00 PM

June 17, 2003

NFS Stinks (initial experience)

Maybe crawls is a better word. I had this great idea which would simplify some things for our servers, but I'm finding that if nfs is a key piece of my plan, sticking with the old method is probably better.

We have a machine that acts as the primary storage device and database server. The webserver machines are currently set to rsync the data from the primary storage machine and host it locally. There is a delay between the time a file is uploaded to the primary storage machine and when it becomes available on the webservers. More than one complaint has been registered about this delay.

So I got this brilliant idea to do NFS mounting so all machines would share common data. Started with some tests to see how quickly stuff could be pulled from the locally mounted drives. 280Mb file copied off the local drive in avg 20 realtime seconds (.10 usr, 3.8 sys).

Setting up NFS wasn't terribly difficult. share a drive by putting it in /etc/dfs/dfstab and starting up the nfs server (sudo /etc/init.d/nfs.server start). The client was even easier, mounting the drive using the usual mount command a little more syntax:

sudo mount -F nfs >server<:/data /data

My initial experience with NFS is that it crawls . . . so slow I can't imagine how it can be useful (the whole time I've been writing this I've been waiting for the file transfer to complete). Maybe I've got something configured wrong, I recently tested the data throughput on the switch between the machines. Maybe after some more review and thoughts I'll give it another shot.

So for now we'll be sticking with to rsync (and more user complaints). Will post the nfs times when the transfer is done.

write to NFS took 90 realtime minutes (.08 usr, 4.1 sys)

Posted by mike at 3:30 PM

June 16, 2003

15 tons in an Hour

I guess I'm getting stronger. Went to the gym today and decided to do the weight machines. My previous mark was ~18,000 pounds. For some reason I get a kick out of expressing that metric in tons (sorry if it is annoying).

Spent an hour on the machines, not really trying to do anything great, an extra set here and there to see if I could best myself. Was suprised when I checked out to see the total at ~31,000, over 15 tons.

It sounds impressive to me but I know people in there are lifting way more weight than I am (some machines I'm still at 50lbs), and by no means do any of my muscles show through my fatty skin layer.

Posted by mike at 2:37 PM

June 12, 2003

Creating New Set of Firewall Rules

I'm going through the process of creating a new set of firewall rules for our machines. The first reason is because we've got a new machine and it needs the firewall configured, the second reason is because on existing machines we've been experiencing traffic bottleneck at the firewall. Seems like a good time to start from scratch and make a new set of rules (the previous set had been created before I got here).

Been studying strategies in O'Reilly's Building Internet Firewalls and scouring documentation on best practices. In general, the firewall config is simple, each machine can accept and send out traffic on certain ports, depending on the services running on that machine. We have two private networks (one for MySQL data and the other for network jumpstart), which don't pose a huge risk.

I have fiddled with ipchains a bit on Linux, but never gotten as deep as I've gotten into ipf, which I think offers a really nice set of functionality. My approach is default deny (is there any other sane approach?), listing any acceptable traffic followed by a statement to log and drop other packets.

ipfstat was very helpful, even critical. Allowed me to analyze the filtering on a live machine and order by priority so that I could pull rules which were matched more often to the top, speeding up the packet approval process.

To find stats on rule matching:
outbound traffic matches: sudo ipfstat -oh
inbound traffic rules: sudo ipfstat -ih

Will see over the next few weeks if the priority I set on the rules matches usage. Will be watching the logs to make sure I haven't shut down some unsuspecting packet, the machine isn't live for another week so have some time to tweak.

I'm not allowing icmp packets (ping, traceroute), still deliberating on that. Read several docs indicating to turn it off, but many of the examples I've seen have it on. Ping is nice, so is traceroute, but is it important to allow those packets considering how little use and benefit comes from them?

Posted by mike at 5:29 PM

Headphones Going on My List

For a long time I thought the Sony Fontopia MDR-EX70LP ear bud headphones were the ones for me, but reading on andersja weblog I see that I have an even better option, the Etymotic ER-6s. They are twice as expensive, but I really would like to get the best available as I spend a good deal of time on the subway and bus which can be noisy (flights will be more pleasant too).

This will also make it so when I go running or biking on the street I don't have to hear the warning horn of the speeding car and can die instantly instead of enduring the moment of suspense before I'm hit.

Posted by mike at 1:38 PM

User vs Technology Driven Development

Got into a discussion today with a co-worker who heads up the user support for our project (he's got a good set of technical skills). Realized during our discussion that on many of our projects, what I thought was being "user" driven was just on the surface and not penetrating much further than the initial idea.

We don't necessarily plan to be development driven, for the most part we keep our ears open to suggestions, even do some usability studies from time to time. I think we sometimes believe that is good enough and after getting some initial thoughts on the functionality we turn it over to the developer and the user is brought back in to see the final product when it's done (or in final revisions).

We really need to focus on having the end user and user-support people involved to a greater extent, so that when the developer gets involved there are very little functionality decisions to be made. Right now the developer is handed an idea like "let users upload images" with very little additional information. The developer is then responsible to make decisions about how and what the user will do with the system, having no idea what the user really needs. In addition, the programmer is close to technology so the solution frequently comes out reeking of tech know-how and requires numerous revisions just to make the interface friendly, at which point the tool may still not do exactly what is needed.

It sounds obvious writing this all out, but is amazing how the focus gets lost when racing against a deadline or trying to juggle many projects.

Posted by mike at 1:10 PM

June 11, 2003

Installing Java 1.4.1_03-b02

Installing the latest Java SDK on a Sun box and wanted to note the order of package installs is important to a smooth install.

1) Remove any existing Java packages which are installed in /usr/j2se (the default Solaris install is OK to leave there at /usr/java1.2)
2) Install the four packages in this order: sudo pkgadd -d . SUNWj3rt SUNWj3dev SUNWj3man SUNWj3dmo
3) Add the 64-bit packages in this order: sudo pkgadd -d . SUNWj3rtx SUNWj3dvx SUNWj3dmx

Sun documentation has more details.

Posted by mike at 9:27 PM

June 6, 2003

Denver Airport Sign Lies, Says "Wireless Here"

I'm sitting right under a sign in the Denver airport that says "wireless internet access here", service is provided by AT&T & Intel centrino. But theres *no* signal. Looks like the marketing got ahead of the technology implementation.

If I was slightly geekier, or had a fellow wireless seeker in the area I would walk around with my laptop open and see if I could pick up a signal.

Posted by mike at 10:02 PM

June 5, 2003

Run Along Boulder Creek

This morning I ran along the Boulder Creek Trail, a 35-mile stretch that runs through Boulder, CO. The Creek is raging from the recent rain, which makes for nice background noise (when I don't have music on). The trail is lined with trees and runs from the downtown area, along the University of Colorado, and then out into undeveloped land. A stark contrast from the Chinatown YMCA, or the streets of greater Boston. The return trip back into Boulder was especially nice, getting glimpses of the mountains to the west of Boulder.

Posted by mike at 4:07 PM

June 4, 2003

What IT Folks do when Network Connection Goes Down

Am sitting toward the back of the room in a presentation at the internet2 middleware camp. During this first session of the day the network connectivity went down. Interesting, and kind of funny to watch a roomful of it folks launch configuration tools, command line pings, etc to explore what might be causing the problem.

Most users ended up closing the laptop after 10 or so minutes but some of the die-hards set up periodic pings in command line windows which sat to the side of their note-taking screen.

Kind of ironic that network availability is an issue at an internet2 conferece.

Posted by mike at 1:25 PM

June 3, 2003

Subclasses of a Programming Position

2/3 of our programming staff (2 people) have announced they are returning to school in the summer/fall. I can think of very few reasons why this is good, but one of them is it allows us to rethink all three positions and bring two new people on under modified positions.

I've been working on ideas for what roles each of the 3 programmers will play (myself included). One of the complaints we hear frequently from our director is we all focus too much on writing/tuning the underlying Perl modules, which means end users aren't seeing much of new or improved functionality. Having someone dedicated to creating/improving user functionality is highest priority as we're looking to fill these positions.

One idea I've been tossing around is to make one of the positions a "functionality programmer" and the other a "library programmer." I know in the end both programmers are writing handlers/scripts/modules which can fall into both functionality and library, but the idea is to make the development efforts of the programmer driven by one or the other end result.

We took a shot at this in the past, having one of the positions be a "front end" programmer. It didn't work so well because the programmer had already been working for us for awhile and frankly, was quite skilled at writing/refactoring the underlying libraries. Made it hard to limit his work to frontend when he was the most qualified to do the library work.

Will be interesting to start looking at resumes and see if breaking things up like this is even close to reality. I recently helped with interviews for a position in another department at Tufts. The job description was tweaked after the first round of interviews because we realized that we just weren't going to find what had been specified in the posting.

The first position opens in July, the second sometime in August.

Posted by mike at 9:06 AM