« syslogd stops? is that allowed? | Main | Arrived in London (saw sun set and sun rise on same flight) »

June 8, 2004

Using XML to Dump and Restore Data

The past three days I've been working on a project to move a collection of documents from the Tufts instance of our software to University of Natal.

Why? University of California San Francisco has a huge collection of AIDS-related documents, and had funding to put a subset of the documents online in South Africa, at the University of Natal where the Tufts course/content management software is installed. The University of Natal was being updated, so the documents were stuck in our system with a promise that we would transfer them.

Originally I had thought I'd do a somewhat complicated mysqldump, but we decided that even though there was no chance of overlapping IDs (because our system is using IDs in the 100,000s and they're in the 100s), it would be better to do an ID independant dump/import with all the documents and relationships.

So I used XML to represent the relationships, essentially developing a recursive function that dug into each folder and output all the column values. In the end, I captured around 400 documents at varying levels up to 5 folders deep. The dump and import scripts took about the same amount of time to write, and after practiving the import a dozen times in a dev environment I copied the tarfile (xml tree and supporting PDFs, etc) to University of Natal and imported without issue. Was a little nervous that there might be issues with Perl libraries, they are running on Linux (we're Solaris), but it went off without a hitch.

Posted by mike at June 8, 2004 8:41 PM