We recently had a client ask us if we could help them convert their DB/TextWorks library catalogue database to MARC format, for submission to another search system.
Our initial reaction was "Sure, easy-peasy." After all, we're librarians, we know MARC well, we're experts in DB/TextWorks, and we've done this before. How hard could it be?
Ha! Famous last words, and all that.
To be fair, there are a few wrinkles:
- The conversion is not a one-time affair, but rather something the client would like automated and running on a regular basis.
- Not all records in the database are to be converted.
- MARC is not the simplest of formats. Whether MARC or MARCXML, there are a fairly rigid set of rules that must be followed to create 100% valid, standards-compliant MARC records.
Nonetheless, since MARC has been around rather a long time, there are a plethora of tools available for creating MARC records from other data sources. There's Inmagic's own MARC Transformer, which works directly with DB/TextWorks databases, as well as various free or open source tools from the U.S. Library of Congress and other agencies, MARCEdit, Balboa Software's Data Magician, and others.
We also have our Andornot Data Extraction Utility for automatically exporting data from a DB/TextWorks database and manipulating it into a variety of formats.
Ever optimistic, we figured some combination of these tools could be strung together in sequence without too much effort to create a solution for this client. We wanted to avoid developing a DB/TextWorks-to-MARC conversion program from scratch as it would be quite time consuming, mostly due to the MARC format requirements themselves.
Several tools, upon closer investigation, proved to be too ancient to run reliably on a modern Windows server, or couldn't work with the current version of DB/TextWorks. Others proved almost impossible to use in an automated fashion. They could be useful in a one-time manual conversion to MARC format, but not in the hands-off, automated workflow we needed.
The exploration of this issue was an interesting exercise in seeing how old data formats and old programs age and become harder to work with.
The recipe that baked the cake in the end was:
- Use the Andornot Data Extraction Utility and the Inmagic ODBC driver to extract data from the DB/TextWorks database to a pseudo-MARC plain text format.
- Use a custom-developed PowerShell script to manipulate the records in this file to handle some of the quirks of the ODBC output and to more closely adhere to the MARC format.
- Use the command line interface to MARCEdit to convert the pseudo-MARC to MARC Communications Format files.
- Upload the MARC files over FTP to the destination server.
- Manage all the moving parts through a PowerShell script and log the steps and results to a file for easy troubleshooting in case of problems.
- Run the script nightly as a scheduled task on a Windows server.
When written like that, in hindsight, it sounds so simple. And in the end, it was, and works well. But the journey to arrive at this solution was one of the more challenging small projects we've undertaken, considering how simple the task sounds at first.
We hope this will help you if you have a similar project, but don't hesitate to ask us for help, now that we've worked through this.