MarcEdit

What is MarcEdit for?

Fixing issues with authority control.

A screenshot of a list of subjects from the Queanbeyan Palarang Library Council

Whether it’s because of copy cataloging or accident, metadata can become inconsistent.

This example from the Queanbeyan-Palarang Library service has separate subject headings for colonisation and colonization. The spelling is also incredibly strict when searching. One or the other will be invisible to the library user.

MarcEdit has tools to fix these authority records. You can either manually change variant headings, or link the data to an external service for example checking against Authority records at the Library of Congress.

Repairing MARC records

MARC records are notoriously fragile. When going through a batch of records, MarcEdit will notify of broken records. Then the user can select whether to allow MarcEdit to try and ‘heal’ them, or they can be moved to a separate location to be considered later.

Translating between data formats

MarcEdit is used to translate data sets between different formats. According the the Library of Congress website, the translations that MarcEdit can do are:

MARC >Text,
Text >MARC,
MARC >MARC21XMLslim,
MARC21XMLslim >MARC,
MARC >Dublin Core (unqualified),
MARC >EAD (example using the NWDA best practices),
MARC21XML >MODS and
MARC21XML >OAI Dublin Core.

Crucially, MarcEdit allows records to be converted between MARC-8, the Marc format for coding characters, and UTF-8 which is the standard form of encoding characters across scripts.

It’s possible to write plugins to convert between other data types.

Moving data to the web

Uniform Resource Identifiers (URIs) of which URLs are a subtype, are a kind of data which needs to be added into bibliographic records for them to be accessible via the Web. In the future, URIs will allow library resources such as digitized archives to be accessible via search engines like Google.

Testing out the program

Finding a dataset

Library of Congress has MDS Connect – a MARC open-access system.

MSDConnect Books I selected Part 1 under 2016, UTF8.

Unzipping

When I got the dataset on my computer, I realized I didn’t have a program that could extract the data. I found a program called 7-zip to do it. It was a simple matter of opening the file within 7-zip and selecting a destination file.

Inside MarcEdit

I noted the file had the extension .utf8, so I selected UTF8 as the default character encoding within MarcEdit, and selected to translate to MARC-8.

A screenshot of the MarcEdit Tools page interface. — Note that I chose the dark theme on startup. Not all text is optimized for visibility within this setting.

A screenshot of the results which says 250 thousand records were processed in 41.7 seconds. — It took 41 seconds to convert all 250 000 records to MARC-8.

Once the conversion was complete, I was able to open the records in the Edit mode.

A screenshot of the editor window displaying a MARC record. — Each page displays 100 records, and there are 2501 pages.

Within this window I could edit the data sets manually, add fields, and search within the file. I tried the validator, which uses a rule file to check if the MARC files are legal. It does take some time, particularly on large datasets.

Surprisingly? The validator came up with some possible errors; a lot actually. The error Invalid data came up in particular with the qualifier Indicator can only be 013, which I think might be a problem that occurred by translating into MARC8. It will need further investigation, and probably make more sense when I have a better understanding of library data in general.

Converting from Marc text file to Marc file

I used MarkMaker in the Select Operations dropdown box. This converted the file to an actual MARC binary file, which runs much quicker than the text version. I then decided to run the validator tool again. Still had the same kinds of error messages. I think it may be a problem with the first conversion process.

Conclusion

This is only really a taster of the possibilities of MarcEdit. I’m particularly interested in how I can use it in future for linking URI’s or things like that. Other things, such as the ability to search within the data set – I know they’re possible but I haven’t yet got them to work on my program. I think it’s unlikely that the issue is with the data. I have either been trying to use it incorrectly, or the installation went awry. There are several install repair tools available on the home website, but figuring out which one I need and how to do it correctly is taking time.