New Search Options for the Railroad Museum of Pennsylvania Library and Archives

Published Nov 20, 2015

The Railroad Museum of Pennsylvania maintains a collection of tens of thousands of resources related to railroading in the Commonwealth of Pennsylvania. The collection is diverse - historical, political, cultural, social, economic, and technological - and emphasizes its development from the 1830s through the present day. Every manner of printed materials is in the collection, from annual reports to timetables, as well as an extensive set of photographs and negatives. A reference library contains books, periodicals, railroad association and union publications, government documents, and trade catalogues.

Public search access has been available for many years through an interface developed by Andornot using our Andornot Starter Kit. However, as with all websites and applications, renewal and refurbishment is necessary every few years, to keep up with technology standards and user expectations. In particular, we noticed that the search logs indicated no records found for many user searches, so we knew that some new features were needed to help users connect to resources.

In 2015, the museum began a project with Andornot to develop a new, modern search engine using the Andornot Discovery Interface (AnDI). This is now available at http://rrmuseumpa.andornot.com

"We had two primary objectives – to replace an earlier online catalog search system that was sagging under the growing weight of tens of thousands of new records and images, and to make the system more useful to users who have become accustomed to the more intelligent finding systems currently available in so many places on the web. Andornot delivered admirably on both needs." -- James Alexander, Jr., the museum's webmaster and lead on this project.

Large Collection Needs Advanced Search Features

The new search offers users access to over 270,000 records from both the library and archives databases, which were formerly separate. 80,000 of these records have digitized photographs available online. With such a large data set, advanced search features are needed to help researchers uncover resources of interest to them.

AnDI's Apache Solr search engine excels at indexing large data sets. The more records that are available to it, the better it can analyze words and perform frequency analysis on them, one of the many algorithms it uses to deliver relevant results first.

Key to the search process are the facets that allow researchers to narrow their initial search by many criteria, such as the names of railroads, individuals, corporations and other organizations, subjects, geographic places, and dates.

As with all AnDI sites, users can view brief and full records, view photographs in a gallery layout, save records to a list, share search results on social media, and of course, access the site as easily from a tablet or phone as a desktop web browser.

The small selection of videos included in search results are published through the museum's YouTube channel to expose the museum to the widest possible audience. A YouTube player is embedded in search results for playback within the new site.

AnDI Handles Spelling Variations

As is to be expected with such a large collection, entered over many years by a variety of people, spelling variations and typographic errors have crept in. AnDI helps users locate resources despite this, using two key features:

1. The Apache Solr search engine in AnDI is very, very good at parsing terms from records and suggesting correct terms based on what's in the records and what user's search for. These appear in search results as spelling corrections and "Did you mean?" suggestions, which a user may click to try a different search.

2. A synonym list created by museum staff relates correct terms to some of the many variations that appear.

For example, the New York, Susquehanna & Western Railway appears in around 7,000 records, but with the name Susquehanna spelled at least 11 different ways. Given that searchers may not enter the correct spelling either, the search problem is not trivial! The combination of the synonym list and Solr's other suggestions and corrections helps ensure that no matter how either the data was originally entered, nor how a user searches for it, AnDI can return relevant and complete results.

A video introduction and written search help are both available to introduce users to the site.

Inmagic DB/TextWorks for Back-End Data Management

Behind the scenes, the museum continues to use Inmagic DB/TextWorks to manage these records. This database management system is invaluable to them in managing metadata, selecting standard metadata from validation lists, and providing access to volunteers for every-day data entry.

The museum's search engine continues to be hosted by Andornot as part of our managed hosting service.

"While Andornot had available a well-built modern search system in AnDI, they spent a lot of time with us learning about our particular users' needs, helping us think through the most useful processes, and refining the search experience. They know the business of both managing records internally and helping users find what they need.

In the process two things happened – we learned more about the strengths and weaknesses in our data entry processes, and the usefulness and public recognition of our holdings were enhanced through improved web access. The search help video was a real plus, and they worked with us in making our search page both functional and attractive." – James Alexander, Jr.

We're very pleased to continue our work with the Railroad Museum of Pennsylvania. Contact us to discuss upgrades and search options for your museum collections.