Heritage Burnaby Website Wins Heritage BC Award

by Kathy Bryce Wednesday, March 02, 2016 2:23 PM

The City of Burnaby’s Heritage Burnaby website (www.heritageburnaby.ca) has won one of Heritage BC’s 35th Anniversary awards. Heritage Burnaby won in the category of Heritage Education & Awareness for the upgrades in 2015 to the Heritage Burnaby website and search engine.

This site was initially developed by Andornot in 2008, then upgraded in 2015 to use the Andornot Discovery Interface (AnDI), Instead of having to search each collection separately, users canHeritageBurnabyResultScreen now type in a keyword and instantly see a combined listing of results from the collections of the City of Burnaby Archives, the Burnaby Village Museum, the Office of the City Clerk and Burnaby Heritage Planning. Searches can be narrowed down through facets for repository, type, date, subject, person, place etc. A good example showing the diversity of material is a search on “carousel” which is one of Burnaby’s heritage landmark buildings. This retrieves nearly 150 records with photos, sound recordings from the Archives oral history collection, books from the Museum library, and documents submitted to council, as well as the artifact records.

The new search interface is also now more forgiving, with automatic spelling corrections and “did “you mean” search suggestions which are very helpful for proper names and places where the user may be unsure of the correct spelling.

As part of this project several publications on the history of Burnaby were digitized and made full text searchable. A couple of these were indexed at the book chapter level to allow zeroing in to specific pages. These are viewable online with search words highlighted. Museum staff have reported that they are now “finding many wonderful connections between photos, records, landmarks, artifacts, and library resources” that were not apparent before. (Lisa Codd, Curator)

The update also included development of a new website with content managed in an Umbraco CMSallowing staff to add blog posts and update content easily. The research page provides more information on the types of materials included, and allows users to search only specific collections, or select neighbourhoods on a map, to see all resources from specific areas. The new website design is responsive to provide a mobile friendly interface, and includes features for streaming audio and video files. Behind the scenes, records are maintained in multiple Inmagic DB/TextWorks databases and extracted and indexed by AnDI when approved for public access.

Everything you wanted to know about Burnaby is at your fingertips,” as a result of this new upgrade! Please contact Andornot if you’d like to discuss options for updating your search interface or combining a search of multiple types of materials into one combined search.

Nova Scotia Health Authority Launches Provincial Hospitals Library Catalogue

by Jonathan Jacobsen Wednesday, February 24, 2016 12:03 PM

The Nova Scotia Health Authority is the latest Andornot client to launch a library catalogue using the Andornot Discovery Interface, available at http://library.nshealth.ca

The site provides patients, the public and health care professionals with a modern, mobile-friendly way to search for health information resources across the province, in physical libraries and online.

The collection includes the holdings of hospital libraries throughout Nova Scotia, including e-books, e-journals, patient education pamphlets, books, and audio-visual materials.

Previously, the library holdings of the former Capital Health Authority in Halifax alone were searchable online. Now, the new tool allows for searching of province-wide library holdings throughout the newly established Nova Scotia Health Authority, as well as the IWK Health Centre, the regional pediatric and maternity hospital in Halifax.

The Andornot Discovery Interface provides the features users expect in a modern search engine, such as spelling corrections and search term suggestions, relevancy-ranked results, and facets such as Subject, Author, Date, Location, Language and more, to quickly narrow a search.

Behind the public face is a DB/TextWorks database of holdings from individual libraries, as well as a separate database containing patient pamphlets The catalogue is managed by a team of library staff who co-ordinate data arriving from other libraries.

Contact Andornot to discuss making your collections available online in a modern way.

Using Named Entity Recognition to Generate Searchable Metadata

by Jonathan Jacobsen Tuesday, January 05, 2016 10:53 AM

Ask any librarian and they'll tell you that good metadata makes for a positive and productive search experience for users. Trying to find resources about a historic person or place, produced in a particular time period, and especially about a specific topic, is always more easily achieved when resources have been analyzed and described by a trained professional, with metadata applied from a controlled vocabulary, a process long known as "cataloguing".

Sure, search engines do an ever better job of returning relevant search results based only on the full text of a resource, with little or no metadata, thanks to some pretty sophisticated algorithms. Google is a giant because Google works! And even the Apache Solr search engine in our Andornot Discovery Interface and VuFind is impressive in its ability to parse and return meaningful results from large amounts of non-catalogued, metadata-free text.

But good metadata, applied by a librarian, archivist, curator or other skilled person, is still an even better source of data for a search engine. However, producing it does take time and staff resources. So, many have asked, "what if a computer could help me figure out what this resource is about, who is mentioned in it it, and where and when it takes place? What if the computer could extract the full text as well as metadata from a resource?"

We're very interested in some work being done on this. While automated subject analysis is still challenging, work at Stanford University by a Natural Language Processing group has produced a Named Entity Recognition engine that shows great promise. In a nutshell, this engine does a fine job of reading a passage of text, as long as you like, and finding within it the names of people, organizations and locations. 

Here's an example of a passage of text processed by the engine, with entities identified.

The screenshot shows that the engine did a pretty good job of identifying the names of people, organizations and places. This metadata can be used for increased searching options in a search engine, or fed back into a database for review and editing (as the engine may not always be perfect, there's still a role for professional review).

We're researching the possible uses of this with some of our projects, such as those built from the Andornot Discovery Interface (AnDI). When importing the full text of documents, that text will be run through a Named Entity Recognition engine to generate name and place metadata. For unstructured data, this may provide to be a great means of populating the Names facet, for example.

Stay tuned to this blog for further results, or contact us to discuss your collections and how they could be made more accessible with AnDI and Named Entity Recognition.

New Search Options for the Railroad Museum of Pennsylvania Library and Archives

by Jonathan Jacobsen Friday, November 20, 2015 10:30 AM

The Railroad Museum of Pennsylvania maintains a collection of tens of thousands of resources related to railroading in the Commonwealth of Pennsylvania. The collection is diverse - historical, political, cultural, social, economic, and technological - and emphasizes its development from the 1830s through the present day. Every manner of printed materials is in the collection, from annual reports to timetables, as well as an extensive set of photographs and negatives. A reference library contains books, periodicals, railroad association and union publications, government documents, and trade catalogues.

Public search access has been available for many years through an interface developed by Andornot using our Andornot Starter Kit. However, as with all websites and applications, renewal and refurbishment is necessary every few years, to keep up with technology standards and user expectations. In particular, we noticed that the search logs indicated no records found for many user searches, so we knew that some new features were needed to help users connect to resources.

In 2015, the museum began a project with Andornot to develop a new, modern search engine using the Andornot Discovery Interface (AnDI). This is now available at http://rrmuseumpa.andornot.com 

"We had two primary objectives – to replace an earlier online catalog search system that was sagging under the growing weight of tens of thousands of new records and images, and to make the system more useful to users who have become accustomed to the more intelligent finding systems currently available in so many places on the web. Andornot delivered admirably on both needs." -- James Alexander, Jr., the museum's webmaster and lead on this project.

Large Collection Needs Advanced Search Features

The new search offers users access to over 270,000 records from both the library and archives databases, which were formerly separate. 80,000 of these records have digitized photographs available online. With such a large data set, advanced search features are needed to help researchers uncover resources of interest to them.

AnDI's Apache Solr search engine excels at indexing large data sets. The more records that are available to it, the better it can analyze words and perform frequency analysis on them, one of the many algorithms it uses to deliver relevant results first.

Key to the search process are the facets that allow researchers to narrow their initial search by many criteria, such as the names of railroads, individuals, corporations and other organizations, subjects, geographic places, and dates.

As with all AnDI sites, users can view brief and full records, view photographs in a gallery layout, save records to a list, share search results on social media, and of course, access the site as easily from a tablet or phone as a desktop web browser.

The small selection of videos included in search results are published through the museum's YouTube channel to expose the museum to the widest possible audience. A YouTube player is embedded in search results for playback within the new site.

AnDI Handles Spelling Variations

As is to be expected with such a large collection, entered over many years by a variety of people, spelling variations and typographic errors have crept in. AnDI helps users locate resources despite this, using two key features:

1. The Apache Solr search engine in AnDI is very, very good at parsing terms from records and suggesting correct terms based on what's in the records and what user's search for. These appear in search results as spelling corrections and "Did you mean?" suggestions, which a user may click to try a different search.

2. A synonym list created by museum staff relates correct terms to some of the many variations that appear. 

For example, the New York, Susquehanna & Western Railway appears in around 7,000 records, but with the name Susquehanna spelled at least 11 different ways. Given that searchers may not enter the correct spelling either, the search problem is not trivial! The combination of the synonym list and Solr's other suggestions and corrections helps ensure that no matter how either the data was originally entered, nor how a user searches for it, AnDI can return relevant and complete results.

A video introduction and written search help are both available to introduce users to the site. 

Inmagic DB/TextWorks for Back-End Data Management

Behind the scenes, the museum continues to use Inmagic DB/TextWorks to manage these records. This database management system is invaluable to them in managing metadata, selecting standard metadata from validation lists, and providing access to volunteers for every-day data entry.

The museum's search engine continues to be hosted by Andornot as part of our managed hosting service.

"While Andornot had available a well-built modern search system in AnDI, they spent a lot of time with us learning about our particular users' needs, helping us think through the most useful processes, and refining the search experience. They know the business of both managing records internally and helping users find what they need. 

In the process two things happened – we learned more about the strengths and weaknesses in our data entry processes, and the usefulness and public recognition of our holdings were enhanced through improved web access.  The search help video was a real plus, and they worked with us in making our search page both functional and attractive." – James Alexander, Jr.

We're very pleased to continue our work with the Railroad Museum of Pennsylvania. Contact us to discuss upgrades and search options for your museum collections.

New Andornot Add-on: Embedded Document Viewer Surfaces Content Within Digital Documents

by Jonathan Jacobsen Tuesday, August 18, 2015 9:45 AM

So often when searching a database, records in the search results include links to PDFs and other electronic documents. Somewhere in the linked documents are pages with information related to the search, but where? And which pages are the most relevant? A user can use their PDF reader’s Find function to search again for keywords in the document, but that’s repetitive and not especially sophisticated. What if there was a better way of reviewing content within linked documents?

The Andornot Embedded Document Viewer breaks every PDF or similar document down into individual pages, with OCRd, indexed, searchable full text content available to searchers. When a user searches a database, the search results can include individual pages of linked documents, with their search terms highlighted, and with the most relevant pages shown, not just the record that links to the resource.

The screenshot below shows search terms highlighted on page. Additional images and examples are available here.

By viewing individual pages, rather than having to open and review each linked document in its entirety, a user can more quickly assess resources.

Other features include the ability to navigation through the document, zoom in and out of a page, and view thumbnails of all pages.

The Andornot Embedded Document Viewer is often added to the Andornot Discovery Interface search engine. Search results can represent the individual pages of a document that best match the user's search, ranked by relevancy, rather than just the catalogue or parent metadata record for the entire document.

Examples

The Andornot Embedded Document viewer is incorporated into the following projects, which are also based on the Andornot Discovery Interface:

Contact us for more information about enhancing search and discovery of linked, digitized resources.

Month List