Fulton History's 34 Million Page Online Newspaper Archive Adds Contegra's Server-Side PDF Hit Highlighter to dtSearch® Instant Searching for Immediate Historical Access

Contegra's product lets web users (with only a browser installed) enter a dtSearch request and jump right to a highlighted hits superimposed on the original newspaper page image

Fulton History offers an unparalleled historical newspaper collection extending back to the 18th Century.  dtSearch is a leader in enterprise and developer text retrieval software and document filters.  Contegra Systems has decades of data integration experience and is a major provider of custom development implementations for the dtSearch Engine market, including precision faceted search and other advanced data classification.

With the new installation, Fulton History visitors can not only instantly search 34 million newspaper pages, but also automatically hone in on the exact place on the original newspaper image that contains the search term.  “With about a million visitors a month to FultonHistory.com, it is critical that the site runs smoothly,” says Tom Tryniski, Fulton History’s Founder.  “Adding Contegra’s highlighter to dtSearch makes the whole user experience run that much more seamlessly.”  

One key issue with newspaper scanning is the possibility of optical character recognition (OCR) errors.  A smudge on an old newspaper might result in an OCR program resolving “Titamic”’ as “Titanic.”  The dtSearch Engine includes its own fuzzy searching algorithm which lets web visitors adjust search fuzziness to sift through OCR and other typographical errors.  For example, with a fuzziness level of 3, a search for “Titanic” would find not only “Titanic” but also “Titamic.”  With a fuzziness level of 4, a search for “Titanic” would find “Titanic,” “Titamic” and “Titomic.”

On the Fulton History site, dtSearch’s fuzziness algorithm also extends to the Contegra application’s highlighting of PDF hits.  Adds Mr. Tryniski:  “Because of the potential for OCR errors when scanning old newspapers, dtSearch’s fuzzy searching is really important.”  The Fulton History site makes available other dtSearch search options as well, including stemming, phonic and concept / thesaurus searching. 

# # #

About Fulton History, www.fultonhistory.com

Originally a resource for searching historical newspaper records from upstate New York, Fulton History brings together an ever expanding collection of American and now Canadian newspapers.  The entire 34 million page collection is available for the general public to search at fultonhistory.com

About Contegra Systems, www.contegrasystems.com.

Established in 1987, Contegra Systems, Inc. is a leading provider of data integration services to Fortune 500 companies and others with extensive data access requirements.  The company routinely transforms substantial collections of mixed data content into robust, user-friendly Web and other electronic products.  The company also routinely undertakes custom development projects involving the dtSearch Engine SDKs, applying Contegra’s server-side PDF hit-highlighting application as well as customized faceted and other advanced data classification implementations.

 

About dtSearch, dtSearch.com

The Smart Choice for Text Retrieval® since 1991, the dtSearch product line has 25+ search options for instantly searching terabytes of text.  Along with enterprise and developer text retrieval, the company has its own document filters, offering parsing, extraction, conversion and searching of a broad range of data formats.  Supported data types encompass databases, website data, popular “Office” formats, compression formats, and emails with attachments.  dtSearch products meet some of the largest-capacity text retrieval needs in the world, including developer APIs spanning multiple platforms.  The products have received hundreds of excellent case studies and press reviews.  (Please see dtSearch.com for these.)  The company has distributors worldwide with coverage on six continents.

Share:


Tags: dtSearch, Enterprise Text Retrieval, PDF Highlight, PDF Search


About dtSearch Corp

View Website

dtSearch Corp
6852 Tulip Hill Terrace
Bethesda, MD 20816