Anyone who has ever tried to decipher bad handwriting can understand why postal workers have a reputation for inefficiency: They read some 50 million hand addressed envelopes a day, or 10 percent of the overall mail. Sargur Srihari, however, has eliminated that frustrating and time‑consuming task once and for all.
Srihari, 52, director of the Center of Excellence for Document Analysis and Recognition (Cedar) at the State University of New York (SUNY) in Buffalo, has developed a handwriting recognition computer program capable of culling useful information from illegible addresses on letters.
Similar programs, particularly the ones on tiny hand‑held computers and personal digital assistants, have been around for a while, but they have the advantage of being able to “feel” how each letter is formed on an electronic pad and to learn an individual’s hand‑writing style. Such software would be woefully inadequate in untangling the typical envelope address, in which individual characters are often illegible, if not dropped entirely.
Market‑driven research was always high on Srihari’s agenda. He began work on handwriting analysis shortly after he arrived in the United States in 1970. A graduate of the Indian Institute of Science in Bangalore, Srihari headed to Ohio State University for a Ph.D. in computer and information science.
It was during this time that he developed an interest in researching handwritten documents and commercializing the applications.
He wondered how people read newspapers, what captured their attention in a tabloid, or how a reporter read his handwritten notes. A deeper understanding of the processes behind these required some sort of analytical framework. And Srihari was determined to create one.
After joining SUNY’s computer science department as a faculty member in 1978, Srihari helped found Cedar, where he put together a research team that would study handwritten documents using artificial intelligence and mechanical analysis.
En route to creating the Handwritten Address Interpretation (HWAI) software, Srihari generated six patents, authored 150 papers on text recognition and was inducted as a fellow of the prestigious Institute of Electrical and Electronics Engineers (IEEE).
In 1983, the United States Postal Service (USPS) and companies such as Lockheed Martin and Siemens who are postal contractors, became interested in HWAI.
After funding the program’s development for more than 14 years, the USPS eventually ran a test deployment of the HWAI software in 1999, and later installed it at 255 of its major processing centers in the country.
The HWAI software currently recognizes ZIP codes correctly in roughly 70 percent of handwritten addresses and has achieved full address recognition in 30 percent.
The system may cut post office labor costs by as much as $150 million per year by reducing the need for human intervention in sorting handwritten letters, according to Srihari.
At that rate, one year’s savings will easily justify the USPS’s investment in Srihari’s research.
Srihari also notes that the agency had a billion‑dollar budget surplus in 1999. “We like to think that about one tenth of that billion‑dollar surplus was our contribution,” he says.
How exactly does the software function? It looks at the ZIP code on the envelope, accesses a database of possible street names within that area and then compares that information with the beginning of the handwritten street address.
For instance, if the program deciphers from the ZIP code that the address on the letter is Camden, New Jersey, and it can read only the first four letters of the street name (CAMD), then it can look up all possible street names that match and come up with Camden Court.
HWAI’s uniqueness lies in its ability to consider the overall shape of the word rather than deciphering one character at a time. This has yielded significant information about cursive writing, which has drawn the interest of the Justice Department and the Federal Bureau of Investigation (FBI). The two have expressed interest in developing computer‑assisted handwriting analysis tools for forensic applications.
Armed with a $428,000 grant from the National Institute of Justice in Washington, D.C., Srihari and his team are modifying the HWAI software. The ultimate aim is to be able to provide foolproof analysis that can be used to support the testimony of handwriting experts in courts.
Scientific tools, such as those developed by Srihari, are considered essential for admitting handwriting evidence in U.S. courts due to a number of recent rulings, including the Jon‑Benet Ramsey case concerning expert testimony.
Though handwriting analysts may be able to solve the question of who penned a ransom note or forged a check, their testimony is not admissible as evidence in criminal cases. The reason being that since they are human, they cannot claim complete objectivity.
“A human expert may put in his or her own bias unconsciously,” explains Srihari. “We have built the foundation for a handwriting analysis system that will quantify performance and increase confidence in determining a writer’s identity.”
The software basically validates individuality in writing. The idea that everyone’s handwriting is different is taken for granted. Srihari has developed purely scientific criteria for that premise.
He has submitted a paper, “Individuality of Handwriting” to the Journal of Forensic Sciences. If accepted for publication, the paper will herald a new era in criminal cases, as it will form the basis for admitting expert testimony on handwriting under the so‑called Daubert guidelines set by the U.S. Supreme Court.
Srihari and his team developed the software by first collecting a database of more than 1,000 samples of handwriting from a pool of individuals representing a microcosm of the U.S. population in terms of gender, age and ethnicity.
Multiple samples of handwriting were taken from subjects, who were asked to write the same series of sentences in cursive. Instead of analyzing the sentences visually, the way a human would, Srihari explains, the researchers deconstructed each sample, extracting features from the writing, such as the shapes of individual characters, descenders, and the spaces between lines and words.
The researchers then ran the samples through their software program. “We tested the program by asking it to determine which of two authors wrote a particular sample, based on measurable features,” recalls Srihari. “The program responded correctly 98 percent of the time.”
Part of the software has been developed at CedarTech, a spin-off from Cedar. “Not all work that needs to be done in this area could be performed in a university setting,” Srihari notes.
Although he believes in conducting research for its own sake ‑ that is, without necessarily exploiting the findings for commercial purposes ‑ Srihari feels that market‑driven research is far more satisfying because there is an ultimate goal.
“Some of the end goals are so challenging that it does not trivialize the science,” he concedes.
It was to meet these challenges head on that Srihari and his wife, Rohini, also a computer scientist, co founded Cymfony, a software company specializing in information extraction technology. Despite years of research, Srihari remains more of an educator than a scientist. He balances the day between his teaching job and what he calls “serious” commercial research.
“Pattern recognition is going to become a multibillion dollar industry and its applications are innumerable,” Srihari says. “We are just beginning to tap into this industry, and this what drives me.