[Home]   [Full version]  

Investigating documents in depth

Feb 09 ,Technology


Keyword searches in text databases are a standard procedure today. Related content in different documents can now be analyzed on numerous levels using the software tool SWAPit. Researchers will be demonstrating at CeBIT how football news can be evaluated.

Does Ballack actually play any better now that he has signed a lucrative advertising contract? Or has his performance deteriorated instead? Has the disagreement between Kahn and Lehmann improved the two goalkeepers’ performance, or are they tending to stop fewer balls than before? And what effect does this have on their clubs? If scoop-hungry reporters are to assess these issues on a founded basis, rather than just relying on their gut feeling, they need to square up the news in sports magazines with up-to-date statistics, club communications and articles in the tabloids.

Such multi-layered analyses can now be prepared semi-automatically, using the software tool SWAPit developed by scientists at the Fraunhofer Institute for Applied Information Technology FIT in Sankt Augustin near Bonn. This tool makes it possible to discover related content in textual data at a glance, revealing any associated additional information.

“The name SWAPit is derived from the verb ‘to swap’,” explains Andreas Becks of the FIT. "The program challenges users to look at textual information from alternative points of view, enabling them to compare supplementary information related to the documented topics.” To make this possible the tool presents collections of texts as
a kind of map, in which similar texts are grouped into clusters. When a user clicks on one of these clusters, the shared features are displayed on the monitor in a field immediately adjacent to the map. “These additional ways of looking at information allow users to analyze their data much more fully. They can compile statistics and discern patterns that were not evident before,” Becks emphasizes.

Press research is just one possible application of the method known as integrated text and data mining. Other ways of using this software might be to analyze patents for research planning, examine documents on segments of the market or evaluate inquiries at service centers. “But at one point we even had an interdisciplinary cultural project in which SWAPit solved communication problems,” Becks reports. “It showed us how differently various disciplines define the same term.”

The researchers have already tested their prototype with industrial partners in a wide range of sectors. It is compatible with standard text formats such as doc, pdf and html, but could easily be extended to cover other formats if required for concrete marketing purposes, Becks assures us. Interested parties can learn more details at CeBIT in Hanover from March 9 to 15.

Source: Fraunhofer-Gesellschaft

Related stories:

Six-tonne T. rex quicker than Becks, say scientists
T. rex may have struggled to chase down speeding vehicles as the movie Jurassic Park would have us believe but the world’s most fearsome carnivore was certainly no slouch, research out today suggests.
Living donor liver transplants may drastically decrease mortality from liver failure
Patients with acute liver failure (ALF) could be saved by a transplant from a living donor (LDLT), according to a new study in the September issue of Liver Transplantation, a journal by John Wiley & Sons. The recent experience of U.S. patients shows that recipient mortality rates and donor morbidity rates are acceptable.
Solid-state drive sets speed record
Engineers and researchers at the IBM Hursley development lab in England and Almaden Research Center in California have set a record in storage speed, outperforming the current rate by more than 250 percent. By combining Flash solid-state technology and IBM's storage virtualization technology, the researchers were able to transfer data at more than 1 million Input/Output (I/O) per second.
Future of biology rests in harnessing data avalanche
(PhysOrg.com) -- Like most sciences, biology is inundated with data. However, a group of researchers warns in a Nature feature that the avalanche of biological information is at the point where the discipline may be unable to reach its full potential without improvements for curating data into on-line databases. The commentary appears in the September 4, issue of the journal and outlines specific remedies to harness the information overload.
Is There a 'Mozart Effect'? Ask a Neuroscientist and a Musicologist
(PhysOrg.com) -- Neuroscientists and musicians have learned that looking at the brain on music can yield valuable insights into how the mind works. Yet, University of Arkansas music theorist Elizabeth Hellmuth Margulis cautions that such research has produced some unintended consequences, such as the mistaken notion that listening to Mozart in particular boosts brainpower.
Astronomers discover missing link for origin of comets
(PhysOrg.com) -- An international team of scientists that includes University of British Columbia astronomer Brett Gladman has found an unusual object whose backward and tilted orbit around the Sun may clarify the origins of certain comets.
Scientists identify genetic link that may neutralize HIV
Scientists from the Gladstone Institute of Virology and Immunology (GIVI) and the National Institutes of Allergy and Infectious Diseases (NIAID) have identified a gene that may influence the production of antibodies that neutralize HIV. This new information will likely spur a new approach for making an HIV vaccine that elicits neutralizing antibodies. Neutralizing antibodies, once produced in the host, can attack and checkmate an infecting virus. The research was reported in the September 5 issue of Science.
FDA orders stronger warnings for 4 arthritis drugs
(AP) -- The Food and Drug Administration ordered stronger warnings Thursday on four medications widely used to treat rheumatoid arthritis and other serious illnesses, saying they can raise the risk of possibly fatal fungal infections.

News discussion:

Technology news

[Home]   [Full version]