Although millions of people use Web search engines, researchers show that – by using relatively simple methods – most queries submitted can be classified into one of three categories.
Jim Jansen, assistant professor in Penn State's College of Information Sciences and Technology, worked with IST undergraduate Danielle Booth and Amanda Spink, Queensland University of Technology, to find that Web search engine users are doing primarily informational, navigational or transactional searching.
Informational searching involves looking for a specific fact or topic, navigational searching seeks to locate a specific Web site and transactional searching looks for information related to buying a particular product or service.
The research was the first published work of its kind done using actual searching data, with the aim of real-time classification. Researchers analyzed more than 1.5 million queries from hundreds of thousands of search engines users. Findings showed that about 80 percent of queries are informational and about 10 percent each are for navigational and transactional purposes.
Jansen and his colleagues arrived at those results by selecting random samples of records and analyzing query length, the order of the query in the session and the search results. These fields helped the team develop an algorithm that classified the searches with a 74-percent accuracy rate.
"Other results have classified comparatively much smaller sets of queries, usually manually," Jansen said. "This research aimed to classify queries automatically.
"Our findings have broad implications for search engines and e-commerce if they can classify the user intent of queries in real time. This is why we wanted a computational undemanding algorithm," Jansen continued. "It proves the 80/20 rule that 80 percent of the cases can be achieved with these clear-cut methods."
The paper "Determining the informational, navigational and transactional intent of Web queries" will appear in the May 2008 issue of Information Processing & Management. The article is currently available online.
The Penn State researcher said he plans to continue this research using a more complex algorithm that will hopefully yield a 90-percent accuracy rate using similar searching criteria.
Source: Penn State
Related stories:
UCI Innovation for Developers: Cut to the Action CodeGenie
A brilliant innovation by graduate students of the University of California, Irvine. The CodeGenie will cut down on hours of painstaking search and filter. It is an Eclipse plugin that searches Open Source code and its results are amazing.
Researchers create search engine to hunt molecules online
ChemxSeer, the first publicly available search engine designed specifically for chemical formulae, can sort out when "He" refers to helium and not a person more than nine times out of 10, according to the Penn State College of Information Sciences and Technology (IST) researchers who created the tool.
Branding matters -- even when searching
Web searchers who evaluated identical search-engine results overwhelmingly favored Yahoo! and Google, providing evidence that branding matters as much on the Internet as off, according to a Penn State study.
Hum a few bars and I’ll find it
A European research consortium hopes to make it much easier to find audio/visual content online. The new search approach will be driven by content or example rather than relying on key words and tags.
Blinkx to lead in video search engine
When it comes to video searching on the Net, blinkx is big. Deeming itself the smartest and largest video search engine on the Web, blinkx.tv delivers 4 million hours of searchable content -- audio, video, and TV via RSS -- and boast more content than Google Video and Yahoo.
Google, Adobe in distribution deal
Google's move to sign a multi-year distribution deal with Adobe to include the Internet group's toolbar in the installation process of Adobe products for Windows is raising eyebrows in the industry over the search giant's recent actions in its competition with Microsoft.
Search engines return similar results for e-commerce comparison shopping
Consumers who go to multiple search engines looking for the best prices or products may be spending more time than needed, says a Penn State researcher.
Consumers suspicious of sponsored links
Sponsored links are marketed as a sure-fire way to lure consumers to specific Web sites, but a Penn State study shows most online shoppers don't take the bait.
"Consumers have a bias against the links that businesses pay search engines to provide," said Jim Jansen, assistant professor in the Penn State School of Information Sciences and Technology (IST). "By themselves, sponsored links appear not to be a viable business model and should be only one part of an online advertising campaign."