Featured Researcher - Hugo Zaragoza
Hugo Zaragoza was 17 when he left his hometown of Barcelona, Spain. It would be another 16 years before he returned for good.
His first destination was Canada, where he earned his engineering degree from the University of Toronto. Next was France, where he was awarded a Ph.D. by the University of Paris for his thesis Dynamic Numerical Learning Methods for Textual Information Access. Zaragoza’s final stop: a six-year stint at Microsoft Research in Cambridge, England, where he worked on machine learning and search ranking functions with one of the fathers of modern Information Retrieval, Stephen Robertson.
But after more than a decade and half living abroad, Zaragoza yearned for home. Plus, the gloomy English weather was fraying his nerves.
On a whim, Zaragoza emailed Ricardo Baeza-Yates, who at the time was teaching at the University of Barcelona. Zaragoza had met Baeza-Yates at several industry conferences and was well aware of his reputation as a leading research scientist. He asked Baeza-Yates if perhaps there were any positions for him at the university.
Baeza-Yates responded with an even better offer. He informed Zaragoza he had just been hired by Yahoo! Research to run a new lab in Barcelona—and was looking to build a team of talented scientists. Two months later, in January 2006, Zaragoza packed his bags and returned home a very happy man.
"This was almost like a dream come true," says the 34-year-old Zaragoza, who is now a senior researcher in the Web Mining group. "The second I heard Yahoo! Research was starting a lab in Barcelona, I knew I was interested."
Zaragoza has loved every minute at Yahoo!. Unlike other places, where the Internet is just one of the many things the company does, at Yahoo! it is front and center. "Everyone here is so passionate about search and online content and creating things the world has never seen before," he says.
Zaragoza’s own work at Yahoo! has focused on natural language search. His project, called DeepSearch, is a prototype search engine built to demonstrate some of the capabilities of the Natural Language Retrieval Architecture he and his team are building in Barcelona.
DeepSearch exposes many kinds of "deep" textual information, allowing the search engine to distinguish between, say, a person and a company name, without sacrificing the engine’s typical lightening speed. For example, one can query "Apple" as a fruit or a company. Or, in a sentence like John hit the ball, the search engine would know that John did the hitting, not the ball, so one could query for all the things that John hits or all the people who hit balls.
A search engine that could make these kinds of distinctions would surely enable new kinds of search applications, reasons Zaragoza. "It is already hard to do this kind of fancy representation of text on a small number of documents, but if you want to do it for millions, or billions, it becomes incredibly harder," he says. "Our work is focused on improving linguistic analysis of text, and allowing very fast queries and on a very large scale."
When not shaping the future of search engine technology, Zaragoza enjoys a day at the beach. In fact, his love of the beach was one of the many reasons he returned home. But unlike the typical Barcelona beach bum, he’d rather dive into the water than lounge in the sun. That’s why he heads one hour out of town to the many small, secluded beaches the dot the coastline. "The water is cleaner and I can swim and snorkel all I want," he says.
These days, Zaragoza is feeling right at home. After all, he’s got the perfect job in the perfect city.