Items where author is affiliated with University of Science and Technology of China
Number of items: 5.
and Xie, Xing
and Duan, Manni
and Hara, Takahiro
and Nishio, Shojiro A Game Based Approach to Assign Geographical Relevance to Web Images.
Geographical context is very important for images. Millions of images on the Web have been already assigned latitude and longitude information. Due to the rapid proliferation of such images with geographical context, it is still difficult to effectively search and browse them, since we do not have ways to decide their relevance. In this paper, we focus on the geographical relevance of images, which is defined as to what extent the main objects in an image match landmarks at the location where the image was taken. Recently, researchers have proposed to use game based approaches to label large scale data such as Web images. However, previous works have not shown the quality of collected game logs in detail and how the logs can improve existing applications. To answer these questions, we design and implement a Web-based and multi-player game to collect human knowledge while people are enjoying the game. Then we thoroughly analyze the game logs obtained during a three week study with 147 participants and propose methods to determine the image geographical relevance. In addition, we conduct an experiment to compare our methods with a commercial search engine. Experimental results show that our methods dramatically improve image search relevance. Furthermore, we show that we can derive geographically relevant objects and their salient portion in images, which is valuable for a number of applications such as image location recognition.
and Yang, Linjun
and Yu, Nenghai
and Hua, Xian-Sheng Learning to Tag.
Social tagging provides valuable and crucial information for large-scale web image retrieval. It is ontology-free and easy to obtain; however, irrelevant tags frequently appear, and users typically will not tag all semantic objects in the image, which is also called semantic loss. To avoid noises and compensate for the semantic loss, tag recommendation is proposed in literature. However, current recommendation simply ranks the related tags based on the single modality of tag co-occurrence on the whole dataset, which ignores other modalities, such as visual correlation. This paper proposes a multi-modality recommendation based on both tag and visual correlation, and formulates the tag recommendation as a learning problem. Each modality is used to generate a ranking feature, and Rankboost algorithm is applied to learn an optimal combination of these ranking features from different modalities. Experiments on Flickr data demonstrate the effectiveness of this learning-based multi-modality recommendation strategy.
and Nie, Zaiqing
and Liu, Xiaojiang
and Zhang, Bo
and Wen, Ji-Rong StatSnowball: a Statistical Approach to Extracting Entity Relationships.
Traditional relation extraction methods require pre-specified relations and relation-specific human-tagged examples. Boot- strapping systems significantly reduce the number of train- ing examples, but they usually apply heuristic-based meth- ods to combine a set of strict hard rules, which limit the ability to generalize and thus generate a low recall. Further- more, existing bootstrapping methods do not perform open information extraction (Open IE), which can identify var- ious types of relations without requiring pre-specifications. In this paper, we propose a statistical extraction framework called Statistical Snowball (StatSnowball), which is a boot- strapping system and can perform both traditional relation extraction and Open IE. StatSnowball uses the discriminative Markov logic net- works (MLNs) and softens hard rules by learning their weights in a maximum likelihood estimate sense. MLN is a general model, and can be configured to perform different levels of relation extraction. In StatSnwoball, pattern selection is performed by solving an l1 -norm penalized maximum like- lihood estimation, which enjoys well-founded theories and efficient solvers. We extensively evaluate the performance of StatSnowball in different configurations on both a small but fully labeled data set and large-scale Web data. Empirical results show that StatSnowball can achieve a significantly higher recall without sacrificing the high precision during it- erations with a small number of seeds, and the joint inference of MLN can improve the performance. Finally, StatSnowball is efficient and we have developed a working entity relation search engine called Renlifang based on it.
and Jiang, Daxin
and Pei, Jian
and Chen, Enhong
and Li, Hang Towards Context-Aware Search by Learning a Very Large Variable Length Hidden Markov Model from Search Logs.
Capturing the context of a user’s query from the previous queries and clicks in the same session may help understand the user’s information need. A context-aware approach to document re-ranking, query suggestion, and URL recommendation may improve users’ search experience substantially. In this paper, we propose a general approach to context-aware search. To capture contexts of queries, we learn a variable length Hidden Markov Model (vlHMM) from search sessions extracted from log data. Although the mathematical model is intuitive, how to learn a large vlHMM with millions of states from hundreds of millions of search sessions poses a grand challenge. We develop a strategy for parameter initialization in vlHMM learning which can greatly reduce the number of parameters to be estimated in practice. We also devise a method for distributed vlHMM learning under the map-reduce model. We test our approach on a real data set consisting of 1.8 billion queries, 2.6 billion clicks, and 840 million search sessions, and evaluate the effectiveness of the vlHMM learned from the real data on three search applications: document re-ranking, query suggestion, and URL recommendation. The experimental results show that our approach is both effective and efficient.
and Wang, Lu
and Guo, Xiaolin
and Pan, Aimin
and Zhu, Bin B. WPBench: A Benchmark for Evaluating the Client-side Performance of Web 2.0 Applications.
In this paper, a benchmark called WPBench is reported to evaluate the responsiveness of Web browsers for modern Web 2.0 applications. In WPBench, variations of servers and networks are removed and the benchmark result is the closest to what Web users would perceive. To achieve these, WPBench records users’ interactions with typical Web 2.0 applications, and then replays Web navigations when benchmarking browsers. The replay mechanism can emulate the actual user interactions and the characteristics of the servers and the networks in a consistent way independent of browsers so that any browser compliant to the standards can be benchmarked fairly. In addition to describing the design and generation of WPBench, we also report the WPBench comparison results on the responsiveness performance for three popular Web browsers: Internet Explorer, Firefox and Chrome.
About this site
This website has been set up for WWW2009 by Christopher Gutteridge of the University of Southampton, using our EPrints software.
Add your Slides, Posters, Supporting data, whatnots...
If you are presenting a paper or poster and have slides or supporting material you would like to have permentently made public at this website, please email
email@example.com - Include the file(s), a note to say if they are presentations, supporting material or whatnot, and the URL of the paper/poster from this site. eg. http://www2009.eprints.org/128/
It's impractical to add all the workshops at WWW2009 by hand, but if you can provide me with the metadata in a machine readable way, I'll have a go at importing it. If you are good at slinging XML, my ideal import format is visible at http://www2009.eprints.org/import_example.xml
We (Southampton EPrints Project) intend to preserve the files and HTML pages of this site for many years, however we will turn it into flat files for long term preservation. This means that at some point in the months after the conference the search, metadata-export, JSON interface, OAI etc. will be disabled as we "fossilize" the site. Please plan accordingly. Feel free to ask nicely for us to keep the dynamic site online longer if there's a rally good (or cool) use for it...
- WWW2009 EPrints supports OAI 2.0 with a base URL of http://www2009.eprints.org/cgi/oai2
- The JSON URL is http://www2009.eprints.org/cgi/json?callback=function&eprintid=number
To prevent google killing the server by hammering these tools, the /cgi/ URL's are denied to robots.txt - ask Chris if you want an exception made.
Feel free to contact me (Christopher Gutteridge) with any other queries or suggestions. ...Or if you do something cool with the data which we should link to!
These are not directly related to the EPrints set up, but may be of use to delegates.
- Social tool links
- I've put links in the page header to the WWW2009 stuff on flickr, facebook and to a page which will let you watch the #www2009 tag on Twitter. Not really the right place, but not yet made it onto the main conference homepage. Send me any suggestions for new links.