Items where author is affiliated with IBM T.J. Watson Research Center
Number of items: 5.
and Lim, Lipyeow
and Kementsietsidis, Anastasios
and Wang, Min A Declarative Framework for Semantic Link Discovery over Relational Data.
In this paper, we present a framework for online discovery of semantic links from relational data. Our framework is based on declarative specification of the linkage require- ments by the user, that allows matching data items in many real-world scenarios. These requirements are translated to queries that can run over the relational data source, potentially using the semantic knowledge to enhance the accuracy of link discovery. Our framework lets data publishers to easily find and publish high-quality links to other data sources, and therefore could significantly enhance the value of the data in the next generation of web.
and Sun, Jimeng
and Castro, Paul
and Konuru, Ravi
and Sundaram, Hari
and Kelliher, Aisling Extracting Community Structure through Relational Hypergraphs.
Social media websites promote diverse user interaction on media objects as well as user actions with respect to other users. The goal of this work is to discover community structure in rich media social networks, and observe how it evolves over time, through analysis of multi-relational data. The problem is important in the enterprise domain where extracting emergent community structure on enterprise social media, can help in forming new collaborative teams, aid in expertise discovery, and guide long term enterprise reorganization. Our approach consists of three main parts: (1) a relational hypergraph model for modeling various social context and interactions; (2) a novel hypergraph factorization method for community extraction on multi-relational social data; (3) an online method to handle temporal evolution through incremental hypergraph factorization. Extensive experiments on real-world enterprise data suggest that our technique is scalable and can extract meaningful communities. To evaluate the quality of our mining results, we use our method to predict users’ future interests. Our prediction outperforms baseline methods (frequency counts, pLSA) by 36-250% on the average, indicating the utility of leveraging multi-relational social context by using our method.
and Fan, Wei
and Peng, Jing
and Verscheure, Olivier
and Ren, Jiangtao Latent Space Domain Transfer between High Dimensional Overlapping Distributions.
Transferring knowledge from one domain to another is challenging due to a number of reasons. Since both conditional and marginal distribution of the training data and test data are non-identical, model trained in one domain, when directly applied to a different domain, is usually low in accuracy. For many applications with large feature sets, such as text document, sequence data, medical data, image data of different resolutions, etc. two domains usually do not contain exactly the same features, thus introducing large numbers of “missing values” when considered over the union of features from both domains. In other words, its marginal distributions are at most overlapping. In the same time, these problems are usually high dimensional, such as, several thousands of features. Thus, the combination of high dimensionality and missing values make the relationship in conditional probabilities between two domains hard to measure and model. To address these challenges, we propose a framework that ﬁrst brings the marginal distributions of two domains closer by “ﬁlling up” those missing values of disjoint features. Afterwards, it looks for those comparable sub-structures in the “latent-space” as mapped from the expanded feature vector, where both marginal and conditional distribution are similar. With these sub-structures in latent space, the proposed approach then ﬁnd common concepts that are transferable across domains with high probability. During prediction, unlabeled instances are treated as “queries”, the mostly related labeled instances from outdomain are retrieved, and the classiﬁcation is made by weighted voting using retrieved out-domain examples. We formally show that importing feature values across domains and latentsemantic index can jointly make the distributions of two related domains easier to measure than in original feature space, the nearest neighbor method employed to retrieve related out domain examples is bounded in error when predicting in-domain examples. Software and datasets are available for download.
and Laredo, Jim
and Bhattacharya, Kamal
and Pasquale, Liliana
and Wassermann, Bruno REST-Based Management of Loosely Coupled Services.
Applications increasingly make use of the distributed platform that the World Wide Web provides – be it as a Software-as-a-Service such as salesforce.com, an application infrastructure such as facebook.com, or a computing infrastructure such as a “cloud”. A common characteristic of applications of this kind is that they are deployed on infrastructure or make use of components that reside in different management domains. Current service management approaches and systems, however, often rely on a centrally managed configuration management database (CMDB), which is the basis for centrally orchestrated service management processes, in particular change management and incident management. The distribution of management responsibility of WWW based applications requires a decentralized approach to service management. This paper proposes an approach of decentralized service management based on distributed configuration management and service process co-ordination, making use RESTful access to configuration information and ATOM-based distribution of updates as a novel foundation for service management processes.
and Iyengar, Arun
and Mikalsen, Thomas
and Rouvellou, Isabelle
and Nahrstedt, Klara A Trust Management Framework for Service-Oriented Environments.
Many reputation management systems have been developed under the assumption that each entity in the system will use a variant of the same scoring function. Much of the previous work in reputation management has focused on providing robustness and improving performance for a given reputation scheme. In this paper, we present a reputation-based trust management framework that supports the synthesis of trust-related feedback from many different entities while also providing each entity with the ﬂexibility to apply different scoring functions over the same feedback data for customized trust evaluations. We also propose a novel scheme to cache trust values based on recent client activity. To evaluate our approach, we implemented our trust management service and tested it on a realistic application scenario in both LAN and WAN distributed environments. Our results indicate that our trust management service can effectively support multiple scoring functions with low overhead and high availability.
About this site
This website has been set up for WWW2009 by Christopher Gutteridge of the University of Southampton, using our EPrints software.
Add your Slides, Posters, Supporting data, whatnots...
If you are presenting a paper or poster and have slides or supporting material you would like to have permentently made public at this website, please email
firstname.lastname@example.org - Include the file(s), a note to say if they are presentations, supporting material or whatnot, and the URL of the paper/poster from this site. eg. http://www2009.eprints.org/128/
It's impractical to add all the workshops at WWW2009 by hand, but if you can provide me with the metadata in a machine readable way, I'll have a go at importing it. If you are good at slinging XML, my ideal import format is visible at http://www2009.eprints.org/import_example.xml
We (Southampton EPrints Project) intend to preserve the files and HTML pages of this site for many years, however we will turn it into flat files for long term preservation. This means that at some point in the months after the conference the search, metadata-export, JSON interface, OAI etc. will be disabled as we "fossilize" the site. Please plan accordingly. Feel free to ask nicely for us to keep the dynamic site online longer if there's a rally good (or cool) use for it...
- WWW2009 EPrints supports OAI 2.0 with a base URL of http://www2009.eprints.org/cgi/oai2
- The JSON URL is http://www2009.eprints.org/cgi/json?callback=function&eprintid=number
To prevent google killing the server by hammering these tools, the /cgi/ URL's are denied to robots.txt - ask Chris if you want an exception made.
Feel free to contact me (Christopher Gutteridge) with any other queries or suggestions. ...Or if you do something cool with the data which we should link to!
These are not directly related to the EPrints set up, but may be of use to delegates.
- Social tool links
- I've put links in the page header to the WWW2009 stuff on flickr, facebook and to a page which will let you watch the #www2009 tag on Twitter. Not really the right place, but not yet made it onto the main conference homepage. Send me any suggestions for new links.