TOP

Demo Accepted Papers

  • BestTime: Finding Representatives in Time Series Datasets (Stephan Spiegel, David Schultz, Sahin Albayrak)

    Authors

    Stephan Spiegel, TU Berlin
    David Schultz, TU Berlin
    Sahin Albayrak, TU Berlin

    Abstract

    Given a set of time series, we aim at finding representatives which best comprehend the recurring temporal patterns contained in the data. We demonstrate BestTime, a Matlab application that uses recurrence quantification analysis to find time series representatives.

  • GrammarViz 2.0: a tool for grammar-based pattern discovery in time series (Pavel Senin, Jessica Lin, Xing Wang, Tim Oates, Manfred Lerner, Arnold Boedihardjo, Crystal Chen, Susan Frankenstein, Sunil Gandhi)

    Authors

    Pavel Senin, UH Manoa
    Jessica Lin, George Mason University, USA
    Xing Wang
    Tim Oates
    Manfred Lerner
    Arnold Boedihardjo
    Crystal Chen
    Susan Frankenstein
    Sunil Gandhi

    Abstract

    The problem of frequent and anomalous patterns discovery in time series has received a lot of attention in the past decade. Addressing the common limitation of existing techniques, which require a pattern length to be known in advance, we recently proposed grammar-based algorithms for efficient discovery of variable length frequent and rare patterns. In this paper we present GrammarViz 2.0, an interactive tool that, based on our previous work, implements algorithms for grammar-driven mining and visualization of variable length time series patterns.

  • KnowNow: a Serendipity-based Educational Tool for Learning Time-Linked Knowledge (Luigi Di Caro, Livio Robaldo, Nicoletta Bersia)

    Authors

    Luigi Di Caro, University of Turin
    Livio Robaldo, University of Turin
    Nicoletta Bersia, Telecom Italia Lab

    Abstract

    In this paper we present the system KnowNow, a tool whose aim is to let the users navigate into text corpora through dynamic semantic information networks, created in real-time according to delimited time ranges. In educational scenarios, students are often asked to write short essays on different topics linked by temporal information. This usually involves a combination of several aspects to be avaluated, such as knowledge, imagination, structure and presentation. In the light of this, the introduction of Natural Language Understanding techniques together with cross-topic navigation and visualization tools and can considerably help students to retrieve, link, and create well-structured and original contributions, as we demonstrate by using KnowNow.

  • Spa: a web-based viewer for text mining in Evidence Based Medicine (Joël Kuiper, Iain Marshall, Byron Wallace, Morris Swertz)

    Authors

    Joël Kuiper, University of Groningen
    Iain Marshall
    Byron Wallace
    Morris Swertz

    Abstract

    Summarizing the evidence about medical interventions is an immense undertaking, in part because unstructured PDF documents remain the main vehicle for disseminating scientific findings. Clinicians and researchers must therefore manually extract and synthesise information from these documents. We introduce Spá, a web-based viewer that enables automated annotation and summarisation of PDFs via machine learning. To illustrate its functionality, we use Spá to semi-automate the assessment of bias in clinical trials. Spá has a modular architecture, therefore the tool may be widely useful in other domains with a PDF-based literature, including law, physics, and biology.

  • Khiops CoViz: a tool for visual exploratory analysis of k-coclustering results (Bruno Guerraz, Marc Boullé, Dominique Gay, Fabrice Clérot)

    Authors

    Bruno Guerraz, Orange
    Marc Boullé, Orange Labs
    Dominique Gay, Orange
    Fabrice Clérot, Orange

    Abstract

    Identifying and visually analyzing interesting interactions between variables in large-scale data sets through $k$-coclustering is of high importance. We present Khiops CoViz, a tool for visual analysis of interesting relationships between two or more variables (categorical and/or numerical). The visualization of $k$ variables coclustering takes the form of a grid/matrix whose dimensions are partitioned: categorical variables are grouped into clusters and numerical variables are discretized. The tool allows several kinds of visualization at various scales for grid representation of coclustering results by means of several criteria each of which providing different insights into the data.

  • MinUS: Mining User Similarity with Trajectory Patterns (Jun Pang, Xihui Chen, Piotr Kordy, Ruipeng Lu)

    Authors

    Jun Pang, University of Luxembourg
    Xihui Chen, University of Luxembourg
    Piotr Kordy, University of Luxembourg
    Ruipeng Lu, Shandong University

    Abstract

    The development of positioning systems and wireless connectivity has made it possible to collect users’ fine-grained movement data. This availability of movement data can be applied in a broad range of services. In this paper, we present a novel tool for calculating users’ similarity based on their movements. This tool, MinUS, integrates the technologies of trajectory pattern mining with the state-of-the-art research on discovering user similarity with trajectory patterns. Specifically, with MinUS, we provide a platform to manage movement datasets, and construct and compare users’ trajectory patterns. Tool users can compare results given by a series of user similarity metrics, which allows them to learn the importance and limitations of different similarity metrics and promotes studies in related areas, e.g., location privacy. Additionally, MinUS can also be used by researchers as a tool for preliminary process of movement data and parameter tuning in trajectory pattern mining.

  • WebDR: A Web Workbench for Data Reduction (Stefanos Ougiaroglou, Georgios Evangelidis)

    Authors

    Stefanos Ougiaroglou, University of Macedonia
    Georgios Evangelidis, University of Macedonia

    Abstract

    Data reduction is a common preprocessing task in the context of the k nearest neighbour classification. This paper presents WebDR, a web-based application where several data reduction techniques have been integrated and can be executed on-line. WebDR allows the performance evaluation of the classification process through a web interface. Therefore, it can be used by the academia for educational and experimental purposes.

  • BMaD – A Boolean Matrix Decomposition Framework (Andrey Tyukin, Stefan Kramer, Jörg Wicker)

    Authors

    Andrey Tyukin, Johannes Gutenberg-Universitaet Mainz
    Stefan Kramer, University of Mainz
    Jörg Wicker, Johannes Gutenberg-Universität Mainz

    Abstract

    Boolean matrix decomposition is a method to obtain a compressed representation of a matrix with Boolean entries. We present a modular framework that unifies several Boolean matrix decomposition algorithms, and provide methods to evaluate their performance. The main advantages of the framework are its modular approach and hence the flexible combination of the steps of a Boolean matrix decomposition and the capability of handling missing values. The framework is licensed under the GPLv3 and can be downloaded freely here.

  • Branty: a social media ranking tool for brands (Alexandros Arvanitidis, Anna Serafi, Athina Vakali, Grigorios Tsoumakas)

    Authors

    Alexandros Arvanitidis
    Anna Serafi, AUTH
    Athina Vakali, Aristotle Univ. Thessaloniki
    Grigorios Tsoumakas, Aristotle University of Thessaloniki

    Abstract

    In the competitive world of popular brands, strong presence in social media is of major importance for customer engagement and products advertising. Up to now, many such tools and applications en- able end-users to observe and monitor their company’s web pro_le, their statistics, as well as their market outreach and competition status. This work goes beyond the individual brands statistics since it automates a brand ranking process based on opinions emerging in social media users’ posts. Twitter streaming API is exploited to track micro-blogging activ- ity for a number of famous brands with emphasis on users’ opinions and interactions. The social impact is captured from 3 di_erent perspectives (objective counts, opinion reckoning, inuence analysis), which estimate a score assigned to each brand via a multi-criteria algorithm. The results are then exposed in a Web application as a list of the most social brands on Twitter. But, are conventional metrics, such as followers, enough in order to measure the social impact of a brand? Di_erent usage scenar- ios of our application reveal that the social presence of a brand is more complex than current social impact frameworks care to admit.

  • Propositionalization Online (Nada Lavrac, Matic Perovšek, Anže Vavpetič)

    Authors

    Nada Lavrac, Jožef Stefan Institute
    Matic Perovšek, Jožef Stefan Institute
    Anže Vavpetič, Jožef Stefan Institute

    Abstract

    Inductive Logic Programming and Relational Data Mining address the task of inducing models or patterns from multi-relational data. An established relational data mining approach is propositionalization, characterized by transforming a relational database into a single-table representation. The paper presents a propositionalization toolkit implemented in the web-based data mining platform ClowdFlows. As a contemporary integration platform it enables workflow construction and execution, provides open access to Aleph, RSD, RelF and Wordification feature construction engines, and enables RDM performance comparison through cross-validation and ViperCharts results visualization.

  • PYTHIA: Employing Lexical and Semantic Features for Sentiment Analysis (Ioannis Katakis, Iraklis Varlamis, George Tsatsaronis)

    Authors

    Ioannis Katakis, HUA
    Iraklis Varlamis, Harokopio University of Athens
    George Tsatsaronis, TUD, BIOTEC

    Abstract

    Sentiment analysis methods aim at identifying the polarity of a piece of text, e.g., passage, review, snippet, by analyzing lexical features at the level of the terms or the sentences. However, many of the previous works do not utilize features that can o er a deeper understanding of the text, e.g., negation phrases. In this work we demonstrate a novel piece of software, namely PYTHIA3, which combines semantic and lexical features at the term and sentence level and integrates them into machine learning models in order to predict the polarity of the input text. Experimental evaluation of PYTHIA in a benchmark movie reviews dataset shows that the suggested combination performs favorably against previous related methods. An online demo is publicly available.

  • Interactive Medical Miner: Interactively exploring subpopulations in epidemiological datasets (Uli Niemann, Myra Spiliopoulou, Henry Völzke, Jens-Peter Kühn)

    Authors

    Uli Niemann, OVGU Magdeburg
    Myra Spiliopoulou, University of Magdeburg
    Henry Völzke
    Jens-Peter Kühn

    Abstract

    We present our Interactive Medical Miner, a tool for classification and model drill-down, designed to study epidemiological data. Our tool encompasses supervised learning (with decision trees and classification rules), utilities for data selection, and a rich panel with options for inspecting individual classification rules, and for studying the distribution of variables in each of the target classes. Since some of the epidemiological data available to the medical researcher may be still unlabeled (e.g. because the medical recordings for some part of the cohort are still in progress), our Interactive Medical Miner also supports the juxtaposition of labeled and unlabeled data. The set of methods and scientific workflow supported with our tool have been published in [1].

  • Insight4News: Connecting News to Relevant Social Conversations (Georgiana Ifrim, Bichen Shi, Neil Hurley)

    Authors

    Georgiana Ifrim, University College Dublin, Ireland
    Bichen Shi, Insight Centre, University College Dublin
    Neil Hurley, Insight Centre, University College Dublin

    Abstract

    We present the Insight4News system that connects news articles to social conversations, as echoed in microblogs such as Twitter. Insight4News tracks feeds from mainstream media, e.g., BBC, Irish Times, and extracts relevant topics that summarize the tweet activity around each article, recommends relevant hashtags, and presents complementary views and statistics on the tweet activity, related news articles, and timeline of the story with regard to Twitter reaction. The user can track their own news article or a topic-focused Twitter stream. While many systems tap on the social knowledge of Twitter to help users stay on top of the information wave, none is available for connecting news to relevant Twitter content on a large scale, in real time, with high precision and recall. Insight4News builds on our award winning Twitter topic detection approach and several machine learning components, to deliver news in a social context. Keywords: news tracking, social media, Twitter, summarization

X