TOP

- T1: Medical Mining for Clinical Knowledge Discovery
- T2: Patterns in Noisy and Multidimensional Relations and Graphs
- T3: The Pervasiveness of Machine Learning in Omics Science
- T4: Conformal Predictions for Reliable Machine Learning

- T5: The Lunch is Never Free: How Information Theory, MDL, and Statistics are Connected
- T6: Information Theoretic Methods in Data Mining
- T7: Machine Learning with Analogical Proportions
- T8: Preference Learning Problems
- T9: Deep Learning

Medical data mining is a mature area of research, characterized by both simple and very elaborate methods, mostly dedicated to solving a concrete problem of disease diagnosis, disease description or success prediction for a treatment. Clinical knowledge discovery encompasses analysis of epidemiological data, and of clinical and administrative data on patients; clinical decision support builds upon findings on these data. We elaborate on how data mining can contribute to such findings, we enumerate challenges of model learning, data availability and data provenance, and identify challenges on Big Medical Data. ## Outline- Self-presentation of the Tutorialists and Overview of the Domain (all)
- Knowledge Discovery from Epidemiological Data – Myra Spiliopoulou
- Knowledge Discovery from Clinical and Administrative Data – Pedro Pereira Rodrigues
- Knowledge Discovery Challenges on Big Medical Data – Ernestina Menasalvas
- Knowledge Discovery for Clinical Decision Support – Pedro Pereira Rodrigues
- Concluding Remarks
## AffiliationsPedro Pereira Rodrigues |

In this tutorial, we will consider generalizations of closed itemset mining toward n-ary relations and toward noise tolerance. Declarative aspects (in particular, how to define “noise”) as well as procedural aspects (how to efficiently traverse the pattern space) will be discussed. Presentation of the tutorial T2 Source Beamer LaTeX tutorial T2 ## Outline- Patterns in Multidimensional Relations and Graphs: Defining Them
- Introduction: a bottom-up approach toward an inductive database system
- From binary relations to n-ary relations, a natural generalization
- From crisp relations to fuzzy relations, tolerating noise globally or per-element, absolutely or relatively
- Constraints for readability, quality and efficiency
- From n-ary relations to collections of graphs through the symmetry constraint
- Patterns in vertex-multilabeled graphs
- Conclusion: Summary
- Patterns in Multidimensional Relations and Graphs: Mining Them
- Introduction: Constraints for both a greater expressiveness and a greater scalability
- Mining the closed itemsets, a generic algorithm and its extension to one definition of fault-tolerance
- Classes of constraints, definitions and enforcements
- Mining patterns in fuzzy n-ary relations
- Mining patterns in collections of graphs
- Mining patterns in vertex-multilabeled graphs
- Conclusion: Summary and perspectives
## Affiliations |

Biology has become an enormously data-rich subject. Data is generated in many flavors and follows particularities of the omics perspective adopted along experimental studies. For instance, genomics is the field of study dealing with genomes and it is mostly associated with the static view (the genes and where they are placed along the genome). The dynamic view is brought from the transcriptomics perspective, so the gene expression and its regulation. Finally, interactomics is usually associated to gene products, proteins, and their interactions. However it could also be seen as a huge graph network with layers of interaction integrating distinct omics perspectives. Omics science applications of unsupervised and/or supervised machine learning (ML) techniques abound in the literature. In this tutorial, we discuss machine learning on omics data, putting the emphasis on (i) mapping and (ii) learning omics patterns. We consider three main omics data: genomics, transcriptomics and interactomics. For each perspective, we first provide, the biological problem, the data mapping (from a biological problem to a machine learning problem), the core ML methods employed and its implementation in the R language. Presentation of the tutorial T3 ## Outline- Introduction and overview of the omics science
- Machine learning in genomics data – foundations, methods and applications
- Machine learning in transcriptomics data – foundations, methods and applications
- Machine learning in interactomics data – foundations, methods and applications
- Outlook – summary and future challenges
## Affiliations |

Reliable estimation of confidence remains a significant challenge as learning algorithms proliferate into challenging real world pattern recognition applications. The Conformal Predictions framework is a recent development in machine learning to associate reliable measures of confidence with results in classification and regression. This framework is founded on the principles of algorithmic randomness, transductive inference and hypothesis testing, and has several desirable properties for potential use in various real world applications, such as the calibration of the obtained confidence values in an online setting. Further, this framework can be applied across all existing classification and regression methods (such as neural networks, Support Vector Machines, ridge regression, etc), thus making it a very generalizable approach. Over the last few years, there has been a growing interest in applying this framework to real world problems such as clinical decision support, medical diagnosis, sea surveillance, network traffic classification, and face recognition. Presentation of the tutorial PDF Presentation of the tutorial PTT ## Outline- Expose the audience to the basic theory of the framework
- Demonstrate examples of how the framework can be applied in real world problems
- Provide sample adaptations of the framework to related machine learning problems such as active learning, anomaly detection, feature selection and model selection
## Affiliations |

Model techniques are becoming increasingly popular in many diverse data mining subfields such as sequence mining, graph mining, and pattern mining. One particularly popular approach, due to its interpretability and practicality, is Minimum Description Length (MDL) principle which is based on information-theoretic approach. In this tutorial we present basic concepts of MDL, Information Theory, and Bayesian Statistics with the emphasis on how they are connected, and what are the consequences of these connections. These connections provide additional insights into MDL principle and information theory, provide a stronger theoretical background, and allow us to use tools from statistics, but also point out limitations that are not immediately apparent. ## Outline- Introduction
- Information Theory and Statistics
- Definitions of Kullback-Leibler and entropy [Mac02]
- Shannon theory on lower bound for encoding [Sha48]
- Huffman encoding
- Connection of entropy to the log-likelihood
- Maximum entropy models
- Definition with the emphasis on subjectivity of the model [Csi75]
- Maximum entropy model as a log-linear model [Csi75]
- Algorithm for solving maximum entropy model [DR72, Csi75]
- Comparing two log-linear models leads to a G-test
- Mutual information as an example of G-test
- MDL and Bayesian model selection
- Bayesian model selection and BIC [Sch78]
- Kolmogoroff complexity [LV93]
- Practical version based on Information Theory [Gru07]
- MDL and connection to Bayesian model selection [Mac02]
- Refined MDL and connection to BIC [Ris96, Gru07]
- Conclusions
## Affiliations |

Selecting a model for a given set of data is at the heart of what data analysts do, whether they are statisticians, machine learners or data miners. However, the philosopher Hum already pointed out that the ‘Problem of Induction’ is unsolvable; there are infinitely many functions that touch any finite set of points. So, it is not surprising that there are many different principled approaches to guide the search for a good model. Well-known examples are Bayesian Statistics and Statistical Learning Theory. ## Outline- Basics of Information Theory
- Patterns and Information Theory
- Information Theory for Data Mining Tasks
- Information Theory and Descriptive Data Mining
## Affiliations |

Reasoning by analogy has been recognized as a major cognitive capability of human mind, and studied in AI, among other fields. In the last decade, there has been a renewal of interest around the notion of analogical proportion, i.e., statements of the form “a is to b as c is to d”. Formal models of analogical proportions have been proposed in various settings including sets, lattices, trees, etc. In logical terms, analogical proportion states that “a differs from b as c differs from d” and vice-versa. This shows that analogy making is both a matter of similarity and dissimilarity. Analogical proportions provide a symbolic counterpart to numerical proportions. Instead of dealing exclusively with numbers, analogical proportions transpose the “rule of threeâ€™â€™ to symbolic items, allowing to induce a 4th item when only the 3 others are known. This is the core of analogical-based learning methods. Its interest relies on the “creative” nature of the process which looks at similar items (as in the neighborhood-based methods), but takes also advantage of dissimilar, but “parallel” cases. The aim of this tutorial is to provide the audience with : - an overview of computationally oriented models of analogical reasoning
- technical knowledge about the use of analogical proportions for inductive tasks
Presentation of the tutorial T7 ## Outline- Historical introduction
- Analogical proportions
- Analogy-based classification
- Analogy-based problem solving
- Current researches and perspectives
## Affiliations |

We will start with an overview of the various preference learning problems which have emerged these past years, including instance ranking and label ranking. We will see how these problems can be formulated as (possibly convex) optimization problems, or reduced to other well-known machine learning problems. Then, we will discuss about the main preference models and how to learn them. In particular, we will first introduce ordinal preference models, including CP-nets and lexicographic preference networks, and then discuss about utility-based models such as generalized additive independence (GAI) networks. Finally, to broaden the talk, we will mention how preference learning may be used in other setting such as Markov Decision Processes or Computational Social Choice. ## Outline- Some Preference Learning Problems
- Instance ranking
- Label ranking
- Learning Ordinal Preference Models
- CP-nets
- Lexicographic preferences
- Learning Utility-based Models
- GAI-nets
- Beyond
- Learning Preferences in Markov Decision Processes
## Affiliations |

Deep learning is one of the most rapidly growing areas of machine learning. It concerns the learning of multiple layers of representation that gradually transform the input into a form where a given task can be performed more effectively. Deep learning has recently been responsible for an impressive number of state-of-the-art results in a wide array of domains, including object detection and recognition, speech recognition, natural language processing tasks, bio-informatics and r einforcement learning. Presentation of the tutorial T9 ## OutlineIn this tutorial we will cover the foundations of deep learning: neural networks, convolutional neural networks, recurrent neural networks, autoencoders and Boltzmann machines. We will discuss why models with many layers of representation can be hard to learn and present strategies that have been developed to overcome these challenges. We will also discuss more recent innovations including dropout training that has proved to be an extremely effective regularization technique for training neural networks. Finally, we will cover some concrete and successful applications of deep learning. ## Affiliations |