Skip to main navigation Skip to search Skip to main content

Active and semi-supervised data domain description

  • Nico Görnitz*
  • , Marius Kloft
  • , Ulf Brefeld
  • *Corresponding author for this work

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

30 Citations (Scopus)

Abstract

Data domain description techniques aim at deriving concise descriptions of objects belonging to a category of interest. For instance, the support vector domain description (SVDD) learns a hypersphere enclosing the bulk of provided unlabeled data such that points lying outside of the ball are considered anomalous. However, relevant information such as expert and background knowledge remain unused in the unsupervised setting. In this paper, we rephrase data domain description as a semi-supervised learning task, that is, we propose a semi-supervised generalization of data domain description (SSSVDD) to process unlabeled and labeled examples. The corresponding optimization problem is non-convex. We translate it into an unconstraint, continuous problem that can be optimized accurately by gradient-based techniques. Furthermore, we devise an effective active learning strategy to query low-confidence observations. Our empirical evaluation on network intrusion detection and object recognition tasks shows that our SSSVDDs consistently outperform baseline methods in relevant learning settings.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases : European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I
EditorsWray Buntine, Marko Grobelnik, Dunja Mladenic, John Shawe-Taylor
Number of pages16
Place of PublicationBerlin, Heidelberg
PublisherSpringer Verlag
Publication date01.07.2009
Pages407-422
ISBN (Print)978-3-642-04179-2
ISBN (Electronic)978-3-642-04180-8
DOIs
Publication statusPublished - 01.07.2009
Externally publishedYes
EventEuropean Conference on Machine Learning and Knowledge Discovery in Databases - 2009 - Bled, Slovenia
Duration: 07.09.200911.09.2009
https://www.k4all.org/event/european-conference-on-machine-learning-and-principles-and-practice-of-knowledge-discovery-in-databases/

Research areas and keywords

  • Informatics
  • Active Learning
  • Background knowledge
  • Baseline methods
  • Continuous problems
  • Data domain description
  • Empirical evaluations
  • Gradient based
  • Learning settings
  • Network intrusion detection
  • Optimization problems
  • Semi-supervised learning
  • upport vector domain description
  • Unlabeled data
  • Business informatics

Fingerprint

Dive into the research topics of 'Active and semi-supervised data domain description'. Together they form a unique fingerprint.

Cite this