ECML/PKDD2003 Discovery Challenge

A Collaborative Effort in Knowledge Discovery from Databases

Call for Contributions

 

http://lisp.vse.cz/challenge/ecmlpkdd2003/chall2003.htm


Motivation

Knowledge discovery in real-world databases requires a broad scope of techniques and forms of knowledge. Both the knowledge and the applied methods should fit the discovery tasks and should adapt to knowledge hidden in the data. The ECML/PKDD2003 Discovery Challenge will encourage a collaborative research effort, a broad and unified view of knowledge and methods of discovery, and emphasis on business problems and solutions to those problems.

The idea of Discovery Challenge came from Jan Zytkow, who suggested to organize such an event during PKDD'99 in Prague. In contrast to KDD Cups held within KDD Conferences, the Discovery Challenge stresses the aspect collaboration.

The Discovery Challenge should constitute a collection of data and problems as a common ground for better comparisons and discussions of the applicability of KDD methods on a real-world problems with respect to both KDD and application viewpoints. The main goals of the Discovery Challenge are

 

Time and place

The Discovery Challenge will be held as a workshop during the ECML/PKDD2003 Conference, September 22-26, 2003, Cavtat-Dubrovnik, Croatia. Only those registered for ECML/PKDD2003 can participate in the Discovery Challenge.

 

Data sets

Two data sets from medical domain are used for the Discovery Challenge: The participants in the Challenge can analyze any of these data sets. To get access to the data, you have to fill-in the registration form ).

 

Discovery Challenge guidelines

  1. Each participant can use any KDD techniques and discover as much knowledge as possible.
    Ideally each submitted contribution will include
    • the proposed business objectives (goals that may be of interest to database users),
    • a brief summary of datamining effort; this summary may include the data preprocessing tasks like data extraction, sampling, data integration and homogenization, data cleaning, data transformation, the data mining step as well as the evaluation criteria apporwed,
    • presentation of the discovered knowledge, and
    • an explanation for database users how they can apply the discovered knowledge.
    Since the results may be unexpected, the final applications may be different from those initially proposed.
  2. In order to reach a common framework for comparisons, the presentation of the discovered knowledge should include a clear summary of the predictions it makes possible. Ideally, such a summary shows parts of the entire dataset that can be removed from data because they can be predicted by the discovered knowledge and the remaining data.
  3. All presentation will be done during the Discovery Challnege Workshop. The time allocated for each presentation will be about 20 minutes.
  4. Ample time will be provided during and after the special sessions for interaction between participants. The discussion will be aimed at a joint representation of knowledge and method, and on a synthesis of all contributions.

Submitted papers should be in English and should be formatted according to the Springer-Verlag Lecture Notes in Artificial Intelligence guidelines. Authors' instructions and style files can be downloaded from http://www.springer.de/comp/lncs/authors.html (no copyright form is requested, use the style for proceedings). The maximum length of papers is 12 pages. The paper must be submitted electronicaly (as PostScript or PDF files) either by e-mail to Petr Berka or using the “submit paper” option from the Discovery Challenge Webpage. The deadline for submission is June 30, 2003. An acceptance notification will follow. The deadline for camera-ready papers is July, 11, 2003.

 

Acknowledgment

The ECML/PKDD2003 Discovery Challenge is supported by

Petr Berka
Jan Rauch
Shusaku Tsumoto