Software

Common Data Mining Framework

A software framework that provides a unified data mining perspective. The project is currently code-named Compass.

Introduction

The Common Data Mining Framework is an attempt to create an encompassing software framework for data mining with flexibility in mind. It is meant to be extensible with different algorithms, data types, and environments. The focus of the project is not on implementing as much data mining algorithms as possible, but rather on:

Layered Software Architecture

To achieve these goals, a layered software framework is proposed. At the lowest level is the functionality common to most data mining tasks. The middle layer provides the actual data mining logic (algorithms), and represents the bulk of the framework. The top layer offers very high level functionality by utilizing the abstraction provided by the lower layers, and can be used to built complete data mining applications or more complex systems.

Layered Data Mining Framework Layered Architecture of the Common Data Mining Framework

Multi-form Reusability

From a different perspective, the functionality provided by the frameowork should be reusable in different forms. A user of the framework should be able to access the functionality in the following forms:

To achieve this kind of reusability, the framework will provide most functionality in a class library form. Wrapping the library into a command line tool or a web service will ensure the back-end functionality is the same, but with different front-ends.

Multi-form Reusability
Multi-form Reusability

Implemented Functionality