Weka

Weka — is the library of machine learning intended to solve various data mining problems. The system allows implementing various algorithms to data extracts, as well as call algorithms from various applications using Java programming language.



Weka stands for Waikato Environment for Knowledge Analysis, and it was developed at Waikato University (New Zealand).

About Weka tool

Project goals: creating the modern environment to develop various machine learning methods and implement them in real data, making machine learning methods accessible and available for the wide audience. The idea is to provide the specialists working in the practical fields with the ability to use machine learning methods in order to extract useful knowledge right from the data, including relatively high volumes of information.

Weka users are researchers in the field of machine learning and applied sciences. It can also be used for various learning purposes.

Weka includes a set of tools for the preliminary data processing, classification, regression, clustering, feature extraction, association rule creation, and visualization. Weka is an efficient tool that allows developing new approaches in the field of machine learning.

Implementation of Weka software

Weka is an open-source software solution developed by the international scientific community and distributed under the free GNU GPL license.

The software is fully developed using the Java programming language. It is expected that the source data are presented in the form of a feature matrix of the objects. Weka provides access to SQL databases using Java Database Connectivity (JDBC) and allows using the response for an SQL query as the source of data. This tool doesn’t support processing of related charts; however, there are many tools allowing combining separate charts into a single chart, which can be loaded right into Weka.

Functionality and features of Weka machine learning



Weka offers Explorer user interface, but it also offers the same functionality using the Knowledge Flow component interface and the command prompt. It also offers a separate Experimenter application that allows comparing predictive features of machine learning algorithms for the given set of tasks.

Explorer contains several different tabs.

The preprocessing panel allows importing the data from the base, a CSV-file etc., applying various filtration algorithms, e.g. transforming the quantitative characteristics into discrete ones, or deleting the objects and characteristics according to defined criteria.

The classify panel allows applying various classification and regression algorithms (both of them are called classifiers in Weka) for the data extract, evaluating the predictive ability of algorithms, visualize erroneous predictions, ROCs, and the algorithm itself when it’s possible (in particular, decision trees).

The associate panel is intended to find all the important interconnection between various characteristics.

The cluster panel provides access to the k-means algorithm, EM-algorithm for the Gaussian mixture model etc.

The select attributes panel provides access to different characteristics choosing methods.

The visualize panel allows creating the scatter plot matrix, making it possible to choose and scale charts etc.

Weka tutorial

On this page, you can find a detailed Weka tutorial in order to read or to watch the required information.

If you would like to read, please click here to open Weka tutorial pdf.

Otherwise, please watch the following video tutorials:












Weka integration by Java

Weka provides direct access to the library of implemented algorithms. This feature makes it possible to apply algorithms created in different systems based on Java. For example, such algorithms can be easily requested from MATLAB. In particular, the tool to access Weka algorithms from MATLAB is implemented in such algorithmic machine learning packages as Spider and MATLABArsenal.

Weka API. How to use Weka in your Java source code:



In order to use Weka in systems based on different platforms, one can use the command prompt algorithm interface.

That's it! If information was helpful for you, please share this page in social networks!