Using The Experimenter

Using the Experimenter:

The Experimenter interface to Weka is specialized for conducting experiments where the user is interested in comparing several learning schemes on one or several datasets. As its output, the Experimenter produces data that can be used to compare these learning schemes visually and numerically as well as conducting significance testing. To demonstrate how to use the Experimenter, we will conduct an experiment comparing two tree-learning algorithms on the birth dataset.

 

There are 3 main areas in the Experimenter interface and they are accessed via tabs at the top left of the window. These 3 areas are the Setup area where the experiment parameters are set, the Run area where the experiment is started and its progress monitored and lastly the Analyze area where the results of the experiment are studied.

Setting up the Experiment

Please refer to reference section 3.6.1

The Setup window has 6 main areas that must each be configured in order for the experiment to be properly configured. Starting from the top these areas are Experiment Configuration Mode, Results Destination, Experiment Type, Iteration Control, Datasets, Algorithms and lastly the Notes area.

 

Experiment Configuration Mode:

We will be using the simple experimental interface mode, as we do not require the extra features the advanced mode offers. We will start by creating a new experiment and then defining its parameters. A new experiment is created by pushing the on the ‘New’ button at the top of the window and this will create a blank new experiment. After we have finished setting up the experiment, we save it using the ‘Save’ button. Experiment settings are saved in either EXP or a more familiar XML format. These files can be opened later to recall all the experiment configuration settings.

 

Choose Destination:

The results of the experiment will be stored in a datafile. This area allows one to specify the name and format that this file will have. It is not necessary to do this if one does not intend on using this data outside of the Experimenter and if this data does not need to be examined at a later date. Results can be stored in the ARFF or CSV format and they can also be sent to an external database.

Set Experiment Type:

There are 3 types of experiments available in the simple interface. These types vary in how the data is going to be split for the training/testing in the experiment. The options are cross-validation, random split and random split with order preserved (i.e. data is split randomly but the order of the instances is not randomized; so it will instance#1 followed by instance #2 and so on). We will use cross-validation in our example

Iteration Control:

For the randomized experiment types, the user has the option of randomizing the data again and repeating the experiment. This value for ‘Number of Repetitions’ controls how many times this will take place.

 

Add data set(s):

In this section, the user adds the datasets that will be used in the experiment. Only ARFF files can be used here and as mentioned before, the Experimenter is expecting a fully prepared and cleaned dataset. There is no option for choosing the classification variable here, and it will always pick the last attribute to be the class attribute. In our example, the birth-weight data set is the only one we will use.

 

Add Algorithms:

In this section, the user adds the classification algorithms to be employed in the experiment. The procedure here to select an algorithm and choose its options is exactly the same as in the Explorer. The difference here is that more than one algorithm can be specified.

 

Text Box:  Algorithms are added by clicking on the add button in the Algorithm section of the window and this will pop-up a window that user will use to select the algorithm. This window will also display the available options for the selected algorithm.

 

The first time the window displays, the ZeroR rule algorithm will be selected. This is shown on the picture above. The user can select a different algorithm by clicking on the Choose button. Clicking on the ‘More’ button will display help about the selected Algorithm and a description of its available options. For our example, we will add the J48 algorithm with the option for binary splits turned on and the REPTree algorithm. Individual algorithms can be edited or deleted by clicking on the algorithm from the list of algorithms and then by clicking on the Edit or Delete buttons. Finally, any extra notes or comments about the experiment setup can be added by clicking on the Notes button at the bottom of the window and entering the information in the window provided.

Saving the Experiment Setup:

At this point, we have entered all the necessary options to start our experiment. We will now save the experiment setup so that we do not have to re-enter all this information again. This is done by clicking on the ‘Save Options’ button on the bottom of the window. These settings can be loaded at another time if one wishes to redo the experiment or modify it.