Graphical Outputs
Graphical Outputs
Explorer also has built-in graphical
visualization tools that can greatly assist the task of analyzing the data. There
are five graphical visualization tools available in Explorer with each either
serving a specific functionality or allowing multiple functionalities.
As the picture on the left shows, these tools are accessed by clicking the right mouse button on a specific result from the result list and then from the menu selecting the desired visualization tool.
Except for the Visualize Tree tool, all the other tools share a common interface. Whether visualizing classifier errors, margin, threshold or cost curves, the interface is the same. In all these tools, Weka generates a two dimensional graph. In each graph, the user can choose from a list of variables to show as x-axis or y-axis. The things that change from tool to tool are the choices for the X and Y axes and the color options. We will next describe this interface in general and we will then elaborate on each tool. Please refer to Reference Section
Points on the graph are known as instances. In the case of the classifier errors visualization tool, the use of the word instance means the same as before, the observations in the data. In the other tools however, the word instance will mean something different. An example of this is when viewing threshold (ROC) curves; the points (instances) are actually referring to different threshold levels. Please refer to Reference Sections 3.5.1-3.5.5 for a full size picture of each of these visualization tools.
The picture on the left is the top part
of the interface. The X dropdown at the top is used to select what the x-axis
of the graph will be. The Y dropdown performs a similar function for the
y-axis.
The Color dropdown is used to select what attribute will be used to color the dots on the graph. The different colors are assigned by the different levels of the attribute selected for Color. If the selected attribute is numerical, the colors are assigned from a color gradient with the minimum and maximum values of the attribute given different colors. Points with values in-between these values are given a mix of the two colors depending how close they are to either extreme.
The Select dropdown works with the three buttons (Reset, Clear, Save). The default is the Select Instance mode. In this mode, clicking on a point will pop a window that will give detailed information on that point (or points if there are more than one in the same place on the graph).

The other
modes are used to select a certain area of the graph. The options are to select
a rectangular or the more general polygon (left) and polyline (right). Once the
area has been selected (area in grey), the user can click the submit button and
the graph zooms in on this area alone. The clear button is used to clear the
selection area and the reset button sets the graph back to the original state. This
tool can be used to select a subset of instances visually. This subset of
instances can be saved into a separate data file by clicking on the save
button.
The Jitter slider on the top right is used to scatter points randomly around their original spot on the graph. This is useful to spot areas of concentration on the graph where many points may lie on top of each other.

The picture above is from the middle right of the visualization window. It has been enlarged to make it easier to see. The use of this is to assist in selecting attributes for the X and Y-axes. The points on each strip show how the instances are scattered for different values of the attribute.
Each attribute occupies 1 strip. Each strip is a one-dimensional mini-graph with the instances vertically scattered randomly so that it is easy to see concentrations of them. The last strip, for example, is the class variable weight. Since this variable has 2 levels we see 2 areas of concentration. The points would normal lie on top of each other, but they have been scattered vertically in order to see concentrations of points.
The coloring of the instances is the same as the graph and is controlled with the same Color dropdown. Left clicking on a strip will set the x-axis of the graph to be the attribute the strip belongs to. Right clicking on the strip will do the same for the y-axis. The strip will then show an X or Y to indicate this. If both X and Y are set on the same attribute the strip will indicate this by showing a ‘B’ on the strip the attribute belongs to.
At the bottom of the window is a color legend showing what
the different colors on the graph signify. The picture on the right is from the
birth weight data when visualizing classifier errors. The class variable was
weight, and its two levels are color coded appropriately. It is possible to
change a particular color for a given level by clicking on it and a color
selection window will then pop-up. The background color cannot be changed
however.
Graphical Outputs: Visualize Classifier Errors
This visualization tool is used to illustrate the errors the model generated. Using this tool, the user can try to see if there was something in common or peculiar about the misclassified instances.

The graphs one can view with this tool help in understanding, visually, the kind errors the model made. It can show how these errors are related to the different attributes. This kind of information helps when trying to decide what do to with the misclassified cases. It can be used to see if the misclassified cases had something in common, or if they were particularly concentrated around certain values of an attribute.
The graph on the left was generated from the low birth weight data using the j48 algorithm to build the model. The x-axis is ftv and the y-axis is age. Each point in this graph represents an observation in the data. Therefore, since in our example there are 180 instances in the data file, there are thus 180 points on the graph. In this tool, the color variable is normally set as the class variable (weight). The low weight instances are red and normal weight instances are light blue. A correctly predicted instance is depicted as an x point and an incorrectly classified instance is depicted as a square point. When the window first opens up, X and Y are both set to instance number but the user can choose from any of the attributes in the data as the X or Y-axis. He/She can also choose the class variable, the predicted class variable and finally the instance number for either axis.
When the Select dropdown is in instance mode (default) clicking on a point will open a window that gives detailed information on the instances that lie on that point. For each instance, its list of attributes and its predicted and actual class are shown. The other modes of the Select dropdown are used for data selection, and work in the same as was described earlier.
Graphical Outputs: Visualize Margin Curve
This visualization tool is used to illustrate the
prediction margin. The prediction margin is the difference between the
predicted probability for the actual class and the highest probability
predicted for the other classes.
In our birth weight example, there were 180 tests performed, i.e. 18 tests for each of the 10 models created with the cross-validation (180 point in the graph).
For each instance then, the margin is the difference between the probability predicted for the actual class and the probability predicted for the next most likely class. If for example one low birth weight instance reached a leaf node where 70% of the cases were normal birth weight the margin would be calculated as follows. The actual class of this instance is ‘low’ and in this node, it is expected to occur 30% of the time. The next highest predicted class is ‘normal’ and this is expected to occur 70% of the time. The margin value for this instance would then be 0.30 – 0.70 = -0.40.
For each test instance then, the margin is calculated and sorted from lowest margin to highest. This is shown in the graph, the lowest margin corresponded to a difference about -0.3, and the highest prediction margin was 0.42. In essence, this graph shows the largest mistakes the model made to the most concrete predictions (i.e. predictions which had the highest probability of success).
There are four variables defined here (margin, current, cumulative and instance number) that can be used to construct the margin curve. The margin attribute contains the prediction margin value and current contains the number of instances with a specific margin value. Lastly, cumulative contains the number of instances with a margin less than or equal to the current margin. Since the instances are sorted from lowest margin to highest margin this means that instance number and cumulative show the same information. The default coloring-variable is the margin value.
Graphical Outputs: Visualize Threshold Curve
The threshold value is the minimum probability required to
classify an instance. For example, if the threshold value for low birth weight prediction
is 0.5 then an instance must have a predicted probability of at least 0.5 for
it to be classified as low.
This visualization tool is used to illustrate the prediction tradeoffs under different threshold levels. For each class level, points on this graph are generated by varying the threshold value.
Weka generates a number of threshold levels from zero to one and for each calculates the following performance values: True Positives, False Negatives, False Positives, True Negatives, False Positive Rate, True Positive Rate, Precision, Recall, Fallout, F-Measure and Threshold.
Each variable can be used for the x or y-axis and the color variable is threshold. Using this graph it is then possible to do ROC curve analysis (True Positive Rate vs. False Positive Rate) or visualizing the precision/recall tradeoff. At the top is shown the Area under the ROC curve, higher numbers here indicates better model performance as the model is able to get high true positive rates with low false positive rates quicker.
Graphical Outputs: Visualize Cost Curve
Cost curves were introduced by Christ Drummond[1]
and Robert C. Holte[2] at Knowledge
Discovery and Data Mining Conference in 2000 where they presented their paper[3]
on the method. They propose cost curves as an alternative to ROC analysis and
describe several advantages of the method over ROC curves in describing model
performance.
This tool is used to visualize cost curves for different threshold levels. The color variable is the threshold value. The two other variables defined are the probability cost function and the normalized expected cost.
The formula for the probability cost function is shown below:
![]()
· PCF(+) is the probability cost function for positive cases.
· P(+) is the probability of a positive example.
· C(-|+) is the cost of classifying a negative example as positive
· C(+|-) is the cost of classifying a positive example as negative.
· In the case where misclassification costs are equal then PCF(+)=p(+).
The ranges of values for PCF(x) represent possible operating points. An operating point is refers to a specific combination of misclassification costs and class probabilities. Each point on the ROC curve is represented by a line on the cost curve and each line connecting the points on the ROC curve is a point of intersection between two cost curves. These intersections of cost curves at the bottom when connected form an ROC convex hull.
The two trivial classifiers, i.e. the classifier that designates all cases as positive and the one that designates all cases as negative have cost curves that go from (0,0) to (1,1) and (0,1) to (1,0) respectively. Model performance is compared against the two trivial classifiers by looking for range of operating points (values of PCF(x)) where the convex hull is beneath the two diagonal lines. Similarly, if one is comparing two convex hulls, then one can measure the range of operating points where one shape is beneath the other. One can appreciate how cost curves can make it easier to compare different models or find optimal operating ranges for a given model. At this time, however, there is no documentation in Weka about how these graphs are generated in the system. In addition, the representation is also not very clear and could use some improvement to make these graphs easier to interpret.
Graphical Outputs: Visualize tree
This tool is used to visualize the generated tree model. The picture on the left is the model generated from the birth weight data. At each of the nodes of the tree, the name of the attribute used to split is shown.
In the picture on the right, the first split (root node)
was split on the ptl attribute. The branches indicate what value was used to
split the node. In this case instances with a ptl=one went down the left branch
and everything else went down the right branch.
At each leaf of the tree, the class designation for it is shown. It is followed two numbers in brackets, the first indicates the number of instances in the leaf and the second indicates the number of misclassified cases.
If the SaveInstanceData option was enabled it is then possible to visualize the instances in the node. To do this the user will click on a node or leaf and this will open a window where one can visualize the instances in that node or leaf. This window is the same as the visualize classifier errors tool that was described earlier.