next up previous contents
Next: RowFunctionView Up: Views Previous: Views   Contents

FunctionView

The FunctionView allows one to filter a dataset by unary function. These unary functions take in a single real number and produce a single real number. One can use many of the functions in the python math module or define your own via lambda functions.

For our first example, let's take our dataset and create a FunctionView which log-transforms the data. To do this we will need the log function which lives in the python math module. So, we will need to import that module first.

>>> import math

Now let's test to see that the log function is there and working correctly.

>>> math.log(1)
0.0

Looks good! Now we can use this function to build our FunctionView.

>>> func_view = FunctionView(ds, math.log)

And that's all that needs to be done. In general, creating any view is a one-line operation. Let's inspect this function view and use some of the methods from the Dataset class on it as well to make sure it behaves as expected.

>>> func_view.setName('Log-Transformed Data')
>>> func_view.getName()
'Log-Transformed Data'
>>> func_view
FunctionView: Log-Transformed Data, 3 by 3
>>> func_view.writeDataset()
0.0     0.0     0.0
0.69314718056   0.69314718056   0.69314718056
1.09861228867   1.09861228867   1.09861228867
>>> func_view.getNumRows()  
3
>>> func_view.getNumCols()
3
>>> func_view.getRowData(1)
[ 0.69314718, 0.69314718, 0.69314718,]
>>> func_view.getColData(1)
[ 0.        , 0.69314718, 1.09861229,]

Everything looks to be working as it should, so let's try something a bit different. The log-transformed view we create takes the natural log of the values, but what if we wanted the base-2 log of the data. This is simple to do with a lambda function.

Remember the basic identity that $\log_a b = \frac{\ln b}{\ln a}$. If we create a lambda function which does that and pass that in as the argument to the
FunctionView constructor, we will have our log-base-2 view. Let's try.

 
>>> log2_view = FunctionView(ds, lambda x : math.log(x) / 
... math.log(2))
>>> log2_view.getData()
[[ 0.       , 0.       , 0.       ,]
 [ 1.       , 1.       , 1.       ,]
 [ 1.5849625, 1.5849625, 1.5849625,]]

Perfect. Now, for one last example, let's attach another FunctionView to the first one we created. This time, we'll use the exponential unary function and see if we can get our original dataset back.

>>> exp_view = FunctionView(func_view, math.exp)
>>> exp_view.getData()
[[ 1., 1., 1.,]
 [ 2., 2., 2.,]
 [ 3., 3., 3.,]]
>>> exp_view.getData() == ds.getData()
[[1,1,1,]
 [1,1,1,]
 [0,0,0,]]

As you can see, even though it appears that we reconstructed the original dataset, when comparing by equality, the third row is not exactly equal. This is not a problem specific to python or the MLX schema, but a generic problem inherent with floating point numbers used on any computer system. It is good to be reminded of the limitations of floating point from time to time.

The last thing we need to do before moving on is remove the views we have created. Python uses a reference counting scheme for its garbage collection and the internal structure of the schema is complicated enough that circular references are introduced. This means that the links between objects must be explicitly broken for them to be reclaimed by the garbage collection. Fortunately, the Dataset class, and by extension, the View classes, offer a removeView() method which takes in an instance of a View object and removes it. The View passed in needs to be a child of the view upon which the method is invoked. With this knowledge, let's clean up our views.

>>> func_view.getViews()
[FunctionView: None, 3 by 3]
>>> func_view.removeView(exp_view)
>>> func_view.getViews()
[]
>>> ds.getViews()
[FunctionView: Log-Transformed Data, 3 by 3, 
 FunctionView: None, 3 by 3]
>>> ds.removeView(func_view)
>>> ds.removeView(log2_view)
>>> ds.getViews()
[]

Notice, we've also used the getViews() method. This returns a list of views currently associated with a dataset. Related to this is the getView() method which can retrieve a view by name. We'll look at this method in the next section.


next up previous contents
Next: RowFunctionView Up: Views Previous: Views   Contents
Lucas Scharenbroich 2003-08-27