Week 7: Getting started with Filter-RKHS

- July 16, 2021

This week I revisited the RKHS method to confirm that we are on the right track and got started with implementation of the Filtered-RKHS method. Here is a day-wise summary of my progress throughout the week:

Monday & Tuesday:

Analysed the RFF code to reduce the time taken for calculations. Implemented an RFF calculation method according to the formula:

As described in this article where z(x)T z(y) is the equivalent calculation to k(x,y), the basic kernel function. The kernel techniques we employ use an iterative approach instead being performed through matrix operations. As a result of this, it involves calling the k(x,y) function repeatedly over the length of the dataset to produce results. I could not figure out how replacing the RFF calculations instead of RKHS kernel calculations would speed up the calculations. If anything, it is going to be slower because while the k(x,y) function is a straightforward calculation, the RFF method iterates over R random selected features as the formula shown above indicates. I must have gone wrong in my calculations somewhere or misunderstood some concepts. I will be revisiting this method again to figure it out after discussion with Roger Sir.

Wednesday:

To reinforce that the RKHS method we have been using so far is reliable, I tested it with different models other than the Y= X2 model which we have been using for weeks now. Here are some of those results

Y = sinh(X)

Y = exp(X)

Y = tanh(X)

Y = tanh(X) + X2

As seen, the RKHS method stays superior to the Probspace method across different curves and therefore we are on the right track.

Thursday & Friday:

Worked on implementing the hybrid Multi-Conditional probability method, Which is sort of a combination of how Probspace handles multiple conditionals and the RKHS method. The way Probspace calculates P(Z| Y , X = x) is that it basically filters the whole dataset and returns only the data points where X = x and then calculates P(Z|Y) on these data points.

Since this may sometimes leave us with only a few data points left, there is an algorithm to search within close vicinity of the value x i.e, for some small σ, instead of Returning data points where X =x, it returns all data points where the value of X lies within range (x - σ , x + σ).

This provides us with more data to work with. However the tradeoff is that the σ value must not be so large that it makes the results inaccurate.

What I have implemented is that, after this filtration stage, I feed the filtered data points to the RKHS kernel and calculate P(Z | Y) using kernel methods instead of the ProbSPace methods.

The model I used is:

X = logistic(0,1)

Y = logistic(1,1)

Z = math.sin(X)+math.tanh(Y)

Where X and Y are independent variables and Z depends on X and Y.

Here are some results:

The ProbSpace Z|Y,X=0 is the curve where data points have been filtered through the Probspace method and then the RKHS method has been applied to calculate P(Z|Y) on it.

As seen, the curve resembles a tanh() curve, which is what Z is when the influence of X has been removed.

Similarly, in this curve we are calculating P(Z|X,Y=0) i.e, we filter based on value of Y and then calculate P(Z|Y). The obtained curve resembles a sin() curve, which is nothing but Z with the influence of Y removed.

These results are excellent, however the obtained curves are only so accurate because of the high availability of data points i.e, since X=0 and Y=0 are around the mean of the dataset, we were able to filter a relatively large amount of datapoints in close vicinity to 0. However, as we condition on points farther away from the mean, X = 5 or Y = 3 for example, there are fewer data points available, even when the σ value is increased. This leads to poorer results. The graphs and Average Error table is shown below:

Probability Calculation	Number of filtered data points	Average Error
P(Z\|Y,X=0)	472	0.027500236786978147
P(Z\|Y,X=3)	178	0.12472792815940457
P(Z\|Y,X=5)	146	0.18335539958562355

P(Z\|X,Y=0)	351	0.3696281399936332
P(Z\|X,Y=1)	223	0.8753761183932961
P(Z\|X,Y=3)	185	1.0922776215075785

Upcoming Week Plans:

Revisit the RFF method. Possibly work with Roger sir’s implementation of JointProb i.e, implementation of joint probability distributions through RKHS kernel.

Search This Blog

The Causality Project