Week 4: Two Step Estimator

- June 23, 2021

This week I implemented the 2-Step estimator from this paper (Nonparametric Conditional Density Estimation) and compared the results with the 1-Step estimator I had implemented in the previous week from the same paper. Also found out a shortcut method to calculate E(Y|X=x) that is not only more accurate than the existing ProbSpace method, but also much faster.

Monday:

Studied the 2-step Estimator method from the Nonparametric Conditional Density Estimation paper. Had a discussion with Roger sir regarding future goals and objectives. Apart from optimizing the RKHS method further using the Dual Tree method , we have to integrate the RKHS method into the ProbSpace module.

Implemented an optimization metric during the RKHS calculation phase that allows us to skip calculations that have very little effect towards our answer. This allowed us to reduce the time for calculations significantly:

“Or Optimization”, bound = 3*sigma

Average Error: 1.3968811031293926
Max error: 5.431449902740937
Time: 153.31915044784546

“And Optimization”, bound=3*sigma

Average Error: 1.561447953266133
Max error: 5.426098406537719
Time: 38.010313987731934

For reference :

No Optimization

Average Error: 1.3966743172847627
Max error: 5.431449933619589
Time: 719.6718242168427

Tuesday & Wednesday:

Implemented the 2 step Estimator, here is a comparison for 2-Step vs 1-Step methods. (Note, we are using the “and optimization” (i.e, (abs(r1.X[i] - x) <= r1bound and abs(r2.X[i]-y) <= r2bound) condition for filtering).

Method	Average Error	Max Error	Time taken
1 Step Estimator	1.561447953266133	5.426098406537719	27.177362203598022
2 Step Estimator	1.4623417529653346	6.610000000001165	133.49220895767212

As seen, 2-Step Estimator does produce lower average error but at the cost of time and max-error. At this point I tried to calculate the same graphs but with “or optimization” (similar to “and” but with or in the middle), but that just takes too much time incase of the 2-step estimator. So I settled on using “or optimization” for 1-Step and “and optimization” for 2-Step.

Method	Average Error	Max Error	Time taken
1-Step (OR)	1.3968811031293926	5.431449902740937	113.14020037651062
2-Step (AND)	1.4623417529653346	6.610000000001165	131.32312035560608

As seen, 1-Step with OR optimization has 2-Step beat in all regards, all the while taking lesser time to calculate. The saving grace for the 2-Step method seems to be that it has the potential to reach a higher level of accuracy with some modifications to the optimization metric, however the time taken makes it difficult to consider.

Thursday:

Due to a suggestion from Roger sir, we discovered a shortcut method (this was a part of the 2-step estimator process) to calculate E(Y|X=x) without going through the full procedure of calculating P(Y=y|X=x) like we were before. This yielded amazing results, highest accuracy we’ve seen so far plus it’s even faster than the ProbSpace method!

Method	Average Error	Max Error	Time taken
ProbSpace	10.97397258530662	74.3476459734108	7.661393880844116
RKHS Sigma = 0.24	1.2084530592994442	19.74293374241948	2.8335468769073486

Friday:

Started studying the Fast Nonparametric Conditional Density Estimation[2] paper to implement the dual tree optimization technique.

Upcoming Week Plans:

Explore the Dual-Tree approximation from the Fast Nonparametric Conditional Density Estimation paper.

References:

[1] Hansen, Bruce E. (2004) "Nonparametric Conditional Density Estimation", University of Wisconsin Department of Economics

[2]Holmes, Michael P.; Gray, Alexander G.; Isbell, Charles Lee Jr. (2007) "Fast nonparameteric conditional density estimation", URL: https://arxiv.org/ftp/arxiv/papers/1206/1206.5278.pdf

Search This Blog

The Causality Project

Week 4: Two Step Estimator

Monday:

Tuesday & Wednesday:

Thursday:

Friday:

Upcoming Week Plans:

References:

Comments

Post a Comment

Popular posts from this blog

Week 12: Final Week

Week 9: Filter-RFF and FPROB

Week 10: Implementing UPROB