Week 4: Two Step Estimator
This week I implemented the 2-Step estimator from this paper (Nonparametric Conditional Density Estimation) and compared the results with the 1-Step estimator I had implemented in the previous week from the same paper. Also found out a shortcut method to calculate E(Y|X=x) that is not only more accurate than the existing ProbSpace method, but also much faster.
Monday:
Studied the 2-step Estimator method from the Nonparametric Conditional Density Estimation paper. Had a discussion with Roger sir regarding future goals and objectives. Apart from optimizing the RKHS method further using the Dual Tree method , we have to integrate the RKHS method into the ProbSpace module.
Implemented an optimization metric during the RKHS calculation phase that allows us to skip calculations that have very little effect towards our answer. This allowed us to reduce the time for calculations significantly:
“Or Optimization”, bound = 3*sigma
Average Error: 1.3968811031293926
Max error: 5.431449902740937
Time: 153.31915044784546
“And Optimization”, bound=3*sigma
Average Error: 1.561447953266133
Max error: 5.426098406537719
Time: 38.010313987731934
For reference :
No Optimization
Average Error: 1.3966743172847627
Max error: 5.431449933619589
Time: 719.6718242168427
Tuesday & Wednesday:
Implemented the 2 step Estimator, here is a comparison for 2-Step vs 1-Step methods. (Note, we are using the “and optimization” (i.e, (abs(r1.X[i] - x) <= r1bound and abs(r2.X[i]-y) <= r2bound) condition for filtering).As seen, 2-Step Estimator does produce lower average error but at the cost of time and max-error. At this point I tried to calculate the same graphs but with “or optimization” (similar to “and” but with or in the middle), but that just takes too much time incase of the 2-step estimator. So I settled on using “or optimization” for 1-Step and “and optimization” for 2-Step.
As seen, 1-Step with OR optimization has 2-Step beat in all regards, all the while taking lesser time to calculate. The saving grace for the 2-Step method seems to be that it has the potential to reach a higher level of accuracy with some modifications to the optimization metric, however the time taken makes it difficult to consider.
Thursday:
Due to a suggestion from Roger sir, we discovered a shortcut method (this was a part of the 2-step estimator process) to calculate E(Y|X=x) without going through the full procedure of calculating P(Y=y|X=x) like we were before. This yielded amazing results, highest accuracy we’ve seen so far plus it’s even faster than the ProbSpace method!
Friday:
Started studying the Fast Nonparametric Conditional Density Estimation[2] paper to implement the dual tree optimization technique.
Upcoming Week Plans:
Explore the Dual-Tree approximation from the Fast Nonparametric Conditional Density Estimation paper.
References:
[1] Hansen, Bruce E. (2004) "Nonparametric Conditional Density Estimation", University of Wisconsin Department of Economics
[2]Holmes, Michael P.; Gray, Alexander G.; Isbell, Charles Lee Jr. (2007) "Fast nonparameteric conditional density estimation", URL: https://arxiv.org/ftp/arxiv/papers/1206/1206.5278.pdf
Comments
Post a Comment