香港城市制服做爱练恒教授学术报告通知-版权所有：制服做爱-制服诱惑做爱-制服诱惑

制服做爱

学术动态

当前位置：制服做爱 > 学术动态 > 正文

香港城市制服做爱练恒教授学术报告通知

发布时间 : 2024-05-10 点击量：

报告题目：Kernel-based Decentralized Policy Evaluation for Reinforcement Learning

报告人：练恒教授香港城市制服做爱

报告时间：2024年5月14日（星期二），上午10:00

报告地点：兴庆校区数学楼2-1会议室

报告摘要：We investigate the decentralized nonparametric policy evaluation problem within reinforcement learning, focusing on scenarios where multiple agents collaborate to learn the state-value function using sampled state transitions and privately observed rewards. Our approach centers on a regression-based multi-stage iteration technique employing infinite-dimensional gradient descent within a reproducing kernel Hilbert space (RKHS). To make computation and communication more feasible, we employ Nystrom approximation to project this space into a finite-dimensional one. We establish statistical error bounds to describe the convergence of value function estimation, marking the first instance of such analysis within a fully decentralized nonparametric framework. We compare the regression-based method to the kernel temporal difference (TD) method in some numerical studies.

个人简介：练恒，现任香港城市制服做爱数学系教授，于2000年在中国科学技术制服做爱获得数学和计算机学士学位，2007年在美国布朗制服做爱获得计算机硕士，经济学硕士和应用数学博士学位。先后在新加坡南洋理工制服做爱，澳大利亚新南威尔士制服做爱，和香港城市制服做爱工作。在高水平国际期刊上发表学术论文30多篇，包括《Annals of Statistics》《Journal of the Royal Statistical Society，Series B》、《Journal of the American Statistical Association》《Journal of Machine Learning Research》《IEEE Transactions on Pattern Analysis and Machine Intelligence》. 研究方向包括高维数据分析，函数数据分析，机器学习等。

上一篇：中国科制服做爱空天信息创新研究院张柘研究员学术报告通知下一篇：兰州制服做爱王智诚教授学术报告通知