{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 01. 基础教程" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在这个教程中,您将学习到:\n", "\n", "* 定义搜索空间\n", "* 优化目标函数\n", "\n", "通过这个教程,您无需理解UltraOpt所实现算法的任何数学原理,就可以通过UltraOpt去优化超参数。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# import fmin interface from UltraOpt\n", "from ultraopt import fmin\n", "# hdl2cs can convert HDL(Hyperparams Describe Language) to CS(Config Space)\n", "from ultraopt.hdl import hdl2cs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "声明要优化的评价函数。在本教程中,我们将优化一个名为`evaluate`的简单函数,它是一个简单的二次函数。需要注意的是在我们的定义中,评价函数接受参数**config**返回**loss**,loss越小表示配置越好。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$ y = (x-3)^2 + 2 $$" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def evaluate(config:dict):\n", " x = config[\"x\"]\n", " return (x-3)**2 + 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "现在,让我们可视化这个目标函数。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "x = np.linspace(-10, 10, 100)\n", "y = [evaluate({\"x\": xi}) for xi in x]\n", "\n", "fig = plt.figure()\n", "plt.plot(x, y)\n", "plt.xlabel(\"x\")\n", "plt.ylabel(\"evaluate\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们试图通过改变超参数$x$来优化目标函数。这就是为什么我们要声明一个$x$的搜索空间。与搜索空间相关的函数在`ultraopt.hdl.hp_def`中实现. 列举如下。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* `{\"_type\": \"choice\", \"_value\": options}`\n", "* `{\"_type\": \"ordinal\", \"_value\": sequence}`\n", "* `{\"_type\": \"uniform\", \"_value\": [low, high]}`\n", "* `{\"_type\": \"quniform\", \"_value\": [low, high, q]}`\n", "* `{\"_type\": \"loguniform\", \"_value\": [low, high]}`\n", "* `{\"_type\": \"qloguniform\", \"_value\": [low, high, q]}`\n", "* `{\"_type\": \"int_uniform\", \"_value\": [low, high]}`\n", "* `{\"_type\": \"int_quniform\", \"_value\": [low, high, q]}`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "HDL(超参描述语言)是一种参考[nni](https://github.com/microsoft/nni)[[1]](#refer-anchor-1)的[搜索空间](https://nni.readthedocs.io/en/stable/Tutorial/SearchSpaceSpec.html?highlight=space)[[2]](#refer-anchor-2)而实现的一种超参数描述方法,UltraOpt通过`ultraopt.hdl.hdl2cs`函数将**HDL**转换成**配置空间**([ConfigSpace](https://github.com/automl/ConfigSpace)[[3]](#refer-anchor-3), 一种在[AutoSklearn](https://github.com/automl/auto-sklearn)[[4]](#refer-anchor-4), [HpBandSter](https://github.com/automl/HpBandSter)[[5]](#refer-anchor-5)等库中大量采用的基础库)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "HDL的编写方法为`{\"变量名\": 超参范围描述, ...}`,本例的HDL如下:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "HDL = { \n", " \"x\":{ # 变量名为 x\n", " \"_type\": \"uniform\", # 变量类型为 uniform \n", " \"_value\": [-10, 10] # 变量取值范围为 low = -10, hight = 10\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "通过 `ultraopt.hdl.hdl2cs` 函数将HDL转换为CS" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Configuration space object:\n", " Hyperparameters:\n", " x, Type: UniformFloat, Range: [-10.0, 10.0], Default: 0.0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "CS = hdl2cs(HDL)\n", "CS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "配置空间CS是具有采样功能的,我们从中随机采5个样本,并将其转换为dict类型" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'x': 2.99612580326402},\n", " {'x': 0.028681287154995516},\n", " {'x': 0.34249434531803047},\n", " {'x': -3.5600965460908807},\n", " {'x': -2.049214213069341}]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "configs = [config.get_dictionary() for config in CS.sample_configuration(5)]\n", "configs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "对每个`configs`都调用`objective`函数评估一次,获取其目标值,对于目标值最小的,就是我们想要的最佳配置。\n", "\n", "以上步骤其实就是一次最简单的黑箱优化流程。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "optimal config: {'x': 2.99612580326402}, \n", "optimal loss: 2.0000150094003493\n" ] } ], "source": [ "import numpy as np\n", "losses = [evaluate(config) for config in configs]\n", "best_ix = np.argmin(losses)\n", "print(f\"optimal config: {configs[best_ix]}, \\noptimal loss: {losses[best_ix]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在学习了**超参空间定义**、**采样**、**评估**等`黑箱优化流程`后,我们希望能够用一个工具将这些步骤串起来,并希望使用启发式的优化算法而不是随机搜索。此时我们可以采用UltraOpt的`fmin`函数,这个函数需要定义4个重要的参数:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "|参数名|描述|\n", "|-----|---|\n", "|eval_func|评价函数,接受config参数(`dict`类型),返回loss。我们希望最好的`config`(配置)具有最小的`loss`|\n", "|config_space | 配置空间,可以传入HDL(`dict`类型),也可以传入CS(`ConfigSpace`[[3]](#refer-anchor-3)类型)|\n", "|optimizer| 优化器。在使用优化器默认参数的情况下,您只需要指定优化器的名字,列举如下。|\n", "|n_iterations| 迭代次数,在不考虑多保真优化的情况下可视为评价函数执行次数 |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "|优化器|描述|\n", "|-----|---|\n", "|ETPE| Embedding-Tree-Parzen-Estimator, 是UltraOpt作者自创的一种优化算法,在TPE算法[[9]](#refer-anchor-9)的基础上对类别变量采用Embedding降维为低维连续变量,
并在其他的一些方面也做了改进。ETPE在某些场景下表现比HyperOpt的TPE算法要好。 |\n", "|Forest |基于随机森林的贝叶斯优化算法。概率模型引用了`scikit-optimize`[[6]](#refer-anchor-6)包的`skopt.learning.forest`模型[[7]](#refer-anchor-7),
并借鉴了`SMAC3`[[8]](#refer-anchor-8)中的局部搜索方法|\n", "|GBRT| 基于梯度提升回归树(Gradient Boosting Resgression Tree)的贝叶斯优化算法,
概率模型引用了`scikit-optimize`包的`skopt.learning.gbrt`模型 |\n", "|Random| 随机搜索。 |" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100%|██████████| 100/100 [00:03<00:00, 26.12trial/s, best loss: 2.000]\n" ] } ], "source": [ "result = fmin(\n", " eval_func=evaluate, # 评价函数\n", " config_space=HDL, # 配置空间\n", " optimizer=\"Forest\", # 优化器\n", " n_iterations=100 # 迭代数\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`ultraopt.min`函数的返回值`result`自带优化结果汇总表" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "+---------------------------------+\n", "| HyperParameters | Optimal Value |\n", "+-----------------+---------------+\n", "| x | 3.0014 |\n", "+-----------------+---------------+\n", "| Optimal Loss | 2.0000 |\n", "+-----------------+---------------+\n", "| Num Configs | 100 |\n", "+-----------------+---------------+" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看优化过程的拟合曲线:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "result.plot_convergence();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**参考文献**\n", "\n", "
\n", "\n", "- [1] https://github.com/microsoft/nni\n", "\n", "
\n", "\n", "- [2] https://nni.readthedocs.io/en/stable/Tutorial/SearchSpaceSpec.html?highlight=space\n", "\n", "
\n", "\n", "- [3] https://github.com/automl/ConfigSpace\n", "\n", "\n", "
\n", "\n", "- [4] [Feurer M., Klein A., Eggensperger K., Springenberg J.T., Blum M., Hutter F. (2019) Auto-sklearn: Efficient and Robust Automated Machine Learning. In: Hutter F., Kotthoff L., Vanschoren J. (eds) Automated Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. ](https://link.springer.com/chapter/10.1007/978-3-030-05318-5_6)\n", "\n", "\n", "
\n", "\n", "- [5] [Falkner, Stefan et al. “BOHB: Robust and Efficient Hyperparameter Optimization at Scale.” ICML (2018).](https://arxiv.org/abs/1807.01774)\n", "\n", "
\n", "\n", "- [6] https://github.com/scikit-optimize/scikit-optimize\n", "\n", "
\n", "\n", "- [7] [Hutter, F. et al. “Algorithm runtime prediction: Methods & evaluation.” Artif. Intell. 206 (2014): 79-111.](https://arxiv.org/abs/1211.0906)\n", "\n", "
\n", "\n", "- [8] [Hutter F., Hoos H.H., Leyton-Brown K. (2011) Sequential Model-Based Optimization for General Algorithm Configuration. In: Coello C.A.C. (eds) Learning and Intelligent Optimization. LION 2011. Lecture Notes in Computer Science, vol 6683. Springer, Berlin, Heidelberg.](https://link.springer.com/chapter/10.1007/978-3-642-25566-3_40)\n", "\n", "
\n", "\n", "- [9] [James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. In Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS'11). Curran Associates Inc., Red Hook, NY, USA, 2546–2554.](https://dl.acm.org/doi/10.5555/2986459.2986743)\n", "\n", "\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 4 }