ultraopt.async_comm package

Submodules

ultraopt.async_comm.dispatcher module

class ultraopt.async_comm.dispatcher.Dispatcher(new_result_callback, run_id='0', ping_interval=10, nameserver='localhost', nameserver_port=None, host=None, queue_callback=None)[源代码]

基类:object

The dispatcher is responsible for assigning tasks to free workers, report results back to the master and communicate to the nameserver.

参数
  • new_result_callback (function) – function that will be called with a Job instance as argument. From the Job the result can be read and e.g. logged.

  • run_id (str) – unique run_id associated with the HPB run

  • ping_interval (int) – how often to ping for workers (in seconds)

  • nameserver (str) – address of the Pyro4 nameserver

  • nameserver_port (int) – port of Pyro4 nameserver

  • host (str) – ip (or name that resolves to that) of the network interface to use

  • queue_callback (function) – gets called with the number of workers in the pool on every update-cycle

discover_workers()[源代码]
job_runner()[源代码]
number_of_workers()[源代码]
register_result(id=None, result=None)[源代码]
run()[源代码]
shutdown(shutdown_workers=False)[源代码]
shutdown_all_workers(rediscover=False)[源代码]
submit_job(id, **kwargs)[源代码]
trigger_discover_worker()[源代码]
class ultraopt.async_comm.dispatcher.Worker(name, uri)[源代码]

基类:object

is_alive()[源代码]
is_busy()[源代码]
shutdown()[源代码]

ultraopt.async_comm.master module

class ultraopt.async_comm.master.Master(run_id, optimizer: ultraopt.optimizer.base_opt.BaseOptimizer, iter_generator: ultraopt.multi_fidelity.iter_gen.base_gen.BaseIterGenerator, progress_callback=<function no_progress_callback>, checkpoint_file=None, checkpoint_freq=10, working_directory='.', ping_interval=60, time_left_for_this_task=inf, nameserver='127.0.0.1', nameserver_port=None, host=None, shutdown_workers=True, job_queue_sizes=(-1, 0), dynamic_queue_size=True, result_logger=None, previous_result=None, incumbents: Dict[float, dict] = None, incumbent_performances: Dict[float, float] = None)[源代码]

基类:object

The Master class is responsible for the book keeping and to decide what to run next. Optimizers are

instantiations of Master, that handle the important steps of deciding what configurations to run on what budget when.

参数
  • run_id (string) – A unique identifier of that Hyperband run. Use, for example, the cluster’s JobID when running multiple concurrent runs to separate them

  • optimizer (ultraopt.optimizer.base_opt.BaseOptimizer object) – An object that can generate new configurations and registers results of executed runs

  • working_directory (string) – The top level working directory accessible to all compute nodes(shared filesystem).

  • eta (float) – In each iteration, a complete run of sequential halving is executed. In it, after evaluating each configuration on the same subset size, only a fraction of 1/eta of them ‘advances’ to the next round. Must be greater or equal to 2.

  • min_budget (float) – The smallest budget to consider. Needs to be positive!

  • max_budget (float) – the largest budget to consider. Needs to be larger than min_budget! The budgets will be geometrically distributed \(\sim \eta^k\) for \(k\in [0, 1, ... , num\_subsets - 1]\).

  • ping_interval (int) – number of seconds between pings to discover new nodes. Default is 60 seconds.

  • nameserver (str) – address of the Pyro4 nameserver

  • nameserver_port (int) – port of Pyro4 nameserver

  • host (str) – ip (or name that resolves to that) of the network interface to use

  • shutdown_workers (bool) – flag to control whether the workers are shutdown after the computation is done

  • job_queue_size (tuple of ints) – min and max size of the job queue. During the run, when the number of jobs in the queue reaches the min value, it will be filled up to the max size. Default: (0,1)

  • dynamic_queue_size (bool) – Whether or not to change the queue size based on the number of workers available. If true (default), the job_queue_sizes are relative to the current number of workers.

  • logger (logging.logger like object) – the logger to output some (more or less meaningful) information

  • result_logger – a result logger that writes live results to disk

  • previous_result – previous run to warmstart the run

active_iterations()[源代码]

function to find active (not marked as finished) multi_fidelity

返回

list

返回类型

all active iteration objects (empty if there are none)

adjust_queue_size(number_of_workers=None)[源代码]
get_next_iteration(iteration, iteration_kwargs)[源代码]

instantiates the next iteration

Overwrite this to change the multi_fidelity for different optimizers

参数
  • iteration (int) – the index of the iteration to be instantiated

  • iteration_kwargs (dict) – additional kwargs for the iteration class

返回

HB_iteration

返回类型

a valid HB iteration object

job_callback(job)[源代码]

method to be called when a job has finished

this will do some book keeping and call the user defined new_result_callback if one was specified

run(n_iterations=1, min_n_workers=1, iteration_kwargs={})[源代码]

run n_iterations of RankReductionIteration

参数
  • n_iterations (int) – number of multi_fidelity to be performed in this run

  • min_n_workers (int) – minimum number of workers before starting the run

shutdown(shutdown_workers=False)[源代码]
wait_for_workers(min_n_workers=1)[源代码]

helper function to hold execution until some workers are active

参数

min_n_workers (int) – minimum number of workers present before the run starts

ultraopt.async_comm.nameserver module

class ultraopt.async_comm.nameserver.NameServer(run_id, working_directory=None, host=None, port=0, nic_name=None)[源代码]

基类:object

The nameserver serves as a phonebook-like lookup table for your workers. Unique names are created so the workers can work in parallel and register their results without creating racing conditions. The implementation uses PYRO4 as a backend and this class is basically a wrapper.

参数
  • run_id (str) – unique run_id associated with the HPB run

  • working_directory (str) – path to the working directory of the HPB run to store the nameservers credentials. If None, no config file will be written.

  • host (str) – the hostname to use for the nameserver

  • port (int) – the port to be used. Default (=0) means a random port

  • nic_name (str) – name of the network interface to use (only used if host is not given)

shutdown()[源代码]

clean shutdown of the nameserver and the config file (if written)

start()[源代码]

starts a Pyro4 nameserver in a separate thread

返回

the host name and the used port

返回类型

tuple (str, int)

ultraopt.async_comm.nameserver.nic_name_to_host(nic_name)[源代码]

helper function to translate the name of a network card into a valid host name

ultraopt.async_comm.worker module

class ultraopt.async_comm.worker.Worker(run_id, nameserver=None, nameserver_port=None, host=None, worker_id=None, timeout=None, debug=False)[源代码]

基类:object

The worker is responsible for evaluating a single configuration on a single budget at a time. Communication to the individual workers goes via the nameserver, management of the worker-pool and job scheduling is done by the Dispatcher and jobs are determined by the Master. In distributed systems, each cluster-node runs a Worker-instance. To implement your own worker, overwrite the __init__- and the compute-method. The first allows to perform inital computations, e.g. loading the dataset, when the worker is started, while the latter is repeatedly called during the optimization and evaluates a given configuration yielding the associated loss.

参数
  • run_id (anything with a __str__ method) – unique id to identify individual HpBandSter run

  • nameserver (str) – hostname or IP of the nameserver

  • nameserver_port (int) – port of the nameserver

  • logger (logging.logger instance) – logger used for debugging output

  • host (str) – hostname for this worker process

  • worker_id (anything with a __str__method) – if multiple workers are started in the same process, you MUST provide a unique id for each one of them using the id argument.

  • timeout (int or float or None) – specifies the timeout a worker will wait for a new after finishing a computation before shutting down. Towards the end of a long run with multiple workers, this helps to shutdown idling workers. We recommend a timeout that is roughly half the time it would take for the second largest budget to finish. The default (None) means that the worker will wait indefinitely and never shutdown on its own.

compute(config_id, config, config_info, budget, working_directory)[源代码]

The function you have to overload implementing your computation.

参数
  • config_id (tuple) – a triplet of ints that uniquely identifies a configuration. the convention is id = (iteration, budget index, running index) with the following meaning: - iteration: the iteration of the optimization algorithms. E.g, for Hyperband that is one round of Successive Halving - budget index: the budget (of the current iteration) for which this configuration was sampled by the optimizer. This is only nonzero if the majority of the runs fail and Hyperband resamples to fill empty slots, or you use a more ‘advanced’ optimizer. - running index: this is simply an int >= 0 that sort the configs into the order they where sampled, i.e. (x,x,0) was sampled before (x,x,1).

  • config (dict) – the actual configuration to be evaluated.

  • budget (float) – the budget for the evaluate

  • working_directory (str) – a name of a directory that is unique to this configuration. Use this to store intermediate results on lower budgets that can be reused later for a larger budget (for iterative algorithms, for example).

返回

needs to return a dictionary with two mandatory entries:
  • ’loss’: a numerical value that is MINIMIZED

  • ’info’: This can be pretty much any build in python type, e.g. a dict with lists as value. Due to Pyro4 handling the async_comm function calls, 3rd party types like numpy arrays are not supported!

返回类型

dict

initialize(eval_func)[源代码]
is_busy()[源代码]
load_nameserver_credentials(working_directory, num_tries=60, interval=1)[源代码]

loads the nameserver credentials in cases where master and workers share a filesystem

参数
  • working_directory (str) – the working directory for the HPB run (see master)

  • num_tries (int) – number of attempts to find the file (default 60)

  • interval (float) – waiting period between the attempts

run(background=False, concurrent_type='thread')[源代码]

Method to start the worker.

参数

background (bool) – If set to False (Default). the worker is executed in the current thread. If True, a new daemon thread is created that runs the worker. This is useful in a single worker scenario/when the compute function only simulates work.

shutdown()[源代码]
start_computation(callback, config_id, *args, **kwargs)[源代码]

Module contents