'How to sample parameters without duplicates in optuna?

I am using optuna for parameter optimisation of my custom models.

Is there any way to sample parameters until current params set was not tested before? I mean, do try sample another params if there were some trial in the past with the same set of parameters.

In some cases it is impossible, for example, when there is categorial distribution and n_trials is greater than number os possible unique sampled values.

What I want: have some config param like num_attempts in order to sample parameters up to num_attempts in for-loop until there is a set that was not tested before, else - to run trial on the last sampled set.

Why I need this: just because it costs too much to run heavy models several times on the same parameters.

What I do now: just make this "for-loop" thing but it's messy.

If there is another smart way to do it - will be very grateful for information.

Thanks!



Solution 1:[1]

To the best of my knowledge, there is no direct way to handle your case for now. As a workaround, you can check for parameter duplication and skip the evaluation as follows:

import optuna

def objective(trial: optuna.Trial):
    # Sample parameters.
    x = trial.suggest_int('x', 0, 10)
    y = trial.suggest_categorical('y', [-10, -5, 0, 5, 10])

    # Check duplication and skip if it's detected.
    for t in trial.study.trials:
        if t.state != optuna.structs.TrialState.COMPLETE:
            continue

        if t.params == trial.params:
            return t.value  # Return the previous value without re-evaluating it.

            # # Note that if duplicate parameter sets are suggested too frequently,
            # # you can use the pruning mechanism of Optuna to mitigate the problem.
            # # By raising `TrialPruned` instead of just returning the previous value,
            # # the sampler is more likely to avoid sampling the parameters in the succeeding trials.
            #
            # raise optuna.structs.TrialPruned('Duplicate parameter set')

    # Evaluate parameters.
    return x + y

# Start study.
study = optuna.create_study()

unique_trials = 20
while unique_trials > len(set(str(t.params) for t in study.trials)):
    study.optimize(objective, n_trials=1)

Solution 2:[2]

To second @sile's code comment, you may write a pruner such as:

from optuna.pruners import BasePruner
from optuna.structs import TrialState
class RepeatPruner(BasePruner):
    def prune(self, study, trial):
        # type: (Study, FrozenTrial) -> bool

        trials = study.get_trials(deepcopy=False)
        
        numbers=np.array([t.number for t in trials])
        bool_params= np.array([trial.params==t.params for t in trials]).astype(bool)
        #DonĀ“t evaluate function if another with same params has been/is being evaluated before this one
        if np.sum(bool_params)>1:
            if trial.number>np.min(numbers[bool_params]):
                return True

        return False

then call the pruner as:

study = optuna.create_study(study_name=study_name, storage=storage, load_if_exists=True, pruner=RepeatPruner())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Community