.. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_beginner_hyperparameter_tuning_tutorial.py: Hyperparameter tuning with Ray Tune =================================== Hyperparameter tuning can make the difference between an average model and a highly accurate one. Often simple things like choosing a different learning rate or changing a network layer size can have a dramatic impact on your model performance. Fortunately, there are tools that help with finding the best combination of parameters. `Ray Tune `_ is an industry standard tool for distributed hyperparameter tuning. Ray Tune includes the latest hyperparameter search algorithms, integrates with TensorBoard and other analysis libraries, and natively supports distributed training through `Ray's distributed machine learning engine `_. In this tutorial, we will show you how to integrate Ray Tune into your PyTorch training workflow. We will extend `this tutorial from the PyTorch documentation `_ for training a CIFAR10 image classifier. As you will see, we only need to add some slight modifications. In particular, we need to 1. wrap data loading and training in functions, 2. make some network parameters configurable, 3. add checkpointing (optional), 4. and define the search space for the model tuning | To run this tutorial, please make sure the following packages are installed: - ``ray[tune]``: Distributed hyperparameter tuning library - ``torchvision``: For the data transformers Setup / Imports --------------- Let's start with the imports: .. code-block:: default from functools import partial import numpy as np import os import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from torch.utils.data import random_split import torchvision import torchvision.transforms as transforms from ray import tune from ray.tune import CLIReporter from ray.tune.schedulers import ASHAScheduler Most of the imports are needed for building the PyTorch model. Only the last three imports are for Ray Tune. Data loaders ------------ We wrap the data loaders in their own function and pass a global data directory. This way we can share a data directory between different trials. .. code-block:: default def load_data(data_dir="./data"): transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) trainset = torchvision.datasets.CIFAR10( root=data_dir, train=True, download=True, transform=transform) testset = torchvision.datasets.CIFAR10( root=data_dir, train=False, download=True, transform=transform) return trainset, testset Configurable neural network --------------------------- We can only tune those parameters that are configurable. In this example, we can specify the layer sizes of the fully connected layers: .. code-block:: default class Net(nn.Module): def __init__(self, l1=120, l2=84): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 5 * 5, l1) self.fc2 = nn.Linear(l1, l2) self.fc3 = nn.Linear(l2, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x The train function ------------------ Now it gets interesting, because we introduce some changes to the example `from the PyTorch documentation `_. We wrap the training script in a function ``train_cifar(config, checkpoint_dir=None, data_dir=None)``. As you can guess, the ``config`` parameter will receive the hyperparameters we would like to train with. The ``checkpoint_dir`` parameter is used to restore checkpoints. The ``data_dir`` specifies the directory where we load and store the data, so multiple runs can share the same data source. .. code-block:: python net = Net(config["l1"], config["l2"]) if checkpoint_dir: model_state, optimizer_state = torch.load( os.path.join(checkpoint_dir, "checkpoint")) net.load_state_dict(model_state) optimizer.load_state_dict(optimizer_state) The learning rate of the optimizer is made configurable, too: .. code-block:: python optimizer = optim.SGD(net.parameters(), lr=config["lr"], momentum=0.9) We also split the training data into a training and validation subset. We thus train on 80% of the data and calculate the validation loss on the remaining 20%. The batch sizes with which we iterate through the training and test sets are configurable as well. Adding (multi) GPU support with DataParallel ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Image classification benefits largely from GPUs. Luckily, we can continue to use PyTorch's abstractions in Ray Tune. Thus, we can wrap our model in ``nn.DataParallel`` to support data parallel training on multiple GPUs: .. code-block:: python device = "cpu" if torch.cuda.is_available(): device = "cuda:0" if torch.cuda.device_count() > 1: net = nn.DataParallel(net) net.to(device) By using a ``device`` variable we make sure that training also works when we have no GPUs available. PyTorch requires us to send our data to the GPU memory explicitly, like this: .. code-block:: python for i, data in enumerate(trainloader, 0): inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) The code now supports training on CPUs, on a single GPU, and on multiple GPUs. Notably, Ray also supports `fractional GPUs `_ so we can share GPUs among trials, as long as the model still fits on the GPU memory. We'll come back to that later. Communicating with Ray Tune ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most interesting part is the communication with Ray Tune: .. code-block:: python with tune.checkpoint_dir(epoch) as checkpoint_dir: path = os.path.join(checkpoint_dir, "checkpoint") torch.save((net.state_dict(), optimizer.state_dict()), path) tune.report(loss=(val_loss / val_steps), accuracy=correct / total) Here we first save a checkpoint and then report some metrics back to Ray Tune. Specifically, we send the validation loss and accuracy back to Ray Tune. Ray Tune can then use these metrics to decide which hyperparameter configuration lead to the best results. These metrics can also be used to stop bad performing trials early in order to avoid wasting resources on those trials. The checkpoint saving is optional, however, it is necessary if we wanted to use advanced schedulers like `Population Based Training `_. Also, by saving the checkpoint we can later load the trained models and validate them on a test set. Full training function ~~~~~~~~~~~~~~~~~~~~~~ The full code example looks like this: .. code-block:: default def train_cifar(config, checkpoint_dir=None, data_dir=None): net = Net(config["l1"], config["l2"]) device = "cpu" if torch.cuda.is_available(): device = "cuda:0" if torch.cuda.device_count() > 1: net = nn.DataParallel(net) net.to(device) criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=config["lr"], momentum=0.9) if checkpoint_dir: model_state, optimizer_state = torch.load( os.path.join(checkpoint_dir, "checkpoint")) net.load_state_dict(model_state) optimizer.load_state_dict(optimizer_state) trainset, testset = load_data(data_dir) test_abs = int(len(trainset) * 0.8) train_subset, val_subset = random_split( trainset, [test_abs, len(trainset) - test_abs]) trainloader = torch.utils.data.DataLoader( train_subset, batch_size=int(config["batch_size"]), shuffle=True, num_workers=8) valloader = torch.utils.data.DataLoader( val_subset, batch_size=int(config["batch_size"]), shuffle=True, num_workers=8) for epoch in range(10): # loop over the dataset multiple times running_loss = 0.0 epoch_steps = 0 for i, data in enumerate(trainloader, 0): # get the inputs; data is a list of [inputs, labels] inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) # zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # print statistics running_loss += loss.item() epoch_steps += 1 if i % 2000 == 1999: # print every 2000 mini-batches print("[%d, %5d] loss: %.3f" % (epoch + 1, i + 1, running_loss / epoch_steps)) running_loss = 0.0 # Validation loss val_loss = 0.0 val_steps = 0 total = 0 correct = 0 for i, data in enumerate(valloader, 0): with torch.no_grad(): inputs, labels = data inputs, labels = inputs.to(device), labels.to(device) outputs = net(inputs) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() loss = criterion(outputs, labels) val_loss += loss.cpu().numpy() val_steps += 1 with tune.checkpoint_dir(epoch) as checkpoint_dir: path = os.path.join(checkpoint_dir, "checkpoint") torch.save((net.state_dict(), optimizer.state_dict()), path) tune.report(loss=(val_loss / val_steps), accuracy=correct / total) print("Finished Training") As you can see, most of the code is adapted directly from the original example. Test set accuracy ----------------- Commonly the performance of a machine learning model is tested on a hold-out test set with data that has not been used for training the model. We also wrap this in a function: .. code-block:: default def test_accuracy(net, device="cpu"): trainset, testset = load_data() testloader = torch.utils.data.DataLoader( testset, batch_size=4, shuffle=False, num_workers=2) correct = 0 total = 0 with torch.no_grad(): for data in testloader: images, labels = data images, labels = images.to(device), labels.to(device) outputs = net(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() return correct / total The function also expects a ``device`` parameter, so we can do the test set validation on a GPU. Configuring the search space ---------------------------- Lastly, we need to define Ray Tune's search space. Here is an example: .. code-block:: python config = { "l1": tune.sample_from(lambda _: 2**np.random.randint(2, 9)), "l2": tune.sample_from(lambda _: 2**np.random.randint(2, 9)), "lr": tune.loguniform(1e-4, 1e-1), "batch_size": tune.choice([2, 4, 8, 16]) } The ``tune.sample_from()`` function makes it possible to define your own sample methods to obtain hyperparameters. In this example, the ``l1`` and ``l2`` parameters should be powers of 2 between 4 and 256, so either 4, 8, 16, 32, 64, 128, or 256. The ``lr`` (learning rate) should be uniformly sampled between 0.0001 and 0.1. Lastly, the batch size is a choice between 2, 4, 8, and 16. At each trial, Ray Tune will now randomly sample a combination of parameters from these search spaces. It will then train a number of models in parallel and find the best performing one among these. We also use the ``ASHAScheduler`` which will terminate bad performing trials early. We wrap the ``train_cifar`` function with ``functools.partial`` to set the constant ``data_dir`` parameter. We can also tell Ray Tune what resources should be available for each trial: .. code-block:: python gpus_per_trial = 2 # ... result = tune.run( partial(train_cifar, data_dir=data_dir), resources_per_trial={"cpu": 8, "gpu": gpus_per_trial}, config=config, num_samples=num_samples, scheduler=scheduler, progress_reporter=reporter, checkpoint_at_end=True) You can specify the number of CPUs, which are then available e.g. to increase the ``num_workers`` of the PyTorch ``DataLoader`` instances. The selected number of GPUs are made visible to PyTorch in each trial. Trials do not have access to GPUs that haven't been requested for them - so you don't have to care about two trials using the same set of resources. Here we can also specify fractional GPUs, so something like ``gpus_per_trial=0.5`` is completely valid. The trials will then share GPUs among each other. You just have to make sure that the models still fit in the GPU memory. After training the models, we will find the best performing one and load the trained network from the checkpoint file. We then obtain the test set accuracy and report everything by printing. The full main function looks like this: .. code-block:: default def main(num_samples=10, max_num_epochs=10, gpus_per_trial=2): data_dir = os.path.abspath("./data") load_data(data_dir) config = { "l1": tune.sample_from(lambda _: 2 ** np.random.randint(2, 9)), "l2": tune.sample_from(lambda _: 2 ** np.random.randint(2, 9)), "lr": tune.loguniform(1e-4, 1e-1), "batch_size": tune.choice([2, 4, 8, 16]) } scheduler = ASHAScheduler( metric="loss", mode="min", max_t=max_num_epochs, grace_period=1, reduction_factor=2) reporter = CLIReporter( # parameter_columns=["l1", "l2", "lr", "batch_size"], metric_columns=["loss", "accuracy", "training_iteration"]) result = tune.run( partial(train_cifar, data_dir=data_dir), resources_per_trial={"cpu": 2, "gpu": gpus_per_trial}, config=config, num_samples=num_samples, scheduler=scheduler, progress_reporter=reporter) best_trial = result.get_best_trial("loss", "min", "last") print("Best trial config: {}".format(best_trial.config)) print("Best trial final validation loss: {}".format( best_trial.last_result["loss"])) print("Best trial final validation accuracy: {}".format( best_trial.last_result["accuracy"])) best_trained_model = Net(best_trial.config["l1"], best_trial.config["l2"]) device = "cpu" if torch.cuda.is_available(): device = "cuda:0" if gpus_per_trial > 1: best_trained_model = nn.DataParallel(best_trained_model) best_trained_model.to(device) best_checkpoint_dir = best_trial.checkpoint.value model_state, optimizer_state = torch.load(os.path.join( best_checkpoint_dir, "checkpoint")) best_trained_model.load_state_dict(model_state) test_acc = test_accuracy(best_trained_model, device) print("Best trial test set accuracy: {}".format(test_acc)) if __name__ == "__main__": # You can change the number of GPUs per trial here: main(num_samples=10, max_num_epochs=10, gpus_per_trial=0) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to /var/lib/jenkins/workspace/beginner_source/data/cifar-10-python.tar.gz Extracting /var/lib/jenkins/workspace/beginner_source/data/cifar-10-python.tar.gz to /var/lib/jenkins/workspace/beginner_source/data Files already downloaded and verified == Status == Current time: 2022-05-22 20:00:57 (running for 00:00:00.21) Memory usage on this node: 1.4/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (9 PENDING, 1 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | PENDING | | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ (func pid=1574) Files already downloaded and verified (func pid=1574) Files already downloaded and verified (func pid=1608) Files already downloaded and verified == Status == Current time: 2022-05-22 20:01:05 (running for 00:00:07.78) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ (func pid=1608) Files already downloaded and verified == Status == Current time: 2022-05-22 20:01:10 (running for 00:00:12.79) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ (func pid=1574) [1, 2000] loss: 2.263 == Status == Current time: 2022-05-22 20:01:15 (running for 00:00:17.80) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ (func pid=1608) [1, 2000] loss: 2.033 == Status == Current time: 2022-05-22 20:01:20 (running for 00:00:22.82) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ == Status == Current time: 2022-05-22 20:01:25 (running for 00:00:27.84) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ (func pid=1608) [1, 4000] loss: 0.899 (func pid=1574) [1, 4000] loss: 0.986 == Status == Current time: 2022-05-22 20:01:30 (running for 00:00:32.86) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ == Status == Current time: 2022-05-22 20:01:35 (running for 00:00:37.87) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | |-------------------------+----------+-----------------+--------------+------+------+-------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | +-------------------------+----------+-----------------+--------------+------+------+-------------+ (func pid=1608) [1, 6000] loss: 0.581 Result for train_cifar_e4fb9_00000: accuracy: 0.3169 date: 2022-05-22_20-01-36 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.8282848861694336 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 36.486894845962524 time_this_iter_s: 36.486894845962524 time_total_s: 36.486894845962524 timestamp: 1653249696 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 == Status == Current time: 2022-05-22 20:01:41 (running for 00:00:44.28) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.82828 | 0.3169 | 1 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | | | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1608) [1, 8000] loss: 0.433 == Status == Current time: 2022-05-22 20:01:46 (running for 00:00:49.29) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.82828 | 0.3169 | 1 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | | | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [2, 2000] loss: 1.741 == Status == Current time: 2022-05-22 20:01:51 (running for 00:00:54.31) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.82828 | 0.3169 | 1 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | | | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1608) [1, 10000] loss: 0.344 == Status == Current time: 2022-05-22 20:01:56 (running for 00:00:59.33) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.82828 | 0.3169 | 1 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | | | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [2, 4000] loss: 0.841 == Status == Current time: 2022-05-22 20:02:01 (running for 00:01:04.36) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=0 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (8 PENDING, 2 RUNNING) +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.82828 | 0.3169 | 1 | | train_cifar_e4fb9_00001 | RUNNING | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | | | | | train_cifar_e4fb9_00002 | PENDING | | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | +-------------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00001: accuracy: 0.3407 date: 2022-05-22_20-02-04 done: true experiment_id: bebbd20c7bc2469ea2fb0837736224bf hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.9652087768375874 node_ip: 172.17.0.2 pid: 1608 should_checkpoint: true time_since_restore: 61.96333050727844 time_this_iter_s: 61.96333050727844 time_total_s: 61.96333050727844 timestamp: 1653249724 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00001 warmup_time: 0.003160715103149414 (func pid=1951) Files already downloaded and verified == Status == Current time: 2022-05-22 20:02:10 (running for 00:01:12.54) Memory usage on this node: 2.0/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.82828 | 0.3169 | 1 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.3752 date: 2022-05-22_20-02-10 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 2 loss: 1.6399788443565368 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 70.46458387374878 time_this_iter_s: 33.977689027786255 time_total_s: 70.46458387374878 timestamp: 1653249730 timesteps_since_restore: 0 training_iteration: 2 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 (func pid=1951) Files already downloaded and verified == Status == Current time: 2022-05-22 20:02:15 (running for 00:01:18.26) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.63998 | 0.3752 | 2 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 2000] loss: 2.251 == Status == Current time: 2022-05-22 20:02:20 (running for 00:01:23.28) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.63998 | 0.3752 | 2 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [3, 2000] loss: 1.580 == Status == Current time: 2022-05-22 20:02:25 (running for 00:01:28.30) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.63998 | 0.3752 | 2 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 4000] loss: 1.105 == Status == Current time: 2022-05-22 20:02:30 (running for 00:01:33.32) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.63998 | 0.3752 | 2 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [3, 4000] loss: 0.768 == Status == Current time: 2022-05-22 20:02:35 (running for 00:01:38.34) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.63998 | 0.3752 | 2 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 6000] loss: 0.758 == Status == Current time: 2022-05-22 20:02:40 (running for 00:01:43.36) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.63998 | 0.3752 | 2 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.4391 date: 2022-05-22_20-02-44 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 3 loss: 1.5362158026218413 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 104.1298143863678 time_this_iter_s: 33.66523051261902 time_total_s: 104.1298143863678 timestamp: 1653249764 timesteps_since_restore: 0 training_iteration: 3 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 (func pid=1951) [1, 8000] loss: 0.573 == Status == Current time: 2022-05-22 20:02:49 (running for 00:01:51.92) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.53622 | 0.4391 | 3 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:02:54 (running for 00:01:56.93) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.53622 | 0.4391 | 3 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [4, 2000] loss: 1.463 (func pid=1951) [1, 10000] loss: 0.463 == Status == Current time: 2022-05-22 20:02:59 (running for 00:02:01.95) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.53622 | 0.4391 | 3 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:03:04 (running for 00:02:06.97) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.53622 | 0.4391 | 3 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 12000] loss: 0.386 (func pid=1574) [4, 4000] loss: 0.724 == Status == Current time: 2022-05-22 20:03:09 (running for 00:02:11.99) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.53622 | 0.4391 | 3 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:03:14 (running for 00:02:17.00) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.53622 | 0.4391 | 3 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 14000] loss: 0.330 Result for train_cifar_e4fb9_00000: accuracy: 0.4854 date: 2022-05-22_20-03-17 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 4 loss: 1.4141437401771546 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 137.1664378643036 time_this_iter_s: 33.03662347793579 time_total_s: 137.1664378643036 timestamp: 1653249797 timesteps_since_restore: 0 training_iteration: 4 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 == Status == Current time: 2022-05-22 20:03:22 (running for 00:02:24.96) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.41414 | 0.4854 | 4 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 16000] loss: 0.289 == Status == Current time: 2022-05-22 20:03:27 (running for 00:02:29.97) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.41414 | 0.4854 | 4 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [5, 2000] loss: 1.389 == Status == Current time: 2022-05-22 20:03:32 (running for 00:02:34.99) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.41414 | 0.4854 | 4 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 18000] loss: 0.257 == Status == Current time: 2022-05-22 20:03:37 (running for 00:02:40.01) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.41414 | 0.4854 | 4 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [5, 4000] loss: 0.676 == Status == Current time: 2022-05-22 20:03:42 (running for 00:02:45.02) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.41414 | 0.4854 | 4 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1951) [1, 20000] loss: 0.231 == Status == Current time: 2022-05-22 20:03:47 (running for 00:02:50.03) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.41414 | 0.4854 | 4 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.4998 date: 2022-05-22_20-03-51 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 5 loss: 1.3751472884893416 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 171.00617098808289 time_this_iter_s: 33.8397331237793 time_total_s: 171.00617098808289 timestamp: 1653249831 timesteps_since_restore: 0 training_iteration: 5 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 == Status == Current time: 2022-05-22 20:03:56 (running for 00:02:58.80) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=1 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (7 PENDING, 2 RUNNING, 1 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.37515 | 0.4998 | 5 | | train_cifar_e4fb9_00002 | RUNNING | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | | | | | train_cifar_e4fb9_00003 | PENDING | | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00002: accuracy: 0.1 date: 2022-05-22_20-03-59 done: true experiment_id: e59b460843574eef96a494e1107ad3af hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 2.3127114450454713 node_ip: 172.17.0.2 pid: 1951 should_checkpoint: true time_since_restore: 111.20791006088257 time_this_iter_s: 111.20791006088257 time_total_s: 111.20791006088257 timestamp: 1653249839 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00002 warmup_time: 0.0038728713989257812 (func pid=1574) [6, 2000] loss: 1.324 (func pid=2587) Files already downloaded and verified == Status == Current time: 2022-05-22 20:04:05 (running for 00:03:07.59) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.9652087768375874 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.37515 | 0.4998 | 5 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=2587) Files already downloaded and verified == Status == Current time: 2022-05-22 20:04:10 (running for 00:03:12.61) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.9652087768375874 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.37515 | 0.4998 | 5 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [6, 4000] loss: 0.647 == Status == Current time: 2022-05-22 20:04:15 (running for 00:03:17.63) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.9652087768375874 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.37515 | 0.4998 | 5 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:04:20 (running for 00:03:22.65) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.9652087768375874 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.37515 | 0.4998 | 5 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=2587) [1, 2000] loss: 1.986 == Status == Current time: 2022-05-22 20:04:25 (running for 00:03:27.67) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.9652087768375874 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.37515 | 0.4998 | 5 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | | | | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.5192 date: 2022-05-22_20-04-26 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 6 loss: 1.3341178626060486 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 205.86415553092957 time_this_iter_s: 34.85798454284668 time_total_s: 205.86415553092957 timestamp: 1653249866 timesteps_since_restore: 0 training_iteration: 6 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 Result for train_cifar_e4fb9_00003: accuracy: 0.4174 date: 2022-05-22_20-04-30 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.6141361724853516 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 26.71685481071472 time_this_iter_s: 26.71685481071472 time_total_s: 26.71685481071472 timestamp: 1653249870 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 == Status == Current time: 2022-05-22 20:04:35 (running for 00:03:37.55) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.33412 | 0.5192 | 6 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.61414 | 0.4174 | 1 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [7, 2000] loss: 1.251 == Status == Current time: 2022-05-22 20:04:40 (running for 00:03:42.56) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.33412 | 0.5192 | 6 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.61414 | 0.4174 | 1 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:04:45 (running for 00:03:47.58) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.33412 | 0.5192 | 6 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.61414 | 0.4174 | 1 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=2587) [2, 2000] loss: 1.506 == Status == Current time: 2022-05-22 20:04:50 (running for 00:03:52.60) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.6399788443565368 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.33412 | 0.5192 | 6 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.61414 | 0.4174 | 1 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [7, 4000] loss: 0.632 Result for train_cifar_e4fb9_00003: accuracy: 0.5011 date: 2022-05-22_20-04-53 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 2 loss: 1.3845552258491516 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 50.12973976135254 time_this_iter_s: 23.412884950637817 time_total_s: 50.12973976135254 timestamp: 1653249893 timesteps_since_restore: 0 training_iteration: 2 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 == Status == Current time: 2022-05-22 20:04:58 (running for 00:04:00.95) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.33412 | 0.5192 | 6 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.38456 | 0.5011 | 2 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:05:03 (running for 00:04:05.96) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.33412 | 0.5192 | 6 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.38456 | 0.5011 | 2 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.5406 date: 2022-05-22_20-05-04 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 7 loss: 1.2897682260513306 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 244.41224312782288 time_this_iter_s: 38.54808759689331 time_total_s: 244.41224312782288 timestamp: 1653249904 timesteps_since_restore: 0 training_iteration: 7 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 == Status == Current time: 2022-05-22 20:05:09 (running for 00:04:12.22) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.28977 | 0.5406 | 7 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.38456 | 0.5011 | 2 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=2587) [3, 2000] loss: 1.342 == Status == Current time: 2022-05-22 20:05:14 (running for 00:04:17.24) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.28977 | 0.5406 | 7 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.38456 | 0.5011 | 2 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.4861 date: 2022-05-22_20-05-17 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 3 loss: 1.3970972430229187 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 74.06872272491455 time_this_iter_s: 23.93898296356201 time_total_s: 74.06872272491455 timestamp: 1653249917 timesteps_since_restore: 0 training_iteration: 3 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 (func pid=1574) [8, 2000] loss: 1.229 == Status == Current time: 2022-05-22 20:05:22 (running for 00:04:24.88) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.28977 | 0.5406 | 7 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.3971 | 0.4861 | 3 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:05:27 (running for 00:04:29.92) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.28977 | 0.5406 | 7 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.3971 | 0.4861 | 3 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [8, 4000] loss: 0.601 == Status == Current time: 2022-05-22 20:05:32 (running for 00:04:34.93) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.28977 | 0.5406 | 7 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.3971 | 0.4861 | 3 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=2587) [4, 2000] loss: 1.247 == Status == Current time: 2022-05-22 20:05:37 (running for 00:04:39.96) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: None | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.28977 | 0.5406 | 7 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.3971 | 0.4861 | 3 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.5639 date: 2022-05-22_20-05-41 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 4 loss: 1.2204006052017211 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 98.22129130363464 time_this_iter_s: 24.152568578720093 time_total_s: 98.22129130363464 timestamp: 1653249941 timesteps_since_restore: 0 training_iteration: 4 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 Result for train_cifar_e4fb9_00000: accuracy: 0.5502 date: 2022-05-22_20-05-43 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 8 loss: 1.2576976752519609 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 283.00318574905396 time_this_iter_s: 38.59094262123108 time_total_s: 283.00318574905396 timestamp: 1653249943 timesteps_since_restore: 0 training_iteration: 8 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 == Status == Current time: 2022-05-22 20:05:43 (running for 00:04:45.79) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.2204 | 0.5639 | 4 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:05:48 (running for 00:04:50.81) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.2204 | 0.5639 | 4 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:05:53 (running for 00:04:55.84) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.2204 | 0.5639 | 4 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=1574) [9, 2000] loss: 1.186 (func pid=2587) [5, 2000] loss: 1.158 == Status == Current time: 2022-05-22 20:05:58 (running for 00:05:00.86) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.2204 | 0.5639 | 4 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:06:03 (running for 00:05:05.90) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.2204 | 0.5639 | 4 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.5755 date: 2022-05-22_20-06-05 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 5 loss: 1.202942657327652 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 122.0786521434784 time_this_iter_s: 23.85736083984375 time_total_s: 122.0786521434784 timestamp: 1653249965 timesteps_since_restore: 0 training_iteration: 5 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 (func pid=1574) [9, 4000] loss: 0.592 == Status == Current time: 2022-05-22 20:06:10 (running for 00:05:12.88) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.20294 | 0.5755 | 5 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:06:15 (running for 00:05:17.90) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.20294 | 0.5755 | 5 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:06:20 (running for 00:05:22.93) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.2577 | 0.5502 | 8 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.20294 | 0.5755 | 5 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.5509 date: 2022-05-22_20-06-21 done: false experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 9 loss: 1.2393650981664657 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 320.79671597480774 time_this_iter_s: 37.793530225753784 time_total_s: 320.79671597480774 timestamp: 1653249981 timesteps_since_restore: 0 training_iteration: 9 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 (func pid=2587) [6, 2000] loss: 1.106 == Status == Current time: 2022-05-22 20:06:26 (running for 00:05:28.59) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.23937 | 0.5509 | 9 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.20294 | 0.5755 | 5 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.6026 date: 2022-05-22_20-06-29 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 6 loss: 1.1285030343055724 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 146.44588947296143 time_this_iter_s: 24.367237329483032 time_total_s: 146.44588947296143 timestamp: 1653249989 timesteps_since_restore: 0 training_iteration: 6 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 (func pid=1574) [10, 2000] loss: 1.143 == Status == Current time: 2022-05-22 20:06:34 (running for 00:05:37.25) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.23937 | 0.5509 | 9 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.1285 | 0.6026 | 6 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:06:39 (running for 00:05:42.29) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.23937 | 0.5509 | 9 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.1285 | 0.6026 | 6 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:06:44 (running for 00:05:47.31) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.23937 | 0.5509 | 9 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.1285 | 0.6026 | 6 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=2587) [7, 2000] loss: 1.055 (func pid=1574) [10, 4000] loss: 0.581 == Status == Current time: 2022-05-22 20:06:49 (running for 00:05:52.34) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.23937 | 0.5509 | 9 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.1285 | 0.6026 | 6 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.6128 date: 2022-05-22_20-06-53 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 7 loss: 1.1295763695716858 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 170.1988387107849 time_this_iter_s: 23.752949237823486 time_total_s: 170.1988387107849 timestamp: 1653250013 timesteps_since_restore: 0 training_iteration: 7 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 == Status == Current time: 2022-05-22 20:06:58 (running for 00:06:01.01) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=2 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (6 PENDING, 2 RUNNING, 2 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | RUNNING | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.23937 | 0.5509 | 9 | | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.12958 | 0.6128 | 7 | | train_cifar_e4fb9_00004 | PENDING | | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00000: accuracy: 0.5774 date: 2022-05-22_20-06-59 done: true experiment_id: 76614011963f4fd2bcc9e5e590a2ec47 hostname: 5c55bb57cfdf iterations_since_restore: 10 loss: 1.1799667859077454 node_ip: 172.17.0.2 pid: 1574 should_checkpoint: true time_since_restore: 359.2820427417755 time_this_iter_s: 38.48532676696777 time_total_s: 359.2820427417755 timestamp: 1653250019 timesteps_since_restore: 0 training_iteration: 10 trial_id: e4fb9_00000 warmup_time: 0.003105640411376953 (func pid=4258) Files already downloaded and verified == Status == Current time: 2022-05-22 20:07:05 (running for 00:06:07.69) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.12958 | 0.6128 | 7 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=4258) Files already downloaded and verified (func pid=2587) [8, 2000] loss: 1.013 == Status == Current time: 2022-05-22 20:07:10 (running for 00:06:12.72) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.12958 | 0.6128 | 7 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:07:15 (running for 00:06:17.75) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.2576976752519609 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.12958 | 0.6128 | 7 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.6055 date: 2022-05-22_20-07-17 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 8 loss: 1.117674529981613 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 193.93619537353516 time_this_iter_s: 23.737356662750244 time_total_s: 193.93619537353516 timestamp: 1653250037 timesteps_since_restore: 0 training_iteration: 8 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 == Status == Current time: 2022-05-22 20:07:22 (running for 00:06:24.77) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.11767 | 0.6055 | 8 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=4258) [1, 2000] loss: 1.744 == Status == Current time: 2022-05-22 20:07:27 (running for 00:06:29.79) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8967468315035105 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.11767 | 0.6055 | 8 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | | | | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00004: accuracy: 0.4428 date: 2022-05-22_20-07-32 done: false experiment_id: 0af75391efac4a3e8c96b9916946f6fe hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.5042116988182068 node_ip: 172.17.0.2 pid: 4258 should_checkpoint: true time_since_restore: 28.74411392211914 time_this_iter_s: 28.74411392211914 time_total_s: 28.74411392211914 timestamp: 1653250052 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00004 warmup_time: 0.003892183303833008 (func pid=2587) [9, 2000] loss: 0.977 == Status == Current time: 2022-05-22 20:07:37 (running for 00:06:39.56) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.11767 | 0.6055 | 8 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.50421 | 0.4428 | 1 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:07:42 (running for 00:06:44.57) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.11767 | 0.6055 | 8 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.50421 | 0.4428 | 1 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.619 date: 2022-05-22_20-07-42 done: false experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 9 loss: 1.090410901594162 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 219.1736764907837 time_this_iter_s: 25.237481117248535 time_total_s: 219.1736764907837 timestamp: 1653250062 timesteps_since_restore: 0 training_iteration: 9 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 == Status == Current time: 2022-05-22 20:07:47 (running for 00:06:50.00) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.09041 | 0.619 | 9 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.50421 | 0.4428 | 1 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=4258) [2, 2000] loss: 1.428 == Status == Current time: 2022-05-22 20:07:52 (running for 00:06:55.02) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.09041 | 0.619 | 9 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.50421 | 0.4428 | 1 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:07:57 (running for 00:07:00.06) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.5122670351028442 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.09041 | 0.619 | 9 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.50421 | 0.4428 | 1 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00004: accuracy: 0.5082 date: 2022-05-22_20-07-57 done: false experiment_id: 0af75391efac4a3e8c96b9916946f6fe hostname: 5c55bb57cfdf iterations_since_restore: 2 loss: 1.399407047843933 node_ip: 172.17.0.2 pid: 4258 should_checkpoint: true time_since_restore: 54.50170087814331 time_this_iter_s: 25.75758695602417 time_total_s: 54.50170087814331 timestamp: 1653250077 timesteps_since_restore: 0 training_iteration: 2 trial_id: e4fb9_00004 warmup_time: 0.003892183303833008 (func pid=2587) [10, 2000] loss: 0.939 == Status == Current time: 2022-05-22 20:08:02 (running for 00:07:05.30) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.09041 | 0.619 | 9 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.39941 | 0.5082 | 2 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:08:07 (running for 00:07:10.32) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=3 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (5 PENDING, 2 RUNNING, 3 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00003 | RUNNING | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.09041 | 0.619 | 9 | | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.39941 | 0.5082 | 2 | | train_cifar_e4fb9_00005 | PENDING | | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00003: accuracy: 0.6237 date: 2022-05-22_20-08-07 done: true experiment_id: 88bc59797ca740278250b83ef8eb1a08 hostname: 5c55bb57cfdf iterations_since_restore: 10 loss: 1.090300385904312 node_ip: 172.17.0.2 pid: 2587 should_checkpoint: true time_since_restore: 244.6415901184082 time_this_iter_s: 25.46791362762451 time_total_s: 244.6415901184082 timestamp: 1653250087 timesteps_since_restore: 0 training_iteration: 10 trial_id: e4fb9_00003 warmup_time: 0.004075050354003906 (func pid=5167) Files already downloaded and verified == Status == Current time: 2022-05-22 20:08:13 (running for 00:07:15.72) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=4 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (4 PENDING, 2 RUNNING, 4 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.39941 | 0.5082 | 2 | | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5167) Files already downloaded and verified (func pid=4258) [3, 2000] loss: 1.345 == Status == Current time: 2022-05-22 20:08:18 (running for 00:07:20.74) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=4 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (4 PENDING, 2 RUNNING, 4 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.39941 | 0.5082 | 2 | | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00004: accuracy: 0.5182 date: 2022-05-22_20-08-21 done: false experiment_id: 0af75391efac4a3e8c96b9916946f6fe hostname: 5c55bb57cfdf iterations_since_restore: 3 loss: 1.3889137803077698 node_ip: 172.17.0.2 pid: 4258 should_checkpoint: true time_since_restore: 78.26122426986694 time_this_iter_s: 23.759523391723633 time_total_s: 78.26122426986694 timestamp: 1653250101 timesteps_since_restore: 0 training_iteration: 3 trial_id: e4fb9_00004 warmup_time: 0.003892183303833008 (func pid=5167) [1, 2000] loss: 2.025 == Status == Current time: 2022-05-22 20:08:26 (running for 00:07:29.05) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=4 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (4 PENDING, 2 RUNNING, 4 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.38891 | 0.5182 | 3 | | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:08:31 (running for 00:07:34.08) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=4 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (4 PENDING, 2 RUNNING, 4 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.38891 | 0.5182 | 3 | | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5167) [1, 4000] loss: 0.882 == Status == Current time: 2022-05-22 20:08:36 (running for 00:07:39.10) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=4 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (4 PENDING, 2 RUNNING, 4 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.38891 | 0.5182 | 3 | | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=4258) [4, 2000] loss: 1.304 == Status == Current time: 2022-05-22 20:08:41 (running for 00:07:44.12) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=4 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.3172721726894379 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (4 PENDING, 2 RUNNING, 4 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00004 | RUNNING | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.38891 | 0.5182 | 3 | | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | PENDING | | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00004: accuracy: 0.5029 date: 2022-05-22_20-08-44 done: true experiment_id: 0af75391efac4a3e8c96b9916946f6fe hostname: 5c55bb57cfdf iterations_since_restore: 4 loss: 1.4319864619255065 node_ip: 172.17.0.2 pid: 4258 should_checkpoint: true time_since_restore: 101.32692980766296 time_this_iter_s: 23.06570553779602 time_total_s: 101.32692980766296 timestamp: 1653250124 timesteps_since_restore: 0 training_iteration: 4 trial_id: e4fb9_00004 warmup_time: 0.003892183303833008 (func pid=5167) [1, 6000] loss: 0.553 (func pid=5438) Files already downloaded and verified == Status == Current time: 2022-05-22 20:08:50 (running for 00:07:52.72) Memory usage on this node: 2.2/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5438) Files already downloaded and verified (func pid=5167) [1, 8000] loss: 0.411 == Status == Current time: 2022-05-22 20:08:55 (running for 00:07:57.76) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:09:00 (running for 00:08:02.77) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5438) [1, 2000] loss: 2.102 (func pid=5167) [1, 10000] loss: 0.318 == Status == Current time: 2022-05-22 20:09:05 (running for 00:08:07.79) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:09:10 (running for 00:08:12.81) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.8282848861694336 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | | | | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00005: accuracy: 0.4005 date: 2022-05-22_20-09-13 done: false experiment_id: 44215176aa334905baa0823c61df6279 hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.6506168123602867 node_ip: 172.17.0.2 pid: 5167 should_checkpoint: true time_since_restore: 62.163301944732666 time_this_iter_s: 62.163301944732666 time_total_s: 62.163301944732666 timestamp: 1653250153 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00005 warmup_time: 0.003893136978149414 (func pid=5438) [1, 4000] loss: 0.863 == Status == Current time: 2022-05-22 20:09:18 (running for 00:08:21.06) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:09:23 (running for 00:08:26.07) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | | | | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5167) [2, 2000] loss: 1.565 Result for train_cifar_e4fb9_00006: accuracy: 0.3886 date: 2022-05-22_20-09-25 done: false experiment_id: 2e26f94e77794edebe2390d2a1a9b3bf hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.6104391280174255 node_ip: 172.17.0.2 pid: 5438 should_checkpoint: true time_since_restore: 36.60224509239197 time_this_iter_s: 36.60224509239197 time_total_s: 36.60224509239197 timestamp: 1653250165 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00006 warmup_time: 0.003960609436035156 == Status == Current time: 2022-05-22 20:09:30 (running for 00:08:32.63) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.61044 | 0.3886 | 1 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5167) [2, 4000] loss: 0.778 == Status == Current time: 2022-05-22 20:09:35 (running for 00:08:37.65) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.61044 | 0.3886 | 1 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5438) [2, 2000] loss: 1.553 == Status == Current time: 2022-05-22 20:09:40 (running for 00:08:42.66) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.61044 | 0.3886 | 1 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5167) [2, 6000] loss: 0.517 == Status == Current time: 2022-05-22 20:09:45 (running for 00:08:47.69) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.61044 | 0.3886 | 1 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5438) [2, 4000] loss: 0.742 == Status == Current time: 2022-05-22 20:09:50 (running for 00:08:52.70) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.61044 | 0.3886 | 1 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5167) [2, 8000] loss: 0.387 == Status == Current time: 2022-05-22 20:09:55 (running for 00:08:57.73) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=5 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.399407047843933 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (3 PENDING, 2 RUNNING, 5 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00006 | RUNNING | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.61044 | 0.3886 | 1 | | train_cifar_e4fb9_00007 | PENDING | | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00006: accuracy: 0.4666 date: 2022-05-22_20-09-58 done: true experiment_id: 2e26f94e77794edebe2390d2a1a9b3bf hostname: 5c55bb57cfdf iterations_since_restore: 2 loss: 1.4636820405244828 node_ip: 172.17.0.2 pid: 5438 should_checkpoint: true time_since_restore: 70.33554148674011 time_this_iter_s: 33.733296394348145 time_total_s: 70.33554148674011 timestamp: 1653250198 timesteps_since_restore: 0 training_iteration: 2 trial_id: e4fb9_00006 warmup_time: 0.003960609436035156 (func pid=5167) [2, 10000] loss: 0.310 (func pid=5838) Files already downloaded and verified == Status == Current time: 2022-05-22 20:10:04 (running for 00:09:06.77) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=6 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.431544544184208 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 PENDING, 2 RUNNING, 6 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5838) Files already downloaded and verified == Status == Current time: 2022-05-22 20:10:09 (running for 00:09:11.79) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=6 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.431544544184208 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 PENDING, 2 RUNNING, 6 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00005 | RUNNING | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.65062 | 0.4005 | 1 | | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | PENDING | | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00005: accuracy: 0.449 date: 2022-05-22_20-10-12 done: true experiment_id: 44215176aa334905baa0823c61df6279 hostname: 5c55bb57cfdf iterations_since_restore: 2 loss: 1.5401566292405129 node_ip: 172.17.0.2 pid: 5167 should_checkpoint: true time_since_restore: 120.6176917552948 time_this_iter_s: 58.454389810562134 time_total_s: 120.6176917552948 timestamp: 1653250212 timesteps_since_restore: 0 training_iteration: 2 trial_id: e4fb9_00005 warmup_time: 0.003893136978149414 (func pid=5989) Files already downloaded and verified == Status == Current time: 2022-05-22 20:10:17 (running for 00:09:19.77) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=7 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 PENDING, 2 RUNNING, 7 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5838) [1, 2000] loss: 1.974 (func pid=5989) Files already downloaded and verified == Status == Current time: 2022-05-22 20:10:22 (running for 00:09:24.79) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=7 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 PENDING, 2 RUNNING, 7 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [1, 2000] loss: 2.294 == Status == Current time: 2022-05-22 20:10:27 (running for 00:09:29.81) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=7 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 PENDING, 2 RUNNING, 7 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5838) [1, 4000] loss: 0.937 == Status == Current time: 2022-05-22 20:10:32 (running for 00:09:34.83) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=7 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 PENDING, 2 RUNNING, 7 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [1, 4000] loss: 1.061 == Status == Current time: 2022-05-22 20:10:37 (running for 00:09:39.84) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=7 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.6506168123602867 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 PENDING, 2 RUNNING, 7 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00007 | RUNNING | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | | | | | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | PENDING | | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00007: accuracy: 0.2854 date: 2022-05-22_20-10-41 done: true experiment_id: 8bdb07339a0f4f17b200087e8c387635 hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.9402952749729157 node_ip: 172.17.0.2 pid: 5838 should_checkpoint: true time_since_restore: 38.77672457695007 time_this_iter_s: 38.77672457695007 time_total_s: 38.77672457695007 timestamp: 1653250241 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00007 warmup_time: 0.003671884536743164 (func pid=5989) [1, 6000] loss: 0.673 (func pid=6149) Files already downloaded and verified == Status == Current time: 2022-05-22 20:10:47 (running for 00:09:49.76) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) Files already downloaded and verified (func pid=5989) [1, 8000] loss: 0.487 == Status == Current time: 2022-05-22 20:10:52 (running for 00:09:54.79) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 2000] loss: 2.303 == Status == Current time: 2022-05-22 20:10:57 (running for 00:09:59.80) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [1, 10000] loss: 0.381 == Status == Current time: 2022-05-22 20:11:02 (running for 00:10:04.82) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 4000] loss: 1.151 == Status == Current time: 2022-05-22 20:11:07 (running for 00:10:09.83) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [1, 12000] loss: 0.310 == Status == Current time: 2022-05-22 20:11:12 (running for 00:10:14.85) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 6000] loss: 0.763 (func pid=5989) [1, 14000] loss: 0.264 == Status == Current time: 2022-05-22 20:11:17 (running for 00:10:19.86) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:11:22 (running for 00:10:24.89) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 8000] loss: 0.567 (func pid=5989) [1, 16000] loss: 0.226 == Status == Current time: 2022-05-22 20:11:27 (running for 00:10:29.90) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:11:32 (running for 00:10:34.92) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [1, 18000] loss: 0.201 (func pid=6149) [1, 10000] loss: 0.439 == Status == Current time: 2022-05-22 20:11:37 (running for 00:10:39.94) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [1, 20000] loss: 0.178 == Status == Current time: 2022-05-22 20:11:42 (running for 00:10:44.95) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 12000] loss: 0.351 == Status == Current time: 2022-05-22 20:11:47 (running for 00:10:49.97) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 14000] loss: 0.290 == Status == Current time: 2022-05-22 20:11:52 (running for 00:10:54.99) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7394508492648602 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | | | | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00008: accuracy: 0.3279 date: 2022-05-22_20-11-56 done: false experiment_id: 05d5818bd61349988f1db0835becd9a1 hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.7209034914076329 node_ip: 172.17.0.2 pid: 5989 should_checkpoint: true time_since_restore: 100.75639367103577 time_this_iter_s: 100.75639367103577 time_total_s: 100.75639367103577 timestamp: 1653250316 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00008 warmup_time: 0.0032830238342285156 == Status == Current time: 2022-05-22 20:12:01 (running for 00:11:03.55) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 16000] loss: 0.249 (func pid=5989) [2, 2000] loss: 1.778 == Status == Current time: 2022-05-22 20:12:06 (running for 00:11:08.57) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 18000] loss: 0.214 == Status == Current time: 2022-05-22 20:12:11 (running for 00:11:13.58) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 4000] loss: 0.869 == Status == Current time: 2022-05-22 20:12:16 (running for 00:11:18.60) Memory usage on this node: 2.4/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=6149) [1, 20000] loss: 0.190 == Status == Current time: 2022-05-22 20:12:21 (running for 00:11:23.62) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 6000] loss: 0.575 == Status == Current time: 2022-05-22 20:12:26 (running for 00:11:28.64) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 8000] loss: 0.426 == Status == Current time: 2022-05-22 20:12:31 (running for 00:11:33.65) Memory usage on this node: 2.3/14.7 GiB Using AsyncHyperBand: num_stopped=8 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7209034914076329 Resources requested: 4.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (2 RUNNING, 8 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00009 | RUNNING | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | | | | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00009: accuracy: 0.3276 date: 2022-05-22_20-12-35 done: true experiment_id: 5981902f3ecd4485b1ec1ee63db7d61f hostname: 5c55bb57cfdf iterations_since_restore: 1 loss: 1.8553714662730694 node_ip: 172.17.0.2 pid: 6149 should_checkpoint: true time_since_restore: 109.93525862693787 time_this_iter_s: 109.93525862693787 time_total_s: 109.93525862693787 timestamp: 1653250355 timesteps_since_restore: 0 training_iteration: 1 trial_id: e4fb9_00009 warmup_time: 0.0035321712493896484 (func pid=5989) [2, 10000] loss: 0.336 == Status == Current time: 2022-05-22 20:12:40 (running for 00:11:42.79) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 12000] loss: 0.277 == Status == Current time: 2022-05-22 20:12:45 (running for 00:11:47.80) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:12:50 (running for 00:11:52.81) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 14000] loss: 0.235 == Status == Current time: 2022-05-22 20:12:55 (running for 00:11:57.82) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 16000] loss: 0.205 == Status == Current time: 2022-05-22 20:13:00 (running for 00:12:02.84) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 18000] loss: 0.183 == Status == Current time: 2022-05-22 20:13:05 (running for 00:12:07.85) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ (func pid=5989) [2, 20000] loss: 0.165 == Status == Current time: 2022-05-22 20:13:10 (running for 00:12:12.86) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ == Status == Current time: 2022-05-22 20:13:15 (running for 00:12:17.87) Memory usage on this node: 1.8/14.7 GiB Using AsyncHyperBand: num_stopped=9 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.4636820405244828 | Iter 1.000: -1.7745941887885333 Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (1 RUNNING, 9 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00008 | RUNNING | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.7209 | 0.3279 | 1 | | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Result for train_cifar_e4fb9_00008: accuracy: 0.3399 date: 2022-05-22_20-13-19 done: true experiment_id: 05d5818bd61349988f1db0835becd9a1 hostname: 5c55bb57cfdf iterations_since_restore: 2 loss: 1.6608823190510273 node_ip: 172.17.0.2 pid: 5989 should_checkpoint: true time_since_restore: 184.00248169898987 time_this_iter_s: 83.2460880279541 time_total_s: 184.00248169898987 timestamp: 1653250399 timesteps_since_restore: 0 training_iteration: 2 trial_id: e4fb9_00008 warmup_time: 0.0032830238342285156 == Status == Current time: 2022-05-22 20:13:19 (running for 00:12:21.80) Memory usage on this node: 1.5/14.7 GiB Using AsyncHyperBand: num_stopped=10 Bracket: Iter 8.000: -1.1876861026167869 | Iter 4.000: -1.4141437401771546 | Iter 2.000: -1.5019193348824977 | Iter 1.000: -1.7745941887885333 Resources requested: 0/4 CPUs, 0/1 GPUs, 0.0/8.33 GiB heap, 0.0/4.17 GiB objects (0.0/1.0 accelerator_type:P4) Result logdir: /var/lib/jenkins/ray_results/train_cifar_2022-05-22_20-00-57 Number of trials: 10/10 (10 TERMINATED) +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ | Trial name | status | loc | batch_size | l1 | l2 | lr | loss | accuracy | training_iteration | |-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------| | train_cifar_e4fb9_00000 | TERMINATED | 172.17.0.2:1574 | 8 | 8 | 256 | 0.00027798 | 1.17997 | 0.5774 | 10 | | train_cifar_e4fb9_00001 | TERMINATED | 172.17.0.2:1608 | 4 | 32 | 16 | 0.00483071 | 1.96521 | 0.3407 | 1 | | train_cifar_e4fb9_00002 | TERMINATED | 172.17.0.2:1951 | 2 | 8 | 16 | 0.00929677 | 2.31271 | 0.1 | 1 | | train_cifar_e4fb9_00003 | TERMINATED | 172.17.0.2:2587 | 16 | 64 | 64 | 0.00198823 | 1.0903 | 0.6237 | 10 | | train_cifar_e4fb9_00004 | TERMINATED | 172.17.0.2:4258 | 16 | 128 | 256 | 0.0122764 | 1.43199 | 0.5029 | 4 | | train_cifar_e4fb9_00005 | TERMINATED | 172.17.0.2:5167 | 4 | 32 | 128 | 0.00374699 | 1.54016 | 0.449 | 2 | | train_cifar_e4fb9_00006 | TERMINATED | 172.17.0.2:5438 | 8 | 32 | 8 | 0.00231239 | 1.46368 | 0.4666 | 2 | | train_cifar_e4fb9_00007 | TERMINATED | 172.17.0.2:5838 | 8 | 256 | 256 | 0.0173587 | 1.9403 | 0.2854 | 1 | | train_cifar_e4fb9_00008 | TERMINATED | 172.17.0.2:5989 | 2 | 4 | 4 | 0.00107032 | 1.66088 | 0.3399 | 2 | | train_cifar_e4fb9_00009 | TERMINATED | 172.17.0.2:6149 | 2 | 256 | 32 | 0.000121737 | 1.85537 | 0.3276 | 1 | +-------------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+ Best trial config: {'l1': 64, 'l2': 64, 'lr': 0.001988228642901915, 'batch_size': 16} Best trial final validation loss: 1.090300385904312 Best trial final validation accuracy: 0.6237 Files already downloaded and verified Files already downloaded and verified Best trial test set accuracy: 0.6189 If you run the code, an example output could look like this: :: Number of trials: 10 (10 TERMINATED) +-----+------+------+-------------+--------------+---------+------------+--------------------+ | ... | l1 | l2 | lr | batch_size | loss | accuracy | training_iteration | |-----+------+------+-------------+--------------+---------+------------+--------------------| | ... | 64 | 4 | 0.00011629 | 2 | 1.87273 | 0.244 | 2 | | ... | 32 | 64 | 0.000339763 | 8 | 1.23603 | 0.567 | 8 | | ... | 8 | 16 | 0.00276249 | 16 | 1.1815 | 0.5836 | 10 | | ... | 4 | 64 | 0.000648721 | 4 | 1.31131 | 0.5224 | 8 | | ... | 32 | 16 | 0.000340753 | 8 | 1.26454 | 0.5444 | 8 | | ... | 8 | 4 | 0.000699775 | 8 | 1.99594 | 0.1983 | 2 | | ... | 256 | 8 | 0.0839654 | 16 | 2.3119 | 0.0993 | 1 | | ... | 16 | 128 | 0.0758154 | 16 | 2.33575 | 0.1327 | 1 | | ... | 16 | 8 | 0.0763312 | 16 | 2.31129 | 0.1042 | 4 | | ... | 128 | 16 | 0.000124903 | 4 | 2.26917 | 0.1945 | 1 | +-----+------+------+-------------+--------------+---------+------------+--------------------+ Best trial config: {'l1': 8, 'l2': 16, 'lr': 0.00276249, 'batch_size': 16, 'data_dir': '...'} Best trial final validation loss: 1.181501 Best trial final validation accuracy: 0.5836 Best trial test set accuracy: 0.5806 Most trials have been stopped early in order to avoid wasting resources. The best performing trial achieved a validation accuracy of about 58%, which could be confirmed on the test set. So that's it! You can now tune the parameters of your PyTorch models. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 12 minutes 48.008 seconds) .. _sphx_glr_download_beginner_hyperparameter_tuning_tutorial.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download :download:`Download Python source code: hyperparameter_tuning_tutorial.py ` .. container:: sphx-glr-download :download:`Download Jupyter notebook: hyperparameter_tuning_tutorial.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_