How to do parallel processing in pytorchWhat is the difference between concurrency and parallelism?What is the difference between concurrent programming and parallel programming?Should I always use a parallel stream when possible?How does the “view” method work in PyTorch?about torch.nn.CrossEntropyLoss parameter shapePytorch, what are the gradient argumentsWhy the same configuration network in caffe and pytorch behaves so differently?Pytorch DataLoader - Choose Class STL10 DatasetWhat is the difference between MLP implementation from scratch and in PyTorch?PyTorch did not compute gradient and update parameters for 'masking' tensors?

How might boat designs change in order to allow them to be pulled by dragons?

Will greasing clutch parts make it softer

Odd PCB Layout for Voltage Regulator

Phrase origin: "You ain't got to go home but you got to get out of here."

What do you call the motor that fuels the movement of a robotic arm?

"Best practices" for formulating MIPs

Blood-based alcohol for vampires?

What verb goes with "coup"?

Is it possible that Curiosity measured its own methane or failed doing the spectrometry?

What is the difference between a historical drama and a period drama?

Why is quantum gravity non-renormalizable?

What is the difference between case and adpositions?

How long had Bertha Mason been in the attic at the point of the events in Jane Eyre

Is there any way for a Half-Orc Sorcerer to get proficiency with a heavy weapon?

How can I know (without going to the station) if RATP is offering the Anti Pollution tickets?

Should I warn my boss I might take sick leave

Does the North Korea Kim Jong Un have an heir?

Who are the police in Hong Kong?

Olive oil in Japanese cooking

List of Implementations for common OR problems

Is it advisable to inform the CEO about his brother accessing his office?

Why would a propellor have blades of different lengths?

Did Snape really give Umbridge a fake Veritaserum potion that Harry later pretended to drink?

What is meaning of 4 letter acronyms in Roman names like Titus Flavius T. f. T. n. Sabinus?



How to do parallel processing in pytorch


What is the difference between concurrency and parallelism?What is the difference between concurrent programming and parallel programming?Should I always use a parallel stream when possible?How does the “view” method work in PyTorch?about torch.nn.CrossEntropyLoss parameter shapePytorch, what are the gradient argumentsWhy the same configuration network in caffe and pytorch behaves so differently?Pytorch DataLoader - Choose Class STL10 DatasetWhat is the difference between MLP implementation from scratch and in PyTorch?PyTorch did not compute gradient and update parameters for 'masking' tensors?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I am working on a deep learning problem. I am solving it using pytorch. I have two GPU's which are on the same machine (16273MiB,12193MiB). I want to use both the GPU's for my training (video dataset).



I get a warning:



There is an imbalance between your GPUs. You may want to exclude GPU 1 which
has less than 75% of the memory or cores of GPU 0. You can do so by setting
the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
environment variable.
warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))



I also get an error:



raise TypeError('Broadcast function not implemented for CPU tensors')
TypeError: Broadcast function not implemented for CPU tensors



if __name__ == '__main__':

opt.scales = [opt.initial_scale]
for i in range(1, opt.n_scales):
opt.scales.append(opt.scales[-1] * opt.scale_step)
opt.arch = '-'.format(opt.model, opt.model_depth)
opt.mean = get_mean(opt.norm_value)
opt.std = get_std(opt.norm_value)
print("opt",opt)
with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
json.dump(vars(opt), opt_file)

torch.manual_seed(opt.manual_seed)

model, parameters = generate_model(opt)
#print(model)

pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print("Total number of trainable parameters: ", pytorch_total_params)

# Define Class weights
if opt.weighted:
print("Weighted Loss is created")
if opt.n_finetune_classes == 2:
weight = torch.tensor([1.0, 3.0])
else:
weight = torch.ones(opt.n_finetune_classes)
else:
weight = None

criterion = nn.CrossEntropyLoss()
if not opt.no_cuda:



criterion = nn.DataParallel(criterion.cuda())




if opt.no_mean_norm and not opt.std_norm:
norm_method = Normalize([0, 0, 0], [1, 1, 1])
elif not opt.std_norm:
norm_method = Normalize(opt.mean, [1, 1, 1])
else:
norm_method = Normalize(opt.mean, opt.std)

train_loader = torch.utils.data.DataLoader(
training_data,
batch_size=opt.batch_size,
shuffle=True,
num_workers=opt.n_threads,
pin_memory=True)
train_logger = Logger(
os.path.join(opt.result_path, 'train.log'),
['epoch', 'loss', 'acc', 'precision','recall','lr'])
train_batch_logger = Logger(
os.path.join(opt.result_path, 'train_batch.log'),
['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

if opt.nesterov:
dampening = 0
else:
dampening = opt.dampening
optimizer = optim.SGD(
parameters,
lr=opt.learning_rate,
momentum=opt.momentum,
dampening=dampening,
weight_decay=opt.weight_decay,
nesterov=opt.nesterov)
# scheduler = lr_scheduler.ReduceLROnPlateau(
# optimizer, 'min', patience=opt.lr_patience)
if not opt.no_val:
spatial_transform = Compose([
Scale(opt.sample_size),
CenterCrop(opt.sample_size),
ToTensor(opt.norm_value), norm_method
])




print('run')
for i in range(opt.begin_epoch, opt.n_epochs + 1):
if not opt.no_train:
adjust_learning_rate(optimizer, i, opt.lr_steps)
train_epoch(i, train_loader, model, criterion, optimizer, opt,
train_logger, train_batch_logger)




I have also made changes in my train file:



 model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
outputs = model(inputs)


It does not seem to work properly and is giving error. Please advice, I am new to pytorch.



Thanks










share|improve this question




























    1















    I am working on a deep learning problem. I am solving it using pytorch. I have two GPU's which are on the same machine (16273MiB,12193MiB). I want to use both the GPU's for my training (video dataset).



    I get a warning:



    There is an imbalance between your GPUs. You may want to exclude GPU 1 which
    has less than 75% of the memory or cores of GPU 0. You can do so by setting
    the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
    environment variable.
    warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))



    I also get an error:



    raise TypeError('Broadcast function not implemented for CPU tensors')
    TypeError: Broadcast function not implemented for CPU tensors



    if __name__ == '__main__':

    opt.scales = [opt.initial_scale]
    for i in range(1, opt.n_scales):
    opt.scales.append(opt.scales[-1] * opt.scale_step)
    opt.arch = '-'.format(opt.model, opt.model_depth)
    opt.mean = get_mean(opt.norm_value)
    opt.std = get_std(opt.norm_value)
    print("opt",opt)
    with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
    json.dump(vars(opt), opt_file)

    torch.manual_seed(opt.manual_seed)

    model, parameters = generate_model(opt)
    #print(model)

    pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    print("Total number of trainable parameters: ", pytorch_total_params)

    # Define Class weights
    if opt.weighted:
    print("Weighted Loss is created")
    if opt.n_finetune_classes == 2:
    weight = torch.tensor([1.0, 3.0])
    else:
    weight = torch.ones(opt.n_finetune_classes)
    else:
    weight = None

    criterion = nn.CrossEntropyLoss()
    if not opt.no_cuda:



    criterion = nn.DataParallel(criterion.cuda())




    if opt.no_mean_norm and not opt.std_norm:
    norm_method = Normalize([0, 0, 0], [1, 1, 1])
    elif not opt.std_norm:
    norm_method = Normalize(opt.mean, [1, 1, 1])
    else:
    norm_method = Normalize(opt.mean, opt.std)

    train_loader = torch.utils.data.DataLoader(
    training_data,
    batch_size=opt.batch_size,
    shuffle=True,
    num_workers=opt.n_threads,
    pin_memory=True)
    train_logger = Logger(
    os.path.join(opt.result_path, 'train.log'),
    ['epoch', 'loss', 'acc', 'precision','recall','lr'])
    train_batch_logger = Logger(
    os.path.join(opt.result_path, 'train_batch.log'),
    ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

    if opt.nesterov:
    dampening = 0
    else:
    dampening = opt.dampening
    optimizer = optim.SGD(
    parameters,
    lr=opt.learning_rate,
    momentum=opt.momentum,
    dampening=dampening,
    weight_decay=opt.weight_decay,
    nesterov=opt.nesterov)
    # scheduler = lr_scheduler.ReduceLROnPlateau(
    # optimizer, 'min', patience=opt.lr_patience)
    if not opt.no_val:
    spatial_transform = Compose([
    Scale(opt.sample_size),
    CenterCrop(opt.sample_size),
    ToTensor(opt.norm_value), norm_method
    ])




    print('run')
    for i in range(opt.begin_epoch, opt.n_epochs + 1):
    if not opt.no_train:
    adjust_learning_rate(optimizer, i, opt.lr_steps)
    train_epoch(i, train_loader, model, criterion, optimizer, opt,
    train_logger, train_batch_logger)




    I have also made changes in my train file:



     model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
    outputs = model(inputs)


    It does not seem to work properly and is giving error. Please advice, I am new to pytorch.



    Thanks










    share|improve this question
























      1












      1








      1








      I am working on a deep learning problem. I am solving it using pytorch. I have two GPU's which are on the same machine (16273MiB,12193MiB). I want to use both the GPU's for my training (video dataset).



      I get a warning:



      There is an imbalance between your GPUs. You may want to exclude GPU 1 which
      has less than 75% of the memory or cores of GPU 0. You can do so by setting
      the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
      environment variable.
      warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))



      I also get an error:



      raise TypeError('Broadcast function not implemented for CPU tensors')
      TypeError: Broadcast function not implemented for CPU tensors



      if __name__ == '__main__':

      opt.scales = [opt.initial_scale]
      for i in range(1, opt.n_scales):
      opt.scales.append(opt.scales[-1] * opt.scale_step)
      opt.arch = '-'.format(opt.model, opt.model_depth)
      opt.mean = get_mean(opt.norm_value)
      opt.std = get_std(opt.norm_value)
      print("opt",opt)
      with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
      json.dump(vars(opt), opt_file)

      torch.manual_seed(opt.manual_seed)

      model, parameters = generate_model(opt)
      #print(model)

      pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
      print("Total number of trainable parameters: ", pytorch_total_params)

      # Define Class weights
      if opt.weighted:
      print("Weighted Loss is created")
      if opt.n_finetune_classes == 2:
      weight = torch.tensor([1.0, 3.0])
      else:
      weight = torch.ones(opt.n_finetune_classes)
      else:
      weight = None

      criterion = nn.CrossEntropyLoss()
      if not opt.no_cuda:



      criterion = nn.DataParallel(criterion.cuda())




      if opt.no_mean_norm and not opt.std_norm:
      norm_method = Normalize([0, 0, 0], [1, 1, 1])
      elif not opt.std_norm:
      norm_method = Normalize(opt.mean, [1, 1, 1])
      else:
      norm_method = Normalize(opt.mean, opt.std)

      train_loader = torch.utils.data.DataLoader(
      training_data,
      batch_size=opt.batch_size,
      shuffle=True,
      num_workers=opt.n_threads,
      pin_memory=True)
      train_logger = Logger(
      os.path.join(opt.result_path, 'train.log'),
      ['epoch', 'loss', 'acc', 'precision','recall','lr'])
      train_batch_logger = Logger(
      os.path.join(opt.result_path, 'train_batch.log'),
      ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

      if opt.nesterov:
      dampening = 0
      else:
      dampening = opt.dampening
      optimizer = optim.SGD(
      parameters,
      lr=opt.learning_rate,
      momentum=opt.momentum,
      dampening=dampening,
      weight_decay=opt.weight_decay,
      nesterov=opt.nesterov)
      # scheduler = lr_scheduler.ReduceLROnPlateau(
      # optimizer, 'min', patience=opt.lr_patience)
      if not opt.no_val:
      spatial_transform = Compose([
      Scale(opt.sample_size),
      CenterCrop(opt.sample_size),
      ToTensor(opt.norm_value), norm_method
      ])




      print('run')
      for i in range(opt.begin_epoch, opt.n_epochs + 1):
      if not opt.no_train:
      adjust_learning_rate(optimizer, i, opt.lr_steps)
      train_epoch(i, train_loader, model, criterion, optimizer, opt,
      train_logger, train_batch_logger)




      I have also made changes in my train file:



       model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
      outputs = model(inputs)


      It does not seem to work properly and is giving error. Please advice, I am new to pytorch.



      Thanks










      share|improve this question














      I am working on a deep learning problem. I am solving it using pytorch. I have two GPU's which are on the same machine (16273MiB,12193MiB). I want to use both the GPU's for my training (video dataset).



      I get a warning:



      There is an imbalance between your GPUs. You may want to exclude GPU 1 which
      has less than 75% of the memory or cores of GPU 0. You can do so by setting
      the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
      environment variable.
      warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))



      I also get an error:



      raise TypeError('Broadcast function not implemented for CPU tensors')
      TypeError: Broadcast function not implemented for CPU tensors



      if __name__ == '__main__':

      opt.scales = [opt.initial_scale]
      for i in range(1, opt.n_scales):
      opt.scales.append(opt.scales[-1] * opt.scale_step)
      opt.arch = '-'.format(opt.model, opt.model_depth)
      opt.mean = get_mean(opt.norm_value)
      opt.std = get_std(opt.norm_value)
      print("opt",opt)
      with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
      json.dump(vars(opt), opt_file)

      torch.manual_seed(opt.manual_seed)

      model, parameters = generate_model(opt)
      #print(model)

      pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
      print("Total number of trainable parameters: ", pytorch_total_params)

      # Define Class weights
      if opt.weighted:
      print("Weighted Loss is created")
      if opt.n_finetune_classes == 2:
      weight = torch.tensor([1.0, 3.0])
      else:
      weight = torch.ones(opt.n_finetune_classes)
      else:
      weight = None

      criterion = nn.CrossEntropyLoss()
      if not opt.no_cuda:



      criterion = nn.DataParallel(criterion.cuda())




      if opt.no_mean_norm and not opt.std_norm:
      norm_method = Normalize([0, 0, 0], [1, 1, 1])
      elif not opt.std_norm:
      norm_method = Normalize(opt.mean, [1, 1, 1])
      else:
      norm_method = Normalize(opt.mean, opt.std)

      train_loader = torch.utils.data.DataLoader(
      training_data,
      batch_size=opt.batch_size,
      shuffle=True,
      num_workers=opt.n_threads,
      pin_memory=True)
      train_logger = Logger(
      os.path.join(opt.result_path, 'train.log'),
      ['epoch', 'loss', 'acc', 'precision','recall','lr'])
      train_batch_logger = Logger(
      os.path.join(opt.result_path, 'train_batch.log'),
      ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

      if opt.nesterov:
      dampening = 0
      else:
      dampening = opt.dampening
      optimizer = optim.SGD(
      parameters,
      lr=opt.learning_rate,
      momentum=opt.momentum,
      dampening=dampening,
      weight_decay=opt.weight_decay,
      nesterov=opt.nesterov)
      # scheduler = lr_scheduler.ReduceLROnPlateau(
      # optimizer, 'min', patience=opt.lr_patience)
      if not opt.no_val:
      spatial_transform = Compose([
      Scale(opt.sample_size),
      CenterCrop(opt.sample_size),
      ToTensor(opt.norm_value), norm_method
      ])




      print('run')
      for i in range(opt.begin_epoch, opt.n_epochs + 1):
      if not opt.no_train:
      adjust_learning_rate(optimizer, i, opt.lr_steps)
      train_epoch(i, train_loader, model, criterion, optimizer, opt,
      train_logger, train_batch_logger)




      I have also made changes in my train file:



       model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
      outputs = model(inputs)


      It does not seem to work properly and is giving error. Please advice, I am new to pytorch.



      Thanks







      parallel-processing pytorch torch gpu-programming torchvision






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 25 at 17:58









      user10050371user10050371

      175 bronze badges




      175 bronze badges






















          1 Answer
          1






          active

          oldest

          votes


















          1














          As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.



          net = nn.DataParallel(model.cuda(), device_ids=[0,1])


          https://github.com/pytorch/pytorch/issues/17065






          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55343893%2fhow-to-do-parallel-processing-in-pytorch%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.



            net = nn.DataParallel(model.cuda(), device_ids=[0,1])


            https://github.com/pytorch/pytorch/issues/17065






            share|improve this answer



























              1














              As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.



              net = nn.DataParallel(model.cuda(), device_ids=[0,1])


              https://github.com/pytorch/pytorch/issues/17065






              share|improve this answer

























                1












                1








                1







                As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.



                net = nn.DataParallel(model.cuda(), device_ids=[0,1])


                https://github.com/pytorch/pytorch/issues/17065






                share|improve this answer













                As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.



                net = nn.DataParallel(model.cuda(), device_ids=[0,1])


                https://github.com/pytorch/pytorch/issues/17065







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 27 at 9:57









                Manoj MohanManoj Mohan

                2,0915 silver badges12 bronze badges




                2,0915 silver badges12 bronze badges


















                    Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







                    Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55343893%2fhow-to-do-parallel-processing-in-pytorch%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

                    용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

                    155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해