How to do parallel processing in pytorchWhat is the difference between concurrency and parallelism?What is the difference between concurrent programming and parallel programming?Should I always use a parallel stream when possible?How does the “view” method work in PyTorch?about torch.nn.CrossEntropyLoss parameter shapePytorch, what are the gradient argumentsWhy the same configuration network in caffe and pytorch behaves so differently?Pytorch DataLoader - Choose Class STL10 DatasetWhat is the difference between MLP implementation from scratch and in PyTorch?PyTorch did not compute gradient and update parameters for 'masking' tensors?

How might boat designs change in order to allow them to be pulled by dragons?

Will greasing clutch parts make it softer

Odd PCB Layout for Voltage Regulator

Phrase origin: "You ain't got to go home but you got to get out of here."

What do you call the motor that fuels the movement of a robotic arm?

"Best practices" for formulating MIPs

Blood-based alcohol for vampires?

What verb goes with "coup"?

Is it possible that Curiosity measured its own methane or failed doing the spectrometry?

What is the difference between a historical drama and a period drama?

Why is quantum gravity non-renormalizable?

What is the difference between case and adpositions?

How long had Bertha Mason been in the attic at the point of the events in Jane Eyre

Is there any way for a Half-Orc Sorcerer to get proficiency with a heavy weapon?

How can I know (without going to the station) if RATP is offering the Anti Pollution tickets?

Should I warn my boss I might take sick leave

Does the North Korea Kim Jong Un have an heir?

Who are the police in Hong Kong?

Olive oil in Japanese cooking

List of Implementations for common OR problems

Is it advisable to inform the CEO about his brother accessing his office?

Why would a propellor have blades of different lengths?

Did Snape really give Umbridge a fake Veritaserum potion that Harry later pretended to drink?

What is meaning of 4 letter acronyms in Roman names like Titus Flavius T. f. T. n. Sabinus?

How to do parallel processing in pytorch

What is the difference between concurrency and parallelism?What is the difference between concurrent programming and parallel programming?Should I always use a parallel stream when possible?How does the “view” method work in PyTorch?about torch.nn.CrossEntropyLoss parameter shapePytorch, what are the gradient argumentsWhy the same configuration network in caffe and pytorch behaves so differently?Pytorch DataLoader - Choose Class STL10 DatasetWhat is the difference between MLP implementation from scratch and in PyTorch?PyTorch did not compute gradient and update parameters for 'masking' tensors?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I am working on a deep learning problem. I am solving it using pytorch. I have two GPU's which are on the same machine (16273MiB,12193MiB). I want to use both the GPU's for my training (video dataset).

I get a warning:

There is an imbalance between your GPUs. You may want to exclude GPU 1 which
has less than 75% of the memory or cores of GPU 0. You can do so by setting
the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
environment variable.
warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))

I also get an error:

raise TypeError('Broadcast function not implemented for CPU tensors')
TypeError: Broadcast function not implemented for CPU tensors

if __name__ == '__main__':

 opt.scales = [opt.initial_scale]
 for i in range(1, opt.n_scales):
 opt.scales.append(opt.scales[-1] * opt.scale_step)
 opt.arch = '-'.format(opt.model, opt.model_depth)
 opt.mean = get_mean(opt.norm_value)
 opt.std = get_std(opt.norm_value)
 print("opt",opt)
 with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
 json.dump(vars(opt), opt_file)

 torch.manual_seed(opt.manual_seed)

 model, parameters = generate_model(opt)
 #print(model)

 pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
 print("Total number of trainable parameters: ", pytorch_total_params)

 # Define Class weights
 if opt.weighted:
 print("Weighted Loss is created")
 if opt.n_finetune_classes == 2:
 weight = torch.tensor([1.0, 3.0])
 else:
 weight = torch.ones(opt.n_finetune_classes)
 else:
 weight = None

 criterion = nn.CrossEntropyLoss()
 if not opt.no_cuda:



 criterion = nn.DataParallel(criterion.cuda())




 if opt.no_mean_norm and not opt.std_norm:
 norm_method = Normalize([0, 0, 0], [1, 1, 1])
 elif not opt.std_norm:
 norm_method = Normalize(opt.mean, [1, 1, 1])
 else:
 norm_method = Normalize(opt.mean, opt.std)

 train_loader = torch.utils.data.DataLoader(
 training_data,
 batch_size=opt.batch_size,
 shuffle=True,
 num_workers=opt.n_threads,
 pin_memory=True)
 train_logger = Logger(
 os.path.join(opt.result_path, 'train.log'),
 ['epoch', 'loss', 'acc', 'precision','recall','lr'])
 train_batch_logger = Logger(
 os.path.join(opt.result_path, 'train_batch.log'),
 ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

 if opt.nesterov:
 dampening = 0
 else:
 dampening = opt.dampening
 optimizer = optim.SGD(
 parameters,
 lr=opt.learning_rate,
 momentum=opt.momentum,
 dampening=dampening,
 weight_decay=opt.weight_decay,
 nesterov=opt.nesterov)
 # scheduler = lr_scheduler.ReduceLROnPlateau(
 # optimizer, 'min', patience=opt.lr_patience)
 if not opt.no_val:
 spatial_transform = Compose([
 Scale(opt.sample_size),
 CenterCrop(opt.sample_size),
 ToTensor(opt.norm_value), norm_method
 ])




 print('run')
 for i in range(opt.begin_epoch, opt.n_epochs + 1):
 if not opt.no_train:
 adjust_learning_rate(optimizer, i, opt.lr_steps)
 train_epoch(i, train_loader, model, criterion, optimizer, opt,
 train_logger, train_batch_logger)

I have also made changes in my train file:

 model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
 outputs = model(inputs)

It does not seem to work properly and is giving error. Please advice, I am new to pytorch.

Thanks

asked Mar 25 at 17:58

user10050371

175 bronze badges

add a comment |

I get a warning:

I also get an error:

raise TypeError('Broadcast function not implemented for CPU tensors')
TypeError: Broadcast function not implemented for CPU tensors

if __name__ == '__main__':

 opt.scales = [opt.initial_scale]
 for i in range(1, opt.n_scales):
 opt.scales.append(opt.scales[-1] * opt.scale_step)
 opt.arch = '-'.format(opt.model, opt.model_depth)
 opt.mean = get_mean(opt.norm_value)
 opt.std = get_std(opt.norm_value)
 print("opt",opt)
 with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
 json.dump(vars(opt), opt_file)

 torch.manual_seed(opt.manual_seed)

 model, parameters = generate_model(opt)
 #print(model)

 pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
 print("Total number of trainable parameters: ", pytorch_total_params)

 # Define Class weights
 if opt.weighted:
 print("Weighted Loss is created")
 if opt.n_finetune_classes == 2:
 weight = torch.tensor([1.0, 3.0])
 else:
 weight = torch.ones(opt.n_finetune_classes)
 else:
 weight = None

 criterion = nn.CrossEntropyLoss()
 if not opt.no_cuda:



 criterion = nn.DataParallel(criterion.cuda())




 if opt.no_mean_norm and not opt.std_norm:
 norm_method = Normalize([0, 0, 0], [1, 1, 1])
 elif not opt.std_norm:
 norm_method = Normalize(opt.mean, [1, 1, 1])
 else:
 norm_method = Normalize(opt.mean, opt.std)

 train_loader = torch.utils.data.DataLoader(
 training_data,
 batch_size=opt.batch_size,
 shuffle=True,
 num_workers=opt.n_threads,
 pin_memory=True)
 train_logger = Logger(
 os.path.join(opt.result_path, 'train.log'),
 ['epoch', 'loss', 'acc', 'precision','recall','lr'])
 train_batch_logger = Logger(
 os.path.join(opt.result_path, 'train_batch.log'),
 ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

 if opt.nesterov:
 dampening = 0
 else:
 dampening = opt.dampening
 optimizer = optim.SGD(
 parameters,
 lr=opt.learning_rate,
 momentum=opt.momentum,
 dampening=dampening,
 weight_decay=opt.weight_decay,
 nesterov=opt.nesterov)
 # scheduler = lr_scheduler.ReduceLROnPlateau(
 # optimizer, 'min', patience=opt.lr_patience)
 if not opt.no_val:
 spatial_transform = Compose([
 Scale(opt.sample_size),
 CenterCrop(opt.sample_size),
 ToTensor(opt.norm_value), norm_method
 ])




 print('run')
 for i in range(opt.begin_epoch, opt.n_epochs + 1):
 if not opt.no_train:
 adjust_learning_rate(optimizer, i, opt.lr_steps)
 train_epoch(i, train_loader, model, criterion, optimizer, opt,
 train_logger, train_batch_logger)

I have also made changes in my train file:

 model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
 outputs = model(inputs)

It does not seem to work properly and is giving error. Please advice, I am new to pytorch.

Thanks

asked Mar 25 at 17:58

user10050371

175 bronze badges

add a comment |

I get a warning:

I also get an error:

raise TypeError('Broadcast function not implemented for CPU tensors')
TypeError: Broadcast function not implemented for CPU tensors

if __name__ == '__main__':

 opt.scales = [opt.initial_scale]
 for i in range(1, opt.n_scales):
 opt.scales.append(opt.scales[-1] * opt.scale_step)
 opt.arch = '-'.format(opt.model, opt.model_depth)
 opt.mean = get_mean(opt.norm_value)
 opt.std = get_std(opt.norm_value)
 print("opt",opt)
 with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
 json.dump(vars(opt), opt_file)

 torch.manual_seed(opt.manual_seed)

 model, parameters = generate_model(opt)
 #print(model)

 pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
 print("Total number of trainable parameters: ", pytorch_total_params)

 # Define Class weights
 if opt.weighted:
 print("Weighted Loss is created")
 if opt.n_finetune_classes == 2:
 weight = torch.tensor([1.0, 3.0])
 else:
 weight = torch.ones(opt.n_finetune_classes)
 else:
 weight = None

 criterion = nn.CrossEntropyLoss()
 if not opt.no_cuda:



 criterion = nn.DataParallel(criterion.cuda())




 if opt.no_mean_norm and not opt.std_norm:
 norm_method = Normalize([0, 0, 0], [1, 1, 1])
 elif not opt.std_norm:
 norm_method = Normalize(opt.mean, [1, 1, 1])
 else:
 norm_method = Normalize(opt.mean, opt.std)

 train_loader = torch.utils.data.DataLoader(
 training_data,
 batch_size=opt.batch_size,
 shuffle=True,
 num_workers=opt.n_threads,
 pin_memory=True)
 train_logger = Logger(
 os.path.join(opt.result_path, 'train.log'),
 ['epoch', 'loss', 'acc', 'precision','recall','lr'])
 train_batch_logger = Logger(
 os.path.join(opt.result_path, 'train_batch.log'),
 ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

 if opt.nesterov:
 dampening = 0
 else:
 dampening = opt.dampening
 optimizer = optim.SGD(
 parameters,
 lr=opt.learning_rate,
 momentum=opt.momentum,
 dampening=dampening,
 weight_decay=opt.weight_decay,
 nesterov=opt.nesterov)
 # scheduler = lr_scheduler.ReduceLROnPlateau(
 # optimizer, 'min', patience=opt.lr_patience)
 if not opt.no_val:
 spatial_transform = Compose([
 Scale(opt.sample_size),
 CenterCrop(opt.sample_size),
 ToTensor(opt.norm_value), norm_method
 ])




 print('run')
 for i in range(opt.begin_epoch, opt.n_epochs + 1):
 if not opt.no_train:
 adjust_learning_rate(optimizer, i, opt.lr_steps)
 train_epoch(i, train_loader, model, criterion, optimizer, opt,
 train_logger, train_batch_logger)

I have also made changes in my train file:

 model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
 outputs = model(inputs)

It does not seem to work properly and is giving error. Please advice, I am new to pytorch.

Thanks

asked Mar 25 at 17:58

user10050371

175 bronze badges

I get a warning:

I also get an error:

raise TypeError('Broadcast function not implemented for CPU tensors')
TypeError: Broadcast function not implemented for CPU tensors

if __name__ == '__main__':

 opt.scales = [opt.initial_scale]
 for i in range(1, opt.n_scales):
 opt.scales.append(opt.scales[-1] * opt.scale_step)
 opt.arch = '-'.format(opt.model, opt.model_depth)
 opt.mean = get_mean(opt.norm_value)
 opt.std = get_std(opt.norm_value)
 print("opt",opt)
 with open(os.path.join(opt.result_path, 'opts.json'), 'w') as opt_file:
 json.dump(vars(opt), opt_file)

 torch.manual_seed(opt.manual_seed)

 model, parameters = generate_model(opt)
 #print(model)

 pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
 print("Total number of trainable parameters: ", pytorch_total_params)

 # Define Class weights
 if opt.weighted:
 print("Weighted Loss is created")
 if opt.n_finetune_classes == 2:
 weight = torch.tensor([1.0, 3.0])
 else:
 weight = torch.ones(opt.n_finetune_classes)
 else:
 weight = None

 criterion = nn.CrossEntropyLoss()
 if not opt.no_cuda:



 criterion = nn.DataParallel(criterion.cuda())




 if opt.no_mean_norm and not opt.std_norm:
 norm_method = Normalize([0, 0, 0], [1, 1, 1])
 elif not opt.std_norm:
 norm_method = Normalize(opt.mean, [1, 1, 1])
 else:
 norm_method = Normalize(opt.mean, opt.std)

 train_loader = torch.utils.data.DataLoader(
 training_data,
 batch_size=opt.batch_size,
 shuffle=True,
 num_workers=opt.n_threads,
 pin_memory=True)
 train_logger = Logger(
 os.path.join(opt.result_path, 'train.log'),
 ['epoch', 'loss', 'acc', 'precision','recall','lr'])
 train_batch_logger = Logger(
 os.path.join(opt.result_path, 'train_batch.log'),
 ['epoch', 'batch', 'iter', 'loss', 'acc', 'precision', 'recall', 'lr'])

 if opt.nesterov:
 dampening = 0
 else:
 dampening = opt.dampening
 optimizer = optim.SGD(
 parameters,
 lr=opt.learning_rate,
 momentum=opt.momentum,
 dampening=dampening,
 weight_decay=opt.weight_decay,
 nesterov=opt.nesterov)
 # scheduler = lr_scheduler.ReduceLROnPlateau(
 # optimizer, 'min', patience=opt.lr_patience)
 if not opt.no_val:
 spatial_transform = Compose([
 Scale(opt.sample_size),
 CenterCrop(opt.sample_size),
 ToTensor(opt.norm_value), norm_method
 ])




 print('run')
 for i in range(opt.begin_epoch, opt.n_epochs + 1):
 if not opt.no_train:
 adjust_learning_rate(optimizer, i, opt.lr_steps)
 train_epoch(i, train_loader, model, criterion, optimizer, opt,
 train_logger, train_batch_logger)

I have also made changes in my train file:

 model = nn.DataParallel(model(),device_ids=[0,1]).cuda() 
 outputs = model(inputs)

It does not seem to work properly and is giving error. Please advice, I am new to pytorch.

Thanks

parallel-processing pytorch torch gpu-programming torchvision

asked Mar 25 at 17:58

user10050371

175 bronze badges

asked Mar 25 at 17:58

user10050371

175 bronze badges

asked Mar 25 at 17:58

user10050371

175 bronze badges

asked Mar 25 at 17:58

user10050371

175 bronze badges

asked Mar 25 at 17:58

user10050371

175 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.

net = nn.DataParallel(model.cuda(), device_ids=[0,1])

https://github.com/pytorch/pytorch/issues/17065

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55343893%2fhow-to-do-parallel-processing-in-pytorch%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.

net = nn.DataParallel(model.cuda(), device_ids=[0,1])

https://github.com/pytorch/pytorch/issues/17065

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

add a comment |

As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.

net = nn.DataParallel(model.cuda(), device_ids=[0,1])

https://github.com/pytorch/pytorch/issues/17065

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

add a comment |

As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.

net = nn.DataParallel(model.cuda(), device_ids=[0,1])

https://github.com/pytorch/pytorch/issues/17065

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

As mentioned in this link, you have to do model.cuda() before passing it to nn.DataParallel.

net = nn.DataParallel(model.cuda(), device_ids=[0,1])

https://github.com/pytorch/pytorch/issues/17065

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

answered Mar 27 at 9:57

Manoj Mohan

2,0915 silver badges12 bronze badges

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1