Why the pytorch implementation is so inefficient?Why would a DQN give similar values to all actions in the action space (2) for all observationsWho changed my weight initialization?Test Accuracy Increases Whilst Loss IncreasesOutput of conv2d in kerasDeep neural network not learningbackprop in merged modelsThe output NN is image an image with values 0 or 1, but the expected are a range of integers between 0 and 255Conv2D and Conv3D: increase the accuracyHow can I freeze last layer of my own model?'Sequential' object has no attribute 'loss' - When I used GridSearchCV to tuning my Keras model

What does Kasparov mean by "I was behind in three and even in one after six games"?

Why are so many countries still in the Commonwealth?

Automatic Habit of Meditation

How important is a good quality camera for good photography?

How do professional electronic musicians/sound engineers combat listening fatigue?

What is the lowest-speed bogey a jet fighter can intercept/escort?

Spoken encryption

Is it legal to use cash pulled from a credit card to pay the monthly payment on that credit card?

Knights fighting a steam locomotive they believe is a dragon

Why is my read in of data taking so long?

Can two figures have the same area, perimeter, and same number of segments have different shape?

What is the meaning of "you has the wind of me"?

Area of parallelogram = Area of square. Shear transform

Why are off grid solar setups only 12, 24, 48 VDC?

Timing/Stack question about abilities triggered during combat

Iterate over non-const variables in C++

Inadvertently nuked my disk permission structure - why?

Reduce column width of table while also aligning values at decimal point

Where to place an artificial gland in the human body?

Send a single HTML email from Thunderbird, overriding the default "plain text" setting

Examples of simultaneous independent breakthroughs

How can I make sure my players' decisions have consequences?

This message is flooding my syslog, how to find where it comes from?

Is it normal practice to screen share with a client?

Why the pytorch implementation is so inefficient?

Why would a DQN give similar values to all actions in the action space (2) for all observationsWho changed my weight initialization?Test Accuracy Increases Whilst Loss IncreasesOutput of conv2d in kerasDeep neural network not learningbackprop in merged modelsThe output NN is image an image with values 0 or 1, but the expected are a range of integers between 0 and 255Conv2D and Conv3D: increase the accuracyHow can I freeze last layer of my own model?'Sequential' object has no attribute 'loss' - When I used GridSearchCV to tuning my Keras model

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I have implemented a paper about a CNN architecture in both Keras and Pytorch but keras implementation is much more efficient it takes 4 gb of gpu for training with 50000 samples and 10000 validation samples but pytorch one takes all the 12 gb of gpu and i cant even use a validation set !
Optimizer for both of them is sgd with momentum and same settings for both.
more info about the paper:[architecture]:https://github.com/Moeinh77/Lightweight-Deep-Convolutional-Network-for-Tiny-Object-Recognition/edit/master/train.py

pytorch code :

class SimpleCNN(torch.nn.Module):

 def __init__(self):
 super(SimpleCNN, self).__init__()

 self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.Batchnorm_1=torch.nn.BatchNorm2d(64)
 self.Batchnorm_2=torch.nn.BatchNorm2d(128)
 self.Batchnorm_3=torch.nn.BatchNorm2d(256)
 self.Batchnorm_4=torch.nn.BatchNorm2d(512)

 self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
 self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
 self.dropout2d_3=torch.nn.Dropout2d(p=0.5)

 self.dropout1d=torch.nn.Dropout(p=0.5)

 self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.fc = torch.nn.Linear(512, 10)

 def forward(self, x):

 ############################# Phase 1
 #print(x.size())
 x = F.relu(self.conv2d_11(x))
 x = self.dropout2d_1(x) #rate =0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = F.relu(self.conv2d_12(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 2
 x = F.relu(self.conv2d_21(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = F.relu(self.conv2d_22(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 3
 x = F.relu(self.conv2d_31(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_32(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_33(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 4
 x = F.relu(self.conv2d_41(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = F.relu(self.conv2d_42(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 5
 x = F.relu(self.conv2d_51(x))
 x = self.dropout2d_3(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.avgpool2d(x)
 #print(x.size())
 x = x.view(x.size(0), -1)
 #print(x.size())
 x = self.dropout1d(x)
 x = F.relu(self.fc(x))
 x = self.dropout1d(x)
 #print(x.size())
 x = F.softmax(x)
 ###############################

 return(x)


import time
from torch.optim.lr_scheduler import ReduceLROnPlateau

def trainNet(model, batch_size, n_epochs, learning_rate):

 lr=learning_rate

 #Print all of the hyperparameters of the training iteration:
 print("======= HYPERPARAMETERS =======")
 print("Batch size=", batch_size)
 print("Epochs=", n_epochs)
 print("Base learning_rate=", learning_rate)
 print("=" * 30)

 #Get training data
 n_batches = len(train_loader)

 #Time for printing
 training_start_time = time.time()

 #Loss function"
 loss = torch.nn.CrossEntropyLoss()
 optimizer = createOptimizer(model, lr) 

 scheduler = ReduceLROnPlateau(optimizer, 'min'
 ,patience=3,factor=0.9817
 ,verbose=True,)

 #Loop for n_epochs
 for epoch in range(n_epochs):

 #save the weightsevery 10 epochs
 if epoch % 10 == 0 :
 torch.save(model.state_dict(), 'model.ckpt')


 #print('learning rate : :.3f '.format(lr)) 
 #Create our loss and optimizer functions

 running_loss = 0.0
 print_every = n_batches // 10
 start_time = time.time()
 total_train_loss = 0
 total_train_acc = 0
 epoch_time = 0

 for i, data in enumerate(train_loader, 0):

 #free up the cuda memory
 inputs=None
 labels=None

 inputs, labels = data

 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 optimizer.zero_grad()

 outputs = model(inputs)

 score, predictions = torch.max(outputs.data, 1)
 acc = (labels==predictions).sum()
 total_train_acc += acc

 loss_size = loss(outputs, labels)
 loss_size.backward()
 optimizer.step()

 running_loss += loss_size.item()
 total_train_loss += loss_size.item()

 #Print every 10th batch of an epoch
 if (i + 1) % (print_every + 1) == 0:
 print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
 epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
 ,int(acc), time.time() - start_time))

 epoch_time += (time.time() - start_time)

 #Reset running loss and time
 running_loss = 0.0
 start_time = time.time()

 scheduler.step(total_train_loss)
 torch.cuda.empty_cache() 
 #At the end of the epoch, do a pass on the validation set
 total_val_loss = 0

 for inputs, labels in val_loader:

 #Wrap tensors in Variables
 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 #Forward pass
 val_outputs = model(inputs)
 val_loss_size = loss(val_outputs, labels)
 total_val_loss += val_loss_size.item()

 print("-"*30)
 print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
 total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
 ,total_val_loss/len(val_loader),epoch_time))
 print("="*60)


 print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()

trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)

Keras:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D

model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())

#(32,32,3)

model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)


model.add(MaxPool2D((2,2)))
#(16,16,3)

#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)

#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)


model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(MaxPool2D((2,2)))
#(4,4,3)

#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(MaxPool2D((2,2)))
#(2,2,3)

#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)

model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))

model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
 epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

add a comment |

pytorch code :

class SimpleCNN(torch.nn.Module):

 def __init__(self):
 super(SimpleCNN, self).__init__()

 self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.Batchnorm_1=torch.nn.BatchNorm2d(64)
 self.Batchnorm_2=torch.nn.BatchNorm2d(128)
 self.Batchnorm_3=torch.nn.BatchNorm2d(256)
 self.Batchnorm_4=torch.nn.BatchNorm2d(512)

 self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
 self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
 self.dropout2d_3=torch.nn.Dropout2d(p=0.5)

 self.dropout1d=torch.nn.Dropout(p=0.5)

 self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.fc = torch.nn.Linear(512, 10)

 def forward(self, x):

 ############################# Phase 1
 #print(x.size())
 x = F.relu(self.conv2d_11(x))
 x = self.dropout2d_1(x) #rate =0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = F.relu(self.conv2d_12(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 2
 x = F.relu(self.conv2d_21(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = F.relu(self.conv2d_22(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 3
 x = F.relu(self.conv2d_31(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_32(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_33(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 4
 x = F.relu(self.conv2d_41(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = F.relu(self.conv2d_42(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 5
 x = F.relu(self.conv2d_51(x))
 x = self.dropout2d_3(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.avgpool2d(x)
 #print(x.size())
 x = x.view(x.size(0), -1)
 #print(x.size())
 x = self.dropout1d(x)
 x = F.relu(self.fc(x))
 x = self.dropout1d(x)
 #print(x.size())
 x = F.softmax(x)
 ###############################

 return(x)


import time
from torch.optim.lr_scheduler import ReduceLROnPlateau

def trainNet(model, batch_size, n_epochs, learning_rate):

 lr=learning_rate

 #Print all of the hyperparameters of the training iteration:
 print("======= HYPERPARAMETERS =======")
 print("Batch size=", batch_size)
 print("Epochs=", n_epochs)
 print("Base learning_rate=", learning_rate)
 print("=" * 30)

 #Get training data
 n_batches = len(train_loader)

 #Time for printing
 training_start_time = time.time()

 #Loss function"
 loss = torch.nn.CrossEntropyLoss()
 optimizer = createOptimizer(model, lr) 

 scheduler = ReduceLROnPlateau(optimizer, 'min'
 ,patience=3,factor=0.9817
 ,verbose=True,)

 #Loop for n_epochs
 for epoch in range(n_epochs):

 #save the weightsevery 10 epochs
 if epoch % 10 == 0 :
 torch.save(model.state_dict(), 'model.ckpt')


 #print('learning rate : :.3f '.format(lr)) 
 #Create our loss and optimizer functions

 running_loss = 0.0
 print_every = n_batches // 10
 start_time = time.time()
 total_train_loss = 0
 total_train_acc = 0
 epoch_time = 0

 for i, data in enumerate(train_loader, 0):

 #free up the cuda memory
 inputs=None
 labels=None

 inputs, labels = data

 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 optimizer.zero_grad()

 outputs = model(inputs)

 score, predictions = torch.max(outputs.data, 1)
 acc = (labels==predictions).sum()
 total_train_acc += acc

 loss_size = loss(outputs, labels)
 loss_size.backward()
 optimizer.step()

 running_loss += loss_size.item()
 total_train_loss += loss_size.item()

 #Print every 10th batch of an epoch
 if (i + 1) % (print_every + 1) == 0:
 print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
 epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
 ,int(acc), time.time() - start_time))

 epoch_time += (time.time() - start_time)

 #Reset running loss and time
 running_loss = 0.0
 start_time = time.time()

 scheduler.step(total_train_loss)
 torch.cuda.empty_cache() 
 #At the end of the epoch, do a pass on the validation set
 total_val_loss = 0

 for inputs, labels in val_loader:

 #Wrap tensors in Variables
 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 #Forward pass
 val_outputs = model(inputs)
 val_loss_size = loss(val_outputs, labels)
 total_val_loss += val_loss_size.item()

 print("-"*30)
 print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
 total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
 ,total_val_loss/len(val_loader),epoch_time))
 print("="*60)


 print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()

trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)

Keras:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D

model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())

#(32,32,3)

model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)


model.add(MaxPool2D((2,2)))
#(16,16,3)

#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)

#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)


model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(MaxPool2D((2,2)))
#(4,4,3)

#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(MaxPool2D((2,2)))
#(2,2,3)

#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)

model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))

model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
 epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

add a comment |

pytorch code :

class SimpleCNN(torch.nn.Module):

 def __init__(self):
 super(SimpleCNN, self).__init__()

 self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.Batchnorm_1=torch.nn.BatchNorm2d(64)
 self.Batchnorm_2=torch.nn.BatchNorm2d(128)
 self.Batchnorm_3=torch.nn.BatchNorm2d(256)
 self.Batchnorm_4=torch.nn.BatchNorm2d(512)

 self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
 self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
 self.dropout2d_3=torch.nn.Dropout2d(p=0.5)

 self.dropout1d=torch.nn.Dropout(p=0.5)

 self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.fc = torch.nn.Linear(512, 10)

 def forward(self, x):

 ############################# Phase 1
 #print(x.size())
 x = F.relu(self.conv2d_11(x))
 x = self.dropout2d_1(x) #rate =0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = F.relu(self.conv2d_12(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 2
 x = F.relu(self.conv2d_21(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = F.relu(self.conv2d_22(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 3
 x = F.relu(self.conv2d_31(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_32(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_33(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 4
 x = F.relu(self.conv2d_41(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = F.relu(self.conv2d_42(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 5
 x = F.relu(self.conv2d_51(x))
 x = self.dropout2d_3(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.avgpool2d(x)
 #print(x.size())
 x = x.view(x.size(0), -1)
 #print(x.size())
 x = self.dropout1d(x)
 x = F.relu(self.fc(x))
 x = self.dropout1d(x)
 #print(x.size())
 x = F.softmax(x)
 ###############################

 return(x)


import time
from torch.optim.lr_scheduler import ReduceLROnPlateau

def trainNet(model, batch_size, n_epochs, learning_rate):

 lr=learning_rate

 #Print all of the hyperparameters of the training iteration:
 print("======= HYPERPARAMETERS =======")
 print("Batch size=", batch_size)
 print("Epochs=", n_epochs)
 print("Base learning_rate=", learning_rate)
 print("=" * 30)

 #Get training data
 n_batches = len(train_loader)

 #Time for printing
 training_start_time = time.time()

 #Loss function"
 loss = torch.nn.CrossEntropyLoss()
 optimizer = createOptimizer(model, lr) 

 scheduler = ReduceLROnPlateau(optimizer, 'min'
 ,patience=3,factor=0.9817
 ,verbose=True,)

 #Loop for n_epochs
 for epoch in range(n_epochs):

 #save the weightsevery 10 epochs
 if epoch % 10 == 0 :
 torch.save(model.state_dict(), 'model.ckpt')


 #print('learning rate : :.3f '.format(lr)) 
 #Create our loss and optimizer functions

 running_loss = 0.0
 print_every = n_batches // 10
 start_time = time.time()
 total_train_loss = 0
 total_train_acc = 0
 epoch_time = 0

 for i, data in enumerate(train_loader, 0):

 #free up the cuda memory
 inputs=None
 labels=None

 inputs, labels = data

 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 optimizer.zero_grad()

 outputs = model(inputs)

 score, predictions = torch.max(outputs.data, 1)
 acc = (labels==predictions).sum()
 total_train_acc += acc

 loss_size = loss(outputs, labels)
 loss_size.backward()
 optimizer.step()

 running_loss += loss_size.item()
 total_train_loss += loss_size.item()

 #Print every 10th batch of an epoch
 if (i + 1) % (print_every + 1) == 0:
 print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
 epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
 ,int(acc), time.time() - start_time))

 epoch_time += (time.time() - start_time)

 #Reset running loss and time
 running_loss = 0.0
 start_time = time.time()

 scheduler.step(total_train_loss)
 torch.cuda.empty_cache() 
 #At the end of the epoch, do a pass on the validation set
 total_val_loss = 0

 for inputs, labels in val_loader:

 #Wrap tensors in Variables
 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 #Forward pass
 val_outputs = model(inputs)
 val_loss_size = loss(val_outputs, labels)
 total_val_loss += val_loss_size.item()

 print("-"*30)
 print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
 total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
 ,total_val_loss/len(val_loader),epoch_time))
 print("="*60)


 print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()

trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)

Keras:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D

model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())

#(32,32,3)

model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)


model.add(MaxPool2D((2,2)))
#(16,16,3)

#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)

#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)


model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(MaxPool2D((2,2)))
#(4,4,3)

#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(MaxPool2D((2,2)))
#(2,2,3)

#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)

model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))

model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
 epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

pytorch code :

class SimpleCNN(torch.nn.Module):

 def __init__(self):
 super(SimpleCNN, self).__init__()

 self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
 self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)

 self.Batchnorm_1=torch.nn.BatchNorm2d(64)
 self.Batchnorm_2=torch.nn.BatchNorm2d(128)
 self.Batchnorm_3=torch.nn.BatchNorm2d(256)
 self.Batchnorm_4=torch.nn.BatchNorm2d(512)

 self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
 self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
 self.dropout2d_3=torch.nn.Dropout2d(p=0.5)

 self.dropout1d=torch.nn.Dropout(p=0.5)

 self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)

 self.fc = torch.nn.Linear(512, 10)

 def forward(self, x):

 ############################# Phase 1
 #print(x.size())
 x = F.relu(self.conv2d_11(x))
 x = self.dropout2d_1(x) #rate =0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = F.relu(self.conv2d_12(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_1(x) #input 64
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 2
 x = F.relu(self.conv2d_21(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = F.relu(self.conv2d_22(x))
 x = self.dropout2d_1(x) #rate=0.3
 x = self.Batchnorm_2(x) #input 128
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 3
 x = F.relu(self.conv2d_31(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_32(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = F.relu(self.conv2d_33(x))
 x = self.dropout2d_2(x) #rate=0.4
 x = self.Batchnorm_3(x) #input 256
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 4
 x = F.relu(self.conv2d_41(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = F.relu(self.conv2d_42(x))
 x = self.dropout2d_2(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.maxpool2d(x)
 #print(x.size())
 ############################# Phase 5
 x = F.relu(self.conv2d_51(x))
 x = self.dropout2d_3(x)
 x = self.Batchnorm_4(x)
 #print(x.size())

 x = self.avgpool2d(x)
 #print(x.size())
 x = x.view(x.size(0), -1)
 #print(x.size())
 x = self.dropout1d(x)
 x = F.relu(self.fc(x))
 x = self.dropout1d(x)
 #print(x.size())
 x = F.softmax(x)
 ###############################

 return(x)


import time
from torch.optim.lr_scheduler import ReduceLROnPlateau

def trainNet(model, batch_size, n_epochs, learning_rate):

 lr=learning_rate

 #Print all of the hyperparameters of the training iteration:
 print("======= HYPERPARAMETERS =======")
 print("Batch size=", batch_size)
 print("Epochs=", n_epochs)
 print("Base learning_rate=", learning_rate)
 print("=" * 30)

 #Get training data
 n_batches = len(train_loader)

 #Time for printing
 training_start_time = time.time()

 #Loss function"
 loss = torch.nn.CrossEntropyLoss()
 optimizer = createOptimizer(model, lr) 

 scheduler = ReduceLROnPlateau(optimizer, 'min'
 ,patience=3,factor=0.9817
 ,verbose=True,)

 #Loop for n_epochs
 for epoch in range(n_epochs):

 #save the weightsevery 10 epochs
 if epoch % 10 == 0 :
 torch.save(model.state_dict(), 'model.ckpt')


 #print('learning rate : :.3f '.format(lr)) 
 #Create our loss and optimizer functions

 running_loss = 0.0
 print_every = n_batches // 10
 start_time = time.time()
 total_train_loss = 0
 total_train_acc = 0
 epoch_time = 0

 for i, data in enumerate(train_loader, 0):

 #free up the cuda memory
 inputs=None
 labels=None

 inputs, labels = data

 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 optimizer.zero_grad()

 outputs = model(inputs)

 score, predictions = torch.max(outputs.data, 1)
 acc = (labels==predictions).sum()
 total_train_acc += acc

 loss_size = loss(outputs, labels)
 loss_size.backward()
 optimizer.step()

 running_loss += loss_size.item()
 total_train_loss += loss_size.item()

 #Print every 10th batch of an epoch
 if (i + 1) % (print_every + 1) == 0:
 print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
 epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
 ,int(acc), time.time() - start_time))

 epoch_time += (time.time() - start_time)

 #Reset running loss and time
 running_loss = 0.0
 start_time = time.time()

 scheduler.step(total_train_loss)
 torch.cuda.empty_cache() 
 #At the end of the epoch, do a pass on the validation set
 total_val_loss = 0

 for inputs, labels in val_loader:

 #Wrap tensors in Variables
 inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))

 #Forward pass
 val_outputs = model(inputs)
 val_loss_size = loss(val_outputs, labels)
 total_val_loss += val_loss_size.item()

 print("-"*30)
 print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
 total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
 ,total_val_loss/len(val_loader),epoch_time))
 print("="*60)


 print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()

trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)

Keras:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D

model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())

#(32,32,3)

model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)


model.add(MaxPool2D((2,2)))
#(16,16,3)

#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)

model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)

#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)


model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)

model.add(MaxPool2D((2,2)))
#(4,4,3)

#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)

model.add(MaxPool2D((2,2)))
#(2,2,3)

#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)

model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))

model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
 epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))

keras deep-learning pytorch

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

asked Mar 26 at 17:11

Moeinh77

422 silver badges8 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

Edit: on a closer look, acc doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc to total_train_acc += acc.item() should fix this.

Another thing you should use with torch.no_grad() for the validation loop.

Not really about speed, but model.train() and model.eval() should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.

edited Mar 27 at 4:17

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55362722%2fwhy-the-pytorch-implementation-is-so-inefficient%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Another thing you should use with torch.no_grad() for the validation loop.

Not really about speed, but model.train() and model.eval() should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.

edited Mar 27 at 4:17

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

add a comment |

Another thing you should use with torch.no_grad() for the validation loop.

Not really about speed, but model.train() and model.eval() should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.

edited Mar 27 at 4:17

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

add a comment |

Another thing you should use with torch.no_grad() for the validation loop.

Not really about speed, but model.train() and model.eval() should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.

edited Mar 27 at 4:17

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

Another thing you should use with torch.no_grad() for the validation loop.

Not really about speed, but model.train() and model.eval() should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.

edited Mar 27 at 4:17

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

edited Mar 27 at 4:17

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

answered Mar 27 at 4:07

Sergey Dymchenko

4,9051 gold badge13 silver badges31 bronze badges

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

lFMvgRcc,y3KyLxfqP fpEUeX3XNJNjmq1WYs9aAKtXo P0u LYN4UnDNKE3nnx,lLz

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

1 Answer
1

1 Answer
1

1 Answer
1