Why the pytorch implementation is so inefficient?Why would a DQN give similar values to all actions in the action space (2) for all observationsWho changed my weight initialization?Test Accuracy Increases Whilst Loss IncreasesOutput of conv2d in kerasDeep neural network not learningbackprop in merged modelsThe output NN is image an image with values 0 or 1, but the expected are a range of integers between 0 and 255Conv2D and Conv3D: increase the accuracyHow can I freeze last layer of my own model?'Sequential' object has no attribute 'loss' - When I used GridSearchCV to tuning my Keras model
What does Kasparov mean by "I was behind in three and even in one after six games"?
Why are so many countries still in the Commonwealth?
Automatic Habit of Meditation
How important is a good quality camera for good photography?
How do professional electronic musicians/sound engineers combat listening fatigue?
What is the lowest-speed bogey a jet fighter can intercept/escort?
Spoken encryption
Is it legal to use cash pulled from a credit card to pay the monthly payment on that credit card?
Knights fighting a steam locomotive they believe is a dragon
Why is my read in of data taking so long?
Can two figures have the same area, perimeter, and same number of segments have different shape?
What is the meaning of "you has the wind of me"?
Area of parallelogram = Area of square. Shear transform
Why are off grid solar setups only 12, 24, 48 VDC?
Timing/Stack question about abilities triggered during combat
Iterate over non-const variables in C++
Inadvertently nuked my disk permission structure - why?
Reduce column width of table while also aligning values at decimal point
Where to place an artificial gland in the human body?
Send a single HTML email from Thunderbird, overriding the default "plain text" setting
Examples of simultaneous independent breakthroughs
How can I make sure my players' decisions have consequences?
This message is flooding my syslog, how to find where it comes from?
Is it normal practice to screen share with a client?
Why the pytorch implementation is so inefficient?
Why would a DQN give similar values to all actions in the action space (2) for all observationsWho changed my weight initialization?Test Accuracy Increases Whilst Loss IncreasesOutput of conv2d in kerasDeep neural network not learningbackprop in merged modelsThe output NN is image an image with values 0 or 1, but the expected are a range of integers between 0 and 255Conv2D and Conv3D: increase the accuracyHow can I freeze last layer of my own model?'Sequential' object has no attribute 'loss' - When I used GridSearchCV to tuning my Keras model
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have implemented a paper about a CNN architecture in both Keras and Pytorch but keras implementation is much more efficient it takes 4 gb of gpu for training with 50000 samples and 10000 validation samples but pytorch one takes all the 12 gb of gpu and i cant even use a validation set !
Optimizer for both of them is sgd with momentum and same settings for both.
more info about the paper:[architecture]:https://github.com/Moeinh77/Lightweight-Deep-Convolutional-Network-for-Tiny-Object-Recognition/edit/master/train.py
pytorch code :
class SimpleCNN(torch.nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.Batchnorm_1=torch.nn.BatchNorm2d(64)
self.Batchnorm_2=torch.nn.BatchNorm2d(128)
self.Batchnorm_3=torch.nn.BatchNorm2d(256)
self.Batchnorm_4=torch.nn.BatchNorm2d(512)
self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
self.dropout2d_3=torch.nn.Dropout2d(p=0.5)
self.dropout1d=torch.nn.Dropout(p=0.5)
self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)
self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)
self.fc = torch.nn.Linear(512, 10)
def forward(self, x):
############################# Phase 1
#print(x.size())
x = F.relu(self.conv2d_11(x))
x = self.dropout2d_1(x) #rate =0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = F.relu(self.conv2d_12(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 2
x = F.relu(self.conv2d_21(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = F.relu(self.conv2d_22(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 3
x = F.relu(self.conv2d_31(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_32(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_33(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 4
x = F.relu(self.conv2d_41(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = F.relu(self.conv2d_42(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 5
x = F.relu(self.conv2d_51(x))
x = self.dropout2d_3(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.avgpool2d(x)
#print(x.size())
x = x.view(x.size(0), -1)
#print(x.size())
x = self.dropout1d(x)
x = F.relu(self.fc(x))
x = self.dropout1d(x)
#print(x.size())
x = F.softmax(x)
###############################
return(x)
import time
from torch.optim.lr_scheduler import ReduceLROnPlateau
def trainNet(model, batch_size, n_epochs, learning_rate):
lr=learning_rate
#Print all of the hyperparameters of the training iteration:
print("======= HYPERPARAMETERS =======")
print("Batch size=", batch_size)
print("Epochs=", n_epochs)
print("Base learning_rate=", learning_rate)
print("=" * 30)
#Get training data
n_batches = len(train_loader)
#Time for printing
training_start_time = time.time()
#Loss function"
loss = torch.nn.CrossEntropyLoss()
optimizer = createOptimizer(model, lr)
scheduler = ReduceLROnPlateau(optimizer, 'min'
,patience=3,factor=0.9817
,verbose=True,)
#Loop for n_epochs
for epoch in range(n_epochs):
#save the weightsevery 10 epochs
if epoch % 10 == 0 :
torch.save(model.state_dict(), 'model.ckpt')
#print('learning rate : :.3f '.format(lr))
#Create our loss and optimizer functions
running_loss = 0.0
print_every = n_batches // 10
start_time = time.time()
total_train_loss = 0
total_train_acc = 0
epoch_time = 0
for i, data in enumerate(train_loader, 0):
#free up the cuda memory
inputs=None
labels=None
inputs, labels = data
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
optimizer.zero_grad()
outputs = model(inputs)
score, predictions = torch.max(outputs.data, 1)
acc = (labels==predictions).sum()
total_train_acc += acc
loss_size = loss(outputs, labels)
loss_size.backward()
optimizer.step()
running_loss += loss_size.item()
total_train_loss += loss_size.item()
#Print every 10th batch of an epoch
if (i + 1) % (print_every + 1) == 0:
print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
,int(acc), time.time() - start_time))
epoch_time += (time.time() - start_time)
#Reset running loss and time
running_loss = 0.0
start_time = time.time()
scheduler.step(total_train_loss)
torch.cuda.empty_cache()
#At the end of the epoch, do a pass on the validation set
total_val_loss = 0
for inputs, labels in val_loader:
#Wrap tensors in Variables
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
#Forward pass
val_outputs = model(inputs)
val_loss_size = loss(val_outputs, labels)
total_val_loss += val_loss_size.item()
print("-"*30)
print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
,total_val_loss/len(val_loader),epoch_time))
print("="*60)
print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()
trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)
Keras:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D
model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(MaxPool2D((2,2)))
#(16,16,3)
#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)
#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(MaxPool2D((2,2)))
#(4,4,3)
#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(MaxPool2D((2,2)))
#(2,2,3)
#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)
model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))
model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))
keras deep-learning pytorch
add a comment |
I have implemented a paper about a CNN architecture in both Keras and Pytorch but keras implementation is much more efficient it takes 4 gb of gpu for training with 50000 samples and 10000 validation samples but pytorch one takes all the 12 gb of gpu and i cant even use a validation set !
Optimizer for both of them is sgd with momentum and same settings for both.
more info about the paper:[architecture]:https://github.com/Moeinh77/Lightweight-Deep-Convolutional-Network-for-Tiny-Object-Recognition/edit/master/train.py
pytorch code :
class SimpleCNN(torch.nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.Batchnorm_1=torch.nn.BatchNorm2d(64)
self.Batchnorm_2=torch.nn.BatchNorm2d(128)
self.Batchnorm_3=torch.nn.BatchNorm2d(256)
self.Batchnorm_4=torch.nn.BatchNorm2d(512)
self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
self.dropout2d_3=torch.nn.Dropout2d(p=0.5)
self.dropout1d=torch.nn.Dropout(p=0.5)
self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)
self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)
self.fc = torch.nn.Linear(512, 10)
def forward(self, x):
############################# Phase 1
#print(x.size())
x = F.relu(self.conv2d_11(x))
x = self.dropout2d_1(x) #rate =0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = F.relu(self.conv2d_12(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 2
x = F.relu(self.conv2d_21(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = F.relu(self.conv2d_22(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 3
x = F.relu(self.conv2d_31(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_32(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_33(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 4
x = F.relu(self.conv2d_41(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = F.relu(self.conv2d_42(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 5
x = F.relu(self.conv2d_51(x))
x = self.dropout2d_3(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.avgpool2d(x)
#print(x.size())
x = x.view(x.size(0), -1)
#print(x.size())
x = self.dropout1d(x)
x = F.relu(self.fc(x))
x = self.dropout1d(x)
#print(x.size())
x = F.softmax(x)
###############################
return(x)
import time
from torch.optim.lr_scheduler import ReduceLROnPlateau
def trainNet(model, batch_size, n_epochs, learning_rate):
lr=learning_rate
#Print all of the hyperparameters of the training iteration:
print("======= HYPERPARAMETERS =======")
print("Batch size=", batch_size)
print("Epochs=", n_epochs)
print("Base learning_rate=", learning_rate)
print("=" * 30)
#Get training data
n_batches = len(train_loader)
#Time for printing
training_start_time = time.time()
#Loss function"
loss = torch.nn.CrossEntropyLoss()
optimizer = createOptimizer(model, lr)
scheduler = ReduceLROnPlateau(optimizer, 'min'
,patience=3,factor=0.9817
,verbose=True,)
#Loop for n_epochs
for epoch in range(n_epochs):
#save the weightsevery 10 epochs
if epoch % 10 == 0 :
torch.save(model.state_dict(), 'model.ckpt')
#print('learning rate : :.3f '.format(lr))
#Create our loss and optimizer functions
running_loss = 0.0
print_every = n_batches // 10
start_time = time.time()
total_train_loss = 0
total_train_acc = 0
epoch_time = 0
for i, data in enumerate(train_loader, 0):
#free up the cuda memory
inputs=None
labels=None
inputs, labels = data
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
optimizer.zero_grad()
outputs = model(inputs)
score, predictions = torch.max(outputs.data, 1)
acc = (labels==predictions).sum()
total_train_acc += acc
loss_size = loss(outputs, labels)
loss_size.backward()
optimizer.step()
running_loss += loss_size.item()
total_train_loss += loss_size.item()
#Print every 10th batch of an epoch
if (i + 1) % (print_every + 1) == 0:
print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
,int(acc), time.time() - start_time))
epoch_time += (time.time() - start_time)
#Reset running loss and time
running_loss = 0.0
start_time = time.time()
scheduler.step(total_train_loss)
torch.cuda.empty_cache()
#At the end of the epoch, do a pass on the validation set
total_val_loss = 0
for inputs, labels in val_loader:
#Wrap tensors in Variables
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
#Forward pass
val_outputs = model(inputs)
val_loss_size = loss(val_outputs, labels)
total_val_loss += val_loss_size.item()
print("-"*30)
print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
,total_val_loss/len(val_loader),epoch_time))
print("="*60)
print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()
trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)
Keras:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D
model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(MaxPool2D((2,2)))
#(16,16,3)
#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)
#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(MaxPool2D((2,2)))
#(4,4,3)
#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(MaxPool2D((2,2)))
#(2,2,3)
#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)
model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))
model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))
keras deep-learning pytorch
add a comment |
I have implemented a paper about a CNN architecture in both Keras and Pytorch but keras implementation is much more efficient it takes 4 gb of gpu for training with 50000 samples and 10000 validation samples but pytorch one takes all the 12 gb of gpu and i cant even use a validation set !
Optimizer for both of them is sgd with momentum and same settings for both.
more info about the paper:[architecture]:https://github.com/Moeinh77/Lightweight-Deep-Convolutional-Network-for-Tiny-Object-Recognition/edit/master/train.py
pytorch code :
class SimpleCNN(torch.nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.Batchnorm_1=torch.nn.BatchNorm2d(64)
self.Batchnorm_2=torch.nn.BatchNorm2d(128)
self.Batchnorm_3=torch.nn.BatchNorm2d(256)
self.Batchnorm_4=torch.nn.BatchNorm2d(512)
self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
self.dropout2d_3=torch.nn.Dropout2d(p=0.5)
self.dropout1d=torch.nn.Dropout(p=0.5)
self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)
self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)
self.fc = torch.nn.Linear(512, 10)
def forward(self, x):
############################# Phase 1
#print(x.size())
x = F.relu(self.conv2d_11(x))
x = self.dropout2d_1(x) #rate =0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = F.relu(self.conv2d_12(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 2
x = F.relu(self.conv2d_21(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = F.relu(self.conv2d_22(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 3
x = F.relu(self.conv2d_31(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_32(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_33(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 4
x = F.relu(self.conv2d_41(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = F.relu(self.conv2d_42(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 5
x = F.relu(self.conv2d_51(x))
x = self.dropout2d_3(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.avgpool2d(x)
#print(x.size())
x = x.view(x.size(0), -1)
#print(x.size())
x = self.dropout1d(x)
x = F.relu(self.fc(x))
x = self.dropout1d(x)
#print(x.size())
x = F.softmax(x)
###############################
return(x)
import time
from torch.optim.lr_scheduler import ReduceLROnPlateau
def trainNet(model, batch_size, n_epochs, learning_rate):
lr=learning_rate
#Print all of the hyperparameters of the training iteration:
print("======= HYPERPARAMETERS =======")
print("Batch size=", batch_size)
print("Epochs=", n_epochs)
print("Base learning_rate=", learning_rate)
print("=" * 30)
#Get training data
n_batches = len(train_loader)
#Time for printing
training_start_time = time.time()
#Loss function"
loss = torch.nn.CrossEntropyLoss()
optimizer = createOptimizer(model, lr)
scheduler = ReduceLROnPlateau(optimizer, 'min'
,patience=3,factor=0.9817
,verbose=True,)
#Loop for n_epochs
for epoch in range(n_epochs):
#save the weightsevery 10 epochs
if epoch % 10 == 0 :
torch.save(model.state_dict(), 'model.ckpt')
#print('learning rate : :.3f '.format(lr))
#Create our loss and optimizer functions
running_loss = 0.0
print_every = n_batches // 10
start_time = time.time()
total_train_loss = 0
total_train_acc = 0
epoch_time = 0
for i, data in enumerate(train_loader, 0):
#free up the cuda memory
inputs=None
labels=None
inputs, labels = data
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
optimizer.zero_grad()
outputs = model(inputs)
score, predictions = torch.max(outputs.data, 1)
acc = (labels==predictions).sum()
total_train_acc += acc
loss_size = loss(outputs, labels)
loss_size.backward()
optimizer.step()
running_loss += loss_size.item()
total_train_loss += loss_size.item()
#Print every 10th batch of an epoch
if (i + 1) % (print_every + 1) == 0:
print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
,int(acc), time.time() - start_time))
epoch_time += (time.time() - start_time)
#Reset running loss and time
running_loss = 0.0
start_time = time.time()
scheduler.step(total_train_loss)
torch.cuda.empty_cache()
#At the end of the epoch, do a pass on the validation set
total_val_loss = 0
for inputs, labels in val_loader:
#Wrap tensors in Variables
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
#Forward pass
val_outputs = model(inputs)
val_loss_size = loss(val_outputs, labels)
total_val_loss += val_loss_size.item()
print("-"*30)
print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
,total_val_loss/len(val_loader),epoch_time))
print("="*60)
print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()
trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)
Keras:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D
model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(MaxPool2D((2,2)))
#(16,16,3)
#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)
#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(MaxPool2D((2,2)))
#(4,4,3)
#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(MaxPool2D((2,2)))
#(2,2,3)
#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)
model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))
model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))
keras deep-learning pytorch
I have implemented a paper about a CNN architecture in both Keras and Pytorch but keras implementation is much more efficient it takes 4 gb of gpu for training with 50000 samples and 10000 validation samples but pytorch one takes all the 12 gb of gpu and i cant even use a validation set !
Optimizer for both of them is sgd with momentum and same settings for both.
more info about the paper:[architecture]:https://github.com/Moeinh77/Lightweight-Deep-Convolutional-Network-for-Tiny-Object-Recognition/edit/master/train.py
pytorch code :
class SimpleCNN(torch.nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv2d_11 = torch.nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_12 = torch.nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_21 = torch.nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_22 = torch.nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_31 = torch.nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_32 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_33 = torch.nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_41 = torch.nn.Conv2d(256, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_42 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.conv2d_51 = torch.nn.Conv2d(512, 512, kernel_size = 3, stride = 1, padding = 1)
self.Batchnorm_1=torch.nn.BatchNorm2d(64)
self.Batchnorm_2=torch.nn.BatchNorm2d(128)
self.Batchnorm_3=torch.nn.BatchNorm2d(256)
self.Batchnorm_4=torch.nn.BatchNorm2d(512)
self.dropout2d_1=torch.nn.Dropout2d(p=0.3)
self.dropout2d_2=torch.nn.Dropout2d(p=0.4)
self.dropout2d_3=torch.nn.Dropout2d(p=0.5)
self.dropout1d=torch.nn.Dropout(p=0.5)
self.maxpool2d = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)
self.avgpool2d = torch.nn.AvgPool2d(kernel_size = 2, stride = 2, padding = 0)
self.fc = torch.nn.Linear(512, 10)
def forward(self, x):
############################# Phase 1
#print(x.size())
x = F.relu(self.conv2d_11(x))
x = self.dropout2d_1(x) #rate =0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = F.relu(self.conv2d_12(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_1(x) #input 64
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 2
x = F.relu(self.conv2d_21(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = F.relu(self.conv2d_22(x))
x = self.dropout2d_1(x) #rate=0.3
x = self.Batchnorm_2(x) #input 128
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 3
x = F.relu(self.conv2d_31(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_32(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = F.relu(self.conv2d_33(x))
x = self.dropout2d_2(x) #rate=0.4
x = self.Batchnorm_3(x) #input 256
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 4
x = F.relu(self.conv2d_41(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = F.relu(self.conv2d_42(x))
x = self.dropout2d_2(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.maxpool2d(x)
#print(x.size())
############################# Phase 5
x = F.relu(self.conv2d_51(x))
x = self.dropout2d_3(x)
x = self.Batchnorm_4(x)
#print(x.size())
x = self.avgpool2d(x)
#print(x.size())
x = x.view(x.size(0), -1)
#print(x.size())
x = self.dropout1d(x)
x = F.relu(self.fc(x))
x = self.dropout1d(x)
#print(x.size())
x = F.softmax(x)
###############################
return(x)
import time
from torch.optim.lr_scheduler import ReduceLROnPlateau
def trainNet(model, batch_size, n_epochs, learning_rate):
lr=learning_rate
#Print all of the hyperparameters of the training iteration:
print("======= HYPERPARAMETERS =======")
print("Batch size=", batch_size)
print("Epochs=", n_epochs)
print("Base learning_rate=", learning_rate)
print("=" * 30)
#Get training data
n_batches = len(train_loader)
#Time for printing
training_start_time = time.time()
#Loss function"
loss = torch.nn.CrossEntropyLoss()
optimizer = createOptimizer(model, lr)
scheduler = ReduceLROnPlateau(optimizer, 'min'
,patience=3,factor=0.9817
,verbose=True,)
#Loop for n_epochs
for epoch in range(n_epochs):
#save the weightsevery 10 epochs
if epoch % 10 == 0 :
torch.save(model.state_dict(), 'model.ckpt')
#print('learning rate : :.3f '.format(lr))
#Create our loss and optimizer functions
running_loss = 0.0
print_every = n_batches // 10
start_time = time.time()
total_train_loss = 0
total_train_acc = 0
epoch_time = 0
for i, data in enumerate(train_loader, 0):
#free up the cuda memory
inputs=None
labels=None
inputs, labels = data
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
optimizer.zero_grad()
outputs = model(inputs)
score, predictions = torch.max(outputs.data, 1)
acc = (labels==predictions).sum()
total_train_acc += acc
loss_size = loss(outputs, labels)
loss_size.backward()
optimizer.step()
running_loss += loss_size.item()
total_train_loss += loss_size.item()
#Print every 10th batch of an epoch
if (i + 1) % (print_every + 1) == 0:
print("Epoch , :d % t | train_loss: :.3f | train_acc:% | took: :.2fs".format(
epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every
,int(acc), time.time() - start_time))
epoch_time += (time.time() - start_time)
#Reset running loss and time
running_loss = 0.0
start_time = time.time()
scheduler.step(total_train_loss)
torch.cuda.empty_cache()
#At the end of the epoch, do a pass on the validation set
total_val_loss = 0
for inputs, labels in val_loader:
#Wrap tensors in Variables
inputs, labels = Variable(inputs.to(device)), Variable(labels.to(device))
#Forward pass
val_outputs = model(inputs)
val_loss_size = loss(val_outputs, labels)
total_val_loss += val_loss_size.item()
print("-"*30)
print("Train loss = :.2f | Train acc = :.1f% | Val loss=:.2f | took: :.2fs".format(
total_train_loss / len(train_loader),total_train_acc/ len(train_loader)
,total_val_loss/len(val_loader),epoch_time))
print("="*60)
print("Training finished, took :.2fs".format(time.time() - training_start_time))
CNN = SimpleCNN().to(device)
CNN.eval()
trainNet(CNN, batch_size=64, n_epochs=250, learning_rate=0.1)
Keras:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten,Activation
from tensorflow.keras.layers import Conv2D, MaxPool2D,BatchNormalization,GlobalAveragePooling2D
model = Sequential()
#####################################################
# Phase 1
model.add(Conv2D(64,(3,3),input_shape=(32,32,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(Conv2D(64,(3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(32,32,3)
model.add(MaxPool2D((2,2)))
#(16,16,3)
#####################################################
#Phase 2
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(Conv2D(128, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.3))
model.add(BatchNormalization())
#(16,16,3)
model.add(MaxPool2D((2,2),padding='same'))
#(8,8,3)
#####################################################
#Phase 3
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(Conv2D(256, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(8,8,3)
model.add(MaxPool2D((2,2)))
#(4,4,3)
#####################################################
#Phase 4
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.4))
model.add(BatchNormalization())
#(4,4,3)
model.add(MaxPool2D((2,2)))
#(2,2,3)
#####################################################
#Phase 5
model.add(Conv2D(512, (3,3),padding='same'))
model.add(Activation('relu'))
model.add(Dropout(rate=0.5))
model.add(BatchNormalization())
#(2,2,3)
model.add(GlobalAveragePooling2D(data_format='channels_last'))
model.add(Flatten())
model.add(Dropout(rate=0.5))
model.add(Dense(10,activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=sgd_optimizer,loss='categorical_crossentropy',metrics=['accuracy'])
history=model.fit(x=x_train,y=y_train,batch_size=64,
epochs=250,verbose=1,callbacks=[checkpoint],validation_data=(x_test,y_test))
keras deep-learning pytorch
keras deep-learning pytorch
asked Mar 26 at 17:11
Moeinh77Moeinh77
422 silver badges8 bronze badges
422 silver badges8 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Edit: on a closer look, acc
doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc
accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc
to total_train_acc += acc.item()
should fix this.
Another thing you should use with torch.no_grad()
for the validation loop.
Not really about speed, but model.train()
and model.eval()
should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55362722%2fwhy-the-pytorch-implementation-is-so-inefficient%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Edit: on a closer look, acc
doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc
accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc
to total_train_acc += acc.item()
should fix this.
Another thing you should use with torch.no_grad()
for the validation loop.
Not really about speed, but model.train()
and model.eval()
should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.
add a comment |
Edit: on a closer look, acc
doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc
accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc
to total_train_acc += acc.item()
should fix this.
Another thing you should use with torch.no_grad()
for the validation loop.
Not really about speed, but model.train()
and model.eval()
should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.
add a comment |
Edit: on a closer look, acc
doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc
accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc
to total_train_acc += acc.item()
should fix this.
Another thing you should use with torch.no_grad()
for the validation loop.
Not really about speed, but model.train()
and model.eval()
should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.
Edit: on a closer look, acc
doesn't seem to require gradient, so this paragraph probably doesn't apply.
It looks like the most significant issue is that total_train_acc
accumulates history across the training loop (see https://pytorch.org/docs/stable/notes/faq.html for details).
Changing total_train_acc += acc
to total_train_acc += acc.item()
should fix this.
Another thing you should use with torch.no_grad()
for the validation loop.
Not really about speed, but model.train()
and model.eval()
should be used for training/evaluation to make batchnorm and dropout layers work in correct mode.
edited Mar 27 at 4:17
answered Mar 27 at 4:07
Sergey DymchenkoSergey Dymchenko
4,9051 gold badge13 silver badges31 bronze badges
4,9051 gold badge13 silver badges31 bronze badges
add a comment |
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55362722%2fwhy-the-pytorch-implementation-is-so-inefficient%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown