Why does this TensorFlow example not have a summation before the activation function?Does Python have a ternary conditional operator?Understanding Neural Network BackpropagationDoes Python have a string 'contains' substring method?How to update the bias in neural network backpropagation?Why does Python code run faster in a function?How to use the custom neural network function in the MATLAB Neural Network ToolboxHow to get bias and neuron weights in optimizer?Why are different bias values used in different types of layersTensorflow different activation functions for output layerQuestions on tf.layers .dense

Do liquid propellant rocket engines experience thrust oscillation?

How to fix folder structure in Windows 7 and 10

Why are some of the Stunts in The Expanse RPG labelled 'Core'?

How is the problem, ⟨G⟩ in Logspace?

Paradox regarding phase transitions in relativistic systems

Should the pagination be reset when changing the order?

When does removing Goblin Warchief affect its cost reduction ability?

Can Bless or Bardic Inspiration help a creature from rolling a 1 on a death save?

What do solvers like Gurobi and CPLEX do when they run into hard instances of MIP

Resolving moral conflict

Can Northern Ireland's border issue be solved by repartition?

Centrifugal force with Newton's third law?

How is underwater propagation of sound possible?

Why NASA publish all the results/data it gets?

How to make interviewee comfortable interviewing in lounge chairs

Escape the labyrinth!

Pandas aggregate with dynamic column names

Can planetary bodies have a second axis of rotation?

I reverse the source code, you negate the input!

Is there a builtin function to turn selective Echos off?

Is Zack Morris's 'time stop' ability in "Saved By the Bell" a supernatural ability?

Where Does VDD+0.3V Input Limit Come From on IC chips?

What can a pilot do if an air traffic controller is incapacitated?

Circle divided by lines between a blue dots

Why does this TensorFlow example not have a summation before the activation function?

Does Python have a ternary conditional operator?Understanding Neural Network BackpropagationDoes Python have a string 'contains' substring method?How to update the bias in neural network backpropagation?Why does Python code run faster in a function?How to use the custom neural network function in the MATLAB Neural Network ToolboxHow to get bias and neuron weights in optimizer?Why are different bias values used in different types of layersTensorflow different activation functions for output layerQuestions on tf.layers .dense

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I'm trying to understand a TensorFlow code snippet. What I've been taught is that we sum all the incoming inputs and then pass them to an activation function. Shown in the picture below is a single neuron. Notice that we compute a weighted sum of the inputs and THEN compute the activation.

In most examples of the multi-layer perceptron, they don't include the summation step. I find this very confusing.

Here is an example of one of those snippets:

weights = 
 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))

biases = 
 'b1': tf.Variable(tf.random_normal([n_hidden_1])),
 'b2': tf.Variable(tf.random_normal([n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_classes]))



# Create model
def multilayer_perceptron(x):
 # Hidden fully connected layer with 256 neurons
 layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['h1']), biases['b1']))
 # Hidden fully connected layer with 256 neurons
 layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']))
 # Output fully connected layer with a neuron for each class
 out_layer = tf.nn.relu(tf.matmul(layer_2, weights['out']) + biases['out'])
 return out_layer

In each layer, we first multiply the inputs with a weights. Afterwards, we add the bias term. Then we pass those to the tf.nn.relu. Where does the summation happen? It looks like we've skipped this!

Any help would be really great!

edited Mar 28 at 15:04

asked Mar 28 at 14:25

echo

1104 bronze badges

It's done by softmax as far as I understand, it's the equivalent of softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)

– EdChum
Mar 28 at 14:27

Okay -- the softmax layer does it. But the other nodes don't do it?

– echo
Mar 28 at 14:46

No I don't think so as this wouldn't make sense, if you sum or perform any kind of aggregation, they stop becoming a layer so you can't feed them to another layer

– EdChum
Mar 28 at 15:01

It does remain a layer. Each individual neuron in a layer takes input and each neuron has to produce a single scalar value.

– echo
Mar 28 at 15:05

add a comment
|

In most examples of the multi-layer perceptron, they don't include the summation step. I find this very confusing.

Here is an example of one of those snippets:

weights = 
 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))

biases = 
 'b1': tf.Variable(tf.random_normal([n_hidden_1])),
 'b2': tf.Variable(tf.random_normal([n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_classes]))



# Create model
def multilayer_perceptron(x):
 # Hidden fully connected layer with 256 neurons
 layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['h1']), biases['b1']))
 # Hidden fully connected layer with 256 neurons
 layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']))
 # Output fully connected layer with a neuron for each class
 out_layer = tf.nn.relu(tf.matmul(layer_2, weights['out']) + biases['out'])
 return out_layer

Any help would be really great!

edited Mar 28 at 15:04

asked Mar 28 at 14:25

echo

1104 bronze badges

It's done by softmax as far as I understand, it's the equivalent of softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)

– EdChum
Mar 28 at 14:27

Okay -- the softmax layer does it. But the other nodes don't do it?

– echo
Mar 28 at 14:46

No I don't think so as this wouldn't make sense, if you sum or perform any kind of aggregation, they stop becoming a layer so you can't feed them to another layer

– EdChum
Mar 28 at 15:01

It does remain a layer. Each individual neuron in a layer takes input and each neuron has to produce a single scalar value.

– echo
Mar 28 at 15:05

add a comment
|

In most examples of the multi-layer perceptron, they don't include the summation step. I find this very confusing.

Here is an example of one of those snippets:

weights = 
 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))

biases = 
 'b1': tf.Variable(tf.random_normal([n_hidden_1])),
 'b2': tf.Variable(tf.random_normal([n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_classes]))



# Create model
def multilayer_perceptron(x):
 # Hidden fully connected layer with 256 neurons
 layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['h1']), biases['b1']))
 # Hidden fully connected layer with 256 neurons
 layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']))
 # Output fully connected layer with a neuron for each class
 out_layer = tf.nn.relu(tf.matmul(layer_2, weights['out']) + biases['out'])
 return out_layer

Any help would be really great!

edited Mar 28 at 15:04

asked Mar 28 at 14:25

echo

1104 bronze badges

In most examples of the multi-layer perceptron, they don't include the summation step. I find this very confusing.

Here is an example of one of those snippets:

weights = 
 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))

biases = 
 'b1': tf.Variable(tf.random_normal([n_hidden_1])),
 'b2': tf.Variable(tf.random_normal([n_hidden_2])),
 'out': tf.Variable(tf.random_normal([n_classes]))



# Create model
def multilayer_perceptron(x):
 # Hidden fully connected layer with 256 neurons
 layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['h1']), biases['b1']))
 # Hidden fully connected layer with 256 neurons
 layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']))
 # Output fully connected layer with a neuron for each class
 out_layer = tf.nn.relu(tf.matmul(layer_2, weights['out']) + biases['out'])
 return out_layer

Any help would be really great!

python tensorflow machine-learning

edited Mar 28 at 15:04

asked Mar 28 at 14:25

echo

1104 bronze badges

edited Mar 28 at 15:04

asked Mar 28 at 14:25

echo

1104 bronze badges

edited Mar 28 at 15:04

asked Mar 28 at 14:25

echo

1104 bronze badges

asked Mar 28 at 14:25

echo

1104 bronze badges

asked Mar 28 at 14:25

echo

1104 bronze badges

It's done by softmax as far as I understand, it's the equivalent of softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)

– EdChum
Mar 28 at 14:27

Okay -- the softmax layer does it. But the other nodes don't do it?

– echo
Mar 28 at 14:46

No I don't think so as this wouldn't make sense, if you sum or perform any kind of aggregation, they stop becoming a layer so you can't feed them to another layer

– EdChum
Mar 28 at 15:01

It does remain a layer. Each individual neuron in a layer takes input and each neuron has to produce a single scalar value.

– echo
Mar 28 at 15:05

add a comment
|

It's done by softmax as far as I understand, it's the equivalent of softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)

– EdChum
Mar 28 at 14:27

Okay -- the softmax layer does it. But the other nodes don't do it?

– echo
Mar 28 at 14:46

No I don't think so as this wouldn't make sense, if you sum or perform any kind of aggregation, they stop becoming a layer so you can't feed them to another layer

– EdChum
Mar 28 at 15:01

It does remain a layer. Each individual neuron in a layer takes input and each neuron has to produce a single scalar value.

– echo
Mar 28 at 15:05

It's done by softmax as far as I understand, it's the equivalent of softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)

– EdChum
Mar 28 at 14:27

Okay -- the softmax layer does it. But the other nodes don't do it?

– echo
Mar 28 at 14:46

No I don't think so as this wouldn't make sense, if you sum or perform any kind of aggregation, they stop becoming a layer so you can't feed them to another layer

– EdChum
Mar 28 at 15:01

It does remain a layer. Each individual neuron in a layer takes input and each neuron has to produce a single scalar value.

– echo
Mar 28 at 15:05

add a comment
|

2 Answers
2

active

oldest

votes

The tf.matmul operator performs a matrix multiplication, which means that each element in the resulting matrix is a sum of products (which corresponds exactly to what you describe).

Take a simple example with a row-vector and a column-vector, as would be the case if you had exactly one neuron and an input vector (as per the graphic you shared above);

x = [2,3,1]
y = [3,
1,
2]

Then the result would be:

tf.matmul(x, y) = 2*3 + 3*1 +1*2 = 11

There you can see the weighted sum.

p.s: tf.multiply performs element-wise multiplication, which is not what we want here.

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

add a comment
|

The last layer of your model out_layer outputs probabilities of each class Prob(y=yi|X) and has shape [batch_size, n_classes]. To calculate these probabilities the softmax
function is applied. For each single input data point x that your model receives it outputs a vector of probabilities y of size number of classes. You then pick the one that has highest probability by applying argmax on the output vector class=argmax(P(y|x)) which can be written in tensorflow as y_pred = tf.argmax(out_layer, 1).

Consider network with a single layer. You have input matrix X of shape [n_samples, x_dimension] and you multiply it by some matrix W that has shape [x_dimension, model_output]. The summation that you're talking about is dot product between the row of matrix X and column of matrix W. The output will then have shape [n_samples, model_output]. On this output you apply activation function (if it is the final layer you probably want softmax). Perhaps the picture that you've shown is a bit misleading.

Mathematically, the layer without bias can be described as enter image description here and suppose that the first row of matrix (the first row is a single input data point) is

enter image description here

and first column of W is

enter image description here

The result of this dot product is given by

enter image description here

which is your summation. You repeat this for each column in matrix W and the result is vector of size model_output (which correspond to the number of columns in W). To this vector you add bias (if needed) and then apply activation.

edited Mar 28 at 15:46

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

I updated the question to use the relu activation function at the end of the network. I don't think it should matter what the activation function is.

– echo
Mar 28 at 15:04

@echo I've updated my answer

– Vlad
Mar 28 at 15:13

add a comment
|

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55400041%2fwhy-does-this-tensorflow-example-not-have-a-summation-before-the-activation-func%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

The tf.matmul operator performs a matrix multiplication, which means that each element in the resulting matrix is a sum of products (which corresponds exactly to what you describe).

Take a simple example with a row-vector and a column-vector, as would be the case if you had exactly one neuron and an input vector (as per the graphic you shared above);

x = [2,3,1]
y = [3,
1,
2]

Then the result would be:

tf.matmul(x, y) = 2*3 + 3*1 +1*2 = 11

There you can see the weighted sum.

p.s: tf.multiply performs element-wise multiplication, which is not what we want here.

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

add a comment
|

The tf.matmul operator performs a matrix multiplication, which means that each element in the resulting matrix is a sum of products (which corresponds exactly to what you describe).

Take a simple example with a row-vector and a column-vector, as would be the case if you had exactly one neuron and an input vector (as per the graphic you shared above);

x = [2,3,1]
y = [3,
1,
2]

Then the result would be:

tf.matmul(x, y) = 2*3 + 3*1 +1*2 = 11

There you can see the weighted sum.

p.s: tf.multiply performs element-wise multiplication, which is not what we want here.

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

add a comment
|

The tf.matmul operator performs a matrix multiplication, which means that each element in the resulting matrix is a sum of products (which corresponds exactly to what you describe).

Take a simple example with a row-vector and a column-vector, as would be the case if you had exactly one neuron and an input vector (as per the graphic you shared above);

x = [2,3,1]
y = [3,
1,
2]

Then the result would be:

tf.matmul(x, y) = 2*3 + 3*1 +1*2 = 11

There you can see the weighted sum.

p.s: tf.multiply performs element-wise multiplication, which is not what we want here.

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

The tf.matmul operator performs a matrix multiplication, which means that each element in the resulting matrix is a sum of products (which corresponds exactly to what you describe).

Take a simple example with a row-vector and a column-vector, as would be the case if you had exactly one neuron and an input vector (as per the graphic you shared above);

x = [2,3,1]
y = [3,
1,
2]

Then the result would be:

tf.matmul(x, y) = 2*3 + 3*1 +1*2 = 11

There you can see the weighted sum.

p.s: tf.multiply performs element-wise multiplication, which is not what we want here.

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

answered Mar 28 at 15:50

Whynote

4744 silver badges5 bronze badges

add a comment
|

Mathematically, the layer without bias can be described as enter image description here and suppose that the first row of matrix (the first row is a single input data point) is

enter image description here

and first column of W is

enter image description here

The result of this dot product is given by

enter image description here

edited Mar 28 at 15:46

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

I updated the question to use the relu activation function at the end of the network. I don't think it should matter what the activation function is.

– echo
Mar 28 at 15:04

@echo I've updated my answer

– Vlad
Mar 28 at 15:13

add a comment
|

Mathematically, the layer without bias can be described as enter image description here and suppose that the first row of matrix (the first row is a single input data point) is

enter image description here

and first column of W is

enter image description here

The result of this dot product is given by

enter image description here

edited Mar 28 at 15:46

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

I updated the question to use the relu activation function at the end of the network. I don't think it should matter what the activation function is.

– echo
Mar 28 at 15:04

@echo I've updated my answer

– Vlad
Mar 28 at 15:13

add a comment
|

Mathematically, the layer without bias can be described as enter image description here and suppose that the first row of matrix (the first row is a single input data point) is

enter image description here

and first column of W is

enter image description here

The result of this dot product is given by

enter image description here

edited Mar 28 at 15:46

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

Mathematically, the layer without bias can be described as enter image description here and suppose that the first row of matrix (the first row is a single input data point) is

enter image description here

and first column of W is

enter image description here

The result of this dot product is given by

enter image description here

edited Mar 28 at 15:46

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

edited Mar 28 at 15:46

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

answered Mar 28 at 15:03

Vlad

3,9465 gold badges14 silver badges31 bronze badges

I updated the question to use the relu activation function at the end of the network. I don't think it should matter what the activation function is.

– echo
Mar 28 at 15:04

@echo I've updated my answer

– Vlad
Mar 28 at 15:13

add a comment
|

I updated the question to use the relu activation function at the end of the network. I don't think it should matter what the activation function is.

– echo
Mar 28 at 15:04

@echo I've updated my answer

– Vlad
Mar 28 at 15:13

I updated the question to use the relu activation function at the end of the network. I don't think it should matter what the activation function is.

– echo
Mar 28 at 15:04

@echo I've updated my answer

– Vlad
Mar 28 at 15:13

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

2 Answers
2

2 Answers
2

2 Answers
2