What's the input of each LSTM layer in a stacked LSTM network?Understanding Keras LSTMshow to stack LSTM layers using TensorFlowUnderstanding the functioning of a recurrent neural network with LSTM cellsKeras - Input a 3 channel image into LSTMKeras embedding layer with LSTMHidden states and layers in LSTM?How to customize number of multiple hidden layer units in pytorch LSTM?Query about the input output shape of LSTM in KerasReturn_Sequence in LSTM with many to many network in PythonThe difference between Vanilla and stacked LSTM
Is a Centaur PC considered an animal when calculating carrying capacity for vehicles?
Does NASA use any type of office/groupware software and which is that?
Why didn't Doc believe Marty was from the future?
Pen test results for web application include a file from a forbidden directory that is not even used or referenced
Defending Castle from Zombies
How to emphasise the insignificance of someone/thing – besides using "klein"
Using a JoeBlow Sport pump on a presta valve
Biological refrigeration?
Are strlen optimizations really needed in glibc?
Dotted background on a flowchart
How to prevent a hosting company from accessing a VM's encryption keys?
Why is sh (not bash) complaining about functions defined in my .bashrc?
How many petaflops does it take to land on the moon? What does Artemis need with an Aitken?
What is the name of this plot that has rows with two connected dots?
Can I Prove Schröder-Bernstein With Just Definition of Bijection?
Is the Amazon rainforest the "world's lungs"?
Why does Windows store Wi-Fi passwords in a reversible format?
Does trying to charm an uncharmable creature cost a spell slot?
Is there any problem with a full installation on a USB drive?
Why does the `ls` command sort files like this?
Finding square root without division and initial guess
How to determine algebraically whether an equation has an infinite solutions or not?
Did anybody find out it was Anakin who blew up the command center?
What stops you from using fixed income in developing countries?
What's the input of each LSTM layer in a stacked LSTM network?
Understanding Keras LSTMshow to stack LSTM layers using TensorFlowUnderstanding the functioning of a recurrent neural network with LSTM cellsKeras - Input a 3 channel image into LSTMKeras embedding layer with LSTMHidden states and layers in LSTM?How to customize number of multiple hidden layer units in pytorch LSTM?Query about the input output shape of LSTM in KerasReturn_Sequence in LSTM with many to many network in PythonThe difference between Vanilla and stacked LSTM
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:
# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32
# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))
where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:
- Does the 2nd LSTM layer -LSTM(32)- receives as
X(t)(as input) the hidden state of the 1st layer -LSTM(64)- that has the size[batch_size, time-step, hidden_unit_length]and passes it through it's own hidden network - in this case consisting of 32 nodes-? - If the first is true, why the
input_shapeof the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case haveinput_shapeset to be[32, 10, 64]?
I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
Any help would be highly appreciated.
Thanks!
deep-learning lstm recurrent-neural-network stacked
add a comment |
I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:
# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32
# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))
where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:
- Does the 2nd LSTM layer -LSTM(32)- receives as
X(t)(as input) the hidden state of the 1st layer -LSTM(64)- that has the size[batch_size, time-step, hidden_unit_length]and passes it through it's own hidden network - in this case consisting of 32 nodes-? - If the first is true, why the
input_shapeof the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case haveinput_shapeset to be[32, 10, 64]?
I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
Any help would be highly appreciated.
Thanks!
deep-learning lstm recurrent-neural-network stacked
add a comment |
I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:
# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32
# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))
where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:
- Does the 2nd LSTM layer -LSTM(32)- receives as
X(t)(as input) the hidden state of the 1st layer -LSTM(64)- that has the size[batch_size, time-step, hidden_unit_length]and passes it through it's own hidden network - in this case consisting of 32 nodes-? - If the first is true, why the
input_shapeof the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case haveinput_shapeset to be[32, 10, 64]?
I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
Any help would be highly appreciated.
Thanks!
deep-learning lstm recurrent-neural-network stacked
I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:
# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32
# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))
where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:
- Does the 2nd LSTM layer -LSTM(32)- receives as
X(t)(as input) the hidden state of the 1st layer -LSTM(64)- that has the size[batch_size, time-step, hidden_unit_length]and passes it through it's own hidden network - in this case consisting of 32 nodes-? - If the first is true, why the
input_shapeof the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case haveinput_shapeset to be[32, 10, 64]?
I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
Any help would be highly appreciated.
Thanks!
deep-learning lstm recurrent-neural-network stacked
deep-learning lstm recurrent-neural-network stacked
asked Mar 27 at 20:27
Lio ChonLio Chon
461 gold badge1 silver badge4 bronze badges
461 gold badge1 silver badge4 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)
The model below
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))
represent the below architecture

Which you can verify it from model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416
=================================================================
Replacing the line
model.add(LSTM(32))
with
model.add(LSTM(32, input_shape=(1000000, 200000)))
will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.
And If you need a sequence to sequence architecture like below

you should be using the code:
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))
which should return a model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416
=================================================================
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385906%2fwhats-the-input-of-each-lstm-layer-in-a-stacked-lstm-network%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)
The model below
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))
represent the below architecture

Which you can verify it from model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416
=================================================================
Replacing the line
model.add(LSTM(32))
with
model.add(LSTM(32, input_shape=(1000000, 200000)))
will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.
And If you need a sequence to sequence architecture like below

you should be using the code:
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))
which should return a model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416
=================================================================
add a comment |
The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)
The model below
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))
represent the below architecture

Which you can verify it from model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416
=================================================================
Replacing the line
model.add(LSTM(32))
with
model.add(LSTM(32, input_shape=(1000000, 200000)))
will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.
And If you need a sequence to sequence architecture like below

you should be using the code:
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))
which should return a model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416
=================================================================
add a comment |
The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)
The model below
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))
represent the below architecture

Which you can verify it from model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416
=================================================================
Replacing the line
model.add(LSTM(32))
with
model.add(LSTM(32, input_shape=(1000000, 200000)))
will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.
And If you need a sequence to sequence architecture like below

you should be using the code:
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))
which should return a model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416
=================================================================
The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)
The model below
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))
represent the below architecture

Which you can verify it from model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416
=================================================================
Replacing the line
model.add(LSTM(32))
with
model.add(LSTM(32, input_shape=(1000000, 200000)))
will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.
And If you need a sequence to sequence architecture like below

you should be using the code:
model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))
which should return a model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416
=================================================================
answered Mar 27 at 21:14
mujjigamujjiga
5,0602 gold badges16 silver badges24 bronze badges
5,0602 gold badges16 silver badges24 bronze badges
add a comment |
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385906%2fwhats-the-input-of-each-lstm-layer-in-a-stacked-lstm-network%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown