What's the input of each LSTM layer in a stacked LSTM network?Understanding Keras LSTMshow to stack LSTM layers using TensorFlowUnderstanding the functioning of a recurrent neural network with LSTM cellsKeras - Input a 3 channel image into LSTMKeras embedding layer with LSTMHidden states and layers in LSTM?How to customize number of multiple hidden layer units in pytorch LSTM?Query about the input output shape of LSTM in KerasReturn_Sequence in LSTM with many to many network in PythonThe difference between Vanilla and stacked LSTM

Is a Centaur PC considered an animal when calculating carrying capacity for vehicles?

Does NASA use any type of office/groupware software and which is that?

Why didn't Doc believe Marty was from the future?

Pen test results for web application include a file from a forbidden directory that is not even used or referenced

Defending Castle from Zombies

How to emphasise the insignificance of someone/thing – besides using "klein"

Using a JoeBlow Sport pump on a presta valve

Biological refrigeration?

Are strlen optimizations really needed in glibc?

Dotted background on a flowchart

How to prevent a hosting company from accessing a VM's encryption keys?

Why is sh (not bash) complaining about functions defined in my .bashrc?

How many petaflops does it take to land on the moon? What does Artemis need with an Aitken?

What is the name of this plot that has rows with two connected dots?

Can I Prove Schröder-Bernstein With Just Definition of Bijection?

Is the Amazon rainforest the "world's lungs"?

Why does Windows store Wi-Fi passwords in a reversible format?

Does trying to charm an uncharmable creature cost a spell slot?

Is there any problem with a full installation on a USB drive?

Why does the `ls` command sort files like this?

Finding square root without division and initial guess

How to determine algebraically whether an equation has an infinite solutions or not?

Did anybody find out it was Anakin who blew up the command center?

What stops you from using fixed income in developing countries?

What's the input of each LSTM layer in a stacked LSTM network?

Understanding Keras LSTMshow to stack LSTM layers using TensorFlowUnderstanding the functioning of a recurrent neural network with LSTM cellsKeras - Input a 3 channel image into LSTMKeras embedding layer with LSTMHidden states and layers in LSTM?How to customize number of multiple hidden layer units in pytorch LSTM?Query about the input output shape of LSTM in KerasReturn_Sequence in LSTM with many to many network in PythonThe difference between Vanilla and stacked LSTM

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:

# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32

# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))

where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:

Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
LSTM workings

Any help would be highly appreciated.
Thanks!

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

add a comment |

I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:

# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32

# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))

Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
LSTM workings

Any help would be highly appreciated.
Thanks!

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

add a comment |

I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:

# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32

# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))

Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
LSTM workings

Any help would be highly appreciated.
Thanks!

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:

# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32

# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))

Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
LSTM workings

Any help would be highly appreciated.
Thanks!

deep-learning lstm recurrent-neural-network stacked

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

asked Mar 27 at 20:27

Lio Chon

461 gold badge1 silver badge4 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

The model below

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

represent the below architecture

enter image description here

Which you can verify it from model.summary()

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416 
=================================================================

Replacing the line

model.add(LSTM(32))

with

model.add(LSTM(32, input_shape=(1000000, 200000)))

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

And If you need a sequence to sequence architecture like below

enter image description here

you should be using the code:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

which should return a model

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416 
=================================================================

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385906%2fwhats-the-input-of-each-lstm-layer-in-a-stacked-lstm-network%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

The model below

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

represent the below architecture

enter image description here

Which you can verify it from model.summary()

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416 
=================================================================

Replacing the line

model.add(LSTM(32))

with

model.add(LSTM(32, input_shape=(1000000, 200000)))

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

And If you need a sequence to sequence architecture like below

enter image description here

you should be using the code:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

which should return a model

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416 
=================================================================

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

add a comment |

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

The model below

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

represent the below architecture

enter image description here

Which you can verify it from model.summary()

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416 
=================================================================

Replacing the line

model.add(LSTM(32))

with

model.add(LSTM(32, input_shape=(1000000, 200000)))

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

And If you need a sequence to sequence architecture like below

enter image description here

you should be using the code:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

which should return a model

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416 
=================================================================

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

add a comment |

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

The model below

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

represent the below architecture

enter image description here

Which you can verify it from model.summary()

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416 
=================================================================

Replacing the line

model.add(LSTM(32))

with

model.add(LSTM(32, input_shape=(1000000, 200000)))

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

And If you need a sequence to sequence architecture like below

enter image description here

you should be using the code:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

which should return a model

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416 
=================================================================

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

The model below

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

represent the below architecture

enter image description here

Which you can verify it from model.summary()

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_26 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_27 (LSTM) (None, 32) 12416 
=================================================================

Replacing the line

model.add(LSTM(32))

with

model.add(LSTM(32, input_shape=(1000000, 200000)))

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

And If you need a sequence to sequence architecture like below

enter image description here

you should be using the code:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

which should return a model

_________________________________________________________________
Layer (type) Output Shape Param # 
=================================================================
lstm_32 (LSTM) (None, 5, 64) 17152 
_________________________________________________________________
lstm_33 (LSTM) (None, 5, 32) 12416 
=================================================================

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

answered Mar 27 at 21:14

mujjiga

5,0602 gold badges16 silver badges24 bronze badges

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1