What's the input of each LSTM layer in a stacked LSTM network?Understanding Keras LSTMshow to stack LSTM layers using TensorFlowUnderstanding the functioning of a recurrent neural network with LSTM cellsKeras - Input a 3 channel image into LSTMKeras embedding layer with LSTMHidden states and layers in LSTM?How to customize number of multiple hidden layer units in pytorch LSTM?Query about the input output shape of LSTM in KerasReturn_Sequence in LSTM with many to many network in PythonThe difference between Vanilla and stacked LSTM

Is a Centaur PC considered an animal when calculating carrying capacity for vehicles?

Does NASA use any type of office/groupware software and which is that?

Why didn't Doc believe Marty was from the future?

Pen test results for web application include a file from a forbidden directory that is not even used or referenced

Defending Castle from Zombies

How to emphasise the insignificance of someone/thing – besides using "klein"

Using a JoeBlow Sport pump on a presta valve

Biological refrigeration?

Are strlen optimizations really needed in glibc?

Dotted background on a flowchart

How to prevent a hosting company from accessing a VM's encryption keys?

Why is sh (not bash) complaining about functions defined in my .bashrc?

How many petaflops does it take to land on the moon? What does Artemis need with an Aitken?

What is the name of this plot that has rows with two connected dots?

Can I Prove Schröder-Bernstein With Just Definition of Bijection?

Is the Amazon rainforest the "world's lungs"?

Why does Windows store Wi-Fi passwords in a reversible format?

Does trying to charm an uncharmable creature cost a spell slot?

Is there any problem with a full installation on a USB drive?

Why does the `ls` command sort files like this?

Finding square root without division and initial guess

How to determine algebraically whether an equation has an infinite solutions or not?

Did anybody find out it was Anakin who blew up the command center?

What stops you from using fixed income in developing countries?



What's the input of each LSTM layer in a stacked LSTM network?


Understanding Keras LSTMshow to stack LSTM layers using TensorFlowUnderstanding the functioning of a recurrent neural network with LSTM cellsKeras - Input a 3 channel image into LSTMKeras embedding layer with LSTMHidden states and layers in LSTM?How to customize number of multiple hidden layer units in pytorch LSTM?Query about the input output shape of LSTM in KerasReturn_Sequence in LSTM with many to many network in PythonThe difference between Vanilla and stacked LSTM






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:



# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32

# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))


where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:



  1. Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

  2. If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
LSTM workings



Any help would be highly appreciated.
Thanks!










share|improve this question






























    0















    I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:



    # parameters
    time_steps = 10
    features = 2
    input_shape = [time_steps, features]
    batch_size = 32

    # model
    model = Sequential()
    model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
    model.add(LSTM(32,input_shape=input_shape))


    where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:



    1. Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

    2. If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

    I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
    LSTM workings



    Any help would be highly appreciated.
    Thanks!










    share|improve this question


























      0












      0








      0








      I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:



      # parameters
      time_steps = 10
      features = 2
      input_shape = [time_steps, features]
      batch_size = 32

      # model
      model = Sequential()
      model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
      model.add(LSTM(32,input_shape=input_shape))


      where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:



      1. Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

      2. If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

      I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
      LSTM workings



      Any help would be highly appreciated.
      Thanks!










      share|improve this question














      I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:



      # parameters
      time_steps = 10
      features = 2
      input_shape = [time_steps, features]
      batch_size = 32

      # model
      model = Sequential()
      model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
      model.add(LSTM(32,input_shape=input_shape))


      where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:



      1. Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?

      2. If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

      I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:
      LSTM workings



      Any help would be highly appreciated.
      Thanks!







      deep-learning lstm recurrent-neural-network stacked






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 27 at 20:27









      Lio ChonLio Chon

      461 gold badge1 silver badge4 bronze badges




      461 gold badge1 silver badge4 bronze badges

























          1 Answer
          1






          active

          oldest

          votes


















          1















          The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)



          The model below



          model = Sequential()
          model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
          model.add(LSTM(32))


          represent the below architecture



          enter image description here



          Which you can verify it from model.summary()



          _________________________________________________________________
          Layer (type) Output Shape Param #
          =================================================================
          lstm_26 (LSTM) (None, 5, 64) 17152
          _________________________________________________________________
          lstm_27 (LSTM) (None, 32) 12416
          =================================================================


          Replacing the line



          model.add(LSTM(32))


          with



          model.add(LSTM(32, input_shape=(1000000, 200000)))


          will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.



          And If you need a sequence to sequence architecture like below



          enter image description here



          you should be using the code:



          model = Sequential()
          model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
          model.add(LSTM(32, return_sequences=True))


          which should return a model



          _________________________________________________________________
          Layer (type) Output Shape Param #
          =================================================================
          lstm_32 (LSTM) (None, 5, 64) 17152
          _________________________________________________________________
          lstm_33 (LSTM) (None, 5, 32) 12416
          =================================================================





          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385906%2fwhats-the-input-of-each-lstm-layer-in-a-stacked-lstm-network%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1















            The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)



            The model below



            model = Sequential()
            model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
            model.add(LSTM(32))


            represent the below architecture



            enter image description here



            Which you can verify it from model.summary()



            _________________________________________________________________
            Layer (type) Output Shape Param #
            =================================================================
            lstm_26 (LSTM) (None, 5, 64) 17152
            _________________________________________________________________
            lstm_27 (LSTM) (None, 32) 12416
            =================================================================


            Replacing the line



            model.add(LSTM(32))


            with



            model.add(LSTM(32, input_shape=(1000000, 200000)))


            will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.



            And If you need a sequence to sequence architecture like below



            enter image description here



            you should be using the code:



            model = Sequential()
            model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
            model.add(LSTM(32, return_sequences=True))


            which should return a model



            _________________________________________________________________
            Layer (type) Output Shape Param #
            =================================================================
            lstm_32 (LSTM) (None, 5, 64) 17152
            _________________________________________________________________
            lstm_33 (LSTM) (None, 5, 32) 12416
            =================================================================





            share|improve this answer





























              1















              The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)



              The model below



              model = Sequential()
              model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
              model.add(LSTM(32))


              represent the below architecture



              enter image description here



              Which you can verify it from model.summary()



              _________________________________________________________________
              Layer (type) Output Shape Param #
              =================================================================
              lstm_26 (LSTM) (None, 5, 64) 17152
              _________________________________________________________________
              lstm_27 (LSTM) (None, 32) 12416
              =================================================================


              Replacing the line



              model.add(LSTM(32))


              with



              model.add(LSTM(32, input_shape=(1000000, 200000)))


              will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.



              And If you need a sequence to sequence architecture like below



              enter image description here



              you should be using the code:



              model = Sequential()
              model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
              model.add(LSTM(32, return_sequences=True))


              which should return a model



              _________________________________________________________________
              Layer (type) Output Shape Param #
              =================================================================
              lstm_32 (LSTM) (None, 5, 64) 17152
              _________________________________________________________________
              lstm_33 (LSTM) (None, 5, 32) 12416
              =================================================================





              share|improve this answer



























                1














                1










                1









                The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)



                The model below



                model = Sequential()
                model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
                model.add(LSTM(32))


                represent the below architecture



                enter image description here



                Which you can verify it from model.summary()



                _________________________________________________________________
                Layer (type) Output Shape Param #
                =================================================================
                lstm_26 (LSTM) (None, 5, 64) 17152
                _________________________________________________________________
                lstm_27 (LSTM) (None, 32) 12416
                =================================================================


                Replacing the line



                model.add(LSTM(32))


                with



                model.add(LSTM(32, input_shape=(1000000, 200000)))


                will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.



                And If you need a sequence to sequence architecture like below



                enter image description here



                you should be using the code:



                model = Sequential()
                model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
                model.add(LSTM(32, return_sequences=True))


                which should return a model



                _________________________________________________________________
                Layer (type) Output Shape Param #
                =================================================================
                lstm_32 (LSTM) (None, 5, 64) 17152
                _________________________________________________________________
                lstm_33 (LSTM) (None, 5, 32) 12416
                =================================================================





                share|improve this answer













                The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)



                The model below



                model = Sequential()
                model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
                model.add(LSTM(32))


                represent the below architecture



                enter image description here



                Which you can verify it from model.summary()



                _________________________________________________________________
                Layer (type) Output Shape Param #
                =================================================================
                lstm_26 (LSTM) (None, 5, 64) 17152
                _________________________________________________________________
                lstm_27 (LSTM) (None, 32) 12416
                =================================================================


                Replacing the line



                model.add(LSTM(32))


                with



                model.add(LSTM(32, input_shape=(1000000, 200000)))


                will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.



                And If you need a sequence to sequence architecture like below



                enter image description here



                you should be using the code:



                model = Sequential()
                model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
                model.add(LSTM(32, return_sequences=True))


                which should return a model



                _________________________________________________________________
                Layer (type) Output Shape Param #
                =================================================================
                lstm_32 (LSTM) (None, 5, 64) 17152
                _________________________________________________________________
                lstm_33 (LSTM) (None, 5, 32) 12416
                =================================================================






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 27 at 21:14









                mujjigamujjiga

                5,0602 gold badges16 silver badges24 bronze badges




                5,0602 gold badges16 silver badges24 bronze badges



















                    Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







                    Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385906%2fwhats-the-input-of-each-lstm-layer-in-a-stacked-lstm-network%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

                    용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

                    155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해