How to cache the left most table in memory for a left outer join in hiveINNER JOIN vs LEFT JOIN performance in SQL ServerRIGHT/LEFT OUTER JOIN perform differently in HIVE?Hive: Left outer join with rlike conditionNon equi Left outer join in hive workaroundhive left outer join long runningUnable to increase hive dynamic partitions in spark using spark-sqlIssue with Left Outer Join in HiveHive - OR condition with left outer joinMultiple left outer joins on HiveHive Map-Join configuration mistery

Why does sound not move though a wall?

The number of days until the end of the month

IP addresses from public IP block in my LAN

What does 'made on' mean here?

What does this wavy downward arrow preceding a piano chord mean?

Appropriate certificate to ask for a fibre installation (ANSI/TIA-568.3-D?)

Should I mention being denied entry to UK due to a confusion in my Visa and Ticket bookings?

Target/total memory is higher than max_server_memory

PWM 1Hz on solid state relay

Can my company stop me from working overtime?

Uniform boundedness of the number of number fields having fixed discriminant

How does this change to the opportunity attack rule impact combat?

I have a unique character that I'm having a problem writing. He's a virus!

A factorization game

Find the cheapest shipping option based on item weight

Is there an official reason for not adding a post-credits scene?

Why did Thanos need his ship to help him in the battle scene?

Upside-Down Pyramid Addition...REVERSED!

Why does this derived table improve performance?

How to increase the size of the cursor in Lubuntu 19.04?

I need a disease

Chapter style minimal design

Did we get closer to another plane than we were supposed to, or was the pilot just protecting our delicate sensibilities?

Is the set of non invertible matrices simply connected? What are their homotopy and homology groups?



How to cache the left most table in memory for a left outer join in hive


INNER JOIN vs LEFT JOIN performance in SQL ServerRIGHT/LEFT OUTER JOIN perform differently in HIVE?Hive: Left outer join with rlike conditionNon equi Left outer join in hive workaroundhive left outer join long runningUnable to increase hive dynamic partitions in spark using spark-sqlIssue with Left Outer Join in HiveHive - OR condition with left outer joinMultiple left outer joins on HiveHive Map-Join configuration mistery






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















I have a large table (1Tb of data) that needs to be joined with a smaller table (100k records)



SELECT st.id
FROM small_table st
LEFT JOIN large_table lt
ON st.id = lt.id


In the above scenario, I am not able to control which table has to be cached into memory.I have tried using MAPJOIN , STREAMTABLE hints and also tried with parameters like conditional task size , small tbale size etc. Since the small table is on the left most side of the join, its not being cached into memory



Is there way to control the table that needs to be cached



Note: I cannot alter the table positions or the code:

Neither here...



enter image description here



...nor here...



enter image description here



Parameters used:



set hive.execution.engine=tez; 
set hive.tez.container.size=4096;
set hive.merge.mapredfiles=true;
set tez.shuffle-vertex-manager.min-src-fraction=0.25;
set tez.shuffle-vertex-manager.max-src-fraction=0.75;
set hive.exec.dynamic.partition.mode=nonstrict;
set tez.am.resource.memory.mb=3200 ;
set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ;
SET hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask.size=288435456;









share|improve this question
























  • If I change the postiions of the table ,m then a map join is enabled SELECT st.id FROM large_table lt LEFT JOIN small_table st ON st.id = lt.id

    – PK25
    Mar 22 at 23:16












  • Please add also parameters used

    – leftjoin
    Mar 23 at 8:03











  • set hive.execution.engine=tez; set hive.tez.container.size=4096; set hive.merge.mapredfiles=true; set tez.shuffle-vertex-manager.min-src-fraction=0.25; set tez.shuffle-vertex-manager.max-src-fraction=0.75; set hive.exec.dynamic.partition.mode=nonstrict; set tez.am.resource.memory.mb=3200 ; set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ; SET hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=288435456;

    – PK25
    Mar 24 at 2:38

















1















I have a large table (1Tb of data) that needs to be joined with a smaller table (100k records)



SELECT st.id
FROM small_table st
LEFT JOIN large_table lt
ON st.id = lt.id


In the above scenario, I am not able to control which table has to be cached into memory.I have tried using MAPJOIN , STREAMTABLE hints and also tried with parameters like conditional task size , small tbale size etc. Since the small table is on the left most side of the join, its not being cached into memory



Is there way to control the table that needs to be cached



Note: I cannot alter the table positions or the code:

Neither here...



enter image description here



...nor here...



enter image description here



Parameters used:



set hive.execution.engine=tez; 
set hive.tez.container.size=4096;
set hive.merge.mapredfiles=true;
set tez.shuffle-vertex-manager.min-src-fraction=0.25;
set tez.shuffle-vertex-manager.max-src-fraction=0.75;
set hive.exec.dynamic.partition.mode=nonstrict;
set tez.am.resource.memory.mb=3200 ;
set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ;
SET hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask.size=288435456;









share|improve this question
























  • If I change the postiions of the table ,m then a map join is enabled SELECT st.id FROM large_table lt LEFT JOIN small_table st ON st.id = lt.id

    – PK25
    Mar 22 at 23:16












  • Please add also parameters used

    – leftjoin
    Mar 23 at 8:03











  • set hive.execution.engine=tez; set hive.tez.container.size=4096; set hive.merge.mapredfiles=true; set tez.shuffle-vertex-manager.min-src-fraction=0.25; set tez.shuffle-vertex-manager.max-src-fraction=0.75; set hive.exec.dynamic.partition.mode=nonstrict; set tez.am.resource.memory.mb=3200 ; set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ; SET hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=288435456;

    – PK25
    Mar 24 at 2:38













1












1








1








I have a large table (1Tb of data) that needs to be joined with a smaller table (100k records)



SELECT st.id
FROM small_table st
LEFT JOIN large_table lt
ON st.id = lt.id


In the above scenario, I am not able to control which table has to be cached into memory.I have tried using MAPJOIN , STREAMTABLE hints and also tried with parameters like conditional task size , small tbale size etc. Since the small table is on the left most side of the join, its not being cached into memory



Is there way to control the table that needs to be cached



Note: I cannot alter the table positions or the code:

Neither here...



enter image description here



...nor here...



enter image description here



Parameters used:



set hive.execution.engine=tez; 
set hive.tez.container.size=4096;
set hive.merge.mapredfiles=true;
set tez.shuffle-vertex-manager.min-src-fraction=0.25;
set tez.shuffle-vertex-manager.max-src-fraction=0.75;
set hive.exec.dynamic.partition.mode=nonstrict;
set tez.am.resource.memory.mb=3200 ;
set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ;
SET hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask.size=288435456;









share|improve this question
















I have a large table (1Tb of data) that needs to be joined with a smaller table (100k records)



SELECT st.id
FROM small_table st
LEFT JOIN large_table lt
ON st.id = lt.id


In the above scenario, I am not able to control which table has to be cached into memory.I have tried using MAPJOIN , STREAMTABLE hints and also tried with parameters like conditional task size , small tbale size etc. Since the small table is on the left most side of the join, its not being cached into memory



Is there way to control the table that needs to be cached



Note: I cannot alter the table positions or the code:

Neither here...



enter image description here



...nor here...



enter image description here



Parameters used:



set hive.execution.engine=tez; 
set hive.tez.container.size=4096;
set hive.merge.mapredfiles=true;
set tez.shuffle-vertex-manager.min-src-fraction=0.25;
set tez.shuffle-vertex-manager.max-src-fraction=0.75;
set hive.exec.dynamic.partition.mode=nonstrict;
set tez.am.resource.memory.mb=3200 ;
set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ;
SET hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask.size=288435456;






performance hive mapjoin






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 24 at 8:00









leftjoin

10.9k32356




10.9k32356










asked Mar 22 at 23:10









PK25PK25

62




62












  • If I change the postiions of the table ,m then a map join is enabled SELECT st.id FROM large_table lt LEFT JOIN small_table st ON st.id = lt.id

    – PK25
    Mar 22 at 23:16












  • Please add also parameters used

    – leftjoin
    Mar 23 at 8:03











  • set hive.execution.engine=tez; set hive.tez.container.size=4096; set hive.merge.mapredfiles=true; set tez.shuffle-vertex-manager.min-src-fraction=0.25; set tez.shuffle-vertex-manager.max-src-fraction=0.75; set hive.exec.dynamic.partition.mode=nonstrict; set tez.am.resource.memory.mb=3200 ; set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ; SET hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=288435456;

    – PK25
    Mar 24 at 2:38

















  • If I change the postiions of the table ,m then a map join is enabled SELECT st.id FROM large_table lt LEFT JOIN small_table st ON st.id = lt.id

    – PK25
    Mar 22 at 23:16












  • Please add also parameters used

    – leftjoin
    Mar 23 at 8:03











  • set hive.execution.engine=tez; set hive.tez.container.size=4096; set hive.merge.mapredfiles=true; set tez.shuffle-vertex-manager.min-src-fraction=0.25; set tez.shuffle-vertex-manager.max-src-fraction=0.75; set hive.exec.dynamic.partition.mode=nonstrict; set tez.am.resource.memory.mb=3200 ; set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ; SET hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=288435456;

    – PK25
    Mar 24 at 2:38
















If I change the postiions of the table ,m then a map join is enabled SELECT st.id FROM large_table lt LEFT JOIN small_table st ON st.id = lt.id

– PK25
Mar 22 at 23:16






If I change the postiions of the table ,m then a map join is enabled SELECT st.id FROM large_table lt LEFT JOIN small_table st ON st.id = lt.id

– PK25
Mar 22 at 23:16














Please add also parameters used

– leftjoin
Mar 23 at 8:03





Please add also parameters used

– leftjoin
Mar 23 at 8:03













set hive.execution.engine=tez; set hive.tez.container.size=4096; set hive.merge.mapredfiles=true; set tez.shuffle-vertex-manager.min-src-fraction=0.25; set tez.shuffle-vertex-manager.max-src-fraction=0.75; set hive.exec.dynamic.partition.mode=nonstrict; set tez.am.resource.memory.mb=3200 ; set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ; SET hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=288435456;

– PK25
Mar 24 at 2:38





set hive.execution.engine=tez; set hive.tez.container.size=4096; set hive.merge.mapredfiles=true; set tez.shuffle-vertex-manager.min-src-fraction=0.25; set tez.shuffle-vertex-manager.max-src-fraction=0.75; set hive.exec.dynamic.partition.mode=nonstrict; set tez.am.resource.memory.mb=3200 ; set tez.am.java.opts=-server -Xmx3200m -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+UseParallelGC -XX:+UseConcMarkSweepGC ; SET hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=288435456;

– PK25
Mar 24 at 2:38












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55308917%2fhow-to-cache-the-left-most-table-in-memory-for-a-left-outer-join-in-hive%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55308917%2fhow-to-cache-the-left-most-table-in-memory-for-a-left-outer-join-in-hive%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해