Adding Extra HASH partitions to already HASH partitioned tableHow to find all the tables in MySQL with specific column names in them?Duplicating a MySQL table, indices, and dataInsert into a MySQL table or update if existsSQL Server -is a GUID based PK the best practice to support tenant based horizontal partitioningHow to get the sizes of the tables of a MySQL database?Phinx and pt-online-schema-changemysql repartitioned table much largermysql alter on large table to another location?Add partition in MySQL Table to an Already Partitioned TableHow to alter and update large table to add composite key columns form another table

How often can a PC check with passive perception during a combat turn?

How to determine what is the correct level of detail when modelling?

What are the penalties for overstaying in USA?

Are Finite Automata Turing Complete?

What's the difference between 予定 (Yotei) and 計画 (keikaku)?

Why aren't (poly-)cotton tents more popular?

How many codes are possible?

Inverse-quotes-quine

What is the line crossing the Pacific Ocean that is shown on maps?

Is there a maximum distance from a planet that a moon can orbit?

Why do some games show lights shine through walls?

How should I behave to assure my friends that I am not after their money?

Could Sauron have read Tom Bombadil's mind if Tom had held the Palantir?

Why is the Turkish president's surname spelt in Russian as Эрдоган, with г?

What happens when your group is victim of a surprise attack but you can't be surprised?

How to append a matrix element by element?

Short story with brother-sister conjoined twins as protagonist?

Averting Real Women Don’t Wear Dresses

How to perform Login Authentication at the client-side?

Why is C++ initial allocation so much larger than C's?

How risky is real estate?

What is this particular type of chord progression, common in classical music, called?

Fedora boot screen shows both Fedora logo and Lenovo logo. Why and How?

Does the UK have a written constitution?

Adding Extra HASH partitions to already HASH partitioned table

How to find all the tables in MySQL with specific column names in them?Duplicating a MySQL table, indices, and dataInsert into a MySQL table or update if existsSQL Server -is a GUID based PK the best practice to support tenant based horizontal partitioningHow to get the sizes of the tables of a MySQL database?Phinx and pt-online-schema-changemysql repartitioned table much largermysql alter on large table to another location?Add partition in MySQL Table to an Already Partitioned TableHow to alter and update large table to add composite key columns form another table

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

Hi I currently have a table which has 100 HASH Partitions. I have decided that this now needs to be increased to 1000 partitions due to future scaling.

Do I need to remove the Partitions from the table and then add the 1000 partitions after or is there a way to add the extra 900 partitions to the already partitioned table?

The way I partitioned was using the below code.

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 100;

Is there also a way to get an estimate on how long it will take to add 1000 partitions to my table? I will be using one of perconas tools to do this which will prevent the table from locking. https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-change.html

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

I won't be surprised if you gain nothing by the change. Possibly, your SELECTs will slow down. Keep us posted. And provide SHOW CREATE TABLE and the main SELECT.

– Rick James
Apr 17 at 3:21

add a comment |

Hi I currently have a table which has 100 HASH Partitions. I have decided that this now needs to be increased to 1000 partitions due to future scaling.

Do I need to remove the Partitions from the table and then add the 1000 partitions after or is there a way to add the extra 900 partitions to the already partitioned table?

The way I partitioned was using the below code.

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 100;

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

I won't be surprised if you gain nothing by the change. Possibly, your SELECTs will slow down. Keep us posted. And provide SHOW CREATE TABLE and the main SELECT.

– Rick James
Apr 17 at 3:21

add a comment |

Hi I currently have a table which has 100 HASH Partitions. I have decided that this now needs to be increased to 1000 partitions due to future scaling.

Do I need to remove the Partitions from the table and then add the 1000 partitions after or is there a way to add the extra 900 partitions to the already partitioned table?

The way I partitioned was using the below code.

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 100;

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

Hi I currently have a table which has 100 HASH Partitions. I have decided that this now needs to be increased to 1000 partitions due to future scaling.

Do I need to remove the Partitions from the table and then add the 1000 partitions after or is there a way to add the extra 900 partitions to the already partitioned table?

The way I partitioned was using the below code.

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 100;

mysql partitioning

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

asked Mar 25 at 11:01

Lukerayner

439 bronze badges

I won't be surprised if you gain nothing by the change. Possibly, your SELECTs will slow down. Keep us posted. And provide SHOW CREATE TABLE and the main SELECT.

– Rick James
Apr 17 at 3:21

add a comment |

I won't be surprised if you gain nothing by the change. Possibly, your SELECTs will slow down. Keep us posted. And provide SHOW CREATE TABLE and the main SELECT.

– Rick James
Apr 17 at 3:21

I won't be surprised if you gain nothing by the change. Possibly, your SELECTs will slow down. Keep us posted. And provide SHOW CREATE TABLE and the main SELECT.

– Rick James
Apr 17 at 3:21

add a comment |

2 Answers
2

active

oldest

votes

You don't need to remove partitioning to repartition. It's going to insert the rows to a new table anyway, so you might as well do this in one step.

Just ALTER TABLE and define the new partitioning scheme:

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 1000;

Or with pt-online-schema-change:

pt-online-schema-change h=myhost,D=mydatabase,t=t1 
 --alter "PARTITION BY HASH(venue_id) PARTITIONS 1000" 
 --execute

(I put line breaks in there to avoid line-wrapping, but that's one command.)

I forgot to comment on your other question, about predicting the ETA for completion.

One advantage of the Percona script is that it reports progress and you can get an estimate of the completion from that. Although in our environment, we find that it's not very accurate. It can sometimes report that it's 99% complete for hours.

Also keep in mind that the Percona script is not 100% without locking. It needs an exclusive metadata lock briefly at the start and end of its run, because it needs to create triggers and then rename the tables and drop the triggers at the end. Any query, even a read-only SELECT, will block the metadata lock. If you have trouble with the completion of the script, make sure any queries and transactions you run against your table finish quickly (or else you must kill them if not).

edited Mar 25 at 14:03

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

Thank you very much for this answer. I am currently running the percona command and once it has finished and worked I will mark this as the correct answer. :)

– Lukerayner
Mar 25 at 13:57

That worked perfectly and thank you for the extra info in your update. I just thought it would be worth asking if once I have added the 1000 partitions should the performance be the same or a bit slower? I don't need 1000 just yet but in a year or 2 I will so I just thought it was best to do it now before I had loads of data making the alter take hours/days.

– Lukerayner
Mar 25 at 14:30

1

I frequently say that tables don't have performance — queries have performance. There are certainly queries that will not perform well no matter how many partitions you have, and the greater number of partitions may cause them to be slower.

– Bill Karwin
Mar 25 at 14:34

That is a very true comment ;) Ok thank you for the advice, I am forcing the query to look at a certain partition using PARTITION(p46) for example. I am hoping because I do this the total number of partitions shouldn't have an impact.

– Lukerayner
Mar 25 at 14:37

Yes, if you limit the partitions explicitly, or else if the optimizer does that for you by partition pruning, then dividing your table into smaller partitions should help it scan fewer rows, which will reduce the overall query time.

– Bill Karwin
Mar 25 at 14:39

|
show 1 more comment

PARTITION BY HASH is virtually useless. I don't expect it to help you with 100 partitions, nor with 1000.

You get more bang for your buck by arranging to have venue_id as the first column in the PRIMARY KEY.

Does the query always have a single venue_id? (If not the options get messier.) For now, I will assume you always have WHERE venue_id = constant.

You have a multi-dimensional indexing problem. INDEXes are only one dimension, so things get tricky. However, partitioning can be used to sort of get a two-dimensional index.

Let's pick day_epoch as the partition key and use PARTITION BY RANGE(day_epoch). (If you change that from a 4-byte INT to a 3-byte DATE, then use PARTITION BY RANGE(TO_DAYS(day_epoch))).

Then let's decide on the PRIMARY KEY. Note: When adding or removing partitioning, the PK should be re-thought. Keep in mind that a PK is a unique index. And the data is clustered on the PK. (However, uniqueness is not guaranteed across partitions.)

So...

PARTITION BY RANGE(day_epoch)

PRIMARY KEY(venue_id, zone_id, id) -- in this order

Without partitioning, I recommend

PRIMARY KEY(venue_id, zone_id, day_epoch, id)

In general, any index (including the PK) should start with any column(s) that are tested with =. Then IN, then at most one 'range'.

For the sake of the uniqueness requirement of the PK, I put the id last.

So, the query performs something like this:

"Partition pruning" -- probably down to a single partition, based on the date.

Drill down the PK directly to the consecutive rows for the one venue_id in question.

Hopscotch across the data based on the zone_ids. (In some situations, this may be a range scan instead of the jumping around. This depends on the version, number of ids, values of the ids, and perhaps the phase of the moon.

(If it makes it this far) Then get the desired date.

When fetching lots of rows from a huge table, the most important thing is to minimize disk hits. What I just described probably does the job better than other situations. Partitioning on venue_id helps only with that one column, but fails to help with the rest.

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55336322%2fadding-extra-hash-partitions-to-already-hash-partitioned-table%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

You don't need to remove partitioning to repartition. It's going to insert the rows to a new table anyway, so you might as well do this in one step.

Just ALTER TABLE and define the new partitioning scheme:

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 1000;

Or with pt-online-schema-change:

pt-online-schema-change h=myhost,D=mydatabase,t=t1 
 --alter "PARTITION BY HASH(venue_id) PARTITIONS 1000" 
 --execute

(I put line breaks in there to avoid line-wrapping, but that's one command.)

I forgot to comment on your other question, about predicting the ETA for completion.

edited Mar 25 at 14:03

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

Thank you very much for this answer. I am currently running the percona command and once it has finished and worked I will mark this as the correct answer. :)

– Lukerayner
Mar 25 at 13:57

That worked perfectly and thank you for the extra info in your update. I just thought it would be worth asking if once I have added the 1000 partitions should the performance be the same or a bit slower? I don't need 1000 just yet but in a year or 2 I will so I just thought it was best to do it now before I had loads of data making the alter take hours/days.

– Lukerayner
Mar 25 at 14:30

1

I frequently say that tables don't have performance — queries have performance. There are certainly queries that will not perform well no matter how many partitions you have, and the greater number of partitions may cause them to be slower.

– Bill Karwin
Mar 25 at 14:34

That is a very true comment ;) Ok thank you for the advice, I am forcing the query to look at a certain partition using PARTITION(p46) for example. I am hoping because I do this the total number of partitions shouldn't have an impact.

– Lukerayner
Mar 25 at 14:37

Yes, if you limit the partitions explicitly, or else if the optimizer does that for you by partition pruning, then dividing your table into smaller partitions should help it scan fewer rows, which will reduce the overall query time.

– Bill Karwin
Mar 25 at 14:39

|
show 1 more comment

You don't need to remove partitioning to repartition. It's going to insert the rows to a new table anyway, so you might as well do this in one step.

Just ALTER TABLE and define the new partitioning scheme:

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 1000;

Or with pt-online-schema-change:

pt-online-schema-change h=myhost,D=mydatabase,t=t1 
 --alter "PARTITION BY HASH(venue_id) PARTITIONS 1000" 
 --execute

(I put line breaks in there to avoid line-wrapping, but that's one command.)

I forgot to comment on your other question, about predicting the ETA for completion.

edited Mar 25 at 14:03

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

Thank you very much for this answer. I am currently running the percona command and once it has finished and worked I will mark this as the correct answer. :)

– Lukerayner
Mar 25 at 13:57

That worked perfectly and thank you for the extra info in your update. I just thought it would be worth asking if once I have added the 1000 partitions should the performance be the same or a bit slower? I don't need 1000 just yet but in a year or 2 I will so I just thought it was best to do it now before I had loads of data making the alter take hours/days.

– Lukerayner
Mar 25 at 14:30

1

I frequently say that tables don't have performance — queries have performance. There are certainly queries that will not perform well no matter how many partitions you have, and the greater number of partitions may cause them to be slower.

– Bill Karwin
Mar 25 at 14:34

That is a very true comment ;) Ok thank you for the advice, I am forcing the query to look at a certain partition using PARTITION(p46) for example. I am hoping because I do this the total number of partitions shouldn't have an impact.

– Lukerayner
Mar 25 at 14:37

Yes, if you limit the partitions explicitly, or else if the optimizer does that for you by partition pruning, then dividing your table into smaller partitions should help it scan fewer rows, which will reduce the overall query time.

– Bill Karwin
Mar 25 at 14:39

|
show 1 more comment

You don't need to remove partitioning to repartition. It's going to insert the rows to a new table anyway, so you might as well do this in one step.

Just ALTER TABLE and define the new partitioning scheme:

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 1000;

Or with pt-online-schema-change:

pt-online-schema-change h=myhost,D=mydatabase,t=t1 
 --alter "PARTITION BY HASH(venue_id) PARTITIONS 1000" 
 --execute

(I put line breaks in there to avoid line-wrapping, but that's one command.)

I forgot to comment on your other question, about predicting the ETA for completion.

edited Mar 25 at 14:03

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

You don't need to remove partitioning to repartition. It's going to insert the rows to a new table anyway, so you might as well do this in one step.

Just ALTER TABLE and define the new partitioning scheme:

ALTER TABLE t1
PARTITION BY HASH(venue_id)
PARTITIONS 1000;

Or with pt-online-schema-change:

pt-online-schema-change h=myhost,D=mydatabase,t=t1 
 --alter "PARTITION BY HASH(venue_id) PARTITIONS 1000" 
 --execute

(I put line breaks in there to avoid line-wrapping, but that's one command.)

I forgot to comment on your other question, about predicting the ETA for completion.

edited Mar 25 at 14:03

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

edited Mar 25 at 14:03

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

answered Mar 25 at 13:16

Bill Karwin

393k67 gold badges531 silver badges685 bronze badges

Thank you very much for this answer. I am currently running the percona command and once it has finished and worked I will mark this as the correct answer. :)

– Lukerayner
Mar 25 at 13:57

That worked perfectly and thank you for the extra info in your update. I just thought it would be worth asking if once I have added the 1000 partitions should the performance be the same or a bit slower? I don't need 1000 just yet but in a year or 2 I will so I just thought it was best to do it now before I had loads of data making the alter take hours/days.

– Lukerayner
Mar 25 at 14:30

1

I frequently say that tables don't have performance — queries have performance. There are certainly queries that will not perform well no matter how many partitions you have, and the greater number of partitions may cause them to be slower.

– Bill Karwin
Mar 25 at 14:34

That is a very true comment ;) Ok thank you for the advice, I am forcing the query to look at a certain partition using PARTITION(p46) for example. I am hoping because I do this the total number of partitions shouldn't have an impact.

– Lukerayner
Mar 25 at 14:37

Yes, if you limit the partitions explicitly, or else if the optimizer does that for you by partition pruning, then dividing your table into smaller partitions should help it scan fewer rows, which will reduce the overall query time.

– Bill Karwin
Mar 25 at 14:39

|
show 1 more comment

Thank you very much for this answer. I am currently running the percona command and once it has finished and worked I will mark this as the correct answer. :)

– Lukerayner
Mar 25 at 13:57

That worked perfectly and thank you for the extra info in your update. I just thought it would be worth asking if once I have added the 1000 partitions should the performance be the same or a bit slower? I don't need 1000 just yet but in a year or 2 I will so I just thought it was best to do it now before I had loads of data making the alter take hours/days.

– Lukerayner
Mar 25 at 14:30

1

I frequently say that tables don't have performance — queries have performance. There are certainly queries that will not perform well no matter how many partitions you have, and the greater number of partitions may cause them to be slower.

– Bill Karwin
Mar 25 at 14:34

That is a very true comment ;) Ok thank you for the advice, I am forcing the query to look at a certain partition using PARTITION(p46) for example. I am hoping because I do this the total number of partitions shouldn't have an impact.

– Lukerayner
Mar 25 at 14:37

Yes, if you limit the partitions explicitly, or else if the optimizer does that for you by partition pruning, then dividing your table into smaller partitions should help it scan fewer rows, which will reduce the overall query time.

– Bill Karwin
Mar 25 at 14:39

Thank you very much for this answer. I am currently running the percona command and once it has finished and worked I will mark this as the correct answer. :)

– Lukerayner
Mar 25 at 13:57

That worked perfectly and thank you for the extra info in your update. I just thought it would be worth asking if once I have added the 1000 partitions should the performance be the same or a bit slower? I don't need 1000 just yet but in a year or 2 I will so I just thought it was best to do it now before I had loads of data making the alter take hours/days.

– Lukerayner
Mar 25 at 14:30

I frequently say that tables don't have performance — queries have performance. There are certainly queries that will not perform well no matter how many partitions you have, and the greater number of partitions may cause them to be slower.

– Bill Karwin
Mar 25 at 14:34

That is a very true comment ;) Ok thank you for the advice, I am forcing the query to look at a certain partition using PARTITION(p46) for example. I am hoping because I do this the total number of partitions shouldn't have an impact.

– Lukerayner
Mar 25 at 14:37

Yes, if you limit the partitions explicitly, or else if the optimizer does that for you by partition pruning, then dividing your table into smaller partitions should help it scan fewer rows, which will reduce the overall query time.

– Bill Karwin
Mar 25 at 14:39

|
show 1 more comment

PARTITION BY HASH is virtually useless. I don't expect it to help you with 100 partitions, nor with 1000.

You get more bang for your buck by arranging to have venue_id as the first column in the PRIMARY KEY.

Does the query always have a single venue_id? (If not the options get messier.) For now, I will assume you always have WHERE venue_id = constant.

You have a multi-dimensional indexing problem. INDEXes are only one dimension, so things get tricky. However, partitioning can be used to sort of get a two-dimensional index.

Let's pick day_epoch as the partition key and use PARTITION BY RANGE(day_epoch). (If you change that from a 4-byte INT to a 3-byte DATE, then use PARTITION BY RANGE(TO_DAYS(day_epoch))).

So...

PARTITION BY RANGE(day_epoch)

PRIMARY KEY(venue_id, zone_id, id) -- in this order

Without partitioning, I recommend

PRIMARY KEY(venue_id, zone_id, day_epoch, id)

In general, any index (including the PK) should start with any column(s) that are tested with =. Then IN, then at most one 'range'.

For the sake of the uniqueness requirement of the PK, I put the id last.

So, the query performs something like this:

"Partition pruning" -- probably down to a single partition, based on the date.

Drill down the PK directly to the consecutive rows for the one venue_id in question.

Hopscotch across the data based on the zone_ids. (In some situations, this may be a range scan instead of the jumping around. This depends on the version, number of ids, values of the ids, and perhaps the phase of the moon.

(If it makes it this far) Then get the desired date.

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

add a comment |

PARTITION BY HASH is virtually useless. I don't expect it to help you with 100 partitions, nor with 1000.

You get more bang for your buck by arranging to have venue_id as the first column in the PRIMARY KEY.

Does the query always have a single venue_id? (If not the options get messier.) For now, I will assume you always have WHERE venue_id = constant.

You have a multi-dimensional indexing problem. INDEXes are only one dimension, so things get tricky. However, partitioning can be used to sort of get a two-dimensional index.

Let's pick day_epoch as the partition key and use PARTITION BY RANGE(day_epoch). (If you change that from a 4-byte INT to a 3-byte DATE, then use PARTITION BY RANGE(TO_DAYS(day_epoch))).

So...

PARTITION BY RANGE(day_epoch)

PRIMARY KEY(venue_id, zone_id, id) -- in this order

Without partitioning, I recommend

PRIMARY KEY(venue_id, zone_id, day_epoch, id)

In general, any index (including the PK) should start with any column(s) that are tested with =. Then IN, then at most one 'range'.

For the sake of the uniqueness requirement of the PK, I put the id last.

So, the query performs something like this:

"Partition pruning" -- probably down to a single partition, based on the date.

Drill down the PK directly to the consecutive rows for the one venue_id in question.

Hopscotch across the data based on the zone_ids. (In some situations, this may be a range scan instead of the jumping around. This depends on the version, number of ids, values of the ids, and perhaps the phase of the moon.

(If it makes it this far) Then get the desired date.

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

add a comment |

PARTITION BY HASH is virtually useless. I don't expect it to help you with 100 partitions, nor with 1000.

You get more bang for your buck by arranging to have venue_id as the first column in the PRIMARY KEY.

Does the query always have a single venue_id? (If not the options get messier.) For now, I will assume you always have WHERE venue_id = constant.

You have a multi-dimensional indexing problem. INDEXes are only one dimension, so things get tricky. However, partitioning can be used to sort of get a two-dimensional index.

Let's pick day_epoch as the partition key and use PARTITION BY RANGE(day_epoch). (If you change that from a 4-byte INT to a 3-byte DATE, then use PARTITION BY RANGE(TO_DAYS(day_epoch))).

So...

PARTITION BY RANGE(day_epoch)

PRIMARY KEY(venue_id, zone_id, id) -- in this order

Without partitioning, I recommend

PRIMARY KEY(venue_id, zone_id, day_epoch, id)

In general, any index (including the PK) should start with any column(s) that are tested with =. Then IN, then at most one 'range'.

For the sake of the uniqueness requirement of the PK, I put the id last.

So, the query performs something like this:

"Partition pruning" -- probably down to a single partition, based on the date.

Drill down the PK directly to the consecutive rows for the one venue_id in question.

Hopscotch across the data based on the zone_ids. (In some situations, this may be a range scan instead of the jumping around. This depends on the version, number of ids, values of the ids, and perhaps the phase of the moon.

(If it makes it this far) Then get the desired date.

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

PARTITION BY HASH is virtually useless. I don't expect it to help you with 100 partitions, nor with 1000.

You get more bang for your buck by arranging to have venue_id as the first column in the PRIMARY KEY.

Does the query always have a single venue_id? (If not the options get messier.) For now, I will assume you always have WHERE venue_id = constant.

You have a multi-dimensional indexing problem. INDEXes are only one dimension, so things get tricky. However, partitioning can be used to sort of get a two-dimensional index.

Let's pick day_epoch as the partition key and use PARTITION BY RANGE(day_epoch). (If you change that from a 4-byte INT to a 3-byte DATE, then use PARTITION BY RANGE(TO_DAYS(day_epoch))).

So...

PARTITION BY RANGE(day_epoch)

PRIMARY KEY(venue_id, zone_id, id) -- in this order

Without partitioning, I recommend

PRIMARY KEY(venue_id, zone_id, day_epoch, id)

In general, any index (including the PK) should start with any column(s) that are tested with =. Then IN, then at most one 'range'.

For the sake of the uniqueness requirement of the PK, I put the id last.

So, the query performs something like this:

"Partition pruning" -- probably down to a single partition, based on the date.

Drill down the PK directly to the consecutive rows for the one venue_id in question.

Hopscotch across the data based on the zone_ids. (In some situations, this may be a range scan instead of the jumping around. This depends on the version, number of ids, values of the ids, and perhaps the phase of the moon.

(If it makes it this far) Then get the desired date.

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

answered Apr 17 at 15:30

Rick James

75k5 gold badges68 silver badges110 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

2 Answers
2

2 Answers
2

2 Answers
2