Is there a faster way to insert records to postgresql database while iterating over very large ndarray?Update VERY LARGE PostgreSQL database table efficientlyIs there a way to iterate over a dictionary?Postgresql inserts stop at random number of recordsselect “step” recordsWays to iterate over a list in JavaWhats the best way to iterate over circular wrapped indices?Get ID of new record in function trigger before insert on postgreSQLredis vs postgresql for wildcard search of large number of recordsVery large variation of the time elapsed when a small write to PostgresqlNumpy random array limited by other arrays
California: "For quality assurance, this phone call is being recorded"
Is there a way to save this session?
Does nuclear propulsion applied to military ships fall under civil or military nuclear power?
How can a single Member of the House block a Congressional bill?
Is there a rule that prohibits us from using 2 possessives in a row?
Applicants clearly not having the skills they advertise
The most awesome army: 80 men left and 81 returned. Is it true?
What does it mean by "d-ism of Leibniz" and "dotage of Newton" in simple English?
Orientable with respect to complex cobordism?
Expenditure in Poland - Forex doesn't have Zloty
Is there any Biblical Basis for 400 years of silence between Old and New Testament?
Is the world in Game of Thrones spherical or flat?
Future enhancements for the finite element method
Why use water tanks from a retired Space Shuttle?
What is the difference between a game ban and a VAC ban in Steam?
Order by does not work as I expect
Are there regional foods in Westeros?
The qvolume of an integer
Self-Preservation: How to DM NPCs that Love Living?
Bringing Food from Hometown for Out-of-Town Interview?
Asking bank to reduce APR instead of increasing credit limit
What are the problems in teaching guitar via Skype?
When was the word "ambigu" first used with the sense of "meal with all items served at the same time"?
Explain Ant-Man's "not it" scene from Avengers: Endgame
Is there a faster way to insert records to postgresql database while iterating over very large ndarray?
Update VERY LARGE PostgreSQL database table efficientlyIs there a way to iterate over a dictionary?Postgresql inserts stop at random number of recordsselect “step” recordsWays to iterate over a list in JavaWhats the best way to iterate over circular wrapped indices?Get ID of new record in function trigger before insert on postgreSQLredis vs postgresql for wildcard search of large number of recordsVery large variation of the time elapsed when a small write to PostgresqlNumpy random array limited by other arrays
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I am trying to loop over ndarray to record index and value of it to postgresql. Here is my code:
for idx, val in enumerate(data):
cur.execute("INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES (%s, %s, %s, %s, %s)", (idx+1, spamlabel, 0, 0, dt.now()))
The size of ndarray is 762k and it tooks more than 8h to insert those values. Is there any more efficient way to do this?
python-3.x postgresql iteration numpy-ndarray
add a comment |
I am trying to loop over ndarray to record index and value of it to postgresql. Here is my code:
for idx, val in enumerate(data):
cur.execute("INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES (%s, %s, %s, %s, %s)", (idx+1, spamlabel, 0, 0, dt.now()))
The size of ndarray is 762k and it tooks more than 8h to insert those values. Is there any more efficient way to do this?
python-3.x postgresql iteration numpy-ndarray
that doesn't have anything to do with numpy, only the strategy you use with the database library. Which library do you use here? Almost any up-to-date library should support batched INSERT's, which is the way to go here.
– Ancoron
Mar 24 at 11:27
I am using psycopg2 for postgresql. How can I do batched INSERT's with it according to my ndarray?
– Mert Koç
Mar 24 at 11:39
add a comment |
I am trying to loop over ndarray to record index and value of it to postgresql. Here is my code:
for idx, val in enumerate(data):
cur.execute("INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES (%s, %s, %s, %s, %s)", (idx+1, spamlabel, 0, 0, dt.now()))
The size of ndarray is 762k and it tooks more than 8h to insert those values. Is there any more efficient way to do this?
python-3.x postgresql iteration numpy-ndarray
I am trying to loop over ndarray to record index and value of it to postgresql. Here is my code:
for idx, val in enumerate(data):
cur.execute("INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES (%s, %s, %s, %s, %s)", (idx+1, spamlabel, 0, 0, dt.now()))
The size of ndarray is 762k and it tooks more than 8h to insert those values. Is there any more efficient way to do this?
python-3.x postgresql iteration numpy-ndarray
python-3.x postgresql iteration numpy-ndarray
asked Mar 24 at 11:02
Mert KoçMert Koç
253
253
that doesn't have anything to do with numpy, only the strategy you use with the database library. Which library do you use here? Almost any up-to-date library should support batched INSERT's, which is the way to go here.
– Ancoron
Mar 24 at 11:27
I am using psycopg2 for postgresql. How can I do batched INSERT's with it according to my ndarray?
– Mert Koç
Mar 24 at 11:39
add a comment |
that doesn't have anything to do with numpy, only the strategy you use with the database library. Which library do you use here? Almost any up-to-date library should support batched INSERT's, which is the way to go here.
– Ancoron
Mar 24 at 11:27
I am using psycopg2 for postgresql. How can I do batched INSERT's with it according to my ndarray?
– Mert Koç
Mar 24 at 11:39
that doesn't have anything to do with numpy, only the strategy you use with the database library. Which library do you use here? Almost any up-to-date library should support batched INSERT's, which is the way to go here.
– Ancoron
Mar 24 at 11:27
that doesn't have anything to do with numpy, only the strategy you use with the database library. Which library do you use here? Almost any up-to-date library should support batched INSERT's, which is the way to go here.
– Ancoron
Mar 24 at 11:27
I am using psycopg2 for postgresql. How can I do batched INSERT's with it according to my ndarray?
– Mert Koç
Mar 24 at 11:39
I am using psycopg2 for postgresql. How can I do batched INSERT's with it according to my ndarray?
– Mert Koç
Mar 24 at 11:39
add a comment |
1 Answer
1
active
oldest
votes
Use psycopg2's execute_values
helper method and also provide constants to limit the data we have to transfer, e.g.:
from psycopg2 import extras
extras.execute_values(
cur,
"INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES %s",
enumerate(data),
template = "(%s + 1, %s, 0, 0, CURRENT_TIMESTAMP)")
You can also experiment with the page_size
parameter for further throughput tuning.
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
1
Sorry, I was usingexecute_batch
first. Updated for theVALUES
template.
– Ancoron
Mar 25 at 5:44
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
1
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55323126%2fis-there-a-faster-way-to-insert-records-to-postgresql-database-while-iterating-o%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use psycopg2's execute_values
helper method and also provide constants to limit the data we have to transfer, e.g.:
from psycopg2 import extras
extras.execute_values(
cur,
"INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES %s",
enumerate(data),
template = "(%s + 1, %s, 0, 0, CURRENT_TIMESTAMP)")
You can also experiment with the page_size
parameter for further throughput tuning.
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
1
Sorry, I was usingexecute_batch
first. Updated for theVALUES
template.
– Ancoron
Mar 25 at 5:44
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
1
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
add a comment |
Use psycopg2's execute_values
helper method and also provide constants to limit the data we have to transfer, e.g.:
from psycopg2 import extras
extras.execute_values(
cur,
"INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES %s",
enumerate(data),
template = "(%s + 1, %s, 0, 0, CURRENT_TIMESTAMP)")
You can also experiment with the page_size
parameter for further throughput tuning.
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
1
Sorry, I was usingexecute_batch
first. Updated for theVALUES
template.
– Ancoron
Mar 25 at 5:44
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
1
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
add a comment |
Use psycopg2's execute_values
helper method and also provide constants to limit the data we have to transfer, e.g.:
from psycopg2 import extras
extras.execute_values(
cur,
"INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES %s",
enumerate(data),
template = "(%s + 1, %s, 0, 0, CURRENT_TIMESTAMP)")
You can also experiment with the page_size
parameter for further throughput tuning.
Use psycopg2's execute_values
helper method and also provide constants to limit the data we have to transfer, e.g.:
from psycopg2 import extras
extras.execute_values(
cur,
"INSERT INTO public.spams(review_id, label, confidence_level, aoc, created_at) VALUES %s",
enumerate(data),
template = "(%s + 1, %s, 0, 0, CURRENT_TIMESTAMP)")
You can also experiment with the page_size
parameter for further throughput tuning.
edited Mar 25 at 5:43
answered Mar 24 at 12:13
AncoronAncoron
1,3101312
1,3101312
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
1
Sorry, I was usingexecute_batch
first. Updated for theVALUES
template.
– Ancoron
Mar 25 at 5:44
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
1
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
add a comment |
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
1
Sorry, I was usingexecute_batch
first. Updated for theVALUES
template.
– Ancoron
Mar 25 at 5:44
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
1
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
I have tried what you suggest but extras.execute_values accepts only one %s placeholder in the query. How can I add %s+1 as review_id and %s as val of data?
– Mert Koç
Mar 24 at 23:01
1
1
Sorry, I was using
execute_batch
first. Updated for the VALUES
template.– Ancoron
Mar 25 at 5:44
Sorry, I was using
execute_batch
first. Updated for the VALUES
template.– Ancoron
Mar 25 at 5:44
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
Thanks for the answer and I think this will work and will try soon but I have another problem here. During the execution, if the server closes the connection or any error occurs the inserts won't be committed and data will be lost. Do you have any suggestions on how can I commit records like part by part (thousands of data)?
– Mert Koç
Mar 25 at 15:09
1
1
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
Sorry for the late reply. In that case, you have to split up the data beforehand in chunks and iterate over them. Also, you'd have to make sure that your Python app can "remember" where it left off (which chunk has been committed already).
– Ancoron
Mar 28 at 21:59
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55323126%2fis-there-a-faster-way-to-insert-records-to-postgresql-database-while-iterating-o%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
that doesn't have anything to do with numpy, only the strategy you use with the database library. Which library do you use here? Almost any up-to-date library should support batched INSERT's, which is the way to go here.
– Ancoron
Mar 24 at 11:27
I am using psycopg2 for postgresql. How can I do batched INSERT's with it according to my ndarray?
– Mert Koç
Mar 24 at 11:39