OpenCL Pipeline failed to allocate buffer with cl_mem_object_allocation_failureHow do I determine available device memory in OpenCL?OpenCL - Multiple GPU Buffer SynchronizationOpenCL same code different results: 1.Nvidia760 2.Nvidia560 3.VivanteOpenCL - what happens if GPU memory is larger than system RAMGot completely confused on how to OpenCL data transferDirect frame buffer access using OpenCL and SWTHow to create read-only memory buffer across multiple devices in OpenCL?With OpenCL, How to get GPU memory usage?OpenCL: How would one split an existing buffer into two?trying to compile OpenCL 1.2 exampleUsing structure as buffer holder

Are the A380 engines interchangeable (given they are not all equipped with reverse)?

What is the best type of paint to paint a shipping container?

What verb is かまされる?

Lost property on Portuguese trains

Round towards zero

How many US airports have 4 or more parallel runways?

Would it be possible to have a GMO that produces chocolate?

Where was Carl Sagan working on a plan to detonate a nuke on the Moon? Where was he applying when he leaked it?

Did a flight controller ever answer Flight with a no-go?

Prove your innocence

Sum ergo cogito?

Read file lines into shell line separated by space

Can a Rogue PC teach an NPC to perform Sneak Attack?

Sql server sleeping state is increasing using ADO.NET?

How do I, an introvert, communicate to my friend and only colleague, an extrovert, that I want to spend my scheduled breaks without them?

Is gzip atomic?

Why in most German places is the church the tallest building?

Are modern clipless shoes and pedals that much better than toe clips and straps?

Uri tokenizer as a simple state machine

How do the Etherealness and Banishment spells interact?

Did anyone try to find the little box that held Professor Moriarty and his wife after the crash?

Why doesn't 'd /= d' throw a division by zero exception?

Is there any practical application for performing a double Fourier transform? ...or an inverse Fourier transform on a time-domain input?

Nothing like a good ol' game of ModTen

OpenCL Pipeline failed to allocate buffer with cl_mem_object_allocation_failure

How do I determine available device memory in OpenCL?OpenCL - Multiple GPU Buffer SynchronizationOpenCL same code different results: 1.Nvidia760 2.Nvidia560 3.VivanteOpenCL - what happens if GPU memory is larger than system RAMGot completely confused on how to OpenCL data transferDirect frame buffer access using OpenCL and SWTHow to create read-only memory buffer across multiple devices in OpenCL?With OpenCL, How to get GPU memory usage?OpenCL: How would one split an existing buffer into two?trying to compile OpenCL 1.2 exampleUsing structure as buffer holder

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I have an OpenCL pipeline that process image/video and it can be greedy with the memory sometimes. It is crashing on cl::Buffer() allocation like this:

cl_int err = CL_SUCCESS;
cl::Buffer tmp = cl::Buffer(m_context, CL_MEM_READ_WRITE, sizeData, NULL, &err);

with the error -4 - cl_mem_object_allocation_failure.

This occurs at a fix point in my pipeline by using very large images. If I just downscale the image a bit, it pass through the pipeline at this very memory intensive part.

I have access to a Nvidia card with 4go that bust at a certain point, and also tried on an AMD GPU with 2go which bust earlier.

According to this thread, there is no need to know the current allocation due to swapping with VRAM, but it seems that my pipeline bust the memory of my device.

So here are my question:

1) Is there any settings on my computer, or pipeline to set to allow more VRAM ?

2) Is it okay to use CL_DEVICE_GLOBAL_MEM_SIZE as reference of the maximum size to allocate, or I need to do CL_DEVICE_GLOBAL_MEM_SIZE - (local memory + private), or something like that ?

According to my own memory profiler, I have 92% of the CL_DEVICE_GLOBAL_MEM_SIZE allocated at the crash. And by resizing a bit, the pipeline says that I used 89% on the resized image and it passed, so I assume that my large image is on the edge to pass.

edited Mar 27 at 18:55

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

1

You can use host memory instead i.e. clCreateBuffer( ... | CL_MEM_USE_HOST_PTR , ..., size, host_ptr, ...);

– Victor Gubin
Mar 27 at 17:31

@VictorGubin By using Host memory, I need to provide a pointer on the host, but If I want to allocate it only on the device, because I never need it on host (aka host_ptr == NULL) what should I do ? I would like to avoid the transfer as much as possible.

– Vuwox
Mar 27 at 17:49

pointer is a memory address aka size_t (unsigned long or unsigned long long depending on CPU arch ). When you creating a cl buffer using a host ptr - it means GPU will use existing memory ( i.e. RAM) at the address of pointer, and nothing will be copied from RAM to VRAM. Otherwise clCreateBuffer will allocate a memory block in VRAM, and return you a pointer on the memory block allocated.

– Victor Gubin
Mar 28 at 11:38

Yes, I understand all of this. But by allocating using CL_MEM_USE_HOST_PTR, I first need to allocate something on CPU to point onto, and when calling the buffer, I need to wait the bandwitdh for the transfer of that memory on the GPU device, but by allocating without the flag, and NULL pointer, its allocating it directly on the device without any transfer requires, which is super fast, specially when you need memory to reside on the device and never query on the host. But Im just wondering if there is a way to tell when to stop allocate like that, and switch to CL_MEM_USE_HOST_PTR maybe.

– Vuwox
Mar 28 at 13:17

add a comment |

I have an OpenCL pipeline that process image/video and it can be greedy with the memory sometimes. It is crashing on cl::Buffer() allocation like this:

cl_int err = CL_SUCCESS;
cl::Buffer tmp = cl::Buffer(m_context, CL_MEM_READ_WRITE, sizeData, NULL, &err);

with the error -4 - cl_mem_object_allocation_failure.

This occurs at a fix point in my pipeline by using very large images. If I just downscale the image a bit, it pass through the pipeline at this very memory intensive part.

I have access to a Nvidia card with 4go that bust at a certain point, and also tried on an AMD GPU with 2go which bust earlier.

According to this thread, there is no need to know the current allocation due to swapping with VRAM, but it seems that my pipeline bust the memory of my device.

So here are my question:

1) Is there any settings on my computer, or pipeline to set to allow more VRAM ?

2) Is it okay to use CL_DEVICE_GLOBAL_MEM_SIZE as reference of the maximum size to allocate, or I need to do CL_DEVICE_GLOBAL_MEM_SIZE - (local memory + private), or something like that ?

edited Mar 27 at 18:55

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

1

You can use host memory instead i.e. clCreateBuffer( ... | CL_MEM_USE_HOST_PTR , ..., size, host_ptr, ...);

– Victor Gubin
Mar 27 at 17:31

@VictorGubin By using Host memory, I need to provide a pointer on the host, but If I want to allocate it only on the device, because I never need it on host (aka host_ptr == NULL) what should I do ? I would like to avoid the transfer as much as possible.

– Vuwox
Mar 27 at 17:49

pointer is a memory address aka size_t (unsigned long or unsigned long long depending on CPU arch ). When you creating a cl buffer using a host ptr - it means GPU will use existing memory ( i.e. RAM) at the address of pointer, and nothing will be copied from RAM to VRAM. Otherwise clCreateBuffer will allocate a memory block in VRAM, and return you a pointer on the memory block allocated.

– Victor Gubin
Mar 28 at 11:38

Yes, I understand all of this. But by allocating using CL_MEM_USE_HOST_PTR, I first need to allocate something on CPU to point onto, and when calling the buffer, I need to wait the bandwitdh for the transfer of that memory on the GPU device, but by allocating without the flag, and NULL pointer, its allocating it directly on the device without any transfer requires, which is super fast, specially when you need memory to reside on the device and never query on the host. But Im just wondering if there is a way to tell when to stop allocate like that, and switch to CL_MEM_USE_HOST_PTR maybe.

– Vuwox
Mar 28 at 13:17

add a comment |

I have an OpenCL pipeline that process image/video and it can be greedy with the memory sometimes. It is crashing on cl::Buffer() allocation like this:

cl_int err = CL_SUCCESS;
cl::Buffer tmp = cl::Buffer(m_context, CL_MEM_READ_WRITE, sizeData, NULL, &err);

with the error -4 - cl_mem_object_allocation_failure.

This occurs at a fix point in my pipeline by using very large images. If I just downscale the image a bit, it pass through the pipeline at this very memory intensive part.

I have access to a Nvidia card with 4go that bust at a certain point, and also tried on an AMD GPU with 2go which bust earlier.

According to this thread, there is no need to know the current allocation due to swapping with VRAM, but it seems that my pipeline bust the memory of my device.

So here are my question:

1) Is there any settings on my computer, or pipeline to set to allow more VRAM ?

2) Is it okay to use CL_DEVICE_GLOBAL_MEM_SIZE as reference of the maximum size to allocate, or I need to do CL_DEVICE_GLOBAL_MEM_SIZE - (local memory + private), or something like that ?

edited Mar 27 at 18:55

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

I have an OpenCL pipeline that process image/video and it can be greedy with the memory sometimes. It is crashing on cl::Buffer() allocation like this:

cl_int err = CL_SUCCESS;
cl::Buffer tmp = cl::Buffer(m_context, CL_MEM_READ_WRITE, sizeData, NULL, &err);

with the error -4 - cl_mem_object_allocation_failure.

This occurs at a fix point in my pipeline by using very large images. If I just downscale the image a bit, it pass through the pipeline at this very memory intensive part.

I have access to a Nvidia card with 4go that bust at a certain point, and also tried on an AMD GPU with 2go which bust earlier.

According to this thread, there is no need to know the current allocation due to swapping with VRAM, but it seems that my pipeline bust the memory of my device.

So here are my question:

1) Is there any settings on my computer, or pipeline to set to allow more VRAM ?

2) Is it okay to use CL_DEVICE_GLOBAL_MEM_SIZE as reference of the maximum size to allocate, or I need to do CL_DEVICE_GLOBAL_MEM_SIZE - (local memory + private), or something like that ?

c++ opencl

edited Mar 27 at 18:55

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

edited Mar 27 at 18:55

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

edited Mar 27 at 18:55

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

asked Mar 27 at 17:14

Vuwox

1,97413 silver badges28 bronze badges

1

You can use host memory instead i.e. clCreateBuffer( ... | CL_MEM_USE_HOST_PTR , ..., size, host_ptr, ...);

– Victor Gubin
Mar 27 at 17:31

@VictorGubin By using Host memory, I need to provide a pointer on the host, but If I want to allocate it only on the device, because I never need it on host (aka host_ptr == NULL) what should I do ? I would like to avoid the transfer as much as possible.

– Vuwox
Mar 27 at 17:49

pointer is a memory address aka size_t (unsigned long or unsigned long long depending on CPU arch ). When you creating a cl buffer using a host ptr - it means GPU will use existing memory ( i.e. RAM) at the address of pointer, and nothing will be copied from RAM to VRAM. Otherwise clCreateBuffer will allocate a memory block in VRAM, and return you a pointer on the memory block allocated.

– Victor Gubin
Mar 28 at 11:38

Yes, I understand all of this. But by allocating using CL_MEM_USE_HOST_PTR, I first need to allocate something on CPU to point onto, and when calling the buffer, I need to wait the bandwitdh for the transfer of that memory on the GPU device, but by allocating without the flag, and NULL pointer, its allocating it directly on the device without any transfer requires, which is super fast, specially when you need memory to reside on the device and never query on the host. But Im just wondering if there is a way to tell when to stop allocate like that, and switch to CL_MEM_USE_HOST_PTR maybe.

– Vuwox
Mar 28 at 13:17

add a comment |

1

You can use host memory instead i.e. clCreateBuffer( ... | CL_MEM_USE_HOST_PTR , ..., size, host_ptr, ...);

– Victor Gubin
Mar 27 at 17:31

@VictorGubin By using Host memory, I need to provide a pointer on the host, but If I want to allocate it only on the device, because I never need it on host (aka host_ptr == NULL) what should I do ? I would like to avoid the transfer as much as possible.

– Vuwox
Mar 27 at 17:49

pointer is a memory address aka size_t (unsigned long or unsigned long long depending on CPU arch ). When you creating a cl buffer using a host ptr - it means GPU will use existing memory ( i.e. RAM) at the address of pointer, and nothing will be copied from RAM to VRAM. Otherwise clCreateBuffer will allocate a memory block in VRAM, and return you a pointer on the memory block allocated.

– Victor Gubin
Mar 28 at 11:38

Yes, I understand all of this. But by allocating using CL_MEM_USE_HOST_PTR, I first need to allocate something on CPU to point onto, and when calling the buffer, I need to wait the bandwitdh for the transfer of that memory on the GPU device, but by allocating without the flag, and NULL pointer, its allocating it directly on the device without any transfer requires, which is super fast, specially when you need memory to reside on the device and never query on the host. But Im just wondering if there is a way to tell when to stop allocate like that, and switch to CL_MEM_USE_HOST_PTR maybe.

– Vuwox
Mar 28 at 13:17

You can use host memory instead i.e. clCreateBuffer( ... | CL_MEM_USE_HOST_PTR , ..., size, host_ptr, ...);

– Victor Gubin
Mar 27 at 17:31

@VictorGubin By using Host memory, I need to provide a pointer on the host, but If I want to allocate it only on the device, because I never need it on host (aka host_ptr == NULL) what should I do ? I would like to avoid the transfer as much as possible.

– Vuwox
Mar 27 at 17:49

pointer is a memory address aka size_t (unsigned long or unsigned long long depending on CPU arch ). When you creating a cl buffer using a host ptr - it means GPU will use existing memory ( i.e. RAM) at the address of pointer, and nothing will be copied from RAM to VRAM. Otherwise clCreateBuffer will allocate a memory block in VRAM, and return you a pointer on the memory block allocated.

– Victor Gubin
Mar 28 at 11:38

Yes, I understand all of this. But by allocating using CL_MEM_USE_HOST_PTR, I first need to allocate something on CPU to point onto, and when calling the buffer, I need to wait the bandwitdh for the transfer of that memory on the GPU device, but by allocating without the flag, and NULL pointer, its allocating it directly on the device without any transfer requires, which is super fast, specially when you need memory to reside on the device and never query on the host. But Im just wondering if there is a way to tell when to stop allocate like that, and switch to CL_MEM_USE_HOST_PTR maybe.

– Vuwox
Mar 28 at 13:17

add a comment |

1 Answer
1

active

oldest

votes

Some parts of your device's VRAM may be used for the pixel buffer, constant memory, or other uses. For AMD cards, you can set the environment variables GPU_MAX_HEAP_SIZE and GPU_MAX_ALLOC_PERCENT to use a larger part of the VRAM, though this may have unintended side-effects. Both are expressed as percentages of your physically available memory on the card. Additionally, there is a limit on the size for each memory allocation. You can get the maximum size for a single memory allocation by querying CL_DEVICE_MAX_MEM_ALLOC_SIZE, which may be less than CL_DEVICE_GLOBAL_MEM_SIZE. For AMD cards, this size can be controlled with GPU_SINGLE_ALLOC_PERCENT. This requires no changes to your code, simply set the variables before you call your executable:

GPU_MAX_ALLOC_PERCENT="100"
GPU_MAX_HEAP_SIZE="100"
GPU_SINGLE_ALLOC_PERCENT="100"
./your_program

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55383011%2fopencl-pipeline-failed-to-allocate-buffer-with-cl-mem-object-allocation-failure%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

GPU_MAX_ALLOC_PERCENT="100"
GPU_MAX_HEAP_SIZE="100"
GPU_SINGLE_ALLOC_PERCENT="100"
./your_program

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

add a comment |

GPU_MAX_ALLOC_PERCENT="100"
GPU_MAX_HEAP_SIZE="100"
GPU_SINGLE_ALLOC_PERCENT="100"
./your_program

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

add a comment |

GPU_MAX_ALLOC_PERCENT="100"
GPU_MAX_HEAP_SIZE="100"
GPU_SINGLE_ALLOC_PERCENT="100"
./your_program

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

GPU_MAX_ALLOC_PERCENT="100"
GPU_MAX_HEAP_SIZE="100"
GPU_SINGLE_ALLOC_PERCENT="100"
./your_program

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

answered Mar 31 at 16:56

Jan-Gerd

7923 silver badges6 bronze badges

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

1 Answer
1

1 Answer
1

1 Answer
1