'ctypes allocate more memory to the stack

I have a c-dll I call from python. The output from the dll is quite large and I suspect this causes an error

OSError: exception: stack overflow

I'm pretty sure the problem is the size of the output (roughly 4x25x720 doubles). Reducing the size of the output (which I do not want to do) makes the error go away.

In C# I can get around this problem by allocating more memory to the calling thread, i.e,

thread = new Thread(() => calculate(ptr_in, ptr_out), 20000000); 

Is it possible to do something similar with ctypes?

This is NOT the problem posted here Ctypes: OSError: exception: stack overflow.


EDIT

Reflecting on the problem, I don't think the problem is the size of the output, but rather the space needed by the actual dll itself. I.e c_out inner_out defined in ctypes_test.c. Regardless, the problem is still the same.


In C I define a test dll dll_ctypes_test

ctypes_testT.h

#pragma once
#define N_ELEMENTS 1000
#define N_ARRAYS 50

typedef struct
{
    double var_0;
    double var_1;
    double var_2;
    double var_3;
    double var_4;
    double var_5;
    double var_6;
    double var_7;
    double var_8;
    double var_9;
} element;

typedef struct
{
    int n_elements;
    element elements[N_ELEMENTS];
} arr;

typedef struct
{
    int n_arrays;
    arr arrays[N_ARRAYS];
} c_out;

ctypes_test.c

#include "ctypes_testT.h"

__declspec(dllexport) void _stdcall dll_ctypes_test(double in, c_out *out)
{
    c_out inner_out;

    //Some caluclations on inner_out
    //Wrap values of inner arr to out
}

And the Python code

import ctypes

N_ELEMENTS = 1000
N_ARRAYS = 50

class element(ctypes.Structure):
    _fields_ = [('var_0', ctypes.c_double),
                ('var_1', ctypes.c_double),
                ('var_2', ctypes.c_double),
                ('var_3', ctypes.c_double),
                ('var_4', ctypes.c_double),
                ('var_5', ctypes.c_double),
                ('var_6', ctypes.c_double),
                ('var_7', ctypes.c_double),
                ('var_8', ctypes.c_double),
                ('var_9', ctypes.c_double)] 

class arr(ctypes.Structure):
    _fields_ = [('n_elements', ctypes.c_int),
                ('elements', element * N_ELEMENTS)] 

class c_out(ctypes.Structure):
    _fields_ = [('n_arrays', ctypes.c_int),
                ('arrays', arr * N_ARRAYS)]     

dll = ctypes.WinDLL(r'C:\repos\ctypes_test\x64\Debug\ctypes_test.dll')

dll.dll_ctypes_test.argtypes = [ctypes.c_double, ctypes.POINTER(c_out)]  
dll.dll_ctypes_test.restype = None

dll.dll_ctypes_test(5, ctypes.byref(c_out()))

Calling the Python code produces

Traceback (most recent call last):

  File "<ipython-input-15-7c8b287888d0>", line 1, in <module>
   dll.dll_ctypes_test(5, c_out())

OSError: exception: access violation writing 0x00000062BA400000

If I change N_ARRAYS from 50 to, say, 10. The error goes away.



Solution 1:[1]

Listing [Python.Docs]: ctypes - A foreign function library for Python.

I must say that I wasn't able to reproduce the behavior (even without fixing the errors below), using either "regular" Python or IPython. Maybe there's more to dll_ctypes_test implementation than meets the eye.

Current issues:

  1. dll_ctypes_test expects a c_out pointer, but your'e passing a plain c_out instance. You should use ctypes.byref (or ctypes.pointer). Don't know why CTypes doesn't complain because of it
  2. C and Python structure definitions don't match. One example is arr which contains an element array in C, and an element pointer array (ctypes.POINTER) in Python. This is Undefined Behavior, the 2 must be in sync
  3. You marked the export function as __stdcall, but you're loading the .dll with CDLL. You should use WinDLL. But since you're on 064bit (based on your paths), this doesn't make too much of a difference

Below it's an example (a modified version of your code).

dll00.h:

#pragma once

#if defined(_WIN32)
#  if defined DLL00_EXPORTS
#    define DLL00_EXPORT_API __declspec(dllexport)
#  else
#    define DLL00_EXPORT_API __declspec(dllimport)
#  endif
#else
#  define DLL00_EXPORT_API
#endif

#define ELEMENT_COUNT 1000
#define ARRAY_COUNT 50


typedef struct {
    double var0, var1, var2, var3, var4,
         var5, var6, var7, var8, var9;
} Element;


typedef struct {
    int size;
    Element data[ELEMENT_COUNT];
} Array1D;


typedef struct {
    int size;
    Array1D data[ARRAY_COUNT];
} Array2D;


#if defined(__cplusplus)
extern "C" {
#endif

DLL00_EXPORT_API void __stdcall dll00Func00(double in, Array2D *pOut);

#if defined(__cplusplus)
}
#endif

dll00.c:

#define DLL00_EXPORTS
#include "dll00.h"

#include <stdio.h>


void dll00Func00(double in, Array2D *pOut) {
    if (pOut == NULL) {
        printf("From C - NULL array passed\n");
        return;
    };
    Array2D arr2d;
    printf("From C - Outer array size: %d\n", pOut->size);
}

code00.py:

#!/usr/bin/env python

import ctypes as ct
import sys


ELEMENT_COUNT = 1000
ARRAY_COUNT = 50


class Element(ct.Structure):
    _fields_ = list(("var{0:d}".format(i), ct.c_double) for i in range(10))


class Array1D(ct.Structure):
    _fields_ = [
        ("size", ct.c_int),
        ("data", Element * ELEMENT_COUNT),
    ]


class Array2D(ct.Structure):
    _fields_ = [
        ("size", ct.c_int),
        ("data", Array1D * ARRAY_COUNT),
    ]


DLL_NAME = "./dll00.{:s}".format("dll" if sys.platform[:3].lower() == "win" else "so")


def main(*argv):
    dll00 = ct.WinDLL(DLL_NAME)
    dll00Func00 = dll00.dll00Func00
    dll00Func00.argtypes = (ct.c_double, ct.POINTER(Array2D))
    #dll00Func00.argtypes = (ct.c_double, Array2D)  # !!! Defining the 2nd argument without POINTER, triggers the error !!!
    

    mat = Array2D()
    mat.size = 7
    #print(ct.sizeof(Element), Element._fields_, dir(Element))
    print("Array sizeof: {0:d} (0x{1:08X})".format(ct.sizeof(mat), ct.sizeof(mat)))

    dll00Func00(5, ct.byref(mat))
    #dll00Func00(5, mat)


if __name__ == "__main__":
    print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
                                                   64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    rc = main(*sys.argv[1:])
    print("\nDone.")
    sys.exit(rc)

Output:

e:\Work\Dev\StackOverflow\q060297181>sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[prompt]> "c:\Install\pc032\Microsoft\VisualStudioCommunity\2017\VC\Auxiliary\Build\vcvarsall.bat" x64 > nul

[prompt]> dir /b
code00.py
dll00.h
dll00.c

[prompt]> cl /nologo /MD /DDLL dll00.c  /link /NOLOGO /DLL /OUT:dll00.dll
dll00.c
   Creating library dll00.lib and object dll00.exp

[prompt]> dir /b
code00.py
dll00.h
dll00.c
dll00.dll
dll00.exp
dll00.lib
dll00.obj

[prompt]>
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07_test0\Scripts\python.exe" code00.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 064bit on win32

Array sizeof: 4000408 (0x003D0A98)
From C - Outer array size: 7

Done.

When calling a function or method (there are some exceptions, but those are irrelevant here), the stack (a special memory area) is used for storage. Common stuff being stored:

  • Arguments (and return value), which represent data exchanged between the caller and the callee
  • Local variables (in the callee), which are neither static, nor explicitly allocated on the heap (via malloc, new, ...)
  • Other data (invisible to the programmer, like Instruction Pointer, ...)

As expected, the stack is limited, so it can store a maximum amount of data. When the amount of data needed to be stored exceeds the maximum, Stack Overflow occurs (the most common scenario for the occurrence is during recursion, when there are too many recurring calls that need to store too much data).

Maximum stack size is determined by each application build options, and defaults vary depending on compiler, OS, etc. From [MS.Docs]: /STACK (Stack Allocations) (emphasis is mine):

The reserve value specifies the total stack allocation in virtual memory. For ARM, x86 and x64 machines, the default stack size is 1 MB.

Same info can be found at [MS.Docs]: /F (Set Stack Size).

As seen, Array2D takes nearly 4 MiB (so it wouldn't fit if / when attempted to be stored on the stack).
As I specified in the comment (from code00.py), defining dll00Func00's 2ndargtype without ct.POINTER triggers the error. Maybe there's such a typo in the code you're actually running?

Anyway, some general guidelines to avoid this error:

  • Avoid passing (as arguments / return type) large amounts of data by value (the last 2 subpoints also apply to locally defined variables):
    • Use pointers (or references in C++)
    • Allocate them dynamically (on the heap)
    • Make them static (less desirable as static data segment is also limited)
  • Make sure recursion doesn't go too deep (where applicable)
  • Increase ("manually") the stack size when building an application

As a side remark, I don't think it's possible to increase (modify) the current process stack size in C under Win (Visual C). It is possible however in C# and also on Nix (setrlimit).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1