0
typedef struct node node;
struct node
{
  unsigned long long  distance;
  int edges;
  int* edgesL;
  node **next;
  node *qnext;
  int limit;
};

So guys I have this struct over here, Its a node of a graph and I am trying to do things in just C for now. So what I am trying to do is create a graph using 100 000 nodes problem is, it simply does not work when there are that many nodes. It works fine for something between 10^4 and 10^5 nodes but when it gets close to 10^5 the program just crashes. Well i thought the obvious of course "Am I using too much memory?" so I did some work on reducing waste of memory but in the end I realized that by simply creating those 10^5 pointers the program would crash anyway. What is hard to explain is that for example, if I want to access 99194 node of 100000 graph it could work or it could not, when rolling through all the inputs, checking one of these inputs the program will crash (always a high number near 10^5 but no at the first one that appears)

Here is the function that creates the nodes: (N is a node that it is connected to all other nodes)

  typedef node *position;

        void create_node(position N, int i)
      {
        N->next[i] = (node*) malloc(sizeof(node));
        N->next[i]->edgesL = (int*) malloc(4*sizeof(int));///This contain information of the distance of the edges
        N->next[i]->next = (position*) malloc(4*sizeof(position));///Vesctor of pointers, it indicates to which nodes this node is connected
        N->next[i]->limit = 10;
        N->next[i]->edges = 0;
        N->next[i]->distance = MAX;
      }

And here is part of the main that i think is important:(limit break just reallocates memory to increase the number of edges, it is rarely called, and the program crashes even with small numbers of edges per node, it has nothing to do with the crashes)

       N->next = (position*) malloc((n+1)*sizeof(position));///Here I create this node that points to all the other nodes.

    for (i = 0; i <= n; i++)
    {
       N->next[i] = NULL;
    }
    printf("oi");
    for(i = 1; i <= n-1; i++)
    {
        scanf("%lu %llu",&v2,&weight);
        getchar();
        if (N->next[v2] == NULL)create_node(N,v2);
        if (N->next[i]  == NULL)create_node(N,i);

            ne1 = N->next[i]->edges;
            ne2 = N->next[v2]->edges;
            l1 = N->next[i]->limit;
            l2 = N->next[v2]->limit;

            printf("limit = %lu\n",l2);
            printf("v2 n edges = %d\n",ne2);

            limit_break(N->next[i]);
            limit_break(N->next[v2]);

            N->next[i]->edgesL[ne1] = weight;
            printf("a\n");
            N->next[v2]->edgesL[ne2] = weight;
            printf("b\n");
            N->next[i]->edges++;
            printf("c\n");
            N->next[v2]->edges++;
            printf("d\n");
            N->next[i]->next[ne1] = N->next[v2];
            printf("e\n");
            N->next[v2]->next[ne2] = N->next[i];

       }

Those printfs is me trying to figure out what is wrong, they only confused me more because sometimes it crashes after an a, or b or c or d, depending in the test case I use.

So the question is: why with "small" number of nodes ( < 10^4 ) the program runs just fine and why with number larger than this many nodes ( > 10^4) the behavior is so random? Hope I am not being too much of a idiot and tell me if you guys need more information. I thank you in advance, if there is any doubt ask me if I can clear it...

5
  • 2
    For starters, always check the return value of malloc. Especially in this case where there is possibility of allocation failures. Commented Jul 20, 2015 at 6:06
  • Your create_node function assumes that N already points to a valid node. Does it? You have not shown where N itself is initialised. Commented Jul 20, 2015 at 6:11
  • Estimate size of a single node, multiply by expected number of nodes, compare with available memory, check with memory reporting tools (Task manager on Windows) Commented Jul 20, 2015 at 6:29
  • The smaller the amount of nodes the smaller the memory is that you are taking up on your machine. If you are using 10^4 nodes fine but 10^5 Nodes does not return as expected then this would imply that you are running out of memory - both for creating the nodes and memory held for static variables. Also as @AlanAu mentioned, it would be helpful instead of making printf statements to check the data -> you might want to see if malloc was successful in creating your nodes (i.e. if (!N->next[i]) { // memory check here} Commented Jul 20, 2015 at 6:42
  • Well, I did put those checks, but they dont show me that malloc failed...but the code is failing after some number of inputs in acessing N->next[v2]->edges, like it does not exist but I put an if(N->next[v2] == NULL) printf ("end of mem") and nothing was printed before. Commented Jul 20, 2015 at 7:51

1 Answer 1

1

You need to know that malloc() uses some memory for its internal bookkeeping, up to several tenth of words per individual call to malloc(). If you allocate many small objects, this overhead can have a dramatic cost. The other bad effect of allocating many small objects is memory fragmentation.

A lot of memory can be saved by packing your graph structure into continuous arrays and use integer indices instead of pointers. You can find some examples of such data structures used for storing sparse matrices (see for instance the Compressed Row Storage format in https://en.wikipedia.org/wiki/Sparse_matrix). If you need to be able to dynamically update your graph, then its more difficult, but it is still doable.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.