Advertisement

Noob pointer question

Started by April 24, 2016 06:01 PM
10 comments, last by owiley 8 years, 7 months ago

Hi guys,

I've been in JavaScript for a very long time for work, and now back to C++ due to excitement of Vulkan. Of course, my hazy memory about pointers already got into me to lots of pointer errors. So here's a quick one...

Why is this working:


int* a = new int(5);
int* pA = a;
delete pA;

But this one isn't:


int a = 5;
int* pA = &a;
delete pA;

I may be wrong, but is it because I'm trying to delete a pointer (is it even a pointer I'm deleting?) allocated in reserved memory or something?

Thanks!

The short answer is: Only 'delete' what you 'new'.

The longer answer is:

You have to think about pointers separately from the memory they point to.

'delete' does not delete the pointer, it's deleting the memory the pointer points to. This also means that even if you have two pointers pointing to the same memory, you're only allowed to delete once.

Your second case allocates 'a' on the stack. pA points to that stack memory. You never need to 'delete' something on the stack, because stack memory gets cleaned up automatically when the function returns.
Advertisement
Yes. It fails because there was not an allocation made for the address of a in the heap.


There are two general memory allocation strategies for objects depending on the lifetime of the objects: Stack and heap.

Stack memory is super quick as it only increments a stack pointer register, but binds the lifetime of the object to the scope it was created in; FILO ordering.

Heap memory is slower in that a registry of allocations must be traversed to find space for the object. Contrast to stack memory, objects can be deallocated at any time-- it is not bound to the scope it was created in. A contrived example would be a particle explosion effect, where the particles fizz out at random times.

There are other allocation strategies as well, such as pool allocations.


Whenever you use a reference or pointer, you must be mindful lifetime of the object it refers to. For instance, while your first example is valid, you must be careful not to use a after deleting pA, as the integer it once referred to no longer exists.



// simple demo of scopes and object lifetimes

struct Foobar { int x; };

int main() {
    Foobar f1;
    Foobar* pf1;
    Foobar* pf2;
    Foobar* pf3;

    /* 1: */ {
        Foobar f2;
        Foobar f3;

        /* 2: */ {
            Foobar f4;

            pf1 = &f4;
            pf2 = new Foobar;
            pf3 = pf1;
        } // f4 destroyed, pf1 and pf3 are invalid!

        // pf3->x = 5;  // Undefined behavior-- you’d be lucky to crash here
    } // f3, f2 destroyed

    delete pf2;  // foobar located at pf2 destroyed
} // f1 destroyed

There are some things you will have to lookup and take your time to get a good understanding of. I'll try to give a short version of that in this post, but it's by no means complete, but I hope it's enough to send you off in the right direction.

Using new and delete in C++ are ways of allocating and deallocating memory. It's true that new returns a pointer and that delete takes a pointer as input, but there is no such thing as deleting a pointer.

What new does is basically request some memory (the size of the given type) and returns something that allows you to access that part of memory, a pointer. A pointer is nothing more than some value indicating a location in memory. If you view your memory as a giant array of bytes, a pointer would just be an index. When this memory is allocated, it'll stay that way until you instruct the computer to deallocate it. Unlike JavaScript, you have to manually manage this memory. You can deallocate memory by calling delete and passing it the pointer you got from new.

So why doesn't it work in your second case? Well there are two types of memory allocation, one is for static allocations and the other for dynamic allocations. The stack is used for static allocation and the heap is for used for dynamic allocations. The stack is basically some memory allocated beforehand, which requires no work from your side, that's used for things like local variables and function arguments. It's called the stack because it's literally how it works. Things are put on top in some order and in then they're taken off in the reverse order. In code that means that at the beginning of a scope (e.g. a function) the necessary space is reserved, and at the end of the scope (e.g. exiting a function) that reservation is lifted. Everything about this is automatic, but the stack has limited size and one other limitation; what if you want to retain the memory outside of the scope? This can't be managed automatically by the stack, it only knows how to put stuff on and then take it off again, it wouldn't be able to know when that part of memory will be free for use again. This is where heap memory comes in, it's memory that you can manually manage and is practically unlimited in size. This manual management is done with new and delete.

In your second example, you are trying to delete memory from the stack, but delete is meant for heap allocations.

Oh gosh, thanks guys!

Just want to re-confirm to myself (please correct me if I conclude this wrong):

If I "delete" pA, it's as if I'm "deleting" a, cause I'm deleting what they are pointing at. This means deleting the memory they are pointing to, NOT the pointer itself (no such thing!).

Since the 2nd case's pA is pointing to a stack memory, deleting it will definitely not work.

1st case works cause I'm allocating it to a heap (with new), and therefore I have the ability to manually delete them, which of course no error to do so. I also can't use a anymore cause I'm deleting the memory that both pA and a pointed to.

@fastcall22

Yeah I've attempted a pool allocation before, but it's already 2 years back when I had to leave country for a while. Now everything went quite blank. :(

Okay now I remember these stack and heap thing. I will re-read these two. :)

Ah, I forgot, just one more. Based on 1st case, are 'a' and 'pA' themselves go to a stack memory for storing values after deletion? I mean, after deleting the memory they are pointing to, they themselves are still there... pointing to zero or whatever value isn't it?

Advertisement

Ah, I forgot, just one more. Based on 1st case, are 'a' and 'pA' themselves go to a stack memory for storing values after deletion? I mean, after deleting the memory they are pointing to, they themselves are still there... pointing to zero or whatever value isn't it?

The pointers themselves are on the stack. They're basically just integers (of whatever size the system uses for pointers) that hold the address. Once you call delete the memory that they're pointing to gets deallocated, but the pointer stays on the stack with whatever value they had before. It's up to you to set them to NULL.

Based on 1st case, are 'a' and 'pA' themselves go to a stack memory for storing values after deletion? I mean, after deleting the memory they are pointing to, they themselves are still there... pointing to zero or whatever value isn't it?


Yes, a variable's value is stored somewhere as well (for a pointer, that value is the integer representing the memory address).

Variable storage depends on how it's declared:

Local variables and function parameters: Stored on the stack and/or in registers. The compiler decides based on how you use the variable and what optimizations it can do.

struct/class member fields: Stored as part of the struct/class (i.e. on the stack if it's a local variable or on the heap if you 'new' it).

Global/static variables: Stored in a non-stack, non-heap region of memory that is cleaned up when the program exits. You aren't allowed to 'delete' memory in this region.

Thread static variables: Stored in thread-local-storage. You aren't allowed to 'delete' memory in this region, either.

An analogy:

The stack is a literal stack of small boxes on your desk. Each box represents a function, with the box on the bottom being main(). Each box contains some pieces of paper that represent the local variables of the function. The values of those variables are written on the papers. When main() calls a function the box for that function gets placed (stacked) on top of main() and its variables are placed in it. When that function exits its box is removed and its variables go along with it, so that the box on top of the stack is always the currently executing function. This is effectively how the stack works.

Now the heap is a filing cabinet in the desk drawer. There are no boxes here. Instead there is a long, ordered stack of re-usable papers. Each one is numbered in order. You cannot remove these papers, but you can look them up by number and read or write any of them. In order to avoid accidentally writing over something that you're storing in there, you have a set of paper clips. When you want to use a paper in the drawer, you put a paper clip on it and then write down its number on a note and put that note in the top box. This is the allocation process. In fact, you have a secretary called 'new' that does this for you. You tell the secretary how many pages you need and the secretary finds an appropriate set of pages, paperclips them and tells you the number of the first one. The paper that you're writing the number on is a pointer. The number itself is an address. When you're done using heap memory you tell the number you wrote down to the secretary (using 'delete') and they remove the clip from that allocation so that it can be used again later.

The important thing to remember is that the pointer is a paper in the box and the address is the number that refers to a page in the drawer.

Where it gets a little confusing is that memory isn't really separated this way. The stack is actually just a data structure that's extending into memory. Pointers are just a special numeric type. It's perfectly legal to have a pointer to a stack variable. It's perfectly legal to have a pointer with any value you want. It's just a piece of paper in the box with a number written on it.

Consider the following:


int* allocation = new int[3]; //reserve enough memory for 3 ints and store the address in 'allocation'
int* ptr = allocation; //copy the address in 'allocation' into 'ptr'
*ptr = 1; //set the value of the memory at 'ptr' to 1
ptr++; //increase the vale of ptr - note that this is making ptr point to a different address, not changing the '1' that we just wrote
*ptr = 2; //set the value of the memory at 'ptr' to 2
ptr++; //increase the vale of ptr
*ptr = 3; //set the value of the memory at 'ptr' to 3
delete allocation; //release the allocation pointed to by 'allocation'
//note that if we tried to delete ptr there would be an error because it no longer points to the base of the allocation

If you freeze the program at this point then the memory there should still contain 1,2,3 in order (because nothing has written over it yet).

'allocation' will still point to the '1' and 'ptr' will still point to the '3'.

The memory is no longer reserved. If the program allocates more memory later there's a decent chance that it will re-use this area of memory and change the values there. Since the reservation is gone the memory in question is 'fair game' to be reallocated.

Since the memory is no longer reserved, calling delete on 'allocation' again should cause an error because there's no reservation recorded for that address. However, it's possible to get some 'fun' if something else asks for memory and a new allocation starts at that same address. In that case an extra delete will destroy the new allocation, so don't rely on getting explicit errors (you almost always will in practice, though).

It is legal to delete zero, or nullptr, even though they will never be the base of an allocation. Relying on this for general use is a bad plan. There are some cases where it's useful, but avoid the bad habit of always setting pointers to null after releasing their allocations. If you can't resist setting the pointer then set them to 0x0BAD0BAD or something similar that will provide a recognizable error if you try to release them again. Error messages are good. They tell you something is wrong, and that's something that you want to know about, so in general try to avoid behaviors that obfuscate errors.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Thanks for the detailed answers (and the 0BAD0BAD, lol, amazing tips), guys! it's clear now. :D

This topic is closed to new replies.

Advertisement