Jump to content

Pointers and Arrays in C

- - - - -

  • Please log in to reply
22 replies to this topic

#1
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
Pointers are thought to be the 'make you or break you a programmer' kind of a topic, though they are much intuitive. I would mainly like to point out some basics and uses along with arrays since they have much in common.

Pointers:

A pointer is basically an address of any memory location. Precisely it is a variable which contains address of another variable.


 int var = 10; // Creates a variable by reserving memory.

 int * p;       // Creates a pointer whose type is integer. It can only hold address   

                  // of another variable - It does not contain any address yet.

 p = &var;     // Now p is assigned address of var.


Now we can reference the value 10 in two ways - Using var directly or through p indirectly:


  printf("Through Variable var: %d  Through Pointer p %d", var, *p);


The * operator tells the compiler to "go to that address in the pointer variable and read value". & operator means "pick address of this variable" (which follows the operator).

We can also indirectly modify the value i.e.


  *p = 5;

  printf("%d %d", var, *p); // both would print 5

  printf("Address contained in pointer is %p\n", p); // Note %p used for pointer and p 

  without *. This prints a hexadecimal address of memory something like 0xabbde21f


The reason is, there only exist a single memory which is pointed to be var as well as p the pointer. So that memory can either contain 5 or 10. If we change it through pointer, the variable would also read the new value.
We can use p++ or p-- to move around in memory. But to grab that let's begin with arrays:

Arrays
Arrays are a way of group many variables of a same type.


  #define SIZE 10 

  int array[SIZE]; // SIZE is replaced with 10 at compile time.


It says create 10 variables of integer type which are placed consecutively in memory and can be addressed using array[0], array[1] and so on.

So we can write a loop to go from array[0] to array[9] (First element's index is 0 and last is SIZE-1).

Now print 10 elements of array:


  for(int cnt=0; cnt < SIZE; ++cnt)

    printf("Element Num %d is %d \n", cnt, array[cnt]);


Instead of this we could use pointers too:


  int * p = array; // Name of array is already an address, so it does not need & operator


  for(int cnt=0; cnt <SIZE; ++cnt) {

    printf("Elmt Num %d is %d \n",cnt, *p);

    p++;

  }


The statement p++ is moving around the pointer. Increasing the address one integer element at a time. So one increment takes it to second array element and so on.

Pointer's size is the size of int on that machine i.e. 4 byte on 32 bit machine and 8 byte on 64 bit machine. If all pointers are addresses, why does it need to have a type i.e. char * p, int * p, float * p are all same sized variables containing addresses then what is the significance of type?

It is the type which tells when we say p++, then how many bytes it actually skips. The above is an integer array so we declare an int pointer. Therefore when we say p++ it skips 4 bytes (assuming int is 4 bytes here). Had it been a char array and the pointer is char * p, p++ would only skip one byte.

Pointer notation of arrays
When we write array[0] or array[1] Compiler is doing a translation into addresses or pointers. We can do that translation too thereby reducing work done by compiler.
Above print statement could also be written as:


  printf("Elmt Num %d is %d \n",cnt, *(p + cnt));


So we no longer need to increment p each time. This is actually translating the earlier statement. p is pointer to array (which is pointing to arrays first element). Now each time we add cnt's current value p skips 4 bytes for each single cnt i.e. 4 bytes for cnt 1, 8 for cnt 2, 12 for cnt 3, 16 for cnt 4 and so on.

The above code is faster than the earlier though here it is a slight advantage. But we are preventing compiler from doing one translation by doing it ourselves.

This relation is extended into double or even triple pointers. 2D or 3D arrays are conceptually translated into those but should be addressed separately.

Edited by fayyazlodhi, 16 May 2011 - 01:37 PM.
replacing constant with Macro


#2
Alexander

Alexander

    It's Science!

  • Moderators
  • 4,118 posts
  • Location:Vancouver, Eh! Cleverness: 200
This is a very clear tutorial, I've not seen many that teach you about pointer arithmetics and as such people often do not know or understand about it and have trouble later on with arrays and subscripts.

I would give on suggestion though, you appear to be using < 10 instead of < SIZE on your example int array loops, it would be a bit clearer to use that maybe if you were just iterating through each element and no more.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.

#3
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
Thank you very much Alexander. Yes you are absolutely right about SIZE and that was my intention when i wrote the #define but some how missed that in loop.

Can you please help with editing it? I looked for the Edit button like it existed on posts but couldn't find it. I guess it works differently for tutorials.

Edit: Never mind - found it and edited. Sorry for the botheration probably my browser had something up and was not showing the link properly

#4
gdjs

gdjs

    Newbie

  • Members
  • PipPip
  • 10 posts
I think you're confused about certain aspects:


The size of a pointer object is not directly related to the type of object it points to. A pointer to int doesn't need to be as big as an int.

Arrays are never pointers. Arrays' values are pointers to their first elements (an array's value's type is "pointer to [whatever the type of that array's elements are]").

Using different notation doesn't reduce work done by the compiler. These days, any half decent compiler will optimize simple operations like dereferencing in loops. Even if optimization did not deal with this, the time you'd save would be compilation (not execution) time.

I've written about pointers and arrays today, right here.

#5
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts

gdjs said:

I think you're confused about certain aspects:


The size of a pointer object is not directly related to the type of object it points to. A pointer to int doesn't need to be as big as an int.

Arrays are never pointers. Arrays' values are pointers to their first elements (an array's value's type is "pointer to [whatever the type of that array's elements are]").

Using different notation doesn't reduce work done by the compiler. These days, any half decent compiler will optimize simple operations like dereferencing in loops. Even if optimization did not deal with this, the time you'd save would be compilation (not execution) time.

I've written about pointers and arrays today, right here.

I believe Alexander has pretty much explained each of the aspects and my intentions. If you still have confusions about any point please do ask. I have been working in c for a long time now so i guess i am pretty clear what i write about. That does not mean i can't be wrong but i didn't claim any of those mentioned above. So it's better if we discuss and clear concepts.

One aspect which might be missing is,

Though i never said "arrays are just pointers", but since it has been discussed i would say Yes "Arrays are constant pointers". The explanation is the same i.e. the NAME of array (not the value) is the starting address of array and is pointing to first element.

However, it is a constant pointer that is why you can't do <ArrayName>++ or <ArrayName> = <another ptr>. But you can easily use

<ptr of array element type> = <ArrayName>

and use the ptr above to use the same array any way you like

This term (Constant pointer) is used in pretty much every authoritative book such as by K&R, ANSI standard, Stroustrup etc.

When you write code which compiler is not forced to optimize on it's own, there are advantages e.g. some times compiler makes decisions when were not exactly intended by your code. Usage of volatile key word was specially designed to prevent these optimization side effects.

Also like you said, it will improve compilation time (not execution or run time). How significant an impact that can be?

For 1 or 2D array it is insignificant - But for a 5D array with each dimension of considerable size, the translation time greatly varies. The concept is that RAM is linear - So every multiple dimension array is MAPPED into 1-D ultimately. That requires formulas and computation.

If you have written code using pointers and indirection yourself, compiler won't have to do any of the mappings and trust me it really becomes a research problem when for e.g. an image processing project is taking say 8-10 hours to compile and you reduce that to 3 hours.

#6
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
And yes - a pointers size doesn't need to be of the size of type it points to - though size of a pointer on any machine is always equivalent to size of Integer on that machine.


struct e

{

   int v;

   char s[10];

};


char * a;                // all of these pointers are size of int on this machine (4 bytes on 32 bit machine and 8 bytes on 64 bit.

int * b;

double * c;

e * obj;


I don't think i ever wrote above is not the case.

#7
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
I went through the post quoted above and agree to what is written about types of array vs. int *.

Only that i would go ahead to say that difference is still the 'constant pointer' which if you evaluate further meaning a pointer that cannot be changed and hence is pointing to an array of fixed size.

#8
gdjs

gdjs

    Newbie

  • Members
  • PipPip
  • 10 posts
I might have misinterpreted what you meant by "Pointer notation of arrays". I simply believe the confusion between arrays and pointers is because people think they are sometimes "the same", which is not the case.

No. An array is never a pointer. It is never a "constant pointer". Like my post clarifies, an array's lvalue might be evaluated (as an expression), in which case the yielded value is of type "pointer to [element of array's type]".

You're actually mistaken to think that 'arr[i]' is equivalent to '*(arr + i * sizeof (int))' (given that arr is an array of int). 'arr[i]' is actually the same as '*(arr + i)'.

The following code exemplifies what I mean:


#include <stdlib.h>

#include <stdio.h>


int main(void)

{

    int arr[] = {1, 2, 3, 4, 5, 6, 7}, i = 0;

    

    for (i = 0; i < 7; i++)

        printf("%d ", *(arr + i)); // not *(arr + i * sizeof (int))

    

    return 0;

}


So, as you can see, the compiler must actually be aware and compute the size of *arr, always. An object of type (char *) can actually be used to walk the entire object, byte by byte, but the gain in compile time is actually irrelevant compared to the loss of all the extra dereferencing at runtime - not to mention the cryptic code this might generate.

One final thing: the value of an array is not a pointer to that array. It is a pointer into that array. Consider the difference between the following:


int array[10]; 

int *p = array; // p is a pointer to this array's first element, or &array[0] (a pointer "into" the array, but not a pointer to the array)

int (*pa)[10] = &array; // pa is a pointer to the array of 10 ints.


I'm sorry if I sound pedantic, but this type of linguistical confusion is what breaks people's brains. We must be clear and unambiguous.

#9
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts

gdjs said:

I might have misinterpreted what you meant by "Pointer notation of arrays". I simply believe the confusion between arrays and pointers is because people think they are sometimes "the same", which is not the case.

Sure - my intention was to elaborate the common syntax and mainly pointer arithmetic. Not to prove they are same, which i guess most of your discussion is focused upon.

gdjs said:

No. An array is never a pointer. It is never a "constant pointer". Like my post clarifies, an array's lvalue might be evaluated (as an expression), in which case the yielded value is of type "pointer to [element of array's type]".
I beg to differ a bit. At implementation level yes array is never a pointer. But your own statement gives the rationale, "array's lvalue is evaluated as pointer to element of array's type". If language implemented it to be retrieved as a constant pointer i guess it is only fair to call it that way rather than saying "No it does evaluate to that but internally it is not." It is much simpler to view things in the light of usage until we need the internals and that even again actually arises in usage.

gdjs said:

You're actually mistaken to think that 'arr[i]' is equivalent to '*(arr + i * sizeof (int))' (given that arr is an array of int). 'arr[i]' is actually the same as '*(arr + i)'.

Agreed - but that was a genuine mistake. Thanks for pointing that out (fixed now). I though i was using a char pointer which would increment a single byte so sizeof needed to be added. And by the way the code fragment was emphasizing how many bytes a pointer would skip based upon it's type. So it is only fair to view it in that context. I would have chosen a clearer e.g. had 'array is not a pointer' been my focus

gdjs said:

So, as you can see, the compiler must actually be aware and compute the size of *arr, always. An object of type (char *) can actually be used to walk the entire object, byte by byte,

I never denied above and it is pretty trivial. If you are leading to this because of sizeof(int) being used, i guess i cleared that already.

gdjs said:

but the gain in compile time is actually irrelevant compared to the loss of all the extra dereferencing at runtime - not to mention the cryptic code this might generate.

Really? Then all the standard libraries that extensively use pointer notation should have been written using array notation. But i guess that is not the case. There is no extra de-referencing. The same computation is done when you write arr[index] or index[arr], rather one step less.

gdjs said:


int array[10]; 

int *p = array; // p is a pointer to this array's first element, or &array[0] (a pointer "into" the array, but not a pointer to the array)

int (*pa)[10] = &array; // pa is a pointer to the array of 10 ints.


If you are trying to make into and to two different things, please suite yourself. it might be my perception of English, But to me 'pointer to' does not translate into a double pointer since you used a pointer to an array of size 10 and assigned it &array. I guess it is pretty common lingual when i say pointer to an array every body would always perceive 'pointer to first element' since that is the way you know arrays are created. If i ever meant otherwise, i would prefer the exact term double pointer rather than "pointer to and into"

#10
Ahmos

Ahmos

    Newbie

  • Members
  • PipPip
  • 12 posts
Nice tutorial

#11
RHochstenbach

RHochstenbach

    Learning Programmer

  • Members
  • PipPipPip
  • 56 posts
I have a question concerning the pointers. In your tutorial you assign 'var' with the value 10, then create the pointer 'p' and then assign it with the memory value of 'var'. Why can't you just do it like this?

int *p;

*p = 10;



#12
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
Sure because a pointer can only contain a valid address to be used. Unless you have a separate memory allocated properly - to which the pointer points, it is incorrect.

If you directly try to do

*p = 10;


Then we have no memory reserved for value 10. Compiler cannot do that implicitly. And memory for p can only contain an address. So we haven't provided that address, it won't work
Today is the first day of the rest of my life




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users