Jump to content


Check out our Community Blogs

Register and join over 40,000 other developers!


Recent Status Updates

View All Updates

Photo
- - - - -

How many bytes in memory a string occupy in Java language?

online printing

This topic has been archived. This means that you cannot reply to this topic.
6 replies to this topic

#1 Alejandro

Alejandro

    CC Lurker

  • New Member
  • Pip
  • 9 posts

Posted 28 November 2014 - 03:28 AM

How many bytes in memory a string occupy in Java language?
Suppose i write:
String name="Bill Gates";
So, how much memory it requires? Whether it can store a UNICODE character in the string. How internally it is implemented, using a 1 byte char array or 2 byte unicode char array? I read somewhere that Java's char data type is 16 bit wide.

Thanks!!!



#2 BlackRabbit

BlackRabbit

    CodeCall Legend

  • Expert Member
  • PipPipPipPipPipPipPipPip
  • 3871 posts

Posted 28 November 2014 - 10:17 AM

It's 2 bytes per char. Both UTF8 and Unicode occupies 2 bytes just the same, even when UTF could be using just 1 . Could be because the second byte could be used for the code page, because code page determines which UTF8 char-set are you using, which varies on location/language.

 

Java primitive Data Types



#3 Chall

Chall

    CC Addict

  • Senior Member
  • PipPipPipPipPip
  • 349 posts

Posted 28 November 2014 - 01:44 PM

Expanding upon what BlackRabbit said, if you don't want/can't have 2 bytes per character, you could always store the data in an array of bytes, and have it so that 1 byte represents 1 character. However, if you do this you'll be limited to only 256 different character possibilities. For example:

//to convert a character array to a byte array of the same length:

public static byte[] characterArrayToByteArray(char[] charArray) {
	byte[] byteArray = new byte[charArray.length];
	for (int i = 0; i < charArray.length; i++) {
		byteArray[i] = (byte) (charArray[i]);
	}
	return byteArray;
}

//and to convert it back into character array.

public static char[] byteArrayToCharacterArray(byte[] byteArray) {
	char[] charArray = new char[byteArray.length];
	for (int i = 0; i < byteArray.length; i++) {
		charArray[i] = (char) (Byte.toUnsignedInt(byteArray[i]));
	}
	return charArray;
}

Both examples are using Jdk1.8.0_25. Anything before a certain Java 8 version wont have the method "toUnsignedInt" in Byte. If you need to ensure that a byte is unsigned prior to 1.8, you can use the simple declaration (int) (myByte & 0xFF)


Speaks fluent Java

#4 0xDEADBEEF

0xDEADBEEF

    CC Devotee

  • Senior Member
  • PipPipPipPipPipPip
  • 790 posts

Posted 28 November 2014 - 01:56 PM

If you want to use byte[] rather than char[] - you should encode/decode to the correct format. Doing byte b = (byte)c; isn't great.

Unless you know that you only every build ascii strings and want to convert between them. But then you might aswell just use a byte array.

 

I don't believe anyone really heavily uses char[]'s in java anyway.


Creating SEGFAULTs since 1995.


#5 janissantony

janissantony

    CC Lurker

  • New Member
  • Pip
  • 9 posts

Posted 01 December 2014 - 10:40 PM

I think String are unicode, so 16bits



#6 Sinipull

Sinipull

    CC Addict

  • Validating
  • PipPipPipPipPip
  • 384 posts

Posted 04 December 2014 - 02:41 PM

You also should be aware that String is internally optimized in Java, so that when you create many instances of "Bill Gates" they might share the memory space / instance, but it is not certainly true in every case. This is might be true for other immutable objects as well.


.

#7 janissantony

janissantony

    CC Lurker

  • New Member
  • Pip
  • 9 posts

Posted 04 December 2014 - 03:56 PM

Yes because it is immutable it could be flyweight






Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download