can I even read a file without knowing how it is encoded?
as I understand it,
a text file is just filled with numbers and one has to know
in which way it is encoded to read it properly.
How does one find out how it is encoded ?
And do so using C ?
thanks in advance
C, text file and it's encoding
Started by denarced, Jul 25 2008 12:24 AM
11 replies to this topic
#1
Posted 25 July 2008 - 12:24 AM
|
|
|
#2
Posted 25 July 2008 - 03:15 AM
Lol know... If you try to read a text file, you are just going to read whatever text is stored in that memory space. So if it is plain text, you'll read the plain text, and if it is encrypted, you'll read the encrypted text.
To read and write files, look at CreateFile(), ReadFile(), and WriteFile().
To read and write files, look at CreateFile(), ReadFile(), and WriteFile().
#3
Posted 25 July 2008 - 05:36 AM
well,
the actual problem is that when reading the text, I'm trying to find certain strings in the text and those strings like the whole text, include certain letters. Such as ä ja ö. This is where I ran into a problem. Don't know how to search for strings with these letters. if I write
char line[] = "näkyvä";
and try to search for that, it won't find it
the actual problem is that when reading the text, I'm trying to find certain strings in the text and those strings like the whole text, include certain letters. Such as ä ja ö. This is where I ran into a problem. Don't know how to search for strings with these letters. if I write
char line[] = "näkyvä";
and try to search for that, it won't find it
#4
Posted 25 July 2008 - 05:40 AM
I don't have the time right now to look on the ascii chart to see if they are all valid characters. But if they are on the ascii chart then they are valid and there is no reason that you wouldn't be able to find them in a given text.
There is probably just an error in your code if that is the case.
There is probably just an error in your code if that is the case.
#5
Posted 25 July 2008 - 12:42 PM
Generally, you will read a text file in text mode. The environment will handle the basics of the extended ASCII encoding. If you are working with something that is NOT plain-text, you have an issue :)
#6
Posted 25 July 2008 - 07:52 PM
MeTh0Dz|Reb0rn said:
Lol know... If you try to read a text file, you are just going to read whatever text is stored in that memory space. So if it is plain text, you'll read the plain text, and if it is encrypted, you'll read the encrypted text.
To read and write files, look at CreateFile(), ReadFile(), and WriteFile().
To read and write files, look at CreateFile(), ReadFile(), and WriteFile().
Uh, there are no functions by those names in C. It's fopen() to open files (whether they previously exist or not), and then fprintf() or fwrite() to write to it, and fscanf() and fread() to read. I don't know where you got that idea. Probably fron C# or something. Remember to fclose() files when done!
#7
Posted 25 July 2008 - 08:24 PM
What? Yeah there is, do you want me to show you the documentation?
There is more than one way to read and write from a file.
Here is the page for Read File, I didn't feel like linking all of them.
ReadFile Function (Windows)
There is more than one way to read and write from a file.
Here is the page for Read File, I didn't feel like linking all of them.
ReadFile Function (Windows)
#8
Posted 26 July 2008 - 06:31 AM
Ah, but that's nonstandard, locks your code into windows, and given that it's Microsoft we're talking about, probably slower. And given that I use Linux, my statement was perfectly true on my machine. I prefer to use standard input/output functions, which work on all operating systems.
#9
Posted 26 July 2008 - 08:17 AM
Well that's the difference, I code pretty much strictly for Windows so I prefer to just use WinAPIs.
#10
Posted 29 July 2008 - 07:05 AM
Short Answer:
Get a hex editor, find out the values for those non standard characters, and input them into the string of characters using integers
Long Answer:
It could even be on a different encoding, such that 8bit characters will make no sense to the text file. It could be on 7 bit, 9 bit, or some other crap just to purposely screw you up. So, find the encoding, and you probably need some stuff with binary IO
Get a hex editor, find out the values for those non standard characters, and input them into the string of characters using integers
Long Answer:
It could even be on a different encoding, such that 8bit characters will make no sense to the text file. It could be on 7 bit, 9 bit, or some other crap just to purposely screw you up. So, find the encoding, and you probably need some stuff with binary IO
#11
Posted 29 July 2008 - 07:03 PM
Well, if you're running Windows, you can test to see if a file is encrypted by doing the following test:
A cheap way to test to see if you're dealing with plain text or Unicode, UTF-16, etc, is to check and see if there are any control characters (chars with ASCII value 0-31). If there are, and you know it's a text file, then it's not plain ASCII text.
#include <windows.h> //other code DWORD dwFileAttribs = GetFileAttributes(szMyFileNameString); if(dwFileAttribs & FILE_ATTRIBUTE_ENCRYPTED) //file is encrypted else //file isn't encrypted
A cheap way to test to see if you're dealing with plain text or Unicode, UTF-16, etc, is to check and see if there are any control characters (chars with ASCII value 0-31). If there are, and you know it's a text file, then it's not plain ASCII text.
#12
Posted 31 July 2008 - 10:32 AM
lots of responses
:)
thanks for all
:)
thanks for all


Sign In
Create Account


Back to top









