I'm writing a function that goes through all files in a directory and reads all files that are text. I ran into a problem in that it begins to read binary files as well. How can I determine whether a file is binary or text in C#?
Well perhaps you could try setting the extensions, so it scans for example only the files with the extensions of .txt, .text, .doc, etc.....
I'd assume that you would have to scan extensions unless you are in a unix system. Not quite sure how you would do it then.
Void
Since text is binary, it's not an easy question. Usually, Windows apps use extensions to know what the type of a file is. Alternatively, you could sniff the file contents - if it contains all lower bytes (< 127), I think it'd be safe to assume it's ASCII text. Unicode would be a bit harder, though you could check for a BOM (Byte Order Mark) at the beginning perhaps.
So I would have to read every file to determine if it were binary or ascii? That seems like a lot of processing power. I'll have to think of something else.
If you just read the first line of the file, you should be able to analyze that. You could also download the Windows version of the Unix command line tools and use grep " " *. It will return which files are binary.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks