I wasn't confusing output modes and representation - I was trying to point out that the article itself doesn't make any assertions that outputting in binary will result in a smaller file than a corresponding text file, which is what you were asking about. It only says it'll be smaller than what you get when you output the same data represented in a fixed-length record. It is true to say that data written in binary have the potential to be smaller than when written as text files, but it won't always be the case.
I can't be specific about what's going on in the binary files, since I can't see them and don't know how VB implements the output. I can point out a few things:
“ |
Data = "."
Text file size = 5 bytes
Binary file size = 5 bytes
Data = "A"
Text file size = 5 bytes
Binary file size = 5 bytes
Data = "AA"
Text file size = 6 bytes
Binary file size = 6 bytes
|
” |
Any ASCII character is going to be a single byte, regardless of how you output it. It can't be any less because there are 256 characters, and so all 8 bits are needed to represent them. You can see this when you add the second "A"...the file size goes up one byte.
“ |
Data = "5"
Text file size = 3 bytes
Binary file size = 6 bytes
Data = "50"
Text file size = 4 bytes
Binary file size = 6 bytes
Data = "550"
Text file size = 5 bytes
Binary file size = 6 bytes
|
” |
If you notice, the binary file size remains at 6 bytes in all 3 of these. That's because these three integers can be represented in the same number of bytes (I'm guessing 2 bytes, but it could be 4...try writing a number above 2<sup>16</sup> if you want to find out).
My guess is some overhead is taking up the extra bytes in the binary files. I'd also predict you'll see the text files gaining size faster than binary files when you move to larger files.
As far as your original issue...perhaps you had a lot of negative integers in the text file that were read into the program and stored in 32 bit variables? The text file would have a "-1" (2 bytes) where the binary file would have "11111111 11111111 11111111 11111111" (4 bytes). There are other places where the representations differ like this, going both directions, such as floats & doubles. It's impossible to make a general case for which one will be smaller.