I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.
This one is about why it was a mistake to call 1024 bytes a kilobyte. It’s about a 20min read so thank you very much in advance if you find the time to read it.
Feedback is very much welcome. Thank you.
deleted by creator
Short answer: It’s because of binary.
Computers are very good at calculating with powers of two, and because of that a lot of computer concepts use powers of two to make calculations easier.
Edit: Oops… It’s 210, not 27
Sorry y’all… 😅
FYFY
FYFYFTFY FTFYYeah, I deserve that. I’m just gonna leave my typo. Thanks for the laugh!
So the problem is that our decimal number system just sucks. Should have gone with hexadecimal 😎
/Joking, if it isn’t obvious. Thank you for the explanation.
Or seximal!
Not that 1024 would be any better, as it’s 4424 in base 6.
I’m confused, why this quotation? 1024 is 210, not 27
Long answer
Just to add, I would argue that by definition of prefixes it is 1000.
However there are other terms to use, in this case Kibibyte (kilo binary byte, KiB instead of just KB) that way you are being clear on what you actually mean (particularly a big difference with modern storage/file sizes)
EDIT: Of course the link in the post goes over this, I admit my brain initially glossed over that and I thought it was a question thread
deleted by creator
I believe it’s because you always use bytes in pairs in a computer. If you always pair the pairs, you would eventually get the number 1024, which is the closest number to a 1000.
The logic is like this:
2+2 = 4
4+4 = 8
8+8 = 16
16+16 = 32
32+32 = 64
64+64 = 128
128+128 = 256
256+256 = 512
512+512 = 1024
not exactly because of pairs unless you’re talking about 1 and 0 being a pair… it’s because the maximum number you can count in binary doubles with each additional bit you add:
with 1 bit, you can either have 0 or 1… which is, unsurprisingly perhaps, 0 and 1 respectively - 2 numbers
with 2 bits you can have 00, 01, 10, 11… which is 0, 1, 2, 3 - 4 numbers
with 3 bits you can have 000, 001, 010, 011, 100, 101, 110, 111… which is 0 to 7- 8 numbers
so you see the pattern: add a bit, double the number you can count to… this is the “2 to the power of” that you might see: with 8 bits (a byte) you can count from 0 to 255 - that’s 2 (because binary has 2 possible states per digit) to the power of 8 (because 8 digits); 8^2
the same is true of decimal, but instead of to the 2 to the power, it’s 10 to the power: with each additional digit, you can count 10 x as many numbers - 0-9 for 1 digit, 00-99 for 2 digits, 000-999 for 3 digits - 10^1, 10^2, 10^3 respectively
and that’s the reason we use hexadecimal sometimes too! we group bits into groups of 8 and call it a byte… hexadecimal is base 16, so nicely lets us represent a byte with just 2 characters - 16^2 = 256 = 2^8
Harvard’s CS50 has a great explanation on it. Makes a ton of sense. In fact CS50 should be required for high school, people would have a much better understanding of how software works in general.
Understanding that has very little advantage for the average person.
So teaching it alongside things like the quadratic equation makes perfect sense then.
Would be better to not teach either.
Exploring concepts that aren’t familiar to you can help you with other issues in your daily life. It helps you problem solve from a new perspective.
Because computers work in bits. Each bit represents 2 values (0 or 1). It’s much easier to represent 1024, which is 2^10 (exactly 10 bits) instead of 1000 which isn’t a power of 2.
“Kilo” means 1000 under the official International System of Units.
With some computer hardware, it’s more convenient to use 1024 for a kilobyte and in the early days nobody really cared that it was slightly wrong. It has to do with the way memory is physically laid out in a memory chip.
These days, people do care and the correct term for 1024 is “Kibi” (kilo-binary). For example Kibibyte. There’s also Gibi, Tebi, Exbi, etc.
It’s mostly CPUs that use 1024 - and also RAM because it’t tightly coupled to the CPU. The internet, hard drives, etc, usually use 1000 because they don’t have any reason to use a weird numbering system.
As the article mentions, windows also uses KB/MB/GB to refer to powers of 2 when calculating disk space. AFAIK Linux somes does too, although the article says otherwise. Apparently OSX uses the KB=1000 definition.
It may be outdated, but it’s still incredibly common for people to use KB/MB/GB to refer to powers of 2 in computing. Best not to assume KB is always 1000.
Windows and Mac both use KB = 1000. With Linux I think it depends on the distro.
You’re thinking of very old versions of Windows… old versions of MacOS were also 1024.
It’s honestly irrelevant anyway - if you want to actually know how much space a file is using on disk, you should look up how many pages / sectors are being used.
A page (on an SSD) or sector (on a HDD) is 32768 bits on most modern drives. They can’t store a file smaller than that and all of your files take up a multiple of that. A lot of modern filesystems quietly use zip compression though. Also they have snapshots and files that exist in multiple locations other shit going on which will really mess with your actual usage.
I’m not going to run
du -h /
on my laptop, because it’d take forever, but I’m pretty sure it would be a number significantly larger than my actual disk. Wouldn’t surprise me if it’s 10x the size of my disk. Macs do some particularly interesting stuff in the filesystem layer - to the point where it’s hard to even figure out how much free space you have… my Home directory has 50 GB of available space on my laptop. Open the Desktop directory (which is in the Home directory…) and the file browser shows 1.9 TB of available space.Weird numbering system? Things are still stored in blocks of 8 bits at the end, it doesn’t matter.
When it gets down to what matter on hard drives, every byte still uses 8 bits, and all other numbers for people actually working with computer science that matter are multiples of 8, not 10.
And because all internal systems use base 8, base 10 is “slower” (not that it matters any longer.