硬盘数据恢复导航
RAID数据恢复导航
 | 网站首页 | 数据恢复资料 | 数据恢复软件 | 咨询留言 | 数据恢复博客 | 数据恢复论坛 | 
数据恢复软件下载
公司简介 数据恢复成功案例 数据恢复报价 数据恢复培训 数据恢复技术文章 数据恢复客服中心 数据恢复服务联系方式
您现在的位置: 北亚数据恢复技术站 >> 数据恢复资料 >> UNIX及LINUX文栏 >> 文章正文
[组图]Why doesn't Linux need defragmenting?         ★★★ 【字体:
Why doesn't Linux need defragmenting?
作者:佚名    文章来源:本站原创    点击数:    更新时间:2006-8-22

It's a question that crops up with depressing regularity: Why don't Linux filesystems need to be defragmented?. Here's my attempt at giving a simple, non-technical answer as to why some filesystems suffer more from fragmenting than others.

Rather than simply stumble through lots of dry technical explanations, I'm opting to consider that an ASCII picture is worth a thousand words. Here, therefore, is the picture I shall be using to explain the whole thing:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This is a representation of a (very small) hard drive, as yet completely empty - Hence all the zeros. The a-z's at the top and the left side of the grid are used to locate each individual byte of data: The top left is aa, top right is za, and bottom left is az. You get the idea, I'm sure. . .

We shall begin with a simple filesystem of a sort that most users are familiar with: One that will need defragmenting occasionally. Since both Windows and Linux users make use of FAT filesystems, if only for USB flash drives, this is an important filesystem - unfortunately, it suffers badly from fragmentation.

We add a file to our filesystem, and our hard drive now looks like this:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a e l e 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  H e l l o , _ w o r l d 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

(Empty rows g-z ommitted for clarity)

To explain what you see: The first four rows of the disk are given over for a "Table of contents", or TOC. This TOC stores the location of every file on the filesystem. In the above example, the TOC contains one file, named "hello.txt", and says that the contents of this file are to be found between ae and le. We look at these locations, and see that the file contents are "Hello, world"

So far so good? Now let's add another file:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a e l e b y e . t x t m e z
b  e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  H e l l o , _ w o r l d G o o d b y e , _ w o r l d
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

As you can see, the second file has been added immediately after the first one. The idea here is that if all your files are kept together, then accessing them will be quicker and easier: The slowest part of the hard drive is the stylus, the less it has to move, the quicker your read/write times will be.

The problem this causes can be seen when we decide to edit our first file. Let's say we want to add some exclamation marks so our "Hello" seems more enthusiastic. We now have a problem: There's no room for these exclamation marks on our filesystem: The "bye.txt" file is in the way. We now have only two options, neither is ideal:

  1. We delete the file from its original position, and tack the new, bigger file on to the end of the second file.
  2. We fragment the file, so that it exists in two places but there are no empty spaces.

To illustrate: Here is approach one

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a f n f b y e . t x t m e z
b  e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 G o o d b y e , _ w o r l d
f  H e l l o , _ w o r l d ! ! 0 0 0 0 0 0 0 0 0 0 0 0

And here is approach two:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t a e l e a f b f b y e . t x
b  t m e z e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  H e l l o , _ w o r l d G o o d b y e , _ w o r l d
f  ! ! 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This is why FAT filesystems need defragging regularly. All files are placed right next to each other, so any time a file is enlarged, it fragments. And if a file is reduced, it leaves a gap. Soon the hard drive becomes a mass of fragments and gaps, and performance starts to suffer.

And then there is Linux. Which has a different philosophy. Windows filesystems are ideal if you have a single user, accessing files in more-or-less the order they were created in, one after the other. Linux, however, was always intended as a multi-user system: It was gauranteed that you would have more than one user trying to access more than one file at the same time. So a different approach was used. When we create "hello.txt" on a Linux filesystem, it looks like this:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t h n s n 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 H e l l o , _ w o r l d 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

And then when another file is added:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t h n s n b y e . t x t d u q
b  u 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 H e l l o , _ w o r l d 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 G o o d b y e , _ w o r l d 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The cleverness of this approach is that the disk's stylus can sit in the middle, and most files, on average, will be fairly nearby: That's how averages work, after all.

Plus when we add our exclamation marks to this filesystem, observe how much trouble it causes:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t h n u n b y e . t x t d u q
b  u 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T O C
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
n  0 0 0 0 0 0 0 H e l l o , _ w o r l d ! ! 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 G o o d b y e , _ w o r l d 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

That's right: Absolutely none.

Windows tries to put all files as close to the start of the hard drive as it can, thus it constantly fragments files when they grow larger and there's no free space available.

Linux scatters files all over the disk so there's plenty of free space if the file's size changes. It also re-arranges files on-the-fly, since it has plenty of empty space to shuffle around. Defragging a Windows filesystem is a more intensive process and not really practical to run during normal use.

Fragmentation thus only becomes an issue on Linux when a disk is so full that there just aren't any gaps a large file can be put into without splitting it up. So long as the disk is less than about 80% full, this is unlikely to happen.

It is also worth knowing that even when an OS says a drive is completely defragmented, due to the nature of hard drive geometry, fragmentation may still be present: A typical hard drive actually has multiple disks, AKA platters, inside it.

Let's say that our example hard drive is actually on two platters, with aa to zm being the first and an to zz:

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

   a b c d e f g h i j k l m n o p q r s t u v w x y z

n  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The following file would be considered non-fragmented, because it goes from row m to row n, but this ignores the fact that the stylus will have to move from the very end of the platter to the very beginning in order to read this file.

   a b c d e f g h i j k l m n o p q r s t u v w x y z

a  T O C h e l l o . t x t r m e n 0 0 0 0 0 0 0 0 0 0
b  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
g  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
h  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
j  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
k  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
l  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
m  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 H e l l o , _ w o

   a b c d e f g h i j k l m n o p q r s t u v w x y z

n  r l d ! ! 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
o  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
p  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
q  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
r  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
t  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
u  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
v  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
w  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
x  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0y  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I hope this has helped you to understand why no defragging software came with your Linux installation. If not, I'm always open to suggestions [Smiley]

122 comments • Categories: Omni, FOSS, Technology  

Comments, Pingbacks:

Comment from:
giz404 [Visitor]
· http://giz404.freecontrib.org/
Your explanation is clear, but I have a one more question : What about NTFS ? Does it handle fragmentation better than FAT ?
Permalink 18/08/06 @ 05:44
Comment from:
Cameron [Visitor]

Excellent explaination! I have wondered this for years.
Permalink 18/08/06 @ 14:41
Comment from:
gab [Visitor]

So this proves that Linux does need defrag when the hard drive does not have enough gaps... So where are the defrag utils for linux?
Permalink 18/08/06 @ 14:49
Comment from:
oneandoneis2 [Member]

Permalink 18/08/06 @ 15:07
Comment from:
RMX [Visitor]
· http://cbbrowne.com/info/defrag.html
Another good writeup explaining how fragmentation is actually a good thing in a well designed filesystem can be found here: http://cbbrowne.com/info/defrag.html

And yes, there are defragmentation utils for some linux filesystems (ext2, for example) and they're useful, for example, when you want to shrink a partition. Totally useless and arguably harmful for performance, though.
Permalink 18/08/06 @ 15:17
Comment from:
Scott Howard [Visitor]
· http://www.dipnoi.org
Very good way of explaining the difference.
Permalink 18/08/06 @ 15:22
Comment from:
Esben Pedersen [Visitor]

an inode in an ext2 filesystem refers to a number of pages on the disk. These pages need not to be placed sequentially thoug it is faster.

So even if the disk usage is larger than 80% and there is not room on the disk for a large file to have all it's pages stored next to each other it will only mean a small performance degradation.

The small files on the disk will be easy to place with it's pages next to each other.
Permalink 18/08/06 @ 15:22
Comment from:
Rob [Visitor]
· http://www.goldcs.co.uk
Very nice! I've wondered why that was ever since some linux person said "defrag? what!?". Obviously linux users would still need to defrag, but not nearly as much as windows users. One question though - how much does this approach affect performance, seeing as the stylus has to move more?
Permalink 18/08/06 @ 15:31
Comment from:
your last example seems a little off.. [Visitor]

since how many files follows that last example?1 in 10000 ? i could be wrong though, i really don't know what i'm talking about.
Permalink 18/08/06 @ 15:35
Comment from:
Matt [Visitor]

There is extra cleverness in the Linux filesystems that means the system does not suffer any noticeable effects of fragmentation until it is more than 95% full. Once a disk is this full there's not enough space left in order to be able to defrag it in any meaningful amount of time (try defrag'ing a 95% full FAT disk sometime to get an idea of what I mean)

A default ext2/ext3 linux filesystem actually reserves (IIRC) 5% of the disk for system use in order to avoid this issue (and for other purposes), so the issue of wanting to actually defrag a disk nearly never occurs in practise.

There did used to be tools to perform defrag, but no-one ever really used them, and since they could trash the disk on power failure they were considered unsafe.
Permalink 18/08/06 @ 15:36
Comment from:
joe [Visitor]

I prefer this example ########################################## ########################################## #################/- --################ ###############- ## ############## ############## .####. -############ ############# .#####. -########### ############ .######- -########## ########### .###### /######### ##########. -###### -## ######### ########## -###### :###@ -######## #########- ###### /#####/ ######## ######### ###### /#######/ ######## ######### X#. .###############:, -####### ########. #### .###############- ####### ######## -####@. ########.-#####: ####### ######## .###### #######. -##### ####### ########. #####/#########. -####.####### ########: ###############. ### .####### ######### .###############. .####### ######### -####### -###### ######## #########- :##### X##### ######## ########## /### .####### /######## ##########/ // .####### ######### ########### ######- ########## ############ ######. ########### ############# /#### /########### ##############. ###. ############# ################ ... .############## ##################... ..-################# ########################################## ##########################################
Permalink 18/08/06 @ 15:43
Comment from:
joe [Visitor]

##########################################
##########################################
#################/- --################
###############- ## ##############
############## .####. -############
############# .#####. -###########
############ .######- -##########
########### .###### /#########
##########. -###### -## #########
########## -###### :###@ -########
#########- ###### /#####/ ########
######### ###### /#######/ ########
######### X#. .###############:, -#######
########. #### .###############- #######
######## -####@. ########.-#####: #######
######## .###### #######. -##### #######
########. #####/#########. -####.#######
########: ###############. ### .#######
######### .###############. .#######
######### -####### -###### ########
#########- :##### X##### ########
########## /### .####### /########
##########/ // .####### #########
########### ######- ##########
############ ######. ###########
############# /#### /###########
##############. ###. #############
################ ... .##############
##################... ..-#################
##########################################
##########################################
Permalink 18/08/06 @ 15:45
Comment from:
asd [Visitor]


##########################################
##########################################
#################/- --################
###############- ## ##############
############## .####. -############
############# .#####. -###########
############ .######- -##########
########### .###### /#########
##########. -###### -## #########
########## -###### :###@ -########
#########- ###### /#####/ ########
######### ###### /#######/ ########
######### X#. .###############:, -#######
########. #### .###############- #######
######## -####@. ########.-#####: #######
######## .###### #######. -##### #######
########. #####/#########. -####.#######
########: ###############. ### .#######
######### .###############. .#######
######### -####### -###### ########
#########- :##### X##### ########
########## /### .####### /########
##########/ // .####### #########
########### ######- ##########
############ ######. ###########
############# /#### /###########
##############. ###. #############
################ ... .##############
##################... ..-#################
##########################################
##########################################
Permalink 18/08/06 @ 15:47
Comment from:
Pandemic [Visitor]
· Http://impulse100.net/
Google is a wonderful thing

^ should be a widely known acronym :P


Great! GREAT! explination. I knew how it worked already, but this is an amazing way to show. Hit it all right on the spot.
Permalink 18/08/06 @ 16:00
Comment from:
J Carter [Visitor]

Drivel. You clearly have no idea how either Windows or Linux filesystems work.
Permalink 18/08/06 @ 16:01
Comment from:
Juan Diego [Visitor]
· http://www.misgangas.com
maybe you want to mention the journaling...

this is a little bit complex

cheers

Permalink 18/08/06 @ 16:35
Comment from:
Mythos [Visitor]
· http://scripters.altervista.org
You talk like Windows has only FAT32! NTFS has MFT to keep all the small files so they don't get scattered on the HD forcing bigger files to be splitted creating fragmentation.

Also it depends more on the OS than the FS is fragmentation is created (for example the OS could save the files in different places instead of putting them all adiacently). In fact .NET 2.0 CreateFile function asks for the "buffer size" (pratically the size of the file) to have windows start writing it in a place where it wouldn't get fragmented.

And finally Windows Vista should have defrag scheduled so fragmentation will finally be a problem of the past.
Permalink 18/08/06 @ 16:52
Comment from:
Anal Avenger [Visitor]

Wonderful write up! Kept it simple and managable.

Next on the agenda;

Anti-Gravity for Dummies


can't wait to read that one.
Permalink 18/08/06 @ 17:26
Comment from:
mt [Visitor]

This all is giant pile of [CENSORED]

If you have files closer to each other then you have less moves of that head over the disk. The problem with adding content to files is easily solved with clusters...

So with clean install of linuxes you would have potential overhead connected with disk head movements, but after a half year of normal use both systems will need to be defragmented.

Difference come with other more sophisticated algorithms like smart positioning of files and constant deffraging.

Also all actions involving looking for right place to write are some type of constant defragging so you then have there lower speeds of writing...

This is the worst kind of ignorance from linux users, thanks for such unscientific explanation looser...


Permalink 18/08/06 @ 17:41
Comment from:
nico [Visitor]

On filesystems that are very busy, fragmentation IS a problem on filesystems on Linux: ext2, ext3, reiserfs... they all get fragmented after a while. Performance becomes so bad that the only way you defragment is to tar/untar the whole partition to clean things up.
I think reiserfs4 has a way of defragmenting when the system is not too busy though.

Oh, and yeah FAT/FAT32 sucks, but we knew that.
Permalink 18/08/06 @ 17:52
Comment from:
JT [Visitor]

You Windows fanboys are pretty funny making silly allegations that eventually linux filesystems'll suffer from fragmentation too.

Unless you have some quite odd usage patterns (create a zillion small files, and delete a random subset, and then create a few extremely large ones) fragmentation does not become a problem on most linux filesystems (don't know about reiser).

With ext2, a disk that's never totally full asymptotically approches a degree of fragmentation that has a minimal impact on performace. After that point, futher updates/deletes/creates have the effect of removing fragmentation just as much as they do create it.

That's not to say there's zero effect - I've worked with a HDTV video streaming system that used raw-access to the disk with no filesystem to reduce fragmentation to a near theoretical minimum (only reason to seek was a bad block on disk) - but it's simply false to say that eventually you'll need to defrag a linux disk.
Permalink 18/08/06 @ 18:13
Comment from:
sb [Visitor]

@mt
This article gives a simplistic overview of how filesystems such as EXT writes files, and judging from how it looks, it may be very specific to EXT since ReiserFS has a different organizational structure, although some of the rules presented here may still apply.

However, this is an OVERVIEW. If you looked at the header of the demonstrations, you'll see TOC areas which maps out the locations and sizes of files on the physical hard disk via 8/16/32 bit addresses based on whatever version of FAT you were using. FAT and arguably any file system operates on the same principle since without a table of contents it would quite a task to find each file. That being said-- your argument against lower write speed is a lousy and uneducated assumption. Yes it takes a little more computation to find an ideal location to write the files but this task is a simple one since all it has to do is query the TOC. And, just like linux file systems, FAT and NTFS access their own TOCs (MFT). It's just that many linux file systems try to place files in ideal locations that will prevent fragmentation.

Constant defragging causes severe performance costs depending on the condition of the drive, and what happens if you have one file that constantly resizes despite being packed into the rest of the "heap"? It will always fragment and because the defragger is always trying to defragment it, it will always be moved.

Your argument with sophisticated algorythms are sophomoric-- files are not just tossed randomly on the hard drive as you think. And even then-- there are many, MANY different algorythms used by each of the different file systems. ReiserFS3.6 is vastly different than ReiserFS4, both of which are vastly different than EXT2. Many are arguably better than NTFS performance and integrity wise in different fields since there is no catch-all file system.

You sound like an uneducated FUD spreader. Had you paid any attention at all, this argument points out the flaws in the FAT filesystem, which has been in the linux kernel for a VERY long time. This is NOT a xxx operating system is better than others because the file system roxors their boxors. It just points out FAT was designed for one user in mind which (somewhat) makes sense for sequential data. Up until Windows 2000, the average consumer (READ: not businesses) used FAT/FAT16/FAT32 as their primary file system, which is why references were made to windows.


Next time read up on EXT2, ZFS, XFS, JFS, and the many, MANY other file systems out there before making any comments like that.
Permalink 18/08/06 @ 18:14
Comment from:
derean [Visitor]

Windows fanboys? sounds more like microsoft shills roaming the net trying hard to discredit other OSes
Permalink 18/08/06 @ 18:37
Comment from:
Dave Nicoll [Visitor]
· http://www.davenicoll.com
YAAAAAAAAAAWN. So, freakin, what.
Permalink 18/08/06 @ 18:43
Comment from:
nic stage [Visitor]

great explanation!
Permalink 18/08/06 @ 19:07
Comment from:
MS [Visitor]

The second part about the platters being written to one after the other is plain wrong. The surfaces are interleaved. If you have 4 surfaces (2 platters):
  • sector 0 = surface 0, sector 0
  • sector 1 = surface 1, sector 0
  • sector 2 = surface 2, sector 0
  • sector 3 = surface 3, sector 0
  • sector 4 = surface 0, sector 1
  • sector 5 = surface 1, sector 1
  • sector 6 = surface 2, sector 1
  • sector 7 = surface 3, sector 1
  • sector 8 = surface 0, sector 2
I'm shocked by this basic ignorance about how hard disks work, so I (having little knowledge about modern file systems) very much doubt the rest of your article is correct.
Permalink 18/08/06 @ 19:43
Comment from:
gothicknight [Visitor]
· http://xfs_fsr
XFS filesystem from SGI has an online defrager, dunno why because it uses delayed alocation to pin-point the best location for the files in buffer.
But yes, we (the evil GNU/Linux community) also has a defrager. HURRAY!!
Permalink 18/08/06 @ 19:46
Comment from:
Thomas Scholz [Visitor]

@MS:

I'm shocked about your ignorace to the fact that this is an article that tries to explain something in a SIMPLE way. How a harddrive really works is way out of out of scope, and does not really change anything. Only reason multiply platters is metioned, is to explain that it is not always possible to make a perfect (loose) fit, becuase it might have to be split over different platters/sectors/whatever.

The article did a great job og explaining the concept of a simple filesystem (like FAT), and the different ways EXT and FAT allocates disk space. Would have been nice with some more on other filesystems like NTFS, reiser 3/4 and others.
Permalink 18/08/06 @ 20:03
Comment from:
Peter Braam [Visitor]
· http://lustre.org
In fact another situation can lead to ext3 fragmentation, namely when many threads do concurrent IO and older versions of ext3 are used. The allocations get mixed easily.

The Lustre project is building an online ext3 defragmenter which will defragment free space and files.
Permalink 18/08/06 @ 20:29
Comment from:
MS [Visitor]

@Thomas Scholz: The article aims to makes the point that moving from the end of one surface to the beginning of the next one
  • is required when reading linear data
  • therefore requres the head to move from one end to the other.
And this is wrong, so the whole last section is wrong. If you have a 1 GB hard disk with one big file filling the whole disk, the head will read from the very outside to the very inside, reading all surfaces at a time - never having to go back to start at sector 0 of the next surface.
Permalink 18/08/06 @ 20:38
Comment from:
Bryan [Visitor]
· http://adminfoo.net
@oneandoneis2, author of this article:

I'm pretty curious to know what references you called upon in writing this?

Hmm, actually, same goes for the rest of the, err, expert-sounding comments made here. I've done a lot of searching the net, but found very few authoritative articles sandwiched in among the many 'how I sort of guess I think it probably must work' theories ...
Permalink 18/08/06 @ 20:59
Comment from:
Simon [Visitor]

FAT does NOT specify an allocation policy. It's up to the operating system to find a good spot for a file. That means that the allocation policy is not a part of the FAT but of the FAT filesystem driver. You can place files anywhere you want on a FAT driver (check Alexei Frounze's FAT driver for an example).

Windows NT is not an operating system with single-user stuff in mind. It's design is "inspired" by VMS. And NTFS is "inspired" by HPFS (OS/2's native filesystem). The goal of NTFS was to create a modern filesystem. It's performance is at least on par with ext3. ReiserFS ist faster for small files.

NTFS has more sophisticated (I don't know if that helps) allocation algorithms than ext3 afaik.
Permalink 18/08/06 @ 21:08
Comment from:
Michael Skelton [Visitor]
· http://blog.codingo.net
Well written article - Doesn't fully clarify everything but it's definately a good introduction. Dugg.
Permalink 18/08/06 @ 21:09
Comment from:
Mic [Visitor]
· http://greatcube.com
a really cool explanation!
i translated it into chinese version:
http://greatcube.com/why_doesn_t_linux_need_defragmenting

if you don't want me to do this, plz comment me so that i'll take it off.

Thanks for sharing!
Permalink 18/08/06 @ 22:39
Comment from:
Brainiac [Visitor]
· http://yourbrainisnot@mensa.net
The problem with performance of drives is the movement of the single stylus.

If the stylus was stationary and extended across all the tracks on the platters, then a microprocessor could manage the incoming/outgoing data and read/write to many tracks at one time.

Imagine filling a 500GB hard drive in a matter of seconds.

I can.
Permalink 18/08/06 @ 22:46
Comment from:
ddaley [Visitor]

I am not a harware guru by any means, but I do understand well what the article is getting at, and it is an adequate explanation (if a basic one) of how fragmentation works.

What is 110% clear to many of us out here who do support for computers is that NTFS gets UNGODLY fragmented, and performs ungodly bad, even without the drive getting full.

I am not here to bash NTFS, I think it's a fine Filesystem, but it really truly does get fragmented worse than any filesystem I've ever worked with. Strangely enough, defragmenting seems to bring it right back up to snuff, so that's fine by me. I personally run a full defragment (Norton Speeddisk LOL.. if any of you remember that old beast) on my W2K box at least twice a week, and I can tell the difference.
Permalink 18/08/06 @ 23:01
Comment from:
Rob [Visitor]
· http://ru-linux-geek.livejournal.com
Leaving space to move things around actually comes up in other places in computer science too. This is a tangentially related idea in which insertion sort, the typical implementation of which is O(n^2) is made O(n log n) by keeping empty spaces in the array that it is sorting.

See http://en.wikipedia.org/wiki/Library_sort
Permalink 18/08/06 @ 23:57
Comment from:
NoahFexPayton [Visitor]
· http://payton.broke-off.com
All of you are freaking dorks.
Permalink 19/08/06 @ 00:08
Comment from:
Raseel [Visitor]
· http://osd.byethost8.com
A good and simple explaination.
The best part i thought about this explaination was that it raised a hunderd more questions in my mind :-)
Permalink 19/08/06 @ 00:12
Comment from:
ccjx [Visitor]

From the way this article was written, it seems like the author just attended a Basic File Systems 101 class and lifted this section from his lecture notes. I find this misleading, overly simplified and partial. Just my two cents. Take this article with a pinch of salt.
Permalink 19/08/06 @ 00:24
Comment from:
Anon [Visitor]

ddaley, you run a full defrag at least twice a week?! I guess you have a lot of idle time for your machines to play with, this is something that just wouldn't be realistic for machines in enterprise with 24/7 demand, it's also a great way to expose your systems to unnecessary risk by increasing wear on discs and increasing write activity and therefore the likelihood of dataloss and or FS corruption in the event of a power failure.

to the rest of you, it's a nice article written by design to be simplistic and easily understood, it's not supposed to me a full and indepth explanation of EXT2/3 and every other FS in existence.
Permalink 19/08/06 @ 00:47
Comment from:
Lubos Lunak [Visitor]

Sorry, but claiming that Linux doesn't need any defrag tool is just nonsense in practice: http://www.kdedevelopers.org/node/2270
Permalink 19/08/06 @ 01:42
Comment from:
mt [Visitor]

[CENSORED] linux lovers, hax0r wannabes, hipi-like rebels get the [CENSORED] outta here... The whole article is giant [CENSORED]

What I mentioned in my previous post is that this explanation is all wrong. The allocation as someone mentioned above is connected with each implementation (fs driver), NTFS and ext3 are just models...
However even if linuxes had better algorithms, that would mean they don't write contigously and thus we would have small delays...

IT IS NOT EVEN OVERSIMPLIFIED, IT IS ALL WRONG.

And why the .. do you think linuxes have so many file systems? Windows has only one, but for linuxes one is newer perfect, so they have 10 FS... consider this
Permalink 19/08/06 @ 02:19
Comment from:
David Scott [Visitor]

The article maybe flawed, but at least it's a good attempt an explaining. What is much more important is all of the comments after. To me, no one has provided an overall explanation (and keeps bringing up windows).

Can someone please have ago at doing a better article rather than trashing the original.

And can we have some references please

http://www.kdedevelopers.org/node/2270

was excellent and certainly a step forward in this discussion.
Permalink 19/08/06 @ 02:23
Comment from:
mt [Visitor]

it is not simplified explanation, the whole concept grew in his head.

Probably he don't understand that here are advanced algorithms involved, not just primary school arithmetics...
Permalink 19/08/06 @ 02:31
Comment from:
mt [Visitor]

And if you had windows, you would then see in disk defrager that data is spread all over the disk, just like the author says is the key advantage of ext3 and other linux fs....
Permalink 19/08/06 @ 02:34
Comment from:
solus [Visitor]

MT - I am worried that you seem to be so enraged about this!

Are you going to eventually snap and turn up at a Linux expo and spray all the nerds with hot lead?

You may think he's wrong, but do you really have to HATE him so much?
Permalink 19/08/06 @ 03:54
Comment from:
mt [Visitor]

I don't like because he is such a moron. I have nothing against linux users (if they don't spread false informations about how linux is 'holier than thou').
Just for information - I also have linux on my pc (besides xp)
Permalink 19/08/06 @ 04:11
Comment from:
Code Guy [Visitor]

FYI: I noticed years ago that Exchange 5 laid out message stores on NTFS the way Linux does; so they did at the application level in one of their products what Linux does at the Driver level. I believe the NY API call is named "FileScatterGather".

I think that NTFS is a better way if your seek times are poor; seek times used to be atrocious, but speeds have increased, and as they have the Linux way was a better way to do it. Microsoft should have adjusted the driver to support both formats (for downlevel compatibility reasons...) Not supporting the format for faster drives may have been a political thing (pissing off their aftermarket vendors who would have to do more to make things work) or a laziness thing ("that's good enough...")
Permalink 19/08/06 @ 04:13
Comment from:
SG [Visitor]

@MT: No you don't understand the concept of file fragmentation. I've just read your comments and they do not hold a point. First of all, Windows have more than one file system {FAT, FAT16, FAT32 and NTFS) your last comment says data is spread all over the disk, we are talking of file fragmentation not whole file being one end of the disk.
Permalink 19/08/06 @ 04:16
Comment from:
mt [Visitor]

SG: You moron, the only FS for the latest windows (XP and Vista) is NTFS.
Yeah it supports FAT (for compatibility reasons), and with custom drivers it can support other FS (like ext3), but NTFS is PRIMARY.
We should not asociate FAT systems anymore with windows because FAT is OBSOLETE!! It is similar as asociating linux with his FS in 1990...
And for your information windows is writing files contigously. I mean where did you get the idea of jumpig all over the disk for each sector???
Permalink 19/08/06 @ 05:12
Comment from:
Erik [Visitor]

In response to Matt who claimed that ext2/3 reservs 5% of the diskspace to avoid fragmentation. While you'r not right you'r not wrong either.

It's true that some amount of diskspace is reserved but it's not to prevent fragmentation but rather to prevent normal users from filling up the disk and thus preventing the system from operating normally, these last % of the disk can be used only by root.
Permalink 19/08/06 @ 05:19
Comment from:
MDF [Visitor]

Whilst I can't vouch for the correctness of any part of the article, it was very well written. Well done for taking the time to do it.

I find it a shame that a fair number of the comments here bear derogatory and even hostile tones, which is unnecessary and unproductive. Positive criticism goes a long way.
Permalink 19/08/06 @ 05:39
Comment from:
budh [Visitor]
· http://none
And why the .. do you think linuxes have so many file systems? Windows has only one, but for linuxes one is newer perfect, so they have 10 FS... consider this

Well guess it means that it is MS' best shot to make a good, versatile filesystem. A bit like having a swiss knife instead off a toolbox i think. It may be adequate or even perfect for some tasks, but lacking for oters.
Having a choice leaves it up to the user... but that's not MS' filosofy - Fine by me, but don't expect everyone to agree with that desition.
Permalink 19/08/06 @ 05:56
Comment from:
Bas [Visitor]

MT: So you have NTFS on Windows and it performs OK. I've been using PC's from the early 80's and most of that time was spent using MS operating systems. I've never had the idea that the filesystem itself was a performance bottleneck until disks started getting really big, hitting the boundaries of the original FAT FS design. FAT32 was a hack, NTFS actually fixed the problems experienced when using large drives and more. NTFS is actually one of the best things MS has ever developed.
For all intents and purposes, NTFS performs OK on my desktops. I don't feel any speed difference compared to my Linux desktop. That means the FS is not the bottleneck in the desktop area and any halfway decent FS is good enough for current desktop use. I don't feel any performance difference between the many different filesystems on a Linux desktop either. So really, that's not the use case in which you'll find proof for "which FS is better" at all.
I do know that ever since I started using Windows NT on my home desktop (around 1997 I believe), the amount of apparent fragmentation amazed me. Whenever I would run a defrag utility (3rd party at the time), it would usually show huge amounts of fragmentation after only a few weeks of use. The same with Windows 2000's own defragger. So whatever NTFS does, it gets fragmented as hell over time and that's a fact. This doesn't prove anything about the impact on performance though.
Performance differences start to come up when you're doing very intensive I/O over long periods of time. Things like very busy mail servers for example. That's where filesystems like ReiserFS really shine compared to other systems. NTFS attempts to be a 'one size fits all' solution. It performs OK at this for desktops and many typical mixed server scenarios, but it'll never beat a system that was practically tailor made for a specific purpose. I for one am very happy with the fact that Linux gives me a choice in filesystems. I can choose to store caches of a zillion small files on ReiserFS. Windows doesn't give me that choice.
My vote goes to Linux based on the freedom of choice that system gives me to configure my machine for exactly the purpose it's supposed to fill.
Permalink 19/08/06 @ 05:59
Comment from:
mt [Visitor]

Fragmentation prevention costs some delays + constant defraging have some impact on performance.

I NEVER said that NTFS is faster or vice versa (anyway in my opinion ReiserFS is the fastest), but the concept which was presented by author is misleading and as not great explanation of FS background.

Every disk with every FS after is filled and then half emptied NEEDS defragmentation. How that is done is not the subject (it is based on driver implementation). NTFS is biased to one time defragmentation, other FS might defragment all the time during idle times and have some sort of fragmentation prevention system which would use some statistical information and placing rules (and that also has some inpact on performance).

Anyway linux FS might be faster, but authors explanation is wrong! (and that is the only point in my posts).
Permalink 19/08/06 @ 06:28
Comment from:
Yigster [Visitor]

Microsoft Bill must have quite a boiler room of hacks out there just waiting to smear posts like this.

Very good article. It gives a fantastic overview, of the astounding difference between the mess that is called Windows and a real OS Linux.
Permalink 19/08/06 @ 06:37
Comment from:
mt [Visitor]

Are people you stupid or what????? Please wait for a moment and think about it. File systems are not primary school arithemtics!!!!!!!!!!!!!!!

And this article is not overview, oversimplified explanation or anything else... it is just construct in authors head about how things work.
And IT IS NOT WORKING LIKE THIS.
Permalink 19/08/06 @ 07:01
Comment from:
matt [Visitor]
· http://fourthstringthirdfret.blogspot.com
MT: Vitriolic slandering and bad grammar do nothing but hurt your case.

Anyone who has used windows for long enough knows that NTFS fragmenting is getting worse, not better (XPs NTFS may have better performance, but Win2ks was MUCH better when it came to fragmentation). Anyone who knows anything about linux knows that fragmentation is simply not an issue in normal circumstances.

As this is common knowledge, his explination seems plausible. You're screaming "IT IS NOT WORKING LIKE THIS" with excessive punctuation and gradeschool grammar does nothing to alter that, even if you are right. What you are saying about advanced algorithms being the only difference is pure bull, what you are saying about one FS vs many being proof is also bull. The reason that there is only NTFS for windows is that it is the only choice they offer, nothing more, nothing less. On linux, you have ext3 for general purpose filesystems, reiser4 is geared towards desktops, XFS is geared towards servers. On the windows side, you only have NTFS, and that is geared towards performance and stability. NTFS is a very good FS, but that doesnt change the fact that it fragments at a redicules rate.
Permalink 19/08/06 @ 07:33
Comment from:
Anon [Visitor]

More flaming please.


@ author:
Interesting stuff. While, as the comments seem to indicate, it's not perfect, I don't think it was ment to either. However, it sparked some interest to make me read up on this on my own, thanks :)
Permalink 19/08/06 @ 07:40
Comment from:
k7k0 [Visitor]

@mt: Why there's only one FS in windows? Because you can't make one. Mono=1. Monopoly moron.
Permalink 19/08/06 @ 07:46
Comment from:
mt [Visitor]

About Monopoly: I know that there is XP driver for ext3..
MS does not have to implement driver for every FS (it is his choice) - in essence it is community's job to write drivers if they want them to have on windows platform, because ext3, ReiserFS are all developed by open-source community. So why would MS need to implement it for them??
Permalink 19/08/06 @ 08:02
Comment from:
mt [Visitor]

I don't know why am I rewriting my posts over and over again. Probably you people don't read or you are just stupid.


The author's explanation is not correct. It is oversimplified to such degree that there is no truth in it anymore.

1. Key point
What he says is that linuxes do not need to defragment their partitions.
Consider filling whole system with 1-3 MB mp3 files. Then removing 75% of them. Now you want to write some movies. You'll see that your HD is fregmented as it would be under NTFS, and you need to defragment it!!
2. Key point
Author says that windows is writing at the begging of the disk. This is incorrect as it rather spreads the files all over the disk and file writes normally are contigous.
3. Key point
Difference is in writing policy and other more advanced algorithms. Many of features are also written in drivers and as such not a subject of debate (and writing difference as described by author is implemented in driver, as is grouping smaller files and other details).
Permalink 19/08/06 @ 08:24
Comment from:
Sriram [Visitor]
· http://unixdesk.blogspot.com
Very Nice Explanation
Permalink 19/08/06 @ 08:35
Comment from:
neuwi [Visitor]

@matt: It's "explanation" and "ridiculous" :) - But I entirely agree with you. As it is the case in many other forums and blogs, I despise the comments here which are filled with personal hatred or contempt when it comes to comparisons between Windows and Linux - and the tendency shows that words like "moron, sucker, ..." are more often used by the windows fan community unfortunately.
Personally I used both OS's for a long time now. Both have their weak and their strong points. I prefer Linux because of the filesystems - compiling hunderds of small java files is just way faster on Reiser than it is on NTFS - by factors. But if anyone else comes along and says "Linux is the worse OS for his usage or in general" - that's fine by me. For me it's not and there's absolutely no reason for me to shout at someone else because of that. People doing this only prove that they just learned that computers can be used for things other than just gaming and that they are looking for new orientation in this area or simply try to get attention. Well, may they live in peace... but we'd all do better if we'd just ignore them until they leared how to behave. Because neither their comments nor our reactions will do any good to the topic.

@author:
A really good introduction into the topic. As stated above, the approach with the platters is really handled differently. But I'd still look forward to a simmilar high-level comparison from your side between ext2/3, reiser, zfs, xfs, jfs.
Permalink 19/08/06 @ 09:16
Comment from:
myself [Visitor]

Permalink 19/08/06 @ 09:23
Comment from:
kimo [Visitor]

MT: It is obvious from the way you talk that YOU know nothing... all you can do is flame someone else. Learn proper english and get a life!
Permalink 19/08/06 @ 09:34
Comment from:
mt [Visitor]

Ok I will rephrase so it will be shorter (and you will maybe understand)

ReiserFS is [CENSORED] faster than current NTFS because of things other than explained by author.

And by current NTFS I mean current implementation of NTFS driver under XP.
Permalink 19/08/06 @ 09:57
Comment from:
PB [Visitor]

@MT:

Did you not read this from the article?

"Fragmentation thus only becomes an issue on Linux when a disk is so full that there just aren't any gaps a large file can be put into without splitting it up. So long as the disk is less than about 80% full, this is unlikely to happen."

If you didn't bother to read the article before spouting your crap, you're a retard. If you did, then you're just trolling. Either way, you make me laugh. Keep it up :)
Permalink 19/08/06 @ 10:38
Comment from:
jayKayEss [Visitor]
· http://www.jaykayess.com
I second the opinion of the commenter who'd like to see more hard references. I've long noticed that my Linux desktop slows down over time after a fresh install; hd performance could definitely be a factor. But that's true of Windows too.

And honestly, there's nothing more moronic than calling someone else "retarded." I think someone needs a time out.

@mt: You've obviously never used the ext3 driver for XP, or you'd know that it's a total PITA. Your ext3 drives don't even show up in My Computer. I can imagine that enterprise users would appreciate the ability to mount non-native filesystems transparently, but this is not to my knowledge possible.

Also, Linux has 10 filesystem drivers so that it can play nicely with everyone else. In actual practice, only two are widely used, ext3 or reiserfs.
Permalink 19/08/06 @ 11:16
Comment from:
Blissex [Visitor]
· http://www.sabi.co.uk/Notes/linuxFS.html#fsNotes
As other people have remarked, Linux filesystems suffer from considerable ''scattering'' of blocks issues, in different ways and for different reasons.

As Braam says, concurrent allocations are a particular problem in 'ext3' and that is somewhat mitigated in recent versions (2.6.11 and later).

I have done extensive measurements, and some informal but illustrative over time tests. Consider reading:

http://www.sabi.co.uk/Notes/linuxFS.html#fsNotes
http://www.sabi.co.uk/Notes/anno06-2nd.html#060416
http://www.sabi.co.uk/Notes/anno05-4th.html#051010
http://www.sabi.co.uk/Notes/anno05-3rd.html#050913

I think that a slowdown of seven times over some months is pretty noticeable.
Permalink 19/08/06 @ 12:28
Comment from:
Dud3 [Visitor]

mt, you seem to know alot about FS. Please wy don't you write an article that is simple so everyone can understand? Please not to techincal. I see some other people also seem to know more than the author. So I hope every body who knows some thing will pitch in to make it a good article.
Maybe a Wiki will be cool.

Thanks in advance.

!WARNING!:
My english is bad I know. I am working on it.
Permalink 19/08/06 @ 12:30
Comment from:
Keith [Visitor]
· http://keith.hostmatrix.org
It's a nice interpretation of what's going on beneath Linux file system. What do you reckon about the difference between ext2 and ext3, and perhaps JFS? Do they work the same way as what you mentioned under linux?

Also, I do wondered about why NTFS is less defrag?
Permalink 19/08/06 @ 13:29
Comment from:
:( [Visitor]

"MT: Vitriolic slandering and bad grammar do nothing but hurt your case."

Damn! Beat me to the punch!

It helps to spell loser correctly when you want to talk down to people.

I enjoyed how simple it was, I don't have the time to get into the nitty gritty, I just wanted a glossed over answer and that's what I got. There were plenty of more detailed links posted by other users. You should have done the same instead of insulting everyone's intelligence, loser.
Permalink 19/08/06 @ 14:16
Comment from:
mt [Visitor]

"There were plenty of more detailed links posted by other users. You should have done the same instead of insulting everyone's intelligence"
And nobody tried to explain anything...

It's not that I want attack linux community, but posts like 'Excellent explaination! I have wondered this for years.' really bothers me, because the whole explanation is RIDICOLOUS!

I mean I can't bealive that people bealive this [CENSORED] Do you really think that entire MS is so stupid to make flaws described by author???
Permalink 19/08/06 @ 14:34
Comment from:
:( [Visitor]

If you care so damn much, then how's it really work then, teacher? All I've seen from you is half-assed lambasting about how it's incorrect, with no real explanation on your behalf. So far, the author is at 1 and you're 0 until you can express yourself better.

"Are people you stupid or what????? Please wait for a moment and think about it. File systems are not primary school arithemtics!!!!!!!!!!!!!!!"

Please make a blog, and explain it instead of being the dirty diapers on this one.
Permalink 19/08/06 @ 14:44
Comment from:
mt [Visitor]

I don't know details but I know enough to know that the explanation is ridicolous.

If you are interested in FS you can read from other more professional sources.

You know file systems is science and not something (in authors opinion) you make to read from disk!!
Permalink 19/08/06 @ 15:00
Comment from:
k [Visitor]
· http://www.krkosska.com/
Great article. Time to relax and move on.
Permalink 19/08/06 @ 15:13
Comment from:
RTRA [Visitor]
· http://www.slackware.org
I'm running Linux for 6 years and I'm so much happy than I was with Windows. My filesystems survive for like 2+ years, and I don't really feel any difference, what supports the assymptotic fragmentation commented above.
I'm allways working with files of all sizes and it's not unusual for me to pack 9x% of my /home partition.

I cannot imagine having 2+ years old filesystems with windows, at least back in win2k's times.
My eXPerience resumes to like a total of 30 minutes, so I can't comment on the present state of affairs.
But win2k was _still_ so much PITA compared with GNU/Linux in sooooo many aspects.

I don't think I'll have to pay for Microsoft Software [CENSORED] in my life. I'm happy with GNU/Linux.
And as a C.S. freshman, I'll not take a job where I've to run Windows on my work computer.

So long, M$(hit) suckers!
Permalink 19/08/06 @ 17:42
Comment from:
unknown [Visitor]

@mt:
"I don't know details but I know enough to know that the explanation is ridicolous."

Well, mt, then tell us what you know. So that we could see
too why this explanation is ridicolous.

"If you are interested in FS you can read from other more professional sources."

Please, then give us references.

As I noticed, author of
this blog just wanted to give the simplest notes on what
fragmentation is and to simplify the differences concerning fragmentation in the two mentioned filesystems.
I believe he is aware that the things are different than
here described. Perhaps he wanted to give simple
explanation to people who are not familiar with FS theory,
algorithms and everything else envolved with this.

And why, why are you calling people names?
This is not the way the discussion should go.
This is not the way for any results and conclusions.
People, is someone of you familiar with filesysems?
Often you write that you do not know the details but yet
you have strong arguments.
And good point: where are your references?

And mt, you are so confinced, but where are your
references? What are your arguments? Scientific with
explanations and proofs. You always mention:
"File systems are not primary school arithemtics".
So give as science references.

Many of you would say that I should give references too.
I would like you to notice that I didn't try to give
more explanations but try to clear the point of this article.
Perhaps this is a good beginning for a little man who
doesn't know FS to start with, because it brings up
many questions that lead to exploration.

I too noticed that "the tendency shows that words like "moron, sucker, ..." are more often used by the windows fan community unfortunately".

Perhaps the reason for this is understanding of freedom?

Permalink 19/08/06 @ 18:11
Comment from:
unknown [Visitor]

I'm afraid this has become a kind of war field in which
it is argued about what OS you like more.

From the article it should be a discussion about filesystems, without ugly names.

This ugly names just prove that you are not as educated
in computers you clame to.
Perhaps you should do more research before attacking others
so hard.
Permalink 19/08/06 @ 18:24
Comment from:
Sanity [Visitor]

Windows and linux are not the same, they are different animals with different uses.

Try spending 20+ years in the IT field.
Once you do you will see that all your comments will be useless in 2 years.
Just make the best of the tools you have because tomorow it aqll changes
Permalink 19/08/06 @ 19:15
Comment from:
AW [Visitor]

I think many of you have missed the point.
The original article was entirely correct as far as it went.
Even Aunt Daisy could understand it and glean as much information as she probably wished to know.
Perhaps the comparison should have been between Fat / Ext systems rather than Windows/Linux as both are Linux supported.
This is one advantage on Linux - one can use virtually any file system ever invented to suit requirements. It comes back to horses for courses.
If one saves many small files to FAT and none of them exceed the cluster size there will never be any fragmentation. Archived data (never altered) probably won't fragment on any file system either.
The FAT file system has served us well and is still useful especially for small simple storage requirements - Most of us realised its weakness when drive sizes passed 500MB and we upgraded the drive to 1G and failed to gain much free space due to the huge clusters. FAT 32 didn't solve these problems, merely postponed them.
My own experience over the years using an average range of file sizes in normal use, is that the EXT2/3 system has never let me down or showed signs of degradation due to fragmentation and that both FAT and NTFS always end up in a fragmented mess over time.
What annoys me is the inefficent way that Windows defragmentation moves everything to the begining of the drive which virtually guarantees that the first time a file is changed it will again become fragmented.
In my opinion defraging by optimising free space would be preferable. The single extra seek time required to access a file at the drive's 'far end', so to speak, is preferable to many seeks jumping all over just to re-assemble one file and it must be more stressful on the drive.
Of the drive failures that I encounter, they seem to always be failing on windows systems and generally with errors reported around the FAT table or registry entries on Win XP where the stress is highest. The fact that they are windows machines may be entirely due to the fact there are more of them than Linux boxes I am involved with but I do wonder!

Permalink 19/08/06 @ 21:14
Comment from:
Bob [Visitor]

Great article that I can link to so I don't have to explain fragmentation to the non-technicals who just want a picture of what's going on.

Is it dead-on accurate? Of course not! No analogy ever is. That's the strength of using them - using an already understood paradigm to explain a new concept in an introductory fashion enables the student to envision an idea in the framework of something else they already understand.

It's unfortunate the comments haven't been more of a technical discussion surrounding the file system features/limitations or around how to improve this analogy.

Keep writing - as a trainer I'll be borrowing your analogy for my classes.

Thanks!
Permalink 19/08/06 @ 22:10
Comment from:
Bob [Visitor]

Post-Script:

Probably would've been better to use a different title for the article. It begs for the us/them of the Linux v. Windows crowd.

Maybe "Defragging a Hard Drive: Why A Drive Fragments"
Permalink 19/08/06 @ 22:13
Comment from:
Sean [Visitor]

@MT: condescension does not strengthen your case. in fact, your juvenile attitude severely erodes your credibility. if your knowledge is so utterly comprehensive, go ahead and put up a page so that we too may be enlightened. on the other hand, if all you have to contribute is diatribe, then let me be the first to recommend that you suck it. Suck it long, and hard.
Permalink 19/08/06 @ 23:49
Comment from:
Jesgue [Visitor]

Hi, I think I can throw a bit of light on some things, so I'll try.

First, this is not only a filesystem issue. The algorithm used by the kernel of the OS (it does not matter if you use this or that OS) to make the operations in the hard disk can be an issue.

In linux it is called the "disk operations scheduler" or just "elevator" and can be changed at boot time by passing to the kernel the parameter elevator=...

No muli process cabable OS (at least i hope so, please, someone with a deeper understanding about Windows, comment on how Windows does this) is that idiot to just process al the petitions sequentially as they come. But instead, the os wait for some operations to be on some kind of stack, then, it reorder them considering the geometry of the drive, in a way that the seek time is reduced to the minimum possible amount of time. The same for multiple files or different fragments of the same big file. Still I am not sure how windows deals with this, and no one can comment for sure, because of the closed nature of this OS. We can only make unfounded statements about that, or even modelizate an aproximation of the algorithm by observation, but still there would be no realistic proof.

Of course, you can use not to choose a scheduler in linux, servers of big files and machines that are used for streaming usually do not need such a thing. And those that use the raw devices (no filesystem) do not need that either.

What I said above is also a noticeable difference. I mean: the elevator algorithm can be used to order the operations when writing, but also during reads. That is why, even if there is an high level of fragmentation, the impact on the performance is not that big like it was for example in MS-DOS. The operations are ordered in a way that even if there is fragmentation, all the reads will be done in the better possible orden, depending on the geometry of the drive. This is a good thing, cause it will save a lot of activity and thus, will make the life of the drive longer. That is not saying that fragmentation if not a problem, but, as you might deduct from this that I explained, the use of an elevator algorithm makes it impact on the overall performance a lot smaller than if you don't use it.

As you might suspect, the scheduler is a different part and has nothing to do with the filesystem driver at a first glance. Though, of course, things will be a lot better if they both cooperate :P

Under normal circumstances, the fragmentation is a minor problem under linux, most times, it is a neutral thing.

By the way, the arithmetics on a filesystem are not that complex, in fact, the most complex thing would be statisticall analysis that some filesystems can do. Still, it is not a thing that I would call complex maths. Integrals and derivations on an Informatic Sciences career ARE complex, discrete maths or whatever they are called in english, are not. Just cause there are massive operations it does not mean that they are complex ones. :P

I will comment something on ext2/3 which I know better.

e2fs likes to divide the filesystem in blocks. Then, comes the groups, which are contiguous regions of blocks. The groups contains a given number of blocks, and i-nodes. When an i-node is created linux chooses the group with the largest number of inode available. When needs to write, linux will preferentially use the blocks in the same group that the inode is located, if needed, will allocate more blocks, always trying to maintain all of them together.

The result of this policy is that at most, the fragmentation is generally of a few blocks, if any. And, in any case, it is almost always in the same direction, avoiding the front-to-back that happens a lot in FAT and its derivatives.

Oh! Also by the way, a filesystem is a way to access files, a specification, not only a concrete driver. So, in my opinion, FAT, FAT32 and FAT16 are the same filesystem, not three separated ones as someone said, since they all share the same base code, and are just patched versions to support bigger drives and filenames.

And yes, it is correct that in an almost full drive the performance can degradate substantially, but storage space is cheap these days, isn't it? Even if the 5% space rule was not intended to keep fragmentation low, it is certainly a side effect that is relevant to the discussion. So, I don't understand the will of some people to take that out of the conversation.

OS/2 HPFS is similar in this regard, it just calls the groups bands or stripes, I think. For techies, it uses some kind of pseudo b-trees for the directories that periodically needs to be rebalanced. When the filesystem is almos full, that balancing can create a severe slowdown when it affects a lot of different groups.

e2fs uses an array for this, so, the general performance is a bit worse when it comes to pure speed, but in that respect, it is much faster than HPFS. Still, you can use -O dir_index when formatting an ext3 filesystem to improve that. It makes ext3 use hashed b-trees to improve the lookups in large directory trees.

So, there are two separated issues, the fragmentation is just one of them. But the real problem with Windows is the scheduler, the elevator. That is the cause why in linux the fragmentation is not a big problem, even if there is a lot of fragmented files. The other problem is the fragmentation, and the one to blame about that is, indeed, the filesystem driver.

I hope this helps someone a bit to better understand the issue.

Best regards :)
Permalink 20/08/06 @ 00:52
Comment from:
neuwi [Visitor]

@Jesgue: Great additional information, thanks!

@Keith: Afaik, ext3 is the same as ext2 but with added journaling and probably added indexes. I have installed an ext2 driver for windows on my machine and use it to access my ext3 partition - which is also used under linux. Works perfectly stable, just that under Windows using ext2 I have no journaling - but under Linux I do.
BTW: This IS the alternative to FAT (which has a 4GB limitation for files) is you use a dual boot machine. Highly recommended with "Paragon Mount Everything" !

@Sanity: I tend to agree with you in general IT development but I think with file systems it's a bit different. Just look at the age of those beasts. FAT is 20+ years old and still in use, NTFS certainly 10+ years and ext2 most certainly also has more than one decade on its shoulders.
Permalink 20/08/06 @ 05:12
Comment from:
gvy [Visitor]
· http://www.linux.kiev.ua
@mt

Shut up, you dumb.

There's never such thing as a one-size-fits-all, filesystems being no exception.

ext2/3 is rather low-performance but quite durable.

reiser3 is fast, especially with big dirs/small files, and quite reset-immune. Although if it goes down, it *goes down* (one might still hope for namesys guys' help, they actually do get data back, I've seen examples).

reiser4 is a beta to me, so not used.

xfs is a very robust filesystem regarding heavy I/O, it's just very prone to any of power failures and kernel crashes -- thus being "server" one, which is not surprising.

jfs... didn't try.

I use at least ext3, reiser3 and xfs; quite often these are combined within a single box.

> Probably he don't understand that here are advanced
> algorithms involved
Probably you didn't ever implement any algorithm on a computer yourself. I'm writing this as a developer-back-then, who liked non-trivial cycle invariants instead of recursion. ;-)

> It is similar as asociating linux with his FS in 1990...
It's you MT, and only you, who's brilliant moron here. You even didn't educate yourself that there was no such thing as a "linux filesystem" in 1990, since there was no Linux. In 1991, it started with Minix FS.

> Every disk with every FS after is filled and then half
> emptied NEEDS defragmentation. How that is done is not
> the subject (it is based on driver implementation)
No. You apparently cannot even understand the word "average" specifically mentioned in explanation why defragmented FS in multithreaded I/O environment is a loser (just like you, same ol' block).

> About Monopoly: I know
You Know Nothing. Go educate yourself, silently.

> I don't know details but I know enough
Repeat after me: "I Know Nothing". Then, see above.

> Difference is in writing policy
The author did enough legwork to explain the difference in drivers (find "USB" there, it's around); but even better file layout can't fix the limitations of an ancient -- not even a File System, but rather File Allocation Table, like metadata strictly at the beginning (hence additional seeks). Are you stupid or what? Halloo?

> Do you really think that entire MS is so stupid
> to make flaws described by author???
Yep.

BTW, all of this is written for curious users because you, dumb "mt", are only able to spew brutal words which just show how much of an animal you are. Well, grow up, become a human. AND STOP WHINING LIKE AN AMERICAN.

@AW

> Perhaps the comparison should have been between Fat /
> Ext systems rather than Windows/Linux as both are Linux
> supported.
Hey but it already is! :)

@Jesgue
> So, in my opinion, FAT, FAT32 and FAT16 are the same
> filesystem, not three separated ones as someone said,
> since they all share the same base code, and are just
> patched versions to support bigger drives and filenames.
A bit other way. FAT12, FAT16 and FAT32 (which would be more correct to be called FAT28) are about sizing limits. VFAT is about long names. It's orthogonal TTBOMK.

> e2fs uses an array for this, so, the general performance
> is a bit worse
Actually, it is awful when there's active I/O -- together with missing delayed allocation. At least on 2.4.x which I use for most servers by now, XFS could save the system from LA > 20 at a bunch of simultaneous reads and a couple of an order of magnitude faster writes. Directory handing has improved within 2.6 series but delayed allocation is still not there AFAIK.

> the real problem with Windows is the scheduler, the elevator.
They told that *seemingly* there was some sort of elevator in Windows 95. Don't know how close to reality it is.

--
Michael Shigorin
Permalink 20/08/06 @ 05:56
Comment from:
mt [Visitor]

"AND STOP WHINING LIKE AN AMERICAN."
die [CENSORED] commie
Permalink 20/08/06 @ 06:04
Comment from:
mt [Visitor]

ok this is probably one of my last posts

I never said that ntfs is better, moreover I agree that linux filesystems are better.

But I disagree with oversimplified explanation of author. Ntfs suffers from fragmenation because it (or maybe drive) write files all over the disk and without advanced algorithms...

Anyway I am performing some test and if I fail I'll capitulate and end this game :P
Permalink 20/08/06 @ 06:33
Comment from:
niyue [Visitor]
· http://www.niyue.com
The best thing on Linux file system, in my humble opinion, is that it can support many more different FS than Windows.
Permalink 20/08/06 @ 08:05
Comment from:
Shergar [Visitor]

I agree with the comments about the title - perhaps
vfat and ext2 comparison would have avoided trolls of all denominations from stoking the fire. Not too late to change it me thinks.

Remember before you post:
The aim of the article is to keep the explanation fairly simple - all you disk platter people can get back to the lab or create your own article on your own lab web site - hey then you can be as technical as you want in _your_ article.

Mr ".NET fixes this" - is obviously a fan of the layered approach - I need another layer coz the layer I am stuck with is mandatory but started looking dated around the turn of the century.
Permalink 20/08/06 @ 08:45
Comment from:
me [Visitor]

(I have not read all comments above but..)
I wonder what fileplacement-method would be best for my filesharing-disk?

As we all know there is only [new disks] and [full disks]. So my "usagepattern" is somewhat like I'm constantly have to free up x Gb to make room for these movies and tv-series etc I am downloading right now, by deleting various older movies and series and mp3 and sometimes directories with 10000 fonts.
(I guess the total space for each major type of file (2Mb-8Mb mp3, 100-300Mb series, 700Mb movies) is normally somewhat equal, and other stuff like textfiles, pictures, comics and software eats up less space.)

Which method would be best in this case?

(I am not asking about wich filesystem, I guess you could use any of them, at least fat don't force you do it a specific way except for the clustersize. And ignoring that my other os will of course using another method)

btw, is it possible to set different methods for different disks with linux?
Permalink 20/08/06 @ 09:14
Comment from:
Jake3988 [Visitor]

Very good job on the article. Nice simplistic way of showing how the hard drives work.

I've had the pleasure of running freebsd for about 18 months now. When I boot the kernel, it tells me how fragmentated my disks are. And, since I've gotten it, it has yet to change from .4%. Now I know a simplistic reason why.
Permalink 20/08/06 @ 09:48
Comment from:
mt [Visitor]

Ok here are results of my tests. The intention of these tests is not comparison of Linux and Windows file systems but rather (only) to disprove author’s explanation about Microsoft’s file systems bias towards fragmentation. Also tests are not intended to deny fragmentation of those file system.

Tests are fairly simple, but should be able to show some background. I used Windows Explorer file browser to copy files across drives and Disk Defragmenter to inspect disk surface. We suppose that defragmenter is not lying about disk structure and that graphical presentation is accurate enough to represent actual structure and that we are able to make conclusions from it.

Author is in his explanation generally using word FAT to represent all windows file systems (as contrast to Linux file systems), so NTFS as the newest representative of windows file systems is also the subject of his article.
I did not make FAT based file systems tests because FAT FS is rather obsolete and is only kept for compatibility and portability reasons.

The tests were run on formatted NTFS 2.25 GB partition with 512 byte sectors. Also all tests were performed while listening to music located on partition which lies on the same disk.

1. Step
I copied 2 movies (700 MB) simultaneously on the disk. Disk Defragmenter (DD) reported that there wasn’t any fragmentation in each file. Also files were not placed at beginning as is author trying to predict.


2. Step
I deleted both films and write 1.15 GB of mp3 files with average size of 5 MB. DD showed that only one file was fragmented into two parts. Also all files were contiguously written.


3. Step
I have deleted more than half of mp3 files in random order. See result.


4. Step
I have copied one film. As DD showed movie wasn’t copied at the beginning between mp3 files but was instead copied to the empty part of disk. (This also denies authors predictions).
Movie file was not fragmented.


5. I have copied another movie to disk. This file was little fragmented as there was no other free contiguous place. Algorithm behind this has maybe flaws or is making some compromises because now it didn’t take the largest contiguous free space. Movie file was written into 9 fragments.

(Red sections indicate parts that are fragmented)

6. Step
I’ve added last file and that one was really fragmented as windows didn’t have any choice in selecting space. Movie file fragmented into 32 parts.



The only conclusion we can make is that the author’s explanation is ridiculous and in some way silly because it is in no way representing how modern file systems are working. I’ll repeat once again that NTFS is biased toward fragmentation but not on basis explained by author. (And thus is this article misleading).
Also I would like to add that such bias is result of less advanced implementation of NTFS driver and writing scheduler in Windows file system, as writing decisions are implemented in software and not in NTFS specification.
Permalink 20/08/06 @ 09:59
Comment from:
SlurpFest [Visitor]

@mt: why do you care so much? Don't you feel silly getting so upset over a set of comments on someone's blog? If you're reading this message and feeling the heat to flame back, then you're checking this site too often - billions of web sites out there, but you have to win a flame war at this one or, what? You feel defeated? Inadequate?

Honestly, flame wars are the unfortunate sludge of the Internet, the bastard child of the productive, academic discussions it was designed for. You worked so hard on your last post, with all the illustrations, why not post your own web site about disk fragmentation and call it a day? Here, you went to all this effort just so you could make the banal conclusion that the author was "ridiculous."

The next time you feel compelled to surf a site just to throw flames around, as if it's a game you are compelled to win, remember that any such "victory" means nothing in the long run, and all you're doing is wasting your lifetime away. Go ride a bike or lift weights or meet a girl or something.
Permalink 20/08/06 @ 10:32
Comment from:
catlett [Visitor]

Thank you to the author.
This is the first time I have seen a simple explanation of defragmentation. There is alot of crap going on in the responses but I just wanted to say thanks.
For the highly educated like mt this is a horrible explanation but for a non-educated computer novice, I appreciate the simple analogy.
It is a shame that you took the time to try and educate but others who "say" they know more have not taken the time to rebut but just use profanity.

Just a note. I am a windows user since 98 and a linux user since last year. I do not see how this has become an OS battle, but since it has becoime one, I will just make one comment. In windows I had to run a defragmenter, a registry cleaner, virus/spyware and reboot after any modification of my system. With linux I do none of those things. I have had my last installation of Ubuntu/linux running for 1 year. I have no firewall, no virus application, no defragmenter, I have not "had to reboot" the entire time and my system is as responsive, fast and virus free as the first day of install.

P.S. Even if linux wasn't better I would still stay in the linux world. Windows is full of mt's. Ugly, ignorant, white trash with a bit of knowledge but no common coutesy to others.
"Vulgarity is a sign that the speaker doesn't have anything intelligent to say."
Permalink 20/08/06 @ 10:39
Comment from:
oneandoneis2 [Member]

Evening mt, thanks for another of your highly entertaining responses - you've brightened up my weekend considerably with your hilarious comments.

Step 2's accompanying picture is a lovely example of exactly the issue I highlighted in the article - in excess of two hundred files, all crammed together into one big (and one small) lump at the "start" of the disk.

Steps 3-6 baffle me with their irrelevance, however - none seem to address fragmentation due to changing file sizes, which is what the original post talked about.

They do show beautifully why even a well spread-out filesystem such as the one you created in step 3 will suffer from fragmentation when it gets too full, as I mentioned happens to Linux filesystems in the post.

So overall: Well done on your demonstration as to the accuracy of my post. Pity about the excess irrelevant images, but I'm sure practice will help you on that score.

Lastly, whilst this blog isn't really intended to be classed as a "child-friendly" site, I take a dim view of excessive swearing, so kindly moderate your post language, or I will do it for you.
Permalink 20/08/06 @ 10:51
Comment from:
mt [Visitor]

1. I hate hippie like rebel linux zealots because their existence. And i can swear if they don't listen. If you think that this is indication of my IQ this is your problem.

I also hate average windows user who are falling to their propaganda.

I know that those posts are not worth of time; I just got too much time.

I have nothing against Linux (even if I don't like GNU licenses). I just don't like people with attitude of author and if he has right to be extreme then I will be just like him. Problem is that people believe Linux supporters even if they are [CENSORED]

I mean speaking about how your platform is superior don't bother me, just don't attack other platforms with untrue facts. And I can't stand that attitude.
Permalink 20/08/06 @ 11:04
Comment from:
mt [Visitor]

oneandoneis2: If you have time replace bad language with *... (I believe I don’t have access, because I am not registered).

Also most of bigger files are expected not do be resized. Nearly all file types reserve space required and then fill it.

And mp3 files certainly falls into that group. The point of my post was that Windows is not stupid in allocating space. Also what would happen if Linux would spread mp3 files all over the disk? How would you then write movies without heavy fragmentation? There are always compromises. How are those algorithms implemented is science.
Permalink 20/08/06 @ 11:18
Comment from:
mt [Visitor]

About fragmentation problems on windows: I have windows installed and I don't think that they my disks are fragmented too much.

You just need some common sense - use separate partition for system, users files (and maybe one for frequently changing content like p2p), and swap.
Permalink 20/08/06 @ 11:25
Comment from:
unknown [Visitor]

@mt: please, educate yourself a little before defending
NTFS, FAT so blindly. Noone says that there is no
fragmentation in ext2. It's just that there is less
fragmentation in ext2 than in FAT.

Some references about NTFS, FAT, REISERFS, fragmentation,
etc.

http://www.namesys.com/

http://forums.gentoo.org/viewtopic-p-3081971.html

http://www.pcguide.com/ref/hdd/file/ntfs/relFrag-c.html

http://www.digit-life.com/articles/ntfs/

http://forums.whirlpool.net.au/forum-replies-archive.cfm/389441.html

http://www.biznix.org/whylinux/windows/fragment.html

http://cbbrowne.com/info/defrag.html

And mt, don't be too smart!
Permalink 20/08/06 @ 12:10
Comment from:
unknown [Visitor]

"Inside the Windows NT File System" the book is written by Helen Custer, NTFS is architected by Tom Miller with contributions by Gary Kimura, Brian Andrew, and David Goebel, Microsoft Press, 1994, an easy to read little book, they fundamentally disagree with me on adding serialization of I/O not requested by the application programmer, and I note that the performance penalty they pay for their decision is high, especially compared with ext2fs. Their FS design is perhaps optimal for floppies and other hardware eject media beyond OS control. A less serialized higher performance log structured architecture is described in [Rosenblum and Ousterhout]. That said, Microsoft is to be commended for recognizing the importance of attempting to optimize for small files, and leading the OS designer effort to integrate small objects into the file name space. This book is notable for not referencing the work of persons not working for Microsoft, or providing any form of proper attribution to previous authors such as [Rosenblum and Ousterhout]. Though perhaps they really didn't read any of the literature and it explains why theirs is the worst performing filesystem in the industry...."
Permalink 20/08/06 @ 12:16
Comment from:
Sebastian [Visitor]

For Your Info,
I work as a consultant for a software firm. The software is for spam analisys, and we are having troubles in windows because of fragmentation of disk, the sistem is writing 2 2 millons files of arround 10 Kb per day.
If you see the Fragmentation Details, it's horrible, a 10kb file is fragmented in 6 parts, and a 2Mb file in 1500 parts.
The partition is 80Gb and with 20Gb free.
opening a TXT whit "hello world" inside takes about 15 seconds.

Also have a Linux version of software and it don't have the problem.

I'm don't like one over the other.
The two systems have Pro.
Permalink 20/08/06 @ 12:21
Comment from:
mt [Visitor]

2 mio 10 KB is special case... and I don't deny that for such situation the best is probably ReiserFS

Anyway I see that posting here is useless since nobody really reads my posts so I'll have to quit posting...
Permalink 20/08/06 @ 13:08
Comment from:
Lphant [Visitor]
· http://www.lphantes.com/
LOL, is there any Spanish translation?
Permalink 20/08/06 @ 13:38
Comment from:
Tristan [Visitor]
· http://lijie.org
Good job!
Permalink 20/08/06 @ 18:36
Comment from:
Jason [Visitor]

This was quite an interesting read including all the comments ;) Thanks for the post oneandoneis2!
Permalink 20/08/06 @ 19:52
Comment from:
Ding [Visitor]

Good Stuff! :)
Permalink 20/08/06 @ 20:44
Comment from:
ext3 [Visitor]

hey mt do an experiment with linux file system too and post it since you have the time to experiment
Permalink 20/08/06 @ 23:04
Comment from:
Jade [Visitor]
· http://linux.coconia.net/
Hey, some of you dudes seem to know what you are talking about.

Care to cast an eye over some of the articles at

http://linux.coconia.net/

and make a few comments on the bulletin board there. Would really appreciate it.
Permalink 21/08/06 @ 00:18
Comment from:
escape [Visitor]

Did everyone write in while they were wasted? I'm thoroughly inebriated, however, I can still manage to spell properly so perhaps I'm not intoxicated enough. I have to get some of whatever certain folks who posted here seem to be on.

Anyway, my opinion is just that, if you've never experienced or worked with heavily used servers, you've not truly really experienced disk corruption and fragmentation first hand so maybe you shouldn't be allowed an opinion. Whether or not this article is one hundred percent correct, fragmentation is a problem on every OS I have had the displeasure of fixing, and it appears to be a more rampant problem on non-UNIX/UNIX-like systems. I've observed that fragmentation gets bad enough on some heavily "used" servers blessed with NTFS formatted drives that it inexplicably leads to file corruption. Annoying? Oh, very much so.

If anything, this article and the resulting comments expose that there is something fundamentally wrong with any file system, no matter how much the fanboys protest. Kudos to that. Enough with the biased malarkey, there's no need to brag about things related to this topic that's besides the point.

Missed the point? Enough of the pissing contest. Let's fix what's wrong.
Permalink 21/08/06 @ 01:21
Comment from:
kia [Visitor]

@mt - the test you performed kinda' missed one of the main points of the artical. If you had copied the two video's you mentioned, it will place them sequentially on the drive, if you then made the first file bigger (i.e. added more content) it would fragment the file. With other files, this is less likely to happen as space is left between files - I think this is one of the main points the author was attemping to demonstrate.
Permalink 21/08/06 @ 01:24
Comment from:
-sianz [Visitor]

i'm amused by how many who 'approved' of such grossly wrong article.


this is the sort of article who i will use to 'explain' to a computer illiterate person just to kill their curiousity. (just like saying the PC got broken, instead of saying the OS is infected with worms/viruses)

technically, the article is wrong.... people supporting the article are clueless on how basic FS works.

NTFS,ext2,ext3,ufs,jfs all are FS standards, and by standards, it's up to the OS implementer to determine how the file read/write behaviour is like.

and instead of asking someone to 'write up' a new article... get off your asses and google on those FS implementation articles... the internet community is not there to spoon feed you with info. The info is there, go look for it.

seesh. bunch of lazy morons.
Permalink 21/08/06 @ 01:55
Comment from:
GunstarCowboy [Visitor]
· http://www.eyequake-studios.com
Brilliant explanation well presented. What about writing some more?

/dev/hda1...2...etc


VFS?
Permalink 21/08/06 @ 03:17
Comment from:
mt [Visitor]

-sianz [Visitor] - that's what I am talking all the time. (they just don't listen)

Kia - presentation is accurate enough. It shows that if I copy two large files simultaneously (and while listening to music) files will not get fragmented.
Yes I suppose I could close and open file handle for every sector, I could play with preallocated/incrementally allocated space, sparse files and all that stuff, but normal file copying (as presented) is probably the most frequent file operation (for average user).
Also in Step 4 you can see that OS has some intelligence in choosing space.
can see that OS has some inteligence in chosing space.
Permalink 21/08/06 @ 04:41
Comment from:
rr [Visitor]

Very nicely put, though most of us had an idea, its more embedded into our brains now thanks to your pictures !

For all the idiots with lame and negative comments, either they are MS shills like another reader said, or over-their-head sys adms who think they know too damn much. Obviously this is a for-dummies explanation..and go get a shave already !

Permalink 21/08/06 @ 07:36
Comment from:
NoSalt [Visitor]

Awesome read ... extremely clear and easy to understand. I am familiar with why this is so anyway but I loved your little hard drive and examples.

If you dont't teach you should!!!
Permalink 21/08/06 @ 12:08
Comment from:
Jake [Visitor]
· http://ns.tan-com.com/
Great explanation!
Permalink 21/08/06 @ 16:22
Comment from:
Jesgue [Visitor]

@neuwi
>@Jesgue
>> So, in my opinion, FAT, FAT32 and FAT16 are the same
>> filesystem, not three separated ones as someone said,
>> since they all share the same base code, and are just
>> patched versions to support bigger drives and filenames.
>A bit other way. FAT12, FAT16 and FAT32 (which would be more correct to be called FAT28) are about sizing limits. VFAT is about long names. It's orthogonal TTBOMK.

About vfat you are right, all the vfat implementations that I know of can handle any variant of fat, it does not matter it it's 12, 16 or 32.

There are really a few differences between 16 and 32, I shortened the story a bit. For example. fat32 can use smaller cluster (a side effect of the support for a wider range of possible indirections). It supposedly also handle some kind of backup of critical structures of the boot sector... though I have never seen any stability improvement on the facts :P It also removed that weird limit in the number of files in the root of the drive. But I would not consider that a new feature, but rather, a bug fix hehehe.

Besides that, all the fat variants are called fat for a reason. FAT is just a file allocation table, and any fat filesystem, regardless of the size of the registers that it uses to make indirections -sorry if that is not the correct term in english, im not a native speaker- they all are the same filesystem, works the same, and can do the same. It does not matter if any better implementation could be made (in fact, vfat is far superior) the filesystem will still be very inferior to all the rest of the filesystems that I know (that are in use nowadays :P ).

Thanks for the info on the 28 thingie hehe, I am not a fan of FAT and really dont know too much about its internals. ;)


@Besides that, maybe some people could revise what I wrote above about the elevator algorythm and stop fighting about how cool this or that filesystem is. A good part of the performance is defined by the elevator or i/o scheduler, wich is the one that sorts the i/o operations accoding to the disk layout so the seek is reduced to the minimun. The fragmentation has an incidence, but under most cases is not the key, unless it is too extreme and the filesystem is reiser 3.x :P
Permalink 21/08/06 @ 18:59
Comment from:
Jade [Visitor]

Care to cast an eye over some of the articles at

http://linux.coconia.net/

Thanks to the dude who helped out.
Permalink 21/08/06 @ 20:18
文章录入:菜刀    责任编辑:菜刀 
  • 上一篇文章:

  • 下一篇文章:
  • 发表评论】【加入收藏】【告诉好友】【打印此文】【关闭窗口
    网友评论:(只显示最新10条。评论内容只代表网友观点,与本站立场无关!)
    关于我们 | RAID数据恢复 | 友情链接 | RSS生成 | XML生成 | 文章HTML地图 | 下载HTML地图

    版权所有 北亚数据恢复中心
    全国统一客服电话:4006-505-808
    北京市海淀区永丰基地丰慧中路7号新材料创业大厦B座205室
    京ICP备05011939