Discussion:
worse than expected compression ratios with -o compress
Jim Faulkner
2010-01-16 16:16:50 UTC
Permalink
I have a mysql database which consists of hundreds of millions, if not
billions of Usenet newsgroup headers. This data should be highly
compressable, so I put the mysql data directory on a btrfs filesystem
mounted with the compress option:
/dev/sdi on /var/news/mysql type btrfs (rw,noatime,compress,noacl)

However, I'm not seeing the kind of compression ratios that I would expect
with this type of data. FYI, all my tests are using Linux 2.6.32.3.
Here's my current disk usage:
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 302G 122G 181G 41% /var/news/mysql

and here's the actual size of all files:
delta-9 mysql # pwd
/var/news/mysql
delta-9 mysql # du -h --max-depth=1
747K ./mysql
0 ./test
125G ./urd
125G .
delta-9 mysql #

As you can see, I am only shaving off 3 gigs out of 125 gigs worth of what
should be very compressable data. The compressed data ends up being
around 98% the size of the original data.

To contrast, rzip can compress a database dump of this data to around 7%
of its original size. This is an older database dump, which is why it is
smaller. Before:
-rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
and after:
-rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz

Of course it took 15 hours to compress the data, and btrfs wouldn't be
able to use rzip for compression anyway.

However, I still would expect to see better compression ratios than 98% on
such data. Are there plans to implement a better compression algorithm?
Alternatively, is there a way to tune btrfs compression to achieve better
ratios?

thanks,
Jim Faulkner
Please CC my e-mail address on any replies.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sander
2010-01-17 14:34:22 UTC
Permalink
Hello Jim,
Post by Jim Faulkner
To contrast, rzip can compress a database dump of this data to
around 7% of its original size. This is an older database dump,
-rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
-rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
Of course it took 15 hours to compress the data, and btrfs wouldn't
be able to use rzip for compression anyway.
The difference between a life MySQL database and a dump of that database
is that the dump is text, while the database files are binary.

A fair comparison would be to compress the actual database files.

With kind regards, Sander
--
Humilis IT Services and Solutions
http://www.humilis.net
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jim Faulkner
2010-01-18 14:46:42 UTC
Permalink
Post by Sander
A fair comparison would be to compress the actual database files.
You are absolutely right. I've run some more tests, this time against the
database files themselves.

This time there are 73 GB worth of database files:
delta-9 mysql # du -h
747K ./mysql
0 ./test
73G ./urd
73G .

btrfs compresses this to 70 GB:
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 70G 117G 38% /var/news/mysql

which is 96% compression ratio. I then tried compressing /var/news/mysql
with some popular compressors.

zip -5, zip -9, and gzip all ended up producing archives that are roughly
11 GB:
delta-9 btrfs-mysql-test-jim # ls -lh btrfs-mysql-test.gz
btrfs-mysql-test-9.zip btrfs-mysql-test-5.zip
-rw-r--r-- 1 jim jim 11G 2010-01-18 01:50 btrfs-mysql-test-5.zip
-rw-r--r-- 1 jim jim 11G 2010-01-18 06:56 btrfs-mysql-test-9.zip
-rw-r--r-- 1 jim jim 11G 2010-01-18 02:17 btrfs-mysql-test.gz
delta-9 btrfs-mysql-test-jim #

This is a 15% compression ratio.

bzip2 produced an 8 GB archive, which is an 11% compression ratio:
-rw-r--r-- 1 jim jim 8.0G 2010-01-18 09:08 btrfs-mysql-test.bz2

7z produced a 6.1 GB archive, which is an 8% compression ratio:
-rw-r--r-- 1 jim jim 6.1G 2010-01-18 07:36 btrfs-mysql-test.7z

Finally, all of these are just command line compressors, I wanted to get a
test in with actual disk compression software. I haven't had a DOS box
running doublespace since I was rather young, so I plugged an extra drive
into a Windows Vista machine, formatted it with NTFS, and enabled
compression via the drive properties menu. I then copied the mysql data
directory onto the compressed NTFS drive:
Loading Image...

The end result was 72.4 GB of data using 29.5 GB of disk space:
Loading Image...

This is a 41% compression ratio.

So, in summary, the compression ratios are:
btrfs: 96%
zip/gzip: 15%
bzip2: 11%
7z: 8%
NTFS: 41%

I think most would agree that btrfs is doing a rather poor job of
compressing my data, even compared to gzip and NTFS compression.
Thoughts?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jim Faulkner
2010-01-18 16:06:23 UTC
Permalink
Post by Jim Faulkner
btrfs: 96%
zip/gzip: 15%
bzip2: 11%
7z: 8%
NTFS: 41%
One minor follow up. I used "mysql < mysqldump.sql" to initially populate
the database on the btrfs filesystem. I thought that this may affect the
compression ratio, since mysql probably keeps changing the same blocks as
the sql import progresses. Compare this to the NTFS test, in which I
simply copied the populated database files onto the filesystem.

So, I created a new btrfs filesystem, mounted it with the compress option,
and simply copied my database files onto the filesystem, just like I did
on the NTFS test. This did make a small difference:
delta-9 mysql # du -h
0 ./btrfs-mysql-test/test
73G ./btrfs-mysql-test/urd
743K ./btrfs-mysql-test/mysql
73G ./btrfs-mysql-test
73G .

Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 67G 120G 36% /var/news/mysql

This shaved another 3 GB off of the disk usage, so btrfs has now achieved
a 91% compression ratio.

This is still rather poor compared to the 41% compression ratio achieved
by NTFS. Surely btrfs should be better at compressing this data.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2010-01-18 14:12:40 UTC
Permalink
Post by Jim Faulkner
I have a mysql database which consists of hundreds of millions, if not
billions of Usenet newsgroup headers. This data should be highly
compressable, so I put the mysql data directory on a btrfs filesystem
/dev/sdi on /var/news/mysql type btrfs (rw,noatime,compress,noacl)
However, I'm not seeing the kind of compression ratios that I would
expect with this type of data. FYI, all my tests are using Linux
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 302G 122G 181G 41% /var/news/mysql
delta-9 mysql # pwd
/var/news/mysql
delta-9 mysql # du -h --max-depth=1
747K ./mysql
0 ./test
125G ./urd
125G .
delta-9 mysql #
As you can see, I am only shaving off 3 gigs out of 125 gigs worth of
what should be very compressable data. The compressed data ends up being
around 98% the size of the original data.
To contrast, rzip can compress a database dump of this data to around 7%
of its original size. This is an older database dump, which is why it is
-rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
-rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
Of course it took 15 hours to compress the data, and btrfs wouldn't be
able to use rzip for compression anyway.
However, I still would expect to see better compression ratios than 98%
on such data. Are there plans to implement a better compression
algorithm? Alternatively, is there a way to tune btrfs compression to
achieve better ratios?
Currently the only compression algorithm we support is gzip, so try gzipp'ing
your database to get a better comparison. The plan is to eventually support
other compression algorithms, but currently we do not. Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-01-18 21:29:51 UTC
Permalink
Post by Josef Bacik
Post by Jim Faulkner
I have a mysql database which consists of hundreds of millions, if not
billions of Usenet newsgroup headers. This data should be highly
compressable, so I put the mysql data directory on a btrfs filesystem
/dev/sdi on /var/news/mysql type btrfs (rw,noatime,compress,noacl)
However, I'm not seeing the kind of compression ratios that I would
expect with this type of data. FYI, all my tests are using Linux
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 302G 122G 181G 41% /var/news/mysql
delta-9 mysql # pwd
/var/news/mysql
delta-9 mysql # du -h --max-depth=1
747K ./mysql
0 ./test
125G ./urd
125G .
delta-9 mysql #
As you can see, I am only shaving off 3 gigs out of 125 gigs worth of
what should be very compressable data. The compressed data ends up being
around 98% the size of the original data.
To contrast, rzip can compress a database dump of this data to around 7%
of its original size. This is an older database dump, which is why it is
-rw------- 1 root root 69G 2010-01-15 14:55 mysqlurdbackup.2010-01-15
-rw------- 1 root root 5.2G 2010-01-16 05:34 mysqlurdbackup.2010-01-15.rz
Of course it took 15 hours to compress the data, and btrfs wouldn't be
able to use rzip for compression anyway.
However, I still would expect to see better compression ratios than 98%
on such data. Are there plans to implement a better compression
algorithm? Alternatively, is there a way to tune btrfs compression to
achieve better ratios?
Currently the only compression algorithm we support is gzip, so try gzipp'ing
your database to get a better comparison. The plan is to eventually support
other compression algorithms, but currently we do not. Thanks,
The compression code backs off compression pretty quickly if parts of
the file do not compress well. This is another way of saying it favors
CPU time over the best possible compression. If gzip ends up better
than what you're getting from btrfs, I can give you a patch to force
compression all the time.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jim Faulkner
2010-01-18 22:11:53 UTC
Permalink
Post by Chris Mason
Post by Josef Bacik
Currently the only compression algorithm we support is gzip, so try gzipp'ing
your database to get a better comparison. The plan is to eventually support
other compression algorithms, but currently we do not. Thanks,
The compression code backs off compression pretty quickly if parts of
the file do not compress well. This is another way of saying it favors
CPU time over the best possible compression. If gzip ends up better
than what you're getting from btrfs, I can give you a patch to force
compression all the time.
Yes, gzip compresses much better than btrfs. I'd greatly appreciate a
patch to force compression all the time.

It would be nice if such an ability were merged in the mainline. Perhaps
there could be a mount option or tunable parameter to force compression?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-01-20 16:30:11 UTC
Permalink
Post by Jim Faulkner
Post by Chris Mason
Post by Josef Bacik
Currently the only compression algorithm we support is gzip, so try gzipp'ing
your database to get a better comparison. The plan is to eventually support
other compression algorithms, but currently we do not. Thanks,
The compression code backs off compression pretty quickly if parts of
the file do not compress well. This is another way of saying it favors
CPU time over the best possible compression. If gzip ends up better
than what you're getting from btrfs, I can give you a patch to force
compression all the time.
Yes, gzip compresses much better than btrfs. I'd greatly appreciate
a patch to force compression all the time.
It would be nice if such an ability were merged in the mainline.
Perhaps there could be a mount option or tunable parameter to force
compression?
Lets start by making sure that this patch works for you. Just apply it
(2.6.32 or 2.6.33-rc) and then mount -o compress-force. Normally, mount
-o compress will set a flag on a file after it fails to get good
compression for that file.

With this patch, mount -o compress-force will still honor that flag.
But it will skip setting it during new writes. This is a long way of
saying you'll have to copy your data file to a new files for the new
mount option to do anything.

Please let me know if this improves your ratios

-chris

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 9f806dd..2aa8ec6 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1161,6 +1161,7 @@ struct btrfs_root {
#define BTRFS_MOUNT_SSD_SPREAD (1 << 8)
#define BTRFS_MOUNT_NOSSD (1 << 9)
#define BTRFS_MOUNT_DISCARD (1 << 10)
+#define BTRFS_MOUNT_FORCE_COMPRESS (1 << 11)

#define btrfs_clear_opt(o, opt) ((o) &= ~BTRFS_MOUNT_##opt)
#define btrfs_set_opt(o, opt) ((o) |= BTRFS_MOUNT_##opt)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b330e27..f46c572 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -483,7 +483,8 @@ again:
nr_pages_ret = 0;

/* flag the file so we don't compress in the future */
- BTRFS_I(inode)->flags |= BTRFS_INODE_NOCOMPRESS;
+ if (!btrfs_test_opt(root, FORCE_COMPRESS))
+ BTRFS_I(inode)->flags |= BTRFS_INODE_NOCOMPRESS;
}
if (will_compress) {
*num_added += 1;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3f9b457..8a1ea6e 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -66,7 +66,8 @@ enum {
Opt_degraded, Opt_subvol, Opt_device, Opt_nodatasum, Opt_nodatacow,
Opt_max_extent, Opt_max_inline, Opt_alloc_start, Opt_nobarrier,
Opt_ssd, Opt_nossd, Opt_ssd_spread, Opt_thread_pool, Opt_noacl,
- Opt_compress, Opt_notreelog, Opt_ratio, Opt_flushoncommit,
+ Opt_compress, Opt_compress_force, Opt_notreelog, Opt_ratio,
+ Opt_flushoncommit,
Opt_discard, Opt_err,
};

@@ -82,6 +83,7 @@ static match_table_t tokens = {
{Opt_alloc_start, "alloc_start=%s"},
{Opt_thread_pool, "thread_pool=%d"},
{Opt_compress, "compress"},
+ {Opt_compress_force, "compress-force"},
{Opt_ssd, "ssd"},
{Opt_ssd_spread, "ssd_spread"},
{Opt_nossd, "nossd"},
@@ -173,6 +175,11 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
printk(KERN_INFO "btrfs: use compression\n");
btrfs_set_opt(info->mount_opt, COMPRESS);
break;
+ case Opt_compress_force:
+ printk(KERN_INFO "btrfs: forcing compression\n");
+ btrfs_set_opt(info->mount_opt, FORCE_COMPRESS);
+ btrfs_set_opt(info->mount_opt, COMPRESS);
+ break;
case Opt_ssd:
printk(KERN_INFO "btrfs: use ssd allocation scheme\n");
btrfs_set_opt(info->mount_opt, SSD);
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jim Faulkner
2010-01-21 18:16:31 UTC
Permalink
Post by Chris Mason
Please let me know if this improves your ratios
It most certainly does! It also greatly reduced the time required to copy
the data to my (not very fast) disk. All my testing was done on 2.6.32.4.
The line numbers in your patch were a little off for 2.6.32.4, but I did
manage to apply it cleanly. Here's the results of my testing:

First let's see the results with plain old mount -o compress:
delta-9 ~ # mkfs.btrfs /dev/sdi

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdi
nodesize 4096 leafsize 4096 sectorsize 4096 size 186.31GB
Btrfs Btrfs v0.19
delta-9 ~ # mount -o noacl,compress,noatime /dev/sdi /var/news/mysql
delta-9 ~ # time cp -a /nfs/media/tmp/btrfs-mysql-test /var/news/mysql

real 57m6.983s
user 0m0.807s
sys 1m28.494s
delta-9 ~ # cd /var/news/mysql
delta-9 mysql # du -h --max-depth=1
73G ./btrfs-mysql-test
73G .
delta-9 mysql # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 67G 120G 36% /var/news/mysql

So, with plain mount -o compress, it took 57 minutes to copy the data to
my btrfs disk, and it achieved a 92% compression ratio.

Now let's test with mount -o compress-force. First I'll create a new
btrfs filesystem so we're getting a fresh start:

delta-9 ~ # mkbtrfs /dev/sdi

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdi
nodesize 4096 leafsize 4096 sectorsize 4096 size 186.31GB
Btrfs Btrfs v0.19
delta-9 ~ # mount -o noatime,noacl,compress-force /dev/sdi /var/news/mysql
delta-9 ~ # time cp -a /nfs/media/tmp/btrfs-mysql-test /var/news/mysql/

real 14m45.742s
user 0m0.547s
sys 1m30.551s
delta-9 ~ # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 14G 173G 8% /var/news/mysql
delta-9 ~ # cd /var/news/mysql
delta-9 mysql # du -h --max-depth=1
73G ./btrfs-mysql-test
73G .
delta-9 mysql #

Wow! So not only did mount -o compress-force achieve a 19% compression
ratio, using 53 GB less disk space than mount -o compress, it managed to
copy the data in only 15 minutes, compared to 57 minutes with mount -o
compress.

The disk in question is an old IDE disk in a cheap external USB 2.0
enclosure, which is probably not exactly the type of storage that btrfs is
being developed for. Nevertheless, it is nice to see such a huge
improvement in the time required to copy the data around.

I'd be very happy to see the -o compress-force option in the mainline
kernel someday!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Gregory Maxwell
2010-01-21 20:04:56 UTC
Permalink
This post might be inappropriate. Click to display it.
Chris Mason
2010-01-21 20:07:50 UTC
Permalink
Post by Gregory Maxwell
Post by Chris Mason
Please let me know if this improves your ratios
It most certainly does! =C2=A0It also greatly reduced the time requ=
ired to copy
Post by Gregory Maxwell
the data to my (not very fast) disk. =C2=A0All my testing was done =
on 2.6.32.4.
Post by Gregory Maxwell
The line numbers in your patch were a little off for 2.6.32.4, but =
I did
Post by Gregory Maxwell
[snip]
I'd be very happy to see the -o compress-force option in the mainli=
ne kernel
Post by Gregory Maxwell
someday!
=20
=20
Sweet. But I think a force mount option is an unreasonably blunt tool=
=2E
Post by Gregory Maxwell
=20
=20
(1) Fix the compression decision, I think this example suggests that
something is broken. (I'd noticed poorer than expected compression on
my laptop, but I'd chalked it up to the 64k blocks=E2=80=A6 now I'm n=
ot so
Post by Gregory Maxwell
confident)
The current code assumes that files have consistent data in them. This
is very true for the average data set, but it'll be horribly wrong for
something like a database file.
Post by Gregory Maxwell
=20
(2) An IOCTL for compression control. Userspace knows best, some
files ought to have a different compression policy.
Yes, we'll get #2 added.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-01-21 20:05:41 UTC
Permalink
Post by Jim Faulkner
delta-9 ~ # mount -o noatime,noacl,compress-force /dev/sdi /var/news/mysql
delta-9 ~ # time cp -a /nfs/media/tmp/btrfs-mysql-test /var/news/mysql/
real 14m45.742s
user 0m0.547s
sys 1m30.551s
delta-9 ~ # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdi 187G 14G 173G 8% /var/news/mysql
delta-9 ~ # cd /var/news/mysql
delta-9 mysql # du -h --max-depth=1
73G ./btrfs-mysql-test
73G .
delta-9 mysql #
Wow! So not only did mount -o compress-force achieve a 19%
compression ratio, using 53 GB less disk space than mount -o
compress, it managed to copy the data in only 15 minutes, compared
to 57 minutes with mount -o compress.
The disk in question is an old IDE disk in a cheap external USB 2.0
enclosure, which is probably not exactly the type of storage that
btrfs is being developed for. Nevertheless, it is nice to see such
a huge improvement in the time required to copy the data around.
I'd be very happy to see the -o compress-force option in the
mainline kernel someday!
Great, it is working here so I'll queue it up as well.

The performance improvement just comes from writing less to the disk.
Your first run copied 67GB in 57m6s, which comes out to about 20MB/s
write throughput.

Your compress-force run copied 14GB in 14m45s, which gives us
around 16 or 17MB/s (depending on df rounding). So, we either lost a
little drive throughput due to overhead in the compression code or the
run was totally CPU bound.

If you look at the output from 'time' the two runs appear to be using
the same amount of system time (cpu time in the kernel). But, most of
the compression is done by helper threads, so the cpu time spent
compressing data doesn't show up.

Just a quick clarification that we didn't make your drive faster, we
just traded CPU for disk. Most of the time that's a safe trade,
especially when you put usb and ide into the same sentence ;)

Either way, thanks for testing this out. Do you happen to remember how
small the file becomes when you just plain gzip it?

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jim Faulkner
2010-01-21 22:38:57 UTC
Permalink
Post by Chris Mason
Either way, thanks for testing this out. Do you happen to remember how
small the file becomes when you just plain gzip it?
gzipping it I end up with an 11 GB file.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...