Reclaiming disk space

1,887 views
Skip to first unread message

Julien Sobrier

unread,
Oct 11, 2014, 1:04:47 AM10/11/14
to weed-fil...@googlegroups.com
Hello,
I'm trying to reclaim the unused disk space. One one volume server, I see this for the ID 169:

"Volumes": [
    {
      "Id": 169,
      "Size": 31452054103,
      "ReplicaPlacement": {
        "SameRackCount": 0,
        "DiffRackCount": 0,
        "DiffDataCenterCount": 0
      },
      "Collection": "",
      "Version": 2,
      "FileCount": 243374,
      "DeleteCount": 38677,
      "DeletedByteCount": 4125152793,
      "ReadOnly": false
    },
[...]

on the file system:

-rw-r--r--  1 screenshot screenshot   30G Oct 11 06:57 169.dat                                                                                                                                                                                                                                                             
-rw-r--r--  1 screenshot screenshot  4.1M Oct 11 06:57 169.idx

It seems I have 4GB of disk space unused for this particular Id. But a vacuum call does nto do anything:
curl "http://localhost:9333/vol/vacuum"

Since I'm running out of disk space on this volume server, and a lot of Volume IDs have quite a lot of DeletedByteCount, how can I reclaim the unused disk space? Should I specify a smaller garbageThreshold value?

Thank you
Julien  Sobrier

Chris Lu

unread,
Oct 11, 2014, 3:18:25 AM10/11/14
to weed-fil...@googlegroups.com
Yes. Smaller garbageThreshold would work. The default value is 0.3.
30GBx0.3=9GB, so it will allow 9GB garbage, larger than you 4GB. Setting garbageThreshold to 0.1 should trigger the disk reclaiming to work.

Also, earlier versions may cause long wait during garbage collection. Using the latest 0.64 version should help if the wait happens.

Chris

--
You received this message because you are subscribed to the Google Groups "Seaweed File System" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weed-file-syst...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Julien Sobrier

unread,
Oct 12, 2014, 2:10:45 PM10/12/14
to weed-fil...@googlegroups.com
THank you. With garbageThreshold =0.1, it is taking 15 hours and counting to call "http://localhost:9333/vol/vacuum" and the disk space hasn't change .It is normal?

Chris Lu

unread,
Oct 12, 2014, 11:03:23 PM10/12/14
to weed-fil...@googlegroups.com
Maybe it's stuck somewhere. how many volumes you have? Is there any replication? how much empty disk space to work with?

What the vacuum does is to serially iterate all volumes, copying not-deleted files in an old volume to a new volume, and delete the old volume.

http://localhost:9333/vol/vacuum can also specify a volume id directly, so that you can run vacuum for a specific volume. You can try that to see whether it works fine.

Chris

Julien Sobrier

unread,
Oct 13, 2014, 12:04:33 AM10/13/14
to weed-fil...@googlegroups.com
Hello,
I have 3 volume servers, no replication. I have about 5TB of data, mostly on 1 of the volume server (130GB free disk space).

What is the option to vacuum a specific volume server?

Chris Lu

unread,
Oct 13, 2014, 1:24:48 AM10/13/14
to weed-fil...@googlegroups.com
Can not vacuum for a specific volume server. But for each volume, you would need to send 2 http calls. One to start the vacuum, one to commit the vacuumed volume.

http://volume_server:port/admin/vacuum_volume_compact?volumeId=xxx
http://volume_server:port/admin/vacuum_volume_commit?volumeId=xxx

Chris

Julien Sobrier

unread,
Oct 13, 2014, 1:36:51 AM10/13/14
to weed-fil...@googlegroups.com
Thank you for the help.

It looks like the vacuum is getting slower and slower for each file id on one volume.

Julien Sobrier

unread,
Oct 13, 2014, 1:57:03 AM10/13/14
to weed-fil...@googlegroups.com
I got this error for volumeId=231:
{"error":"Volume Id  is not a valid unsigned integer!"}

Chris Lu

unread,
Oct 13, 2014, 2:33:15 AM10/13/14
to weed-fil...@googlegroups.com
Sorry, the parameter should be "volume" instead of "volumeId"

http://volume_server:port/admin/vacuum_volume_compact?volume=xxx
http://volume_server:port/admin/vacuum_volume_commit?volume=xxx

Chris Lu

unread,
Oct 13, 2014, 3:22:43 PM10/13/14
to weed-fil...@googlegroups.com
Julien, what's the progress? Does it gets slower for each file when processing one volume? How long did it take for one volume? Are you using hard disk drive or SSD?

Chris

Julien Sobrier

unread,
Oct 14, 2014, 8:35:22 PM10/14/14
to weed-fil...@googlegroups.com
Thank you.

A couple of observations:
* I see that the volume server  sometimes stops in the middle of creating the .cpx file. I guess this may happen if the volume is being modified.
* performances for creating new .cpx files seem to vary a lot on the same volume server

Chris Lu

unread,
Oct 14, 2014, 9:06:12 PM10/14/14
to weed-fil...@googlegroups.com
I forgot the mention that if you are using these APIs directly, you would have to stop the system. These APIs were intended to use by master server internally, where the master server will know to stop sending writes to these volume servers.

http://volume_server:port/admin/vacuum_volume_compact?volume=xxx
http://volume_server:port/admin/vacuum_volume_commit?volume=xxx

Chris

Chris Lu

unread,
Oct 16, 2014, 2:34:45 AM10/16/14
to weed-fil...@googlegroups.com
Since you already familiar with this, you can try to use command "weed compact". It has several limitations:
1. it was meant to be used offline.
2. it only generate the .cpd and .cpx file, you would need to manually change the file name to .dat and .idx respectively. (Probably you can tweak the code to automatically change that.)

I noticed when using the "weed compact", the compact speed is basically a constant.

Chris

Julien Sobrier

unread,
Nov 21, 2014, 11:44:32 PM11/21/14
to weed-fil...@googlegroups.com
I tried with weedfs 0.63 and 0.65 on several volumes, I always get this error:

$ ../weedfs/weed compact -dir=/data/weedfs_volume -volumeId=184
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x4c31c0]

goroutine 16 [running]:
runtime.panic(0x9ca580, 0xd7c6d3)
        /home/chris/apps/go/src/pkg/runtime/panic.c:279 +0xf5
github.com/chrislusf/weed-fs/go/storage.(*Volume).readSuperBlock(0xc208028230, 0x7fd7efc6c1b0, 0xc2080380d0)
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/storage/volume_super_block.go:60 +0x90
github.com/chrislusf/weed-fs/go/storage.(*Volume).load(0xc208028230, 0xc208020101, 0x0, 0x0)
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/storage/volume.go:81 +0x41b
github.com/chrislusf/weed-fs/go/storage.NewVolume(0x7fffbe817761, 0x13, 0xa413b0, 0x0, 0xc2000000b8, 0x0, 0x0, 0xc208028230, 0x0, 0x0)
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/storage/volume.go:33 +0xeb
main.runCompact(0xd7a7e0, 0xc20800e040, 0x0, 0x0, 0x0)
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/weed/compact.go:36 +0xa5
main.main()
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/weed/weed.go:77 +0x612

goroutine 19 [finalizer wait]:
runtime.park(0x428bd0, 0xd81240, 0xd7f449)
        /home/chris/apps/go/src/pkg/runtime/proc.c:1369 +0x89
runtime.parkunlock(0xd81240, 0xd7f449)
        /home/chris/apps/go/src/pkg/runtime/proc.c:1385 +0x3b
runfinq()
        /home/chris/apps/go/src/pkg/runtime/mgc0.c:2644 +0xcf
runtime.goexit()
        /home/chris/apps/go/src/pkg/runtime/proc.c:1445

goroutine 20 [syscall]:
os/signal.loop()
        /home/chris/apps/go/src/pkg/os/signal/signal_unix.go:21 +0x1e
created by os/signal.init·1
        /home/chris/apps/go/src/pkg/os/signal/signal_unix.go:27 +0x32

goroutine 21 [chan receive]:
github.com/chrislusf/weed-fs/go/glog.(*loggingT).flushDaemon(0xd84140)
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/glog/glog.go:833 +0x75
created by github.com/chrislusf/weed-fs/go/glog.init·1
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/glog/glog.go:402 +0x2b2

goroutine 17 [syscall]:
runtime.goexit()
        /home/chris/apps/go/src/pkg/runtime/proc.c:1445

goroutine 23 [runnable]:
github.com/chrislusf/weed-fs/go/stats.(*ServerStats).Start(0xc20800f180)
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/stats/stats.go:90
created by github.com/chrislusf/weed-fs/go/weed/weed_server.init·1
        /home/chris/dev/workspace/home/gopath/src/github.com/chrislusf/weed-fs/go/weed/weed_server/common.go:24 +0x43

Chris Lu

unread,
Nov 21, 2014, 11:57:04 PM11/21/14
to weed-fil...@googlegroups.com
The error shows it could not find file /data/weedfs_volume/184.dat

Is the file in place? Readable?

Chris

Ravi N

unread,
Jul 24, 2017, 11:28:58 AM7/24/17
to Seaweed File System, weed-fil...@googlegroups.com
Chris,

I'm trying to reclaim the disk space after deleting files (around 600 GB). I am using v70 and below command for master and volume and filer. 

(Win Server) weed.exe server -dir="./data" -filer=true -volume.max=50.

When I call vacuum command overall space of data directory is increasing. Can you please suggest how to free up the deleted space?

vacuum URL i have used: http://localhost:9333/vol/vacuum

Chris Lu

unread,
Jul 24, 2017, 12:18:08 PM7/24/17
to Seaweed File System, weed-fil...@googlegroups.com
Vacuum will copy the volume by skipping deleted files. The size will go up and then go down.

Chris

To unsubscribe from this group and stop receiving emails from it, send an email to seaweedfs+unsubscribe@googlegroups.com.

Ravi N

unread,
Jul 24, 2017, 12:20:39 PM7/24/17
to Seaweed File System, weed-fil...@googlegroups.com
I have tried calling vacuum URL from browser (tried multiple times), every time space is increasing but not reducing. Please suggest if I am missing something here.
To unsubscribe from this group and stop receiving emails from it, send an email to seaweedfs+...@googlegroups.com.

Chris Lu

unread,
Jul 24, 2017, 12:21:33 PM7/24/17
to Seaweed File System, weed-fil...@googlegroups.com
I suggest you wait.

To unsubscribe from this group and stop receiving emails from it, send an email to seaweedfs+unsubscribe@googlegroups.com.

Ravi N

unread,
Jul 24, 2017, 12:42:38 PM7/24/17
to Seaweed File System, weed-fil...@googlegroups.com
Thanks Chris, I will wait for some time and see if the deleted space is reclaimed.

To be very clear, I have tried calling the Vacuum URL a week back and still the space was not released. There was a scheduled reboot of that server during the weekend and I have verified the space after reboot and it was not reduced.

Ravi N

unread,
Aug 1, 2017, 3:54:11 PM8/1/17
to Seaweed File System, weed-fil...@googlegroups.com
Vacuum command (curl "http://localhost:9333/vol/vacuum") was executed successfully (on 25th July) but the disk space from deleted files was not freed up. Can you please suggest if there is another option to reclaim the disk space?

Chris Lu

unread,
Aug 1, 2017, 3:55:38 PM8/1/17
to Seaweed File System, weed-fil...@googlegroups.com
Delete any .cpd and .cpx files

Ravi N

unread,
Aug 2, 2017, 10:25:50 AM8/2/17
to Seaweed File System, weed-fil...@googlegroups.com
Thanks Chris. 

I have deleted the existing .cpd and .cpx files with around 100GB. I have called the vacuum method once again, now .dat file size was not reduced but .cpd, .cpx files are created for few volumes. Please let me know if these .cpd and .cpx files are generated with the remaining files in each of the volume? If yes, can I replace the dat files with the cpd file?
Reply all
Reply to author
Forward
0 new messages