'ceph raw used is more than sum of used in all pools (ceph df detail)

First of all sorry for my poor English In my ceph cluster, when i run the ceph df detail command it shows me like as following result

RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd        62 TiB      52 TiB      10 TiB       10 TiB         16.47 
    ssd       8.7 TiB     8.4 TiB     370 GiB      377 GiB          4.22 
    TOTAL      71 TiB      60 TiB      11 TiB       11 TiB         14.96 
 
POOLS:
    POOL                ID     STORED      OBJECTS     USED        %USED     MAX AVAIL     QUOTA OBJECTS     QUOTA BYTES     DIRTY       USED COMPR     UNDER COMPR 
    rbd-kubernetes      36     288 GiB      71.56k     865 GiB      1.73        16 TiB     N/A               N/A              71.56k            0 B             0 B 
    rbd-cache           41     2.4 GiB     208.09k     7.2 GiB      0.09       2.6 TiB     N/A               N/A             205.39k            0 B             0 B 
    cephfs-metadata     51     529 MiB         221     1.6 GiB         0        16 TiB     N/A               N/A                 221            0 B             0 B 
    cephfs-data         52     1.0 GiB         424     3.1 GiB         0        16 TiB     N/A               N/A                 424            0 B             0 B 

So i have a question about the result As you can see, sum of my pools used storage is less than 1 TB, But in RAW STORAGE section the used from HDD hard disks is 10TB and it is growing every day.I think this is unusual and something is wrong with this CEPH cluster.

And also FYI the output of ceph osd dump | grep replicated is

pool 36 'rbd-kubernetes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 244 pg_num_target 64 pgp_num_target 64 last_change 1376476 lfor 2193/2193/2193 flags hashpspool,selfmanaged_snaps,creating tiers 41 read_tier 41 write_tier 41 stripe_width 0 application rbd
pool 41 'rbd-cache' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 1376476 lfor 2193/2193/2193 flags hashpspool,incomplete_clones,selfmanaged_snaps,creating tier_of 36 cache_mode writeback target_bytes 1000000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 decay_rate 0 search_last_n 0 min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0
pool 51 'cephfs-metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31675 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 52 'cephfs-data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 742334 flags hashpspool,selfmanaged_snaps stripe_width 0 application cephfs

Ceph Version ceph -v

ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

Ceph OSD versions ceph tell osd.* version return for all OSDs like

osd.0: {
    "version": "ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)"
}

Ceph status ceph -s

  cluster:
    id:     6a86aee0-3171-4824-98f3-2b5761b09feb
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph-sn-03,ceph-sn-02,ceph-sn-01 (age 37h)
    mgr: ceph-sn-01(active, since 4d), standbys: ceph-sn-03, ceph-sn-02
    mds: cephfs-shared:1 {0=ceph-sn-02=up:active} 2 up:standby
    osd: 63 osds: 63 up (since 41h), 63 in (since 41h)
 
  task status:
    scrub status:
        mds.ceph-sn-02: idle
 
  data:
    pools:   4 pools, 384 pgs
    objects: 280.29k objects, 293 GiB
    usage:   11 TiB used, 60 TiB / 71 TiB avail
    pgs:     384 active+clean
 


Solution 1:[1]

According to the provided data, you should evaluate the following considerations and scenarios:

  1. The replication size is inclusive, and once the min_size is achieved in a write operation, you receive a completion message. That means you should expect storage consumption with the minimum of min_size and maximum of the replication size.

  2. Ceph stores metadata and logs for housekeeping purposes, obviously consuming storage.

  3. If you do benchmark operation via "rados bench" or a similar interface with the --no-cleanup parameter, objects will be permanently stored within the cluster that consumes storage.

All the mentioned scenarios are a couple of possibilities.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Fatemeh khodaparast