backport patches from community
(cherry picked from commit cc60985a665f17f23031c03f1021e886d63b990f)
This commit is contained in:
parent
0b6d55387d
commit
7fbd36818a
42
0012-xfs-sb-verifier-doesn-t-handle-uncached-sb-buffer.patch
Normal file
42
0012-xfs-sb-verifier-doesn-t-handle-uncached-sb-buffer.patch
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
From b3749469112306a80925420b48a6e756b2beeed9 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Date: Mon, 31 Jan 2022 15:25:48 -0500
|
||||||
|
Subject: [PATCH] xfs: sb verifier doesn't handle uncached sb buffer
|
||||||
|
|
||||||
|
Source kernel commit: 8cf07f3dd56195316be97758cb8b4e1d7183ea84
|
||||||
|
|
||||||
|
The verifier checks explicitly for bp->b_bn == XFS_SB_DADDR to match
|
||||||
|
the primary superblock buffer, but the primary superblock is an
|
||||||
|
uncached buffer and so bp->b_bn is always -1ULL. Hence this never
|
||||||
|
matches and the CRC error reporting is wholly dependent on the
|
||||||
|
mount superblock already being populated so CRC feature checks pass
|
||||||
|
and allow CRC errors to be reported.
|
||||||
|
|
||||||
|
Fix this so that the primary superblock CRC error reporting is not
|
||||||
|
dependent on already having read the superblock into memory.
|
||||||
|
|
||||||
|
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_sb.c | 2 +-
|
||||||
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
|
||||||
|
index b2e214e..f29a59a 100644
|
||||||
|
--- a/libxfs/xfs_sb.c
|
||||||
|
+++ b/libxfs/xfs_sb.c
|
||||||
|
@@ -634,7 +634,7 @@ xfs_sb_read_verify(
|
||||||
|
|
||||||
|
if (!xfs_buf_verify_cksum(bp, XFS_SB_CRC_OFF)) {
|
||||||
|
/* Only fail bad secondaries on a known V5 filesystem */
|
||||||
|
- if (bp->b_bn == XFS_SB_DADDR ||
|
||||||
|
+ if (bp->b_maps[0].bm_bn == XFS_SB_DADDR ||
|
||||||
|
xfs_sb_version_hascrc(&mp->m_sb)) {
|
||||||
|
error = -EFSBADCRC;
|
||||||
|
goto out_error;
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
49
0013-libxfs-always-initialize-internal-buffer-map.patch
Normal file
49
0013-libxfs-always-initialize-internal-buffer-map.patch
Normal file
@ -0,0 +1,49 @@
|
|||||||
|
From f043c63e38c9582deac85053a6c8a737482983b1 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Mon, 31 Jan 2022 17:46:05 -0500
|
||||||
|
Subject: [PATCH] libxfs: always initialize internal buffer map
|
||||||
|
|
||||||
|
The __initbuf function is responsible for initializing the fields of an
|
||||||
|
xfs_buf. Buffers are always required to have a mapping, though in the
|
||||||
|
typical case there's only one mapping, so we can use the internal one.
|
||||||
|
|
||||||
|
The single-mapping b_maps init code at the end of the function doesn't
|
||||||
|
quite get this right though -- if a single-mapping buffer in the cache
|
||||||
|
was allowed to expire and now is being repurposed, it'll come out with
|
||||||
|
b_maps == &__b_map, in which case we incorrectly skip initializing the
|
||||||
|
map. This has gone unnoticed until now because (AFAICT) the code paths
|
||||||
|
that use b_maps are the same ones that are called with multi-mapping
|
||||||
|
buffers, which are initialized correctly.
|
||||||
|
|
||||||
|
Anyway, the improperly initialized single-mappings will cause problems
|
||||||
|
in upcoming patches where we turn b_bn into the cache key and require
|
||||||
|
the use of b_maps[0].bm_bn for the buffer LBA. Fix this.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/rdwr.c | 6 ++++--
|
||||||
|
1 file changed, 4 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
|
||||||
|
index 5086bdb..a55e3a7 100644
|
||||||
|
--- a/libxfs/rdwr.c
|
||||||
|
+++ b/libxfs/rdwr.c
|
||||||
|
@@ -251,9 +251,11 @@ __initbuf(struct xfs_buf *bp, struct xfs_buftarg *btp, xfs_daddr_t bno,
|
||||||
|
bp->b_ops = NULL;
|
||||||
|
INIT_LIST_HEAD(&bp->b_li_list);
|
||||||
|
|
||||||
|
- if (!bp->b_maps) {
|
||||||
|
- bp->b_nmaps = 1;
|
||||||
|
+ if (!bp->b_maps)
|
||||||
|
bp->b_maps = &bp->__b_map;
|
||||||
|
+
|
||||||
|
+ if (bp->b_maps == &bp->__b_map) {
|
||||||
|
+ bp->b_nmaps = 1;
|
||||||
|
bp->b_maps[0].bm_bn = bp->b_bn;
|
||||||
|
bp->b_maps[0].bm_len = bp->b_length;
|
||||||
|
}
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,70 @@
|
|||||||
|
From f98b7a261130726c33accd295ec0d2a22f270cde Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Fri, 25 Feb 2022 17:32:48 -0500
|
||||||
|
Subject: [PATCH] libxfs: shut down filesystem if we xfs_trans_cancel with
|
||||||
|
deferred work items
|
||||||
|
|
||||||
|
While debugging some very strange rmap corruption reports in connection
|
||||||
|
with the online directory repair code. I root-caused the error to the
|
||||||
|
following incorrect sequence:
|
||||||
|
|
||||||
|
<start repair transaction>
|
||||||
|
<expand directory, causing a deferred rmap to be queued>
|
||||||
|
<roll transaction>
|
||||||
|
<cancel transaction>
|
||||||
|
|
||||||
|
Obviously, we should have committed the transaction instead of
|
||||||
|
cancelling it. Thinking more broadly, however, xfs_trans_cancel should
|
||||||
|
have warned us that we were throwing away work item that we already
|
||||||
|
committed to performing. This is not correct, and we need to shut down
|
||||||
|
the filesystem.
|
||||||
|
|
||||||
|
Change xfs_trans_cancel to complain in the loudest manner if we're
|
||||||
|
cancelling any transaction with deferred work items attached.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/trans.c | 19 ++++++++++++++++++-
|
||||||
|
1 file changed, 18 insertions(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/trans.c b/libxfs/trans.c
|
||||||
|
index fd2e6f9..8c16cb8 100644
|
||||||
|
--- a/libxfs/trans.c
|
||||||
|
+++ b/libxfs/trans.c
|
||||||
|
@@ -318,13 +318,30 @@ void
|
||||||
|
libxfs_trans_cancel(
|
||||||
|
struct xfs_trans *tp)
|
||||||
|
{
|
||||||
|
+ bool dirty;
|
||||||
|
+
|
||||||
|
trace_xfs_trans_cancel(tp, _RET_IP_);
|
||||||
|
|
||||||
|
if (tp == NULL)
|
||||||
|
return;
|
||||||
|
+ dirty = (tp->t_flags & XFS_TRANS_DIRTY);
|
||||||
|
|
||||||
|
- if (tp->t_flags & XFS_TRANS_PERM_LOG_RES)
|
||||||
|
+ /*
|
||||||
|
+ * It's never valid to cancel a transaction with deferred ops attached,
|
||||||
|
+ * because the transaction is effectively dirty. Complain about this
|
||||||
|
+ * loudly before freeing the in-memory defer items.
|
||||||
|
+ */
|
||||||
|
+ if (!list_empty(&tp->t_dfops)) {
|
||||||
|
+ ASSERT(list_empty(&tp->t_dfops));
|
||||||
|
+ ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES);
|
||||||
|
+ dirty = true;
|
||||||
|
xfs_defer_cancel(tp);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (dirty) {
|
||||||
|
+ fprintf(stderr, _("Cancelling dirty transaction!\n"));
|
||||||
|
+ abort();
|
||||||
|
+ }
|
||||||
|
|
||||||
|
xfs_trans_free_items(tp);
|
||||||
|
xfs_trans_free(tp);
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
47
0015-xfs_db-fix-nbits-parameter-in-fa_ino-48-functions.patch
Normal file
47
0015-xfs_db-fix-nbits-parameter-in-fa_ino-48-functions.patch
Normal file
@ -0,0 +1,47 @@
|
|||||||
|
From e9ff33f6e604ece202373be3ac176064083d913e Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Fri, 25 Feb 2022 17:42:16 -0500
|
||||||
|
Subject: [PATCH] xfs_db: fix nbits parameter in fa_ino[48] functions
|
||||||
|
|
||||||
|
Use the proper macro to convert ino4 and ino8 field byte sizes to a bit
|
||||||
|
count in the functions that navigate shortform directories. This just
|
||||||
|
happens to work correctly for ino4 entries, but omits the upper 4 bytes
|
||||||
|
of an ino8 entry. Note that the entries display correctly; it's just
|
||||||
|
the command "addr u3.sfdir3.list[X].inumber.i8" that won't.
|
||||||
|
|
||||||
|
Found by running smatch.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
db/faddr.c | 6 ++++--
|
||||||
|
1 file changed, 4 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/db/faddr.c b/db/faddr.c
|
||||||
|
index 81d69c9..0127c5d 100644
|
||||||
|
--- a/db/faddr.c
|
||||||
|
+++ b/db/faddr.c
|
||||||
|
@@ -353,7 +353,8 @@ fa_ino4(
|
||||||
|
xfs_ino_t ino;
|
||||||
|
|
||||||
|
ASSERT(next == TYP_INODE);
|
||||||
|
- ino = (xfs_ino_t)getbitval(obj, bit, bitsz(XFS_INO32_SIZE), BVUNSIGNED);
|
||||||
|
+ ino = (xfs_ino_t)getbitval(obj, bit, bitize(XFS_INO32_SIZE),
|
||||||
|
+ BVUNSIGNED);
|
||||||
|
if (ino == NULLFSINO) {
|
||||||
|
dbprintf(_("null inode number, cannot set new addr\n"));
|
||||||
|
return;
|
||||||
|
@@ -370,7 +371,8 @@ fa_ino8(
|
||||||
|
xfs_ino_t ino;
|
||||||
|
|
||||||
|
ASSERT(next == TYP_INODE);
|
||||||
|
- ino = (xfs_ino_t)getbitval(obj, bit, bitsz(XFS_INO64_SIZE), BVUNSIGNED);
|
||||||
|
+ ino = (xfs_ino_t)getbitval(obj, bit, bitize(XFS_INO64_SIZE),
|
||||||
|
+ BVUNSIGNED);
|
||||||
|
if (ino == NULLFSINO) {
|
||||||
|
dbprintf(_("null inode number, cannot set new addr\n"));
|
||||||
|
return;
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
100
0016-xfs_repair-update-secondary-superblocks-after-changi.patch
Normal file
100
0016-xfs_repair-update-secondary-superblocks-after-changi.patch
Normal file
@ -0,0 +1,100 @@
|
|||||||
|
From 918e82a4879dccaf3673871be925a87efc2fbabc Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Fri, 25 Feb 2022 17:42:16 -0500
|
||||||
|
Subject: [PATCH] xfs_repair: update secondary superblocks after changing
|
||||||
|
features
|
||||||
|
|
||||||
|
When we add features to an existing filesystem, make sure we update the
|
||||||
|
secondary superblocks to reflect the new geometry so that if we lose the
|
||||||
|
primary super in the future, repair will recover correctly.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/libxfs_api_defs.h | 2 ++
|
||||||
|
repair/globals.c | 1 +
|
||||||
|
repair/globals.h | 1 +
|
||||||
|
repair/phase2.c | 2 ++
|
||||||
|
repair/xfs_repair.c | 15 +++++++++++++++
|
||||||
|
5 files changed, 21 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
|
||||||
|
index b76e638..63628ae 100644
|
||||||
|
--- a/libxfs/libxfs_api_defs.h
|
||||||
|
+++ b/libxfs/libxfs_api_defs.h
|
||||||
|
@@ -195,6 +195,8 @@
|
||||||
|
#define xfs_trans_roll libxfs_trans_roll
|
||||||
|
#define xfs_trim_extent libxfs_trim_extent
|
||||||
|
|
||||||
|
+#define xfs_update_secondary_sbs libxfs_update_secondary_sbs
|
||||||
|
+
|
||||||
|
#define xfs_validate_stripe_geometry libxfs_validate_stripe_geometry
|
||||||
|
#define xfs_verify_agbno libxfs_verify_agbno
|
||||||
|
#define xfs_verify_agino libxfs_verify_agino
|
||||||
|
diff --git a/repair/globals.c b/repair/globals.c
|
||||||
|
index 506a4e7..f8d4f1e 100644
|
||||||
|
--- a/repair/globals.c
|
||||||
|
+++ b/repair/globals.c
|
||||||
|
@@ -48,6 +48,7 @@ char *rt_name; /* Name of realtime device */
|
||||||
|
int rt_spec; /* Realtime dev specified as option */
|
||||||
|
int convert_lazy_count; /* Convert lazy-count mode on/off */
|
||||||
|
int lazy_count; /* What to set if to if converting */
|
||||||
|
+bool features_changed; /* did we change superblock feature bits? */
|
||||||
|
bool add_inobtcount; /* add inode btree counts to AGI */
|
||||||
|
bool add_bigtime; /* add support for timestamps up to 2486 */
|
||||||
|
|
||||||
|
diff --git a/repair/globals.h b/repair/globals.h
|
||||||
|
index 929b82b..0f98bd2 100644
|
||||||
|
--- a/repair/globals.h
|
||||||
|
+++ b/repair/globals.h
|
||||||
|
@@ -89,6 +89,7 @@ extern char *rt_name; /* Name of realtime device */
|
||||||
|
extern int rt_spec; /* Realtime dev specified as option */
|
||||||
|
extern int convert_lazy_count; /* Convert lazy-count mode on/off */
|
||||||
|
extern int lazy_count; /* What to set if to if converting */
|
||||||
|
+extern bool features_changed; /* did we change superblock feature bits? */
|
||||||
|
extern bool add_inobtcount; /* add inode btree counts to AGI */
|
||||||
|
extern bool add_bigtime; /* add support for timestamps up to 2486 */
|
||||||
|
|
||||||
|
diff --git a/repair/phase2.c b/repair/phase2.c
|
||||||
|
index 32ffe18..ab53ee0 100644
|
||||||
|
--- a/repair/phase2.c
|
||||||
|
+++ b/repair/phase2.c
|
||||||
|
@@ -216,6 +216,8 @@ upgrade_filesystem(
|
||||||
|
}
|
||||||
|
if (bp)
|
||||||
|
libxfs_buf_relse(bp);
|
||||||
|
+
|
||||||
|
+ features_changed = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
|
||||||
|
index 38406ee..e44aa40 100644
|
||||||
|
--- a/repair/xfs_repair.c
|
||||||
|
+++ b/repair/xfs_repair.c
|
||||||
|
@@ -1298,6 +1298,21 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\
|
||||||
|
libxfs_buf_relse(sbp);
|
||||||
|
|
||||||
|
/*
|
||||||
|
+ * If we upgraded V5 filesystem features, we need to update the
|
||||||
|
+ * secondary superblocks to include the new feature bits. Don't set
|
||||||
|
+ * NEEDSREPAIR on the secondaries.
|
||||||
|
+ */
|
||||||
|
+ if (features_changed) {
|
||||||
|
+ mp->m_sb.sb_features_incompat &=
|
||||||
|
+ ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
|
||||||
|
+ error = -libxfs_update_secondary_sbs(mp);
|
||||||
|
+ if (error)
|
||||||
|
+ do_error(_("upgrading features of secondary supers"));
|
||||||
|
+ mp->m_sb.sb_features_incompat |=
|
||||||
|
+ XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
* Done. Flush all cached buffers and inodes first to ensure all
|
||||||
|
* verifiers are run (where we discover the max metadata LSN), reformat
|
||||||
|
* the log if necessary and unmount.
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
41
0017-xfs_repair-fix-AG-header-btree-level-comparisons.patch
Normal file
41
0017-xfs_repair-fix-AG-header-btree-level-comparisons.patch
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
From 2e9720d51a1e9efa6535b540f3c9ff88e95aabe9 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Wed, 27 Apr 2022 23:11:09 -0400
|
||||||
|
Subject: [PATCH] xfs_repair: fix AG header btree level comparisons
|
||||||
|
|
||||||
|
It's not an error if repair encounters a btree with the maximal
|
||||||
|
height, so don't print warnings. Also, we don't allow zero-height
|
||||||
|
btrees.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
repair/scan.c | 4 ++--
|
||||||
|
1 file changed, 2 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/repair/scan.c b/repair/scan.c
|
||||||
|
index 909c449..e2d281a 100644
|
||||||
|
--- a/repair/scan.c
|
||||||
|
+++ b/repair/scan.c
|
||||||
|
@@ -2297,7 +2297,7 @@ validate_agf(
|
||||||
|
priv.nr_blocks = 0;
|
||||||
|
|
||||||
|
levels = be32_to_cpu(agf->agf_levels[XFS_BTNUM_RMAP]);
|
||||||
|
- if (levels >= XFS_BTREE_MAXLEVELS) {
|
||||||
|
+ if (levels == 0 || levels > XFS_BTREE_MAXLEVELS) {
|
||||||
|
do_warn(_("bad levels %u for rmapbt root, agno %d\n"),
|
||||||
|
levels, agno);
|
||||||
|
rmap_avoid_check();
|
||||||
|
@@ -2323,7 +2323,7 @@ validate_agf(
|
||||||
|
unsigned int levels;
|
||||||
|
|
||||||
|
levels = be32_to_cpu(agf->agf_refcount_level);
|
||||||
|
- if (levels >= XFS_BTREE_MAXLEVELS) {
|
||||||
|
+ if (levels == 0 || levels > XFS_BTREE_MAXLEVELS) {
|
||||||
|
do_warn(_("bad levels %u for refcountbt root, agno %d\n"),
|
||||||
|
levels, agno);
|
||||||
|
refcount_avoid_check();
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,47 @@
|
|||||||
|
From 0571c857fe326141e35162f5a05e6b89789840bf Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Thu, 28 Apr 2022 15:39:02 -0400
|
||||||
|
Subject: [PATCH] xfs: fix maxlevels comparisons in the btree staging code
|
||||||
|
|
||||||
|
Source kernel commit: 78e8ec83a404d63dcc86b251f42e4ee8aff27465
|
||||||
|
|
||||||
|
The btree geometry computation function has an off-by-one error in that
|
||||||
|
it does not allow maximally tall btrees (nlevels == XFS_BTREE_MAXLEVELS).
|
||||||
|
This can result in repairs failing unnecessarily on very fragmented
|
||||||
|
filesystems. Subsequent patches to remove MAXLEVELS usage in favor of
|
||||||
|
the per-btree type computations will make this a much more likely
|
||||||
|
occurrence.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_btree_staging.c | 4 ++--
|
||||||
|
1 file changed, 2 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_btree_staging.c b/libxfs/xfs_btree_staging.c
|
||||||
|
index 146d247..daf9979 100644
|
||||||
|
--- a/libxfs/xfs_btree_staging.c
|
||||||
|
+++ b/libxfs/xfs_btree_staging.c
|
||||||
|
@@ -662,7 +662,7 @@ xfs_btree_bload_compute_geometry(
|
||||||
|
xfs_btree_bload_ensure_slack(cur, &bbl->node_slack, 1);
|
||||||
|
|
||||||
|
bbl->nr_records = nr_this_level = nr_records;
|
||||||
|
- for (cur->bc_nlevels = 1; cur->bc_nlevels < XFS_BTREE_MAXLEVELS;) {
|
||||||
|
+ for (cur->bc_nlevels = 1; cur->bc_nlevels <= XFS_BTREE_MAXLEVELS;) {
|
||||||
|
uint64_t level_blocks;
|
||||||
|
uint64_t dontcare64;
|
||||||
|
unsigned int level = cur->bc_nlevels - 1;
|
||||||
|
@@ -724,7 +724,7 @@ xfs_btree_bload_compute_geometry(
|
||||||
|
nr_this_level = level_blocks;
|
||||||
|
}
|
||||||
|
|
||||||
|
- if (cur->bc_nlevels == XFS_BTREE_MAXLEVELS)
|
||||||
|
+ if (cur->bc_nlevels > XFS_BTREE_MAXLEVELS)
|
||||||
|
return -EOVERFLOW;
|
||||||
|
|
||||||
|
bbl->btree_height = cur->bc_nlevels;
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,52 @@
|
|||||||
|
From b79218242b786a2c02bcac9f53fdae45e2e61e90 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Brian Foster <bfoster@redhat.com>
|
||||||
|
Date: Thu, 28 Apr 2022 15:39:02 -0400
|
||||||
|
Subject: [PATCH] xfs: fold perag loop iteration logic into helper function
|
||||||
|
|
||||||
|
Source kernel commit: bf2307b195135ed9c95eebb38920d8bd41843092
|
||||||
|
|
||||||
|
Fold the loop iteration logic into a helper in preparation for
|
||||||
|
further fixups. No functional change in this patch.
|
||||||
|
|
||||||
|
Signed-off-by: Brian Foster <bfoster@redhat.com>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_ag.h | 16 +++++++++++++---
|
||||||
|
1 file changed, 13 insertions(+), 3 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_ag.h b/libxfs/xfs_ag.h
|
||||||
|
index 2522f76..95570df 100644
|
||||||
|
--- a/libxfs/xfs_ag.h
|
||||||
|
+++ b/libxfs/xfs_ag.h
|
||||||
|
@@ -126,12 +126,22 @@ void xfs_perag_put(struct xfs_perag *pag);
|
||||||
|
* for_each_perag_from() because they terminate at sb_agcount where there are
|
||||||
|
* no perag structures in tree beyond end_agno.
|
||||||
|
*/
|
||||||
|
+static inline struct xfs_perag *
|
||||||
|
+xfs_perag_next(
|
||||||
|
+ struct xfs_perag *pag,
|
||||||
|
+ xfs_agnumber_t *next_agno)
|
||||||
|
+{
|
||||||
|
+ struct xfs_mount *mp = pag->pag_mount;
|
||||||
|
+
|
||||||
|
+ *next_agno = pag->pag_agno + 1;
|
||||||
|
+ xfs_perag_put(pag);
|
||||||
|
+ return xfs_perag_get(mp, *next_agno);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
#define for_each_perag_range(mp, next_agno, end_agno, pag) \
|
||||||
|
for ((pag) = xfs_perag_get((mp), (next_agno)); \
|
||||||
|
(pag) != NULL && (next_agno) <= (end_agno); \
|
||||||
|
- (next_agno) = (pag)->pag_agno + 1, \
|
||||||
|
- xfs_perag_put(pag), \
|
||||||
|
- (pag) = xfs_perag_get((mp), (next_agno)))
|
||||||
|
+ (pag) = xfs_perag_next((pag), &(next_agno)))
|
||||||
|
|
||||||
|
#define for_each_perag_from(mp, next_agno, pag) \
|
||||||
|
for_each_perag_range((mp), (next_agno), (mp)->m_sb.sb_agcount, (pag))
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
58
0020-xfs-rename-the-next_agno-perag-iteration-variable.patch
Normal file
58
0020-xfs-rename-the-next_agno-perag-iteration-variable.patch
Normal file
@ -0,0 +1,58 @@
|
|||||||
|
From 02ff0b2b4c117f33f79500815a9322fe987a4bf5 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Brian Foster <bfoster@redhat.com>
|
||||||
|
Date: Thu, 28 Apr 2022 15:39:02 -0400
|
||||||
|
Subject: [PATCH] xfs: rename the next_agno perag iteration variable
|
||||||
|
|
||||||
|
Source kernel commit: f1788b5e5ee25bedf00bb4d25f82b93820d61189
|
||||||
|
|
||||||
|
Rename the next_agno variable to be consistent across the several
|
||||||
|
iteration macros and shorten line length.
|
||||||
|
|
||||||
|
Signed-off-by: Brian Foster <bfoster@redhat.com>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_ag.h | 18 +++++++++---------
|
||||||
|
1 file changed, 9 insertions(+), 9 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_ag.h b/libxfs/xfs_ag.h
|
||||||
|
index 95570df..9cd0669 100644
|
||||||
|
--- a/libxfs/xfs_ag.h
|
||||||
|
+++ b/libxfs/xfs_ag.h
|
||||||
|
@@ -129,22 +129,22 @@ void xfs_perag_put(struct xfs_perag *pag);
|
||||||
|
static inline struct xfs_perag *
|
||||||
|
xfs_perag_next(
|
||||||
|
struct xfs_perag *pag,
|
||||||
|
- xfs_agnumber_t *next_agno)
|
||||||
|
+ xfs_agnumber_t *agno)
|
||||||
|
{
|
||||||
|
struct xfs_mount *mp = pag->pag_mount;
|
||||||
|
|
||||||
|
- *next_agno = pag->pag_agno + 1;
|
||||||
|
+ *agno = pag->pag_agno + 1;
|
||||||
|
xfs_perag_put(pag);
|
||||||
|
- return xfs_perag_get(mp, *next_agno);
|
||||||
|
+ return xfs_perag_get(mp, *agno);
|
||||||
|
}
|
||||||
|
|
||||||
|
-#define for_each_perag_range(mp, next_agno, end_agno, pag) \
|
||||||
|
- for ((pag) = xfs_perag_get((mp), (next_agno)); \
|
||||||
|
- (pag) != NULL && (next_agno) <= (end_agno); \
|
||||||
|
- (pag) = xfs_perag_next((pag), &(next_agno)))
|
||||||
|
+#define for_each_perag_range(mp, agno, end_agno, pag) \
|
||||||
|
+ for ((pag) = xfs_perag_get((mp), (agno)); \
|
||||||
|
+ (pag) != NULL && (agno) <= (end_agno); \
|
||||||
|
+ (pag) = xfs_perag_next((pag), &(agno)))
|
||||||
|
|
||||||
|
-#define for_each_perag_from(mp, next_agno, pag) \
|
||||||
|
- for_each_perag_range((mp), (next_agno), (mp)->m_sb.sb_agcount, (pag))
|
||||||
|
+#define for_each_perag_from(mp, agno, pag) \
|
||||||
|
+ for_each_perag_range((mp), (agno), (mp)->m_sb.sb_agcount, (pag))
|
||||||
|
|
||||||
|
|
||||||
|
#define for_each_perag(mp, agno, pag) \
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
47
0021-xfs-terminate-perag-iteration-reliably-on-agcount.patch
Normal file
47
0021-xfs-terminate-perag-iteration-reliably-on-agcount.patch
Normal file
@ -0,0 +1,47 @@
|
|||||||
|
From 6c18fde82cd02e550fb0c095bd6c6908dcc77747 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Brian Foster <bfoster@redhat.com>
|
||||||
|
Date: Thu, 28 Apr 2022 15:39:03 -0400
|
||||||
|
Subject: [PATCH] xfs: terminate perag iteration reliably on agcount
|
||||||
|
|
||||||
|
Source kernel commit: 8ed004eb9d07a5d6114db3e97a166707c186262d
|
||||||
|
|
||||||
|
The for_each_perag_from() iteration macro relies on sb_agcount to
|
||||||
|
process every perag currently within EOFS from a given starting
|
||||||
|
point. It's perfectly valid to have perag structures beyond
|
||||||
|
sb_agcount, however, such as if a growfs is in progress. If a perag
|
||||||
|
loop happens to race with growfs in this manner, it will actually
|
||||||
|
attempt to process the post-EOFS perag where ->pag_agno ==
|
||||||
|
sb_agcount. This is reproduced by xfs/104 and manifests as the
|
||||||
|
following assert failure in superblock write verifier context:
|
||||||
|
|
||||||
|
XFS: Assertion failed: agno < mp->m_sb.sb_agcount, file: fs/xfs/libxfs/xfs_types.c, line: 22
|
||||||
|
|
||||||
|
Update the corresponding macro to only process perags that are
|
||||||
|
within the current sb_agcount.
|
||||||
|
|
||||||
|
Fixes: 58d43a7e3263 ("xfs: pass perags around in fsmap data dev functions")
|
||||||
|
Signed-off-by: Brian Foster <bfoster@redhat.com>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_ag.h | 2 +-
|
||||||
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_ag.h b/libxfs/xfs_ag.h
|
||||||
|
index 9cd0669..fae2a38 100644
|
||||||
|
--- a/libxfs/xfs_ag.h
|
||||||
|
+++ b/libxfs/xfs_ag.h
|
||||||
|
@@ -144,7 +144,7 @@ xfs_perag_next(
|
||||||
|
(pag) = xfs_perag_next((pag), &(agno)))
|
||||||
|
|
||||||
|
#define for_each_perag_from(mp, agno, pag) \
|
||||||
|
- for_each_perag_range((mp), (agno), (mp)->m_sb.sb_agcount, (pag))
|
||||||
|
+ for_each_perag_range((mp), (agno), (mp)->m_sb.sb_agcount - 1, (pag))
|
||||||
|
|
||||||
|
|
||||||
|
#define for_each_perag(mp, agno, pag) \
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,88 @@
|
|||||||
|
From 9619d9e715b2eba7c39683bcbc721d3954275eb4 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Brian Foster <bfoster@redhat.com>
|
||||||
|
Date: Thu, 28 Apr 2022 15:39:03 -0400
|
||||||
|
Subject: [PATCH] xfs: fix perag reference leak on iteration race with growfs
|
||||||
|
|
||||||
|
Source kernel commit: 892a666fafa19ab04b5e948f6c92f98f1dafb489
|
||||||
|
|
||||||
|
The for_each_perag*() set of macros are hacky in that some (i.e.
|
||||||
|
those based on sb_agcount) rely on the assumption that perag
|
||||||
|
iteration terminates naturally with a NULL perag at the specified
|
||||||
|
end_agno. Others allow for the final AG to have a valid perag and
|
||||||
|
require the calling function to clean up any potential leftover
|
||||||
|
xfs_perag reference on termination of the loop.
|
||||||
|
|
||||||
|
Aside from providing a subtly inconsistent interface, the former
|
||||||
|
variant is racy with growfs because growfs can create discoverable
|
||||||
|
post-eofs perags before the final superblock update that completes
|
||||||
|
the grow operation and increases sb_agcount. This leads to the
|
||||||
|
following assert failure (reproduced by xfs/104) in the perag free
|
||||||
|
path during unmount:
|
||||||
|
|
||||||
|
XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/libxfs/xfs_ag.c, line: 195
|
||||||
|
|
||||||
|
This occurs because one of the many for_each_perag() loops in the
|
||||||
|
code that is expected to terminate with a NULL pag (and thus has no
|
||||||
|
post-loop xfs_perag_put() check) raced with a growfs and found a
|
||||||
|
non-NULL post-EOFS perag, but terminated naturally based on the
|
||||||
|
end_agno check without releasing the post-EOFS perag.
|
||||||
|
|
||||||
|
Rework the iteration logic to lift the agno check from the main for
|
||||||
|
loop conditional to the iteration helper function. The for loop now
|
||||||
|
purely terminates on a NULL pag and xfs_perag_next() avoids taking a
|
||||||
|
reference to any perag beyond end_agno in the first place.
|
||||||
|
|
||||||
|
Fixes: f250eedcf762 ("xfs: make for_each_perag... a first class citizen")
|
||||||
|
Signed-off-by: Brian Foster <bfoster@redhat.com>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_ag.h | 16 ++++++----------
|
||||||
|
1 file changed, 6 insertions(+), 10 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_ag.h b/libxfs/xfs_ag.h
|
||||||
|
index fae2a38..e411d51 100644
|
||||||
|
--- a/libxfs/xfs_ag.h
|
||||||
|
+++ b/libxfs/xfs_ag.h
|
||||||
|
@@ -118,30 +118,26 @@ void xfs_perag_put(struct xfs_perag *pag);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Perag iteration APIs
|
||||||
|
- *
|
||||||
|
- * XXX: for_each_perag_range() usage really needs an iterator to clean up when
|
||||||
|
- * we terminate at end_agno because we may have taken a reference to the perag
|
||||||
|
- * beyond end_agno. Right now callers have to be careful to catch and clean that
|
||||||
|
- * up themselves. This is not necessary for the callers of for_each_perag() and
|
||||||
|
- * for_each_perag_from() because they terminate at sb_agcount where there are
|
||||||
|
- * no perag structures in tree beyond end_agno.
|
||||||
|
*/
|
||||||
|
static inline struct xfs_perag *
|
||||||
|
xfs_perag_next(
|
||||||
|
struct xfs_perag *pag,
|
||||||
|
- xfs_agnumber_t *agno)
|
||||||
|
+ xfs_agnumber_t *agno,
|
||||||
|
+ xfs_agnumber_t end_agno)
|
||||||
|
{
|
||||||
|
struct xfs_mount *mp = pag->pag_mount;
|
||||||
|
|
||||||
|
*agno = pag->pag_agno + 1;
|
||||||
|
xfs_perag_put(pag);
|
||||||
|
+ if (*agno > end_agno)
|
||||||
|
+ return NULL;
|
||||||
|
return xfs_perag_get(mp, *agno);
|
||||||
|
}
|
||||||
|
|
||||||
|
#define for_each_perag_range(mp, agno, end_agno, pag) \
|
||||||
|
for ((pag) = xfs_perag_get((mp), (agno)); \
|
||||||
|
- (pag) != NULL && (agno) <= (end_agno); \
|
||||||
|
- (pag) = xfs_perag_next((pag), &(agno)))
|
||||||
|
+ (pag) != NULL; \
|
||||||
|
+ (pag) = xfs_perag_next((pag), &(agno), (end_agno)))
|
||||||
|
|
||||||
|
#define for_each_perag_from(mp, agno, pag) \
|
||||||
|
for_each_perag_range((mp), (agno), (mp)->m_sb.sb_agcount - 1, (pag))
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,91 @@
|
|||||||
|
From b6a7b627b1211f87e3bac3dc0111d056e70aa773 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 17 May 2022 22:48:12 -0400
|
||||||
|
Subject: [PATCH] mkfs: fix missing validation of -l size against maximum
|
||||||
|
internal log size
|
||||||
|
|
||||||
|
If a sysadmin specifies a log size explicitly, we don't actually check
|
||||||
|
that against the maximum internal log size that we compute for the
|
||||||
|
default log size computation. We're going to add more validation soon,
|
||||||
|
so refactor the max internal log blocks into a common variable and
|
||||||
|
add a check.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
mkfs/xfs_mkfs.c | 36 ++++++++++++++++++++++--------------
|
||||||
|
1 file changed, 22 insertions(+), 14 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index b7e335f..f7006af 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -3268,6 +3268,7 @@ calculate_log_size(
|
||||||
|
{
|
||||||
|
struct xfs_sb *sbp = &mp->m_sb;
|
||||||
|
int min_logblocks;
|
||||||
|
+ int max_logblocks; /* absolute max for this AG */
|
||||||
|
struct xfs_mount mount;
|
||||||
|
|
||||||
|
/* we need a temporary mount to calculate the minimum log size. */
|
||||||
|
@@ -3307,6 +3308,18 @@ _("external log device size %lld blocks too small, must be at least %lld blocks\
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
+ /*
|
||||||
|
+ * Make sure the log fits wholly within an AG
|
||||||
|
+ *
|
||||||
|
+ * XXX: If agf->freeblks ends up as 0 because the log uses all
|
||||||
|
+ * the free space, it causes the kernel all sorts of problems
|
||||||
|
+ * with per-ag reservations. Right now just back it off one
|
||||||
|
+ * block, but there's a whole can of worms here that needs to be
|
||||||
|
+ * opened to decide what is the valid maximum size of a log in
|
||||||
|
+ * an AG.
|
||||||
|
+ */
|
||||||
|
+ max_logblocks = libxfs_alloc_ag_max_usable(mp) - 1;
|
||||||
|
+
|
||||||
|
/* internal log - if no size specified, calculate automatically */
|
||||||
|
if (!cfg->logblocks) {
|
||||||
|
if (cfg->dblocks < GIGABYTES(1, cfg->blocklog)) {
|
||||||
|
@@ -3332,21 +3345,9 @@ _("external log device size %lld blocks too small, must be at least %lld blocks\
|
||||||
|
cfg->logblocks = cfg->logblocks >> cfg->blocklog;
|
||||||
|
}
|
||||||
|
|
||||||
|
- /* Ensure the chosen size meets minimum log size requirements */
|
||||||
|
+ /* Ensure the chosen size fits within log size requirements */
|
||||||
|
cfg->logblocks = max(min_logblocks, cfg->logblocks);
|
||||||
|
-
|
||||||
|
- /*
|
||||||
|
- * Make sure the log fits wholly within an AG
|
||||||
|
- *
|
||||||
|
- * XXX: If agf->freeblks ends up as 0 because the log uses all
|
||||||
|
- * the free space, it causes the kernel all sorts of problems
|
||||||
|
- * with per-ag reservations. Right now just back it off one
|
||||||
|
- * block, but there's a whole can of worms here that needs to be
|
||||||
|
- * opened to decide what is the valid maximum size of a log in
|
||||||
|
- * an AG.
|
||||||
|
- */
|
||||||
|
- cfg->logblocks = min(cfg->logblocks,
|
||||||
|
- libxfs_alloc_ag_max_usable(mp) - 1);
|
||||||
|
+ cfg->logblocks = min(cfg->logblocks, max_logblocks);
|
||||||
|
|
||||||
|
/* and now clamp the size to the maximum supported size */
|
||||||
|
cfg->logblocks = min(cfg->logblocks, XFS_MAX_LOG_BLOCKS);
|
||||||
|
@@ -3354,6 +3355,13 @@ _("external log device size %lld blocks too small, must be at least %lld blocks\
|
||||||
|
cfg->logblocks = XFS_MAX_LOG_BYTES >> cfg->blocklog;
|
||||||
|
|
||||||
|
validate_log_size(cfg->logblocks, cfg->blocklog, min_logblocks);
|
||||||
|
+ } else if (cfg->logblocks > max_logblocks) {
|
||||||
|
+ /* check specified log size */
|
||||||
|
+ fprintf(stderr,
|
||||||
|
+_("internal log size %lld too large, must be less than %d\n"),
|
||||||
|
+ (long long)cfg->logblocks,
|
||||||
|
+ max_logblocks);
|
||||||
|
+ usage();
|
||||||
|
}
|
||||||
|
|
||||||
|
if (cfg->logblocks > sbp->sb_agblocks - libxfs_prealloc_blocks(mp)) {
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
108
0024-mkfs-reduce-internal-log-size-when-log-stripe-units-.patch
Normal file
108
0024-mkfs-reduce-internal-log-size-when-log-stripe-units-.patch
Normal file
@ -0,0 +1,108 @@
|
|||||||
|
From 8d1bff2be3360572fbee9ed83e0d1c86af1614c5 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 17 May 2022 22:48:12 -0400
|
||||||
|
Subject: [PATCH] mkfs: reduce internal log size when log stripe units are in
|
||||||
|
play
|
||||||
|
|
||||||
|
Currently, one can feed mkfs a combination of options like this:
|
||||||
|
|
||||||
|
$ truncate -s 6366g /tmp/a ; mkfs.xfs -f /tmp/a -d agcount=3200 -d su=256k,sw=4
|
||||||
|
meta-data=/tmp/a isize=512 agcount=3200, agsize=521536 blks
|
||||||
|
= sectsz=512 attr=2, projid32bit=1
|
||||||
|
= crc=1 finobt=1, sparse=1, rmapbt=0
|
||||||
|
= reflink=1 bigtime=0 inobtcount=0
|
||||||
|
data = bsize=4096 blocks=1668808704, imaxpct=5
|
||||||
|
= sunit=64 swidth=256 blks
|
||||||
|
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
|
||||||
|
log =internal log bsize=4096 blocks=521536, version=2
|
||||||
|
= sectsz=512 sunit=64 blks, lazy-count=1
|
||||||
|
realtime =none extsz=4096 blocks=0, rtextents=0
|
||||||
|
Metadata corruption detected at 0x55e88052c6b6, xfs_agf block 0x1/0x200
|
||||||
|
libxfs_bwrite: write verifier failed on xfs_agf bno 0x1/0x1
|
||||||
|
mkfs.xfs: writing AG headers failed, err=117
|
||||||
|
|
||||||
|
The format fails because the internal log size sizing algorithm
|
||||||
|
specifies a log size of 521492 blocks to avoid taking all the space in
|
||||||
|
the AG, but align_log_size sees the stripe unit and rounds that up to
|
||||||
|
the next stripe unit, which is 521536 blocks.
|
||||||
|
|
||||||
|
Fix this problem by rounding the log size down if rounding up would
|
||||||
|
result in a log that consumes more space in the AG than we allow.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
mkfs/xfs_mkfs.c | 19 +++++++++++--------
|
||||||
|
1 file changed, 11 insertions(+), 8 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index e11b39d..eb4d7fa 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -3180,9 +3180,10 @@ sb_set_features(
|
||||||
|
static void
|
||||||
|
align_log_size(
|
||||||
|
struct mkfs_params *cfg,
|
||||||
|
- int sunit)
|
||||||
|
+ int sunit,
|
||||||
|
+ int max_logblocks)
|
||||||
|
{
|
||||||
|
- uint64_t tmp_logblocks;
|
||||||
|
+ uint64_t tmp_logblocks;
|
||||||
|
|
||||||
|
/* nothing to do if it's already aligned. */
|
||||||
|
if ((cfg->logblocks % sunit) == 0)
|
||||||
|
@@ -3199,7 +3200,8 @@ _("log size %lld is not a multiple of the log stripe unit %d\n"),
|
||||||
|
|
||||||
|
/* If the log is too large, round down instead of round up */
|
||||||
|
if ((tmp_logblocks > XFS_MAX_LOG_BLOCKS) ||
|
||||||
|
- ((tmp_logblocks << cfg->blocklog) > XFS_MAX_LOG_BYTES)) {
|
||||||
|
+ ((tmp_logblocks << cfg->blocklog) > XFS_MAX_LOG_BYTES) ||
|
||||||
|
+ tmp_logblocks > max_logblocks) {
|
||||||
|
tmp_logblocks = (cfg->logblocks / sunit) * sunit;
|
||||||
|
}
|
||||||
|
cfg->logblocks = tmp_logblocks;
|
||||||
|
@@ -3213,7 +3215,8 @@ static void
|
||||||
|
align_internal_log(
|
||||||
|
struct mkfs_params *cfg,
|
||||||
|
struct xfs_mount *mp,
|
||||||
|
- int sunit)
|
||||||
|
+ int sunit,
|
||||||
|
+ int max_logblocks)
|
||||||
|
{
|
||||||
|
uint64_t logend;
|
||||||
|
|
||||||
|
@@ -3231,7 +3234,7 @@ _("Due to stripe alignment, the internal log start (%lld) cannot be aligned\n"
|
||||||
|
}
|
||||||
|
|
||||||
|
/* round up/down the log size now */
|
||||||
|
- align_log_size(cfg, sunit);
|
||||||
|
+ align_log_size(cfg, sunit, max_logblocks);
|
||||||
|
|
||||||
|
/* check the aligned log still starts and ends in the same AG. */
|
||||||
|
logend = cfg->logstart + cfg->logblocks - 1;
|
||||||
|
@@ -3309,7 +3312,7 @@ _("external log device size %lld blocks too small, must be at least %lld blocks\
|
||||||
|
cfg->logstart = 0;
|
||||||
|
cfg->logagno = 0;
|
||||||
|
if (cfg->lsunit)
|
||||||
|
- align_log_size(cfg, cfg->lsunit);
|
||||||
|
+ align_log_size(cfg, cfg->lsunit, XFS_MAX_LOG_BLOCKS);
|
||||||
|
|
||||||
|
validate_log_size(cfg->logblocks, cfg->blocklog, min_logblocks);
|
||||||
|
return;
|
||||||
|
@@ -3386,9 +3389,9 @@ _("log ag number %lld too large, must be less than %lld\n"),
|
||||||
|
* Align the logstart at stripe unit boundary.
|
||||||
|
*/
|
||||||
|
if (cfg->lsunit) {
|
||||||
|
- align_internal_log(cfg, mp, cfg->lsunit);
|
||||||
|
+ align_internal_log(cfg, mp, cfg->lsunit, max_logblocks);
|
||||||
|
} else if (cfg->dsunit) {
|
||||||
|
- align_internal_log(cfg, mp, cfg->dsunit);
|
||||||
|
+ align_internal_log(cfg, mp, cfg->dsunit, max_logblocks);
|
||||||
|
}
|
||||||
|
validate_log_size(cfg->logblocks, cfg->blocklog, min_logblocks);
|
||||||
|
}
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
105
0025-mkfs-don-t-let-internal-logs-bump-the-root-dir-inode.patch
Normal file
105
0025-mkfs-don-t-let-internal-logs-bump-the-root-dir-inode.patch
Normal file
@ -0,0 +1,105 @@
|
|||||||
|
From 1b580a773a65eb9b2fe7f777dd6900c0d6e9a7b3 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 17 May 2022 22:48:13 -0400
|
||||||
|
Subject: [PATCH] mkfs: don't let internal logs bump the root dir inode chunk
|
||||||
|
to AG 1
|
||||||
|
|
||||||
|
Currently, we don't let an internal log consume every last block in an
|
||||||
|
AG. According to the comment, we're doing this to avoid tripping AGF
|
||||||
|
verifiers if freeblks==0, but on a modern filesystem this isn't
|
||||||
|
sufficient to avoid problems because we need to have enough space in the
|
||||||
|
AG to allocate an aligned root inode chunk, if it should be the case
|
||||||
|
that the log also ends up in AG 0:
|
||||||
|
|
||||||
|
$ truncate -s 6366g /tmp/a ; mkfs.xfs -f /tmp/a -d agcount=3200 -l agnum=0
|
||||||
|
meta-data=/tmp/a isize=512 agcount=3200, agsize=521503 blks
|
||||||
|
= sectsz=512 attr=2, projid32bit=1
|
||||||
|
= crc=1 finobt=1, sparse=1, rmapbt=0
|
||||||
|
= reflink=1 bigtime=0 inobtcount=0
|
||||||
|
data = bsize=4096 blocks=1668808704, imaxpct=5
|
||||||
|
= sunit=0 swidth=0 blks
|
||||||
|
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
|
||||||
|
log =internal log bsize=4096 blocks=521492, version=2
|
||||||
|
= sectsz=512 sunit=0 blks, lazy-count=1
|
||||||
|
realtime =none extsz=4096 blocks=0, rtextents=0
|
||||||
|
mkfs.xfs: root inode created in AG 1, not AG 0
|
||||||
|
|
||||||
|
Therefore, modify the maximum internal log size calculation to constrain
|
||||||
|
the maximum internal log size so that the aligned inode chunk allocation
|
||||||
|
will always succeed.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
mkfs/xfs_mkfs.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
1 file changed, 47 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index eb4d7fa..0b1fb74 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -3271,6 +3271,49 @@ validate_log_size(uint64_t logblocks, int blocklog, int min_logblocks)
|
||||||
|
}
|
||||||
|
|
||||||
|
static void
|
||||||
|
+adjust_ag0_internal_logblocks(
|
||||||
|
+ struct mkfs_params *cfg,
|
||||||
|
+ struct xfs_mount *mp,
|
||||||
|
+ int min_logblocks,
|
||||||
|
+ int *max_logblocks)
|
||||||
|
+{
|
||||||
|
+ int backoff = 0;
|
||||||
|
+ int ichunk_blocks;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * mkfs will trip over the write verifiers if the log is allocated in
|
||||||
|
+ * AG 0 and consumes enough space that we cannot allocate a non-sparse
|
||||||
|
+ * inode chunk for the root directory. The inode allocator requires
|
||||||
|
+ * that the AG have enough free space for the chunk itself plus enough
|
||||||
|
+ * to fix up the freelist with aligned blocks if we need to fill the
|
||||||
|
+ * allocation from the AGFL.
|
||||||
|
+ */
|
||||||
|
+ ichunk_blocks = XFS_INODES_PER_CHUNK * cfg->inodesize >> cfg->blocklog;
|
||||||
|
+ backoff = ichunk_blocks * 4;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * We try to align inode allocations to the data device stripe unit,
|
||||||
|
+ * so ensure there's enough space to perform an aligned allocation.
|
||||||
|
+ * The inode geometry structure isn't set up yet, so compute this by
|
||||||
|
+ * hand.
|
||||||
|
+ */
|
||||||
|
+ backoff = max(backoff, cfg->dsunit * 2);
|
||||||
|
+
|
||||||
|
+ *max_logblocks -= backoff;
|
||||||
|
+
|
||||||
|
+ /* If the specified log size is too big, complain. */
|
||||||
|
+ if (cli_opt_set(&lopts, L_SIZE) && cfg->logblocks > *max_logblocks) {
|
||||||
|
+ fprintf(stderr,
|
||||||
|
+_("internal log size %lld too large, must be less than %d\n"),
|
||||||
|
+ (long long)cfg->logblocks,
|
||||||
|
+ *max_logblocks);
|
||||||
|
+ usage();
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ cfg->logblocks = min(cfg->logblocks, *max_logblocks);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void
|
||||||
|
calculate_log_size(
|
||||||
|
struct mkfs_params *cfg,
|
||||||
|
struct cli_params *cli,
|
||||||
|
@@ -3382,6 +3425,10 @@ _("log ag number %lld too large, must be less than %lld\n"),
|
||||||
|
} else
|
||||||
|
cfg->logagno = (xfs_agnumber_t)(sbp->sb_agcount / 2);
|
||||||
|
|
||||||
|
+ if (cfg->logagno == 0)
|
||||||
|
+ adjust_ag0_internal_logblocks(cfg, mp, min_logblocks,
|
||||||
|
+ &max_logblocks);
|
||||||
|
+
|
||||||
|
cfg->logstart = XFS_AGB_TO_FSB(mp, cfg->logagno,
|
||||||
|
libxfs_prealloc_blocks(mp));
|
||||||
|
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
73
0026-mkfs-improve-log-extent-validation.patch
Normal file
73
0026-mkfs-improve-log-extent-validation.patch
Normal file
@ -0,0 +1,73 @@
|
|||||||
|
From 93a199f21dd12fdef4cbcb6821e58e2c301727e2 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 17 May 2022 22:48:13 -0400
|
||||||
|
Subject: [PATCH] mkfs: improve log extent validation
|
||||||
|
|
||||||
|
Use the standard libxfs fsblock verifiers to check the start and end of
|
||||||
|
the internal log. The current code does not catch the case of a
|
||||||
|
(segmented) fsblock that is beyond agf_blocks but not so large to change
|
||||||
|
the agno part of the segmented fsblock.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/libxfs_api_defs.h | 1 +
|
||||||
|
mkfs/xfs_mkfs.c | 10 ++++------
|
||||||
|
2 files changed, 5 insertions(+), 6 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
|
||||||
|
index 8abbd23..370ad8b 100644
|
||||||
|
--- a/libxfs/libxfs_api_defs.h
|
||||||
|
+++ b/libxfs/libxfs_api_defs.h
|
||||||
|
@@ -208,6 +208,7 @@
|
||||||
|
#define xfs_verify_agino libxfs_verify_agino
|
||||||
|
#define xfs_verify_cksum libxfs_verify_cksum
|
||||||
|
#define xfs_verify_dir_ino libxfs_verify_dir_ino
|
||||||
|
+#define xfs_verify_fsbext libxfs_verify_fsbext
|
||||||
|
#define xfs_verify_fsbno libxfs_verify_fsbno
|
||||||
|
#define xfs_verify_ino libxfs_verify_ino
|
||||||
|
#define xfs_verify_rtbno libxfs_verify_rtbno
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index 0b1fb74..b932aca 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -3218,15 +3218,13 @@ align_internal_log(
|
||||||
|
int sunit,
|
||||||
|
int max_logblocks)
|
||||||
|
{
|
||||||
|
- uint64_t logend;
|
||||||
|
-
|
||||||
|
/* round up log start if necessary */
|
||||||
|
if ((cfg->logstart % sunit) != 0)
|
||||||
|
cfg->logstart = ((cfg->logstart + (sunit - 1)) / sunit) * sunit;
|
||||||
|
|
||||||
|
/* If our log start overlaps the next AG's metadata, fail. */
|
||||||
|
- if (XFS_FSB_TO_AGBNO(mp, cfg->logstart) <= XFS_AGFL_BLOCK(mp)) {
|
||||||
|
- fprintf(stderr,
|
||||||
|
+ if (!libxfs_verify_fsbno(mp, cfg->logstart)) {
|
||||||
|
+ fprintf(stderr,
|
||||||
|
_("Due to stripe alignment, the internal log start (%lld) cannot be aligned\n"
|
||||||
|
"within an allocation group.\n"),
|
||||||
|
(long long) cfg->logstart);
|
||||||
|
@@ -3237,8 +3235,7 @@ _("Due to stripe alignment, the internal log start (%lld) cannot be aligned\n"
|
||||||
|
align_log_size(cfg, sunit, max_logblocks);
|
||||||
|
|
||||||
|
/* check the aligned log still starts and ends in the same AG. */
|
||||||
|
- logend = cfg->logstart + cfg->logblocks - 1;
|
||||||
|
- if (XFS_FSB_TO_AGNO(mp, cfg->logstart) != XFS_FSB_TO_AGNO(mp, logend)) {
|
||||||
|
+ if (!libxfs_verify_fsbext(mp, cfg->logstart, cfg->logblocks)) {
|
||||||
|
fprintf(stderr,
|
||||||
|
_("Due to stripe alignment, the internal log size (%lld) is too large.\n"
|
||||||
|
"Must fit within an allocation group.\n"),
|
||||||
|
@@ -3465,6 +3462,7 @@ start_superblock_setup(
|
||||||
|
sbp->sb_agblocks = (xfs_agblock_t)cfg->agsize;
|
||||||
|
sbp->sb_agblklog = (uint8_t)log2_roundup(cfg->agsize);
|
||||||
|
sbp->sb_agcount = (xfs_agnumber_t)cfg->agcount;
|
||||||
|
+ sbp->sb_dblocks = (xfs_rfsblock_t)cfg->dblocks;
|
||||||
|
|
||||||
|
sbp->sb_inodesize = (uint16_t)cfg->inodesize;
|
||||||
|
sbp->sb_inodelog = (uint8_t)cfg->inodelog;
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
130
0027-xfs_repair-detect-v5-featureset-mismatches-in-second.patch
Normal file
130
0027-xfs_repair-detect-v5-featureset-mismatches-in-second.patch
Normal file
@ -0,0 +1,130 @@
|
|||||||
|
From 2b7301269e82e86d9601392d289e38f3f66b1467 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 17 May 2022 22:48:13 -0400
|
||||||
|
Subject: [PATCH] xfs_repair: detect v5 featureset mismatches in secondary
|
||||||
|
supers
|
||||||
|
|
||||||
|
Make sure we detect and correct mismatches between the V5 features
|
||||||
|
described in the primary and the secondary superblocks.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
[sandeen: add comment about XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR]
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
repair/agheader.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||||
|
1 file changed, 92 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/repair/agheader.c b/repair/agheader.c
|
||||||
|
index 2af2410..90adc1f 100644
|
||||||
|
--- a/repair/agheader.c
|
||||||
|
+++ b/repair/agheader.c
|
||||||
|
@@ -221,6 +221,96 @@ compare_sb(xfs_mount_t *mp, xfs_sb_t *sb)
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
+ * If the fs feature bits on a secondary superblock don't match the
|
||||||
|
+ * primary, we need to update them.
|
||||||
|
+ */
|
||||||
|
+static inline int
|
||||||
|
+check_v5_feature_mismatch(
|
||||||
|
+ struct xfs_mount *mp,
|
||||||
|
+ xfs_agnumber_t agno,
|
||||||
|
+ struct xfs_sb *sb)
|
||||||
|
+{
|
||||||
|
+ bool dirty = false;
|
||||||
|
+
|
||||||
|
+ if (!xfs_sb_version_hascrc(&mp->m_sb) || agno == 0)
|
||||||
|
+ return 0;
|
||||||
|
+
|
||||||
|
+ if (mp->m_sb.sb_features_compat != sb->sb_features_compat) {
|
||||||
|
+ if (no_modify) {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("would fix compat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_compat,
|
||||||
|
+ sb->sb_features_compat);
|
||||||
|
+ } else {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("will fix compat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_compat,
|
||||||
|
+ sb->sb_features_compat);
|
||||||
|
+ dirty = true;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Ignore XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR becauses the repair upgrade
|
||||||
|
+ * path sets it only on the primary while upgrading.
|
||||||
|
+ */
|
||||||
|
+ if ((mp->m_sb.sb_features_incompat ^ sb->sb_features_incompat) &
|
||||||
|
+ ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR) {
|
||||||
|
+ if (no_modify) {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("would fix incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_incompat,
|
||||||
|
+ sb->sb_features_incompat);
|
||||||
|
+ } else {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("will fix incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_incompat,
|
||||||
|
+ sb->sb_features_incompat);
|
||||||
|
+ dirty = true;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (mp->m_sb.sb_features_ro_compat != sb->sb_features_ro_compat) {
|
||||||
|
+ if (no_modify) {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("would fix ro compat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_ro_compat,
|
||||||
|
+ sb->sb_features_ro_compat);
|
||||||
|
+ } else {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("will fix ro compat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_ro_compat,
|
||||||
|
+ sb->sb_features_ro_compat);
|
||||||
|
+ dirty = true;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (mp->m_sb.sb_features_log_incompat != sb->sb_features_log_incompat) {
|
||||||
|
+ if (no_modify) {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("would fix log incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_log_incompat,
|
||||||
|
+ sb->sb_features_log_incompat);
|
||||||
|
+ } else {
|
||||||
|
+ do_warn(
|
||||||
|
+ _("will fix log incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
|
||||||
|
+ agno, mp->m_sb.sb_features_log_incompat,
|
||||||
|
+ sb->sb_features_log_incompat);
|
||||||
|
+ dirty = true;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!dirty)
|
||||||
|
+ return 0;
|
||||||
|
+
|
||||||
|
+ sb->sb_features_compat = mp->m_sb.sb_features_compat;
|
||||||
|
+ sb->sb_features_ro_compat = mp->m_sb.sb_features_ro_compat;
|
||||||
|
+ sb->sb_features_incompat = mp->m_sb.sb_features_incompat;
|
||||||
|
+ sb->sb_features_log_incompat = mp->m_sb.sb_features_log_incompat;
|
||||||
|
+ return XR_AG_SB_SEC;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+/*
|
||||||
|
* Possible fields that may have been set at mkfs time,
|
||||||
|
* sb_inoalignmt, sb_unit, sb_width and sb_dirblklog.
|
||||||
|
* The quota inode fields in the secondaries should be zero.
|
||||||
|
@@ -452,6 +542,8 @@ secondary_sb_whack(
|
||||||
|
rval |= XR_AG_SB_SEC;
|
||||||
|
}
|
||||||
|
|
||||||
|
+ rval |= check_v5_feature_mismatch(mp, i, sb);
|
||||||
|
+
|
||||||
|
if (xfs_sb_version_needsrepair(sb)) {
|
||||||
|
if (i == 0) {
|
||||||
|
if (!no_modify)
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
134
0028-xfs_repair-check-the-ftype-of-dot-and-dotdot-directo.patch
Normal file
134
0028-xfs_repair-check-the-ftype-of-dot-and-dotdot-directo.patch
Normal file
@ -0,0 +1,134 @@
|
|||||||
|
From 5008cbb4b0eaef22e5a0e13a5a2c17457671e34a Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 17 May 2022 22:48:13 -0400
|
||||||
|
Subject: [PATCH] xfs_repair: check the ftype of dot and dotdot directory
|
||||||
|
entries
|
||||||
|
|
||||||
|
The long-format directory block checking code skips the filetype check
|
||||||
|
for the '.' and '..' entries, even though they're part of the ondisk
|
||||||
|
format. This leads to repair failing to catch subtle corruption at the
|
||||||
|
start of a directory.
|
||||||
|
|
||||||
|
Found by fuzzing bu[0].filetype = zeroes in xfs/386.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
repair/phase6.c | 79 +++++++++++++++++++++++++++++++++++++++------------------
|
||||||
|
1 file changed, 54 insertions(+), 25 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/repair/phase6.c b/repair/phase6.c
|
||||||
|
index 696a642..06232fb 100644
|
||||||
|
--- a/repair/phase6.c
|
||||||
|
+++ b/repair/phase6.c
|
||||||
|
@@ -1412,6 +1412,48 @@ dir2_kill_block(
|
||||||
|
_("directory shrink failed (%d)\n"), error);
|
||||||
|
}
|
||||||
|
|
||||||
|
+static inline void
|
||||||
|
+check_longform_ftype(
|
||||||
|
+ struct xfs_mount *mp,
|
||||||
|
+ struct xfs_inode *ip,
|
||||||
|
+ xfs_dir2_data_entry_t *dep,
|
||||||
|
+ ino_tree_node_t *irec,
|
||||||
|
+ int ino_offset,
|
||||||
|
+ struct dir_hash_tab *hashtab,
|
||||||
|
+ xfs_dir2_dataptr_t addr,
|
||||||
|
+ struct xfs_da_args *da,
|
||||||
|
+ struct xfs_buf *bp)
|
||||||
|
+{
|
||||||
|
+ xfs_ino_t inum = be64_to_cpu(dep->inumber);
|
||||||
|
+ uint8_t dir_ftype;
|
||||||
|
+ uint8_t ino_ftype;
|
||||||
|
+
|
||||||
|
+ if (!xfs_sb_version_hasftype(&mp->m_sb))
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ dir_ftype = libxfs_dir2_data_get_ftype(mp, dep);
|
||||||
|
+ ino_ftype = get_inode_ftype(irec, ino_offset);
|
||||||
|
+
|
||||||
|
+ if (dir_ftype == ino_ftype)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ if (no_modify) {
|
||||||
|
+ do_warn(
|
||||||
|
+_("would fix ftype mismatch (%d/%d) in directory/child inode %" PRIu64 "/%" PRIu64 "\n"),
|
||||||
|
+ dir_ftype, ino_ftype,
|
||||||
|
+ ip->i_ino, inum);
|
||||||
|
+ return;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ do_warn(
|
||||||
|
+_("fixing ftype mismatch (%d/%d) in directory/child inode %" PRIu64 "/%" PRIu64 "\n"),
|
||||||
|
+ dir_ftype, ino_ftype,
|
||||||
|
+ ip->i_ino, inum);
|
||||||
|
+ libxfs_dir2_data_put_ftype(mp, dep, ino_ftype);
|
||||||
|
+ libxfs_dir2_data_log_entry(da, bp, dep);
|
||||||
|
+ dir_hash_update_ftype(hashtab, addr, ino_ftype);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/*
|
||||||
|
* process a data block, also checks for .. entry
|
||||||
|
* and corrects it to match what we think .. should be
|
||||||
|
@@ -1749,6 +1791,11 @@ longform_dir2_entry_check_data(
|
||||||
|
libxfs_dir2_data_log_entry(&da, bp, dep);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
+
|
||||||
|
+ if (!nbad)
|
||||||
|
+ check_longform_ftype(mp, ip, dep, irec,
|
||||||
|
+ ino_offset, hashtab, addr, &da,
|
||||||
|
+ bp);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
ASSERT(no_modify || libxfs_verify_dir_ino(mp, inum));
|
||||||
|
@@ -1777,6 +1824,11 @@ longform_dir2_entry_check_data(
|
||||||
|
libxfs_dir2_data_log_entry(&da, bp, dep);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
+
|
||||||
|
+ if (!nbad)
|
||||||
|
+ check_longform_ftype(mp, ip, dep, irec,
|
||||||
|
+ ino_offset, hashtab, addr, &da,
|
||||||
|
+ bp);
|
||||||
|
*need_dot = 0;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
@@ -1787,31 +1839,8 @@ longform_dir2_entry_check_data(
|
||||||
|
continue;
|
||||||
|
|
||||||
|
/* validate ftype field if supported */
|
||||||
|
- if (xfs_sb_version_hasftype(&mp->m_sb)) {
|
||||||
|
- uint8_t dir_ftype;
|
||||||
|
- uint8_t ino_ftype;
|
||||||
|
-
|
||||||
|
- dir_ftype = libxfs_dir2_data_get_ftype(mp, dep);
|
||||||
|
- ino_ftype = get_inode_ftype(irec, ino_offset);
|
||||||
|
-
|
||||||
|
- if (dir_ftype != ino_ftype) {
|
||||||
|
- if (no_modify) {
|
||||||
|
- do_warn(
|
||||||
|
- _("would fix ftype mismatch (%d/%d) in directory/child inode %" PRIu64 "/%" PRIu64 "\n"),
|
||||||
|
- dir_ftype, ino_ftype,
|
||||||
|
- ip->i_ino, inum);
|
||||||
|
- } else {
|
||||||
|
- do_warn(
|
||||||
|
- _("fixing ftype mismatch (%d/%d) in directory/child inode %" PRIu64 "/%" PRIu64 "\n"),
|
||||||
|
- dir_ftype, ino_ftype,
|
||||||
|
- ip->i_ino, inum);
|
||||||
|
- libxfs_dir2_data_put_ftype(mp, dep, ino_ftype);
|
||||||
|
- libxfs_dir2_data_log_entry(&da, bp, dep);
|
||||||
|
- dir_hash_update_ftype(hashtab, addr,
|
||||||
|
- ino_ftype);
|
||||||
|
- }
|
||||||
|
- }
|
||||||
|
- }
|
||||||
|
+ check_longform_ftype(mp, ip, dep, irec, ino_offset, hashtab,
|
||||||
|
+ addr, &da, bp);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* check easy case first, regular inode, just bump
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
32
0029-mkfs-Fix-memory-leak.patch
Normal file
32
0029-mkfs-Fix-memory-leak.patch
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
From 8b4002e0cd0072dd69d478ed662f7cf546bae33b Mon Sep 17 00:00:00 2001
|
||||||
|
From: Pavel Reichl <preichl@redhat.com>
|
||||||
|
Date: Fri, 27 May 2022 16:36:21 -0400
|
||||||
|
Subject: [PATCH] mkfs: Fix memory leak
|
||||||
|
|
||||||
|
'value' is allocated by strdup() in getstr(). It
|
||||||
|
needs to be freed as we do not keep any permanent
|
||||||
|
reference to it.
|
||||||
|
|
||||||
|
Signed-off-by: Pavel Reichl <preichl@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
mkfs/xfs_mkfs.c | 1 +
|
||||||
|
1 file changed, 1 insertion(+)
|
||||||
|
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index 01d2e8c..a37d684 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -1714,6 +1714,7 @@ naming_opts_parser(
|
||||||
|
} else {
|
||||||
|
cli->sb_feat.dir_version = getnum(value, opts, subopt);
|
||||||
|
}
|
||||||
|
+ free((char *)value);
|
||||||
|
break;
|
||||||
|
case N_FTYPE:
|
||||||
|
cli->sb_feat.dirftype = getnum(value, opts, subopt);
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
58
0030-xfs-zero-inode-fork-buffer-at-allocation.patch
Normal file
58
0030-xfs-zero-inode-fork-buffer-at-allocation.patch
Normal file
@ -0,0 +1,58 @@
|
|||||||
|
From 5a282e43fd719e37b866f797c9aacac199d08a19 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Date: Wed, 22 Jun 2022 14:28:52 -0500
|
||||||
|
Subject: [PATCH] xfs: zero inode fork buffer at allocation
|
||||||
|
|
||||||
|
Source kernel commit: cb512c921639613ce03f87e62c5e93ed9fe8c84d
|
||||||
|
|
||||||
|
When we first allocate or resize an inline inode fork, we round up
|
||||||
|
the allocation to 4 byte alingment to make journal alignment
|
||||||
|
constraints. We don't clear the unused bytes, so we can copy up to
|
||||||
|
three uninitialised bytes into the journal. Zero those bytes so we
|
||||||
|
only ever copy zeros into the journal.
|
||||||
|
|
||||||
|
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
|
||||||
|
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_inode_fork.c | 12 +++++++++---
|
||||||
|
1 file changed, 9 insertions(+), 3 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
|
||||||
|
index da59232..ac3692b 100644
|
||||||
|
--- a/libxfs/xfs_inode_fork.c
|
||||||
|
+++ b/libxfs/xfs_inode_fork.c
|
||||||
|
@@ -48,8 +48,13 @@ xfs_init_local_fork(
|
||||||
|
mem_size++;
|
||||||
|
|
||||||
|
if (size) {
|
||||||
|
+ /*
|
||||||
|
+ * As we round up the allocation here, we need to ensure the
|
||||||
|
+ * bytes we don't copy data into are zeroed because the log
|
||||||
|
+ * vectors still copy them into the journal.
|
||||||
|
+ */
|
||||||
|
real_size = roundup(mem_size, 4);
|
||||||
|
- ifp->if_u1.if_data = kmem_alloc(real_size, KM_NOFS);
|
||||||
|
+ ifp->if_u1.if_data = kmem_zalloc(real_size, KM_NOFS);
|
||||||
|
memcpy(ifp->if_u1.if_data, data, size);
|
||||||
|
if (zero_terminate)
|
||||||
|
ifp->if_u1.if_data[size] = '\0';
|
||||||
|
@@ -498,10 +503,11 @@ xfs_idata_realloc(
|
||||||
|
/*
|
||||||
|
* For inline data, the underlying buffer must be a multiple of 4 bytes
|
||||||
|
* in size so that it can be logged and stay on word boundaries.
|
||||||
|
- * We enforce that here.
|
||||||
|
+ * We enforce that here, and use __GFP_ZERO to ensure that size
|
||||||
|
+ * extensions always zero the unused roundup area.
|
||||||
|
*/
|
||||||
|
ifp->if_u1.if_data = krealloc(ifp->if_u1.if_data, roundup(new_size, 4),
|
||||||
|
- GFP_NOFS | __GFP_NOFAIL);
|
||||||
|
+ GFP_NOFS | __GFP_NOFAIL | __GFP_ZERO);
|
||||||
|
ifp->if_bytes = new_size;
|
||||||
|
}
|
||||||
|
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
237
0031-xfs-detect-self-referencing-btree-sibling-pointers.patch
Normal file
237
0031-xfs-detect-self-referencing-btree-sibling-pointers.patch
Normal file
@ -0,0 +1,237 @@
|
|||||||
|
From 393859c7a197c8187ffec131ec80cca697f8bf79 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Date: Wed, 22 Jun 2022 14:28:52 -0500
|
||||||
|
Subject: [PATCH] xfs: detect self referencing btree sibling pointers
|
||||||
|
|
||||||
|
Source kernel commit: dc04db2aa7c9307e740d6d0e173085301c173b1a
|
||||||
|
|
||||||
|
To catch the obvious graph cycle problem and hence potential endless
|
||||||
|
looping.
|
||||||
|
|
||||||
|
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_btree.c | 140 +++++++++++++++++++++++++++++++++++++++--------------
|
||||||
|
1 file changed, 105 insertions(+), 35 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_btree.c b/libxfs/xfs_btree.c
|
||||||
|
index 8455f26..d9a82e7 100644
|
||||||
|
--- a/libxfs/xfs_btree.c
|
||||||
|
+++ b/libxfs/xfs_btree.c
|
||||||
|
@@ -48,6 +48,52 @@ xfs_btree_magic(
|
||||||
|
return magic;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static xfs_failaddr_t
|
||||||
|
+xfs_btree_check_lblock_siblings(
|
||||||
|
+ struct xfs_mount *mp,
|
||||||
|
+ struct xfs_btree_cur *cur,
|
||||||
|
+ int level,
|
||||||
|
+ xfs_fsblock_t fsb,
|
||||||
|
+ xfs_fsblock_t sibling)
|
||||||
|
+{
|
||||||
|
+ if (sibling == NULLFSBLOCK)
|
||||||
|
+ return NULL;
|
||||||
|
+ if (sibling == fsb)
|
||||||
|
+ return __this_address;
|
||||||
|
+ if (level >= 0) {
|
||||||
|
+ if (!xfs_btree_check_lptr(cur, sibling, level + 1))
|
||||||
|
+ return __this_address;
|
||||||
|
+ } else {
|
||||||
|
+ if (!xfs_verify_fsbno(mp, sibling))
|
||||||
|
+ return __this_address;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return NULL;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static xfs_failaddr_t
|
||||||
|
+xfs_btree_check_sblock_siblings(
|
||||||
|
+ struct xfs_mount *mp,
|
||||||
|
+ struct xfs_btree_cur *cur,
|
||||||
|
+ int level,
|
||||||
|
+ xfs_agnumber_t agno,
|
||||||
|
+ xfs_agblock_t agbno,
|
||||||
|
+ xfs_agblock_t sibling)
|
||||||
|
+{
|
||||||
|
+ if (sibling == NULLAGBLOCK)
|
||||||
|
+ return NULL;
|
||||||
|
+ if (sibling == agbno)
|
||||||
|
+ return __this_address;
|
||||||
|
+ if (level >= 0) {
|
||||||
|
+ if (!xfs_btree_check_sptr(cur, sibling, level + 1))
|
||||||
|
+ return __this_address;
|
||||||
|
+ } else {
|
||||||
|
+ if (!xfs_verify_agbno(mp, agno, sibling))
|
||||||
|
+ return __this_address;
|
||||||
|
+ }
|
||||||
|
+ return NULL;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/*
|
||||||
|
* Check a long btree block header. Return the address of the failing check,
|
||||||
|
* or NULL if everything is ok.
|
||||||
|
@@ -62,6 +108,8 @@ __xfs_btree_check_lblock(
|
||||||
|
struct xfs_mount *mp = cur->bc_mp;
|
||||||
|
xfs_btnum_t btnum = cur->bc_btnum;
|
||||||
|
int crc = xfs_sb_version_hascrc(&mp->m_sb);
|
||||||
|
+ xfs_failaddr_t fa;
|
||||||
|
+ xfs_fsblock_t fsb = NULLFSBLOCK;
|
||||||
|
|
||||||
|
if (crc) {
|
||||||
|
if (!uuid_equal(&block->bb_u.l.bb_uuid, &mp->m_sb.sb_meta_uuid))
|
||||||
|
@@ -80,16 +128,16 @@ __xfs_btree_check_lblock(
|
||||||
|
if (be16_to_cpu(block->bb_numrecs) >
|
||||||
|
cur->bc_ops->get_maxrecs(cur, level))
|
||||||
|
return __this_address;
|
||||||
|
- if (block->bb_u.l.bb_leftsib != cpu_to_be64(NULLFSBLOCK) &&
|
||||||
|
- !xfs_btree_check_lptr(cur, be64_to_cpu(block->bb_u.l.bb_leftsib),
|
||||||
|
- level + 1))
|
||||||
|
- return __this_address;
|
||||||
|
- if (block->bb_u.l.bb_rightsib != cpu_to_be64(NULLFSBLOCK) &&
|
||||||
|
- !xfs_btree_check_lptr(cur, be64_to_cpu(block->bb_u.l.bb_rightsib),
|
||||||
|
- level + 1))
|
||||||
|
- return __this_address;
|
||||||
|
|
||||||
|
- return NULL;
|
||||||
|
+ if (bp)
|
||||||
|
+ fsb = XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bp));
|
||||||
|
+
|
||||||
|
+ fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
|
||||||
|
+ be64_to_cpu(block->bb_u.l.bb_leftsib));
|
||||||
|
+ if (!fa)
|
||||||
|
+ fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
|
||||||
|
+ be64_to_cpu(block->bb_u.l.bb_rightsib));
|
||||||
|
+ return fa;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Check a long btree block header. */
|
||||||
|
@@ -127,6 +175,9 @@ __xfs_btree_check_sblock(
|
||||||
|
struct xfs_mount *mp = cur->bc_mp;
|
||||||
|
xfs_btnum_t btnum = cur->bc_btnum;
|
||||||
|
int crc = xfs_sb_version_hascrc(&mp->m_sb);
|
||||||
|
+ xfs_failaddr_t fa;
|
||||||
|
+ xfs_agblock_t agbno = NULLAGBLOCK;
|
||||||
|
+ xfs_agnumber_t agno = NULLAGNUMBER;
|
||||||
|
|
||||||
|
if (crc) {
|
||||||
|
if (!uuid_equal(&block->bb_u.s.bb_uuid, &mp->m_sb.sb_meta_uuid))
|
||||||
|
@@ -143,16 +194,18 @@ __xfs_btree_check_sblock(
|
||||||
|
if (be16_to_cpu(block->bb_numrecs) >
|
||||||
|
cur->bc_ops->get_maxrecs(cur, level))
|
||||||
|
return __this_address;
|
||||||
|
- if (block->bb_u.s.bb_leftsib != cpu_to_be32(NULLAGBLOCK) &&
|
||||||
|
- !xfs_btree_check_sptr(cur, be32_to_cpu(block->bb_u.s.bb_leftsib),
|
||||||
|
- level + 1))
|
||||||
|
- return __this_address;
|
||||||
|
- if (block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK) &&
|
||||||
|
- !xfs_btree_check_sptr(cur, be32_to_cpu(block->bb_u.s.bb_rightsib),
|
||||||
|
- level + 1))
|
||||||
|
- return __this_address;
|
||||||
|
|
||||||
|
- return NULL;
|
||||||
|
+ if (bp) {
|
||||||
|
+ agbno = xfs_daddr_to_agbno(mp, XFS_BUF_ADDR(bp));
|
||||||
|
+ agno = xfs_daddr_to_agno(mp, XFS_BUF_ADDR(bp));
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno,
|
||||||
|
+ be32_to_cpu(block->bb_u.s.bb_leftsib));
|
||||||
|
+ if (!fa)
|
||||||
|
+ fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno,
|
||||||
|
+ agbno, be32_to_cpu(block->bb_u.s.bb_rightsib));
|
||||||
|
+ return fa;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Check a short btree block header. */
|
||||||
|
@@ -4265,6 +4318,21 @@ xfs_btree_visit_block(
|
||||||
|
if (xfs_btree_ptr_is_null(cur, &rptr))
|
||||||
|
return -ENOENT;
|
||||||
|
|
||||||
|
+ /*
|
||||||
|
+ * We only visit blocks once in this walk, so we have to avoid the
|
||||||
|
+ * internal xfs_btree_lookup_get_block() optimisation where it will
|
||||||
|
+ * return the same block without checking if the right sibling points
|
||||||
|
+ * back to us and creates a cyclic reference in the btree.
|
||||||
|
+ */
|
||||||
|
+ if (cur->bc_flags & XFS_BTREE_LONG_PTRS) {
|
||||||
|
+ if (be64_to_cpu(rptr.l) == XFS_DADDR_TO_FSB(cur->bc_mp,
|
||||||
|
+ XFS_BUF_ADDR(bp)))
|
||||||
|
+ return -EFSCORRUPTED;
|
||||||
|
+ } else {
|
||||||
|
+ if (be32_to_cpu(rptr.s) == xfs_daddr_to_agbno(cur->bc_mp,
|
||||||
|
+ XFS_BUF_ADDR(bp)))
|
||||||
|
+ return -EFSCORRUPTED;
|
||||||
|
+ }
|
||||||
|
return xfs_btree_lookup_get_block(cur, level, &rptr, &block);
|
||||||
|
}
|
||||||
|
|
||||||
|
@@ -4439,20 +4507,21 @@ xfs_btree_lblock_verify(
|
||||||
|
{
|
||||||
|
struct xfs_mount *mp = bp->b_mount;
|
||||||
|
struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
|
||||||
|
+ xfs_fsblock_t fsb;
|
||||||
|
+ xfs_failaddr_t fa;
|
||||||
|
|
||||||
|
/* numrecs verification */
|
||||||
|
if (be16_to_cpu(block->bb_numrecs) > max_recs)
|
||||||
|
return __this_address;
|
||||||
|
|
||||||
|
/* sibling pointer verification */
|
||||||
|
- if (block->bb_u.l.bb_leftsib != cpu_to_be64(NULLFSBLOCK) &&
|
||||||
|
- !xfs_verify_fsbno(mp, be64_to_cpu(block->bb_u.l.bb_leftsib)))
|
||||||
|
- return __this_address;
|
||||||
|
- if (block->bb_u.l.bb_rightsib != cpu_to_be64(NULLFSBLOCK) &&
|
||||||
|
- !xfs_verify_fsbno(mp, be64_to_cpu(block->bb_u.l.bb_rightsib)))
|
||||||
|
- return __this_address;
|
||||||
|
-
|
||||||
|
- return NULL;
|
||||||
|
+ fsb = XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bp));
|
||||||
|
+ fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
|
||||||
|
+ be64_to_cpu(block->bb_u.l.bb_leftsib));
|
||||||
|
+ if (!fa)
|
||||||
|
+ fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
|
||||||
|
+ be64_to_cpu(block->bb_u.l.bb_rightsib));
|
||||||
|
+ return fa;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
@@ -4493,7 +4562,9 @@ xfs_btree_sblock_verify(
|
||||||
|
{
|
||||||
|
struct xfs_mount *mp = bp->b_mount;
|
||||||
|
struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
|
||||||
|
- xfs_agblock_t agno;
|
||||||
|
+ xfs_agnumber_t agno;
|
||||||
|
+ xfs_agblock_t agbno;
|
||||||
|
+ xfs_failaddr_t fa;
|
||||||
|
|
||||||
|
/* numrecs verification */
|
||||||
|
if (be16_to_cpu(block->bb_numrecs) > max_recs)
|
||||||
|
@@ -4501,14 +4572,13 @@ xfs_btree_sblock_verify(
|
||||||
|
|
||||||
|
/* sibling pointer verification */
|
||||||
|
agno = xfs_daddr_to_agno(mp, XFS_BUF_ADDR(bp));
|
||||||
|
- if (block->bb_u.s.bb_leftsib != cpu_to_be32(NULLAGBLOCK) &&
|
||||||
|
- !xfs_verify_agbno(mp, agno, be32_to_cpu(block->bb_u.s.bb_leftsib)))
|
||||||
|
- return __this_address;
|
||||||
|
- if (block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK) &&
|
||||||
|
- !xfs_verify_agbno(mp, agno, be32_to_cpu(block->bb_u.s.bb_rightsib)))
|
||||||
|
- return __this_address;
|
||||||
|
-
|
||||||
|
- return NULL;
|
||||||
|
+ agbno = xfs_daddr_to_agbno(mp, XFS_BUF_ADDR(bp));
|
||||||
|
+ fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
|
||||||
|
+ be32_to_cpu(block->bb_u.s.bb_leftsib));
|
||||||
|
+ if (!fa)
|
||||||
|
+ fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
|
||||||
|
+ be32_to_cpu(block->bb_u.s.bb_rightsib));
|
||||||
|
+ return fa;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
84
0032-xfs-validate-inode-fork-size-against-fork-format.patch
Normal file
84
0032-xfs-validate-inode-fork-size-against-fork-format.patch
Normal file
@ -0,0 +1,84 @@
|
|||||||
|
From ff6aea290450f00e084cafe5b34901d26abdbc4a Mon Sep 17 00:00:00 2001
|
||||||
|
From: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Date: Wed, 22 Jun 2022 14:28:52 -0500
|
||||||
|
Subject: [PATCH] xfs: validate inode fork size against fork format
|
||||||
|
|
||||||
|
Source kernel commit: 1eb70f54c445fcbb25817841e774adb3d912f3e8
|
||||||
|
|
||||||
|
xfs_repair catches fork size/format mismatches, but the in-kernel
|
||||||
|
verifier doesn't, leading to null pointer failures when attempting
|
||||||
|
to perform operations on the fork. This can occur in the
|
||||||
|
xfs_dir_is_empty() where the in-memory fork format does not match
|
||||||
|
the size and so the fork data pointer is accessed incorrectly.
|
||||||
|
|
||||||
|
Note: this causes new failures in xfs/348 which is testing mode vs
|
||||||
|
ftype mismatches. We now detect a regular file that has been changed
|
||||||
|
to a directory or symlink mode as being corrupt because the data
|
||||||
|
fork is for a symlink or directory should be in local form when
|
||||||
|
there are only 3 bytes of data in the data fork. Hence the inode
|
||||||
|
verify for the regular file now fires w/ -EFSCORRUPTED because
|
||||||
|
the inode fork format does not match the format the corrupted mode
|
||||||
|
says it should be in.
|
||||||
|
|
||||||
|
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
libxfs/xfs_inode_buf.c | 35 ++++++++++++++++++++++++++---------
|
||||||
|
1 file changed, 26 insertions(+), 9 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
|
||||||
|
index f98f5c4..7ecbfad 100644
|
||||||
|
--- a/libxfs/xfs_inode_buf.c
|
||||||
|
+++ b/libxfs/xfs_inode_buf.c
|
||||||
|
@@ -334,19 +334,36 @@ xfs_dinode_verify_fork(
|
||||||
|
int whichfork)
|
||||||
|
{
|
||||||
|
uint32_t di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
|
||||||
|
+ mode_t mode = be16_to_cpu(dip->di_mode);
|
||||||
|
+ uint32_t fork_size = XFS_DFORK_SIZE(dip, mp, whichfork);
|
||||||
|
+ uint32_t fork_format = XFS_DFORK_FORMAT(dip, whichfork);
|
||||||
|
|
||||||
|
- switch (XFS_DFORK_FORMAT(dip, whichfork)) {
|
||||||
|
+ /*
|
||||||
|
+ * For fork types that can contain local data, check that the fork
|
||||||
|
+ * format matches the size of local data contained within the fork.
|
||||||
|
+ *
|
||||||
|
+ * For all types, check that when the size says the should be in extent
|
||||||
|
+ * or btree format, the inode isn't claiming it is in local format.
|
||||||
|
+ */
|
||||||
|
+ if (whichfork == XFS_DATA_FORK) {
|
||||||
|
+ if (S_ISDIR(mode) || S_ISLNK(mode)) {
|
||||||
|
+ if (be64_to_cpu(dip->di_size) <= fork_size &&
|
||||||
|
+ fork_format != XFS_DINODE_FMT_LOCAL)
|
||||||
|
+ return __this_address;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (be64_to_cpu(dip->di_size) > fork_size &&
|
||||||
|
+ fork_format == XFS_DINODE_FMT_LOCAL)
|
||||||
|
+ return __this_address;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ switch (fork_format) {
|
||||||
|
case XFS_DINODE_FMT_LOCAL:
|
||||||
|
/*
|
||||||
|
- * no local regular files yet
|
||||||
|
+ * No local regular files yet
|
||||||
|
*/
|
||||||
|
- if (whichfork == XFS_DATA_FORK) {
|
||||||
|
- if (S_ISREG(be16_to_cpu(dip->di_mode)))
|
||||||
|
- return __this_address;
|
||||||
|
- if (be64_to_cpu(dip->di_size) >
|
||||||
|
- XFS_DFORK_SIZE(dip, mp, whichfork))
|
||||||
|
- return __this_address;
|
||||||
|
- }
|
||||||
|
+ if (S_ISREG(mode) && whichfork == XFS_DATA_FORK)
|
||||||
|
+ return __this_address;
|
||||||
|
if (di_nextents)
|
||||||
|
return __this_address;
|
||||||
|
break;
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
110
0033-xfs_repair-always-rewrite-secondary-supers-when-need.patch
Normal file
110
0033-xfs_repair-always-rewrite-secondary-supers-when-need.patch
Normal file
@ -0,0 +1,110 @@
|
|||||||
|
From fa0f9232bd89e2955ee54e0be4adb6713a00d8b4 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 12 Jul 2022 13:22:33 -0500
|
||||||
|
Subject: [PATCH] xfs_repair: always rewrite secondary supers when needsrepair
|
||||||
|
is set
|
||||||
|
|
||||||
|
Dave Chinner complained about xfs_scrub failures coming from xfs/158.
|
||||||
|
That test induces xfs_repair to fail while upgrading a filesystem to
|
||||||
|
have the inobtcount feature, and then restarts xfs_repair to finish the
|
||||||
|
upgrade. When the second xfs_repair run starts, it will find that the
|
||||||
|
primary super has NEEDSREPAIR set, along with whatever new feature that
|
||||||
|
we were trying to add to the filesystem.
|
||||||
|
|
||||||
|
From there, repair completes the upgrade in much the same manner as the
|
||||||
|
first repair run would have, with one big exception -- it forgets to set
|
||||||
|
features_changed to trigger rewriting of the secondary supers at the end
|
||||||
|
of repair. This results in discrepancies between the supers:
|
||||||
|
|
||||||
|
# XFS_REPAIR_FAIL_AFTER_PHASE=2 xfs_repair -c inobtcount=1 /dev/sdf
|
||||||
|
Phase 1 - find and verify superblock...
|
||||||
|
Phase 2 - using internal log
|
||||||
|
- zero log...
|
||||||
|
- scan filesystem freespace and inode maps...
|
||||||
|
- found root inode chunk
|
||||||
|
Adding inode btree counts to filesystem.
|
||||||
|
Killed
|
||||||
|
# xfs_repair /dev/sdf
|
||||||
|
Phase 1 - find and verify superblock...
|
||||||
|
Phase 2 - using internal log
|
||||||
|
- zero log...
|
||||||
|
- scan filesystem freespace and inode maps...
|
||||||
|
clearing needsrepair flag and regenerating metadata
|
||||||
|
bad inobt block count 0, saw 1
|
||||||
|
bad finobt block count 0, saw 1
|
||||||
|
bad inobt block count 0, saw 1
|
||||||
|
bad finobt block count 0, saw 1
|
||||||
|
bad inobt block count 0, saw 1
|
||||||
|
bad finobt block count 0, saw 1
|
||||||
|
bad inobt block count 0, saw 1
|
||||||
|
bad finobt block count 0, saw 1
|
||||||
|
- found root inode chunk
|
||||||
|
Phase 3 - for each AG...
|
||||||
|
- scan and clear agi unlinked lists...
|
||||||
|
- process known inodes and perform inode discovery...
|
||||||
|
- agno = 0
|
||||||
|
- agno = 1
|
||||||
|
- agno = 2
|
||||||
|
- agno = 3
|
||||||
|
- process newly discovered inodes...
|
||||||
|
Phase 4 - check for duplicate blocks...
|
||||||
|
- setting up duplicate extent list...
|
||||||
|
- check for inodes claiming duplicate blocks...
|
||||||
|
- agno = 1
|
||||||
|
- agno = 2
|
||||||
|
- agno = 0
|
||||||
|
- agno = 3
|
||||||
|
Phase 5 - rebuild AG headers and trees...
|
||||||
|
- reset superblock...
|
||||||
|
Phase 6 - check inode connectivity...
|
||||||
|
- resetting contents of realtime bitmap and summary inodes
|
||||||
|
- traversing filesystem ...
|
||||||
|
- traversal finished ...
|
||||||
|
- moving disconnected inodes to lost+found ...
|
||||||
|
Phase 7 - verify and correct link counts...
|
||||||
|
done
|
||||||
|
# xfs_db -c 'sb 0' -c 'print' -c 'sb 1' -c 'print' /dev/sdf | \
|
||||||
|
egrep '(features_ro_compat|features_incompat)'
|
||||||
|
features_ro_compat = 0xd
|
||||||
|
features_incompat = 0xb
|
||||||
|
features_ro_compat = 0x5
|
||||||
|
features_incompat = 0xb
|
||||||
|
|
||||||
|
Curiously, re-running xfs_repair will not trigger any warnings about the
|
||||||
|
featureset mismatch between the primary and secondary supers. xfs_scrub
|
||||||
|
immediately notices, which is what causes xfs/158 to fail.
|
||||||
|
|
||||||
|
This discrepancy doesn't happen when the upgrade completes successfully
|
||||||
|
in a single repair run, so we need to teach repair to rewrite the
|
||||||
|
secondaries at the end of repair any time needsrepair was set.
|
||||||
|
|
||||||
|
Reported-by: Dave Chinner <david@fromorbit.com>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
repair/agheader.c | 8 ++++++++
|
||||||
|
1 file changed, 8 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/repair/agheader.c b/repair/agheader.c
|
||||||
|
index 36da139..e91509d 100644
|
||||||
|
--- a/repair/agheader.c
|
||||||
|
+++ b/repair/agheader.c
|
||||||
|
@@ -552,6 +552,14 @@ secondary_sb_whack(
|
||||||
|
else
|
||||||
|
do_warn(
|
||||||
|
_("would clear needsrepair flag and regenerate metadata\n"));
|
||||||
|
+ /*
|
||||||
|
+ * If needsrepair is set on the primary super, there's
|
||||||
|
+ * a possibility that repair crashed during an upgrade.
|
||||||
|
+ * Set features_changed to ensure that the secondary
|
||||||
|
+ * supers are rewritten with the new feature bits once
|
||||||
|
+ * we've finished the upgrade.
|
||||||
|
+ */
|
||||||
|
+ features_changed = true;
|
||||||
|
} else {
|
||||||
|
/*
|
||||||
|
* Quietly clear needsrepair on the secondary supers as
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
55
0034-xfs_repair-ignore-empty-xattr-leaf-blocks.patch
Normal file
55
0034-xfs_repair-ignore-empty-xattr-leaf-blocks.patch
Normal file
@ -0,0 +1,55 @@
|
|||||||
|
From f50d3462c654acc484ab3ea68e75e8252b77e262 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Wed, 13 Jul 2022 20:58:25 -0500
|
||||||
|
Subject: [PATCH] xfs_repair: ignore empty xattr leaf blocks
|
||||||
|
|
||||||
|
As detailed in the commit:
|
||||||
|
|
||||||
|
5e572d1a xfs: empty xattr leaf header blocks are not corruption
|
||||||
|
|
||||||
|
empty xattr leaf blocks can be the benign byproduct of the system
|
||||||
|
going down during the multi-step process of adding a large xattr
|
||||||
|
to a file that has no xattrs. If we find one at attr fork offset 0,
|
||||||
|
we should clear it, but this isn't a corruption.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
repair/attr_repair.c | 20 ++++++++++++++++++++
|
||||||
|
1 file changed, 20 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
|
||||||
|
index 2055d96..c3a6d50 100644
|
||||||
|
--- a/repair/attr_repair.c
|
||||||
|
+++ b/repair/attr_repair.c
|
||||||
|
@@ -579,6 +579,26 @@ process_leaf_attr_block(
|
||||||
|
firstb = mp->m_sb.sb_blocksize;
|
||||||
|
stop = xfs_attr3_leaf_hdr_size(leaf);
|
||||||
|
|
||||||
|
+ /*
|
||||||
|
+ * Empty leaf blocks at offset zero can occur as a race between
|
||||||
|
+ * setxattr and the system going down, so we only take action if we're
|
||||||
|
+ * running in modify mode. See xfs_attr3_leaf_verify for details of
|
||||||
|
+ * how we've screwed this up many times.
|
||||||
|
+ */
|
||||||
|
+ if (!leafhdr.count && da_bno == 0) {
|
||||||
|
+ if (no_modify) {
|
||||||
|
+ do_log(
|
||||||
|
+ _("would clear empty leaf attr block 0, inode %" PRIu64 "\n"),
|
||||||
|
+ ino);
|
||||||
|
+ return 0;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ do_warn(
|
||||||
|
+ _("will clear empty leaf attr block 0, inode %" PRIu64 "\n"),
|
||||||
|
+ ino);
|
||||||
|
+ return 1;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
/* does the count look sorta valid? */
|
||||||
|
if (!leafhdr.count ||
|
||||||
|
leafhdr.count * sizeof(xfs_attr_leaf_entry_t) + stop >
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,45 @@
|
|||||||
|
From 91c1d0836aa4a228e76c0b8c5d83903f1f6bfdbb Mon Sep 17 00:00:00 2001
|
||||||
|
From: Chandan Babu R <chandan.babu@oracle.com>
|
||||||
|
Date: Wed, 13 Jul 2022 20:58:27 -0500
|
||||||
|
Subject: [PATCH] xfs_repair: Search for conflicts in inode_tree_ptrs[] when
|
||||||
|
processing uncertain inodes
|
||||||
|
|
||||||
|
When processing an uncertain inode chunk record, if we lose 2 blocks worth of
|
||||||
|
inodes or 25% of the chunk, xfs_repair decides to ignore the chunk. Otherwise,
|
||||||
|
xfs_repair adds a new chunk record to inode_tree_ptrs[agno], marking each
|
||||||
|
inode as either free or used. However, before adding the new chunk record,
|
||||||
|
xfs_repair has to check for the existance of a conflicting record.
|
||||||
|
|
||||||
|
The existing code incorrectly checks for the conflicting record in
|
||||||
|
inode_uncertain_tree_ptrs[agno]. This check will succeed since the inode chunk
|
||||||
|
record being processed was originally obtained from
|
||||||
|
inode_uncertain_tree_ptrs[agno].
|
||||||
|
|
||||||
|
This commit fixes the bug by changing xfs_repair to search
|
||||||
|
inode_tree_ptrs[agno] for conflicts.
|
||||||
|
|
||||||
|
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
repair/dino_chunks.c | 3 +--
|
||||||
|
1 file changed, 1 insertion(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/repair/dino_chunks.c b/repair/dino_chunks.c
|
||||||
|
index 11b0eb5..80c52a4 100644
|
||||||
|
--- a/repair/dino_chunks.c
|
||||||
|
+++ b/repair/dino_chunks.c
|
||||||
|
@@ -229,8 +229,7 @@ verify_inode_chunk(xfs_mount_t *mp,
|
||||||
|
/*
|
||||||
|
* ok, put the record into the tree, if no conflict.
|
||||||
|
*/
|
||||||
|
- if (find_uncertain_inode_rec(agno,
|
||||||
|
- XFS_AGB_TO_AGINO(mp, start_agbno)))
|
||||||
|
+ if (find_inode_rec(mp, agno, XFS_AGB_TO_AGINO(mp, start_agbno)))
|
||||||
|
return(0);
|
||||||
|
|
||||||
|
start_agino = XFS_AGB_TO_AGINO(mp, start_agbno);
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
119
0036-mkfs-terminate-getsubopt-arrays-properly.patch
Normal file
119
0036-mkfs-terminate-getsubopt-arrays-properly.patch
Normal file
@ -0,0 +1,119 @@
|
|||||||
|
From cdf5cfe93ee14942665f3c6ae78a8bf1198e1798 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Wed, 13 Jul 2022 20:58:28 -0500
|
||||||
|
Subject: [PATCH] mkfs: terminate getsubopt arrays properly
|
||||||
|
|
||||||
|
Having not drank any (or maybe too much) coffee this morning, I typed:
|
||||||
|
|
||||||
|
$ mkfs.xfs -d agcount=3 -d nrext64=0
|
||||||
|
Segmentation fault
|
||||||
|
|
||||||
|
I traced this down to getsubopt walking off the end of the dopts.subopts
|
||||||
|
array. The manpage says you're supposed to terminate the suboptions
|
||||||
|
string array with a NULL entry, but the structure definition uses
|
||||||
|
MAX_SUBOPTS/D_MAX_OPTS directly, which means there is no terminator.
|
||||||
|
|
||||||
|
Explicitly terminate each suboption array with a NULL entry after
|
||||||
|
making room for it.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
[sandeen: explicitly add NULL terminators & clarify comment]
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
mkfs/xfs_mkfs.c | 16 ++++++++++++++--
|
||||||
|
1 file changed, 14 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index fdf6d4a..5cd2f81 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -132,8 +132,11 @@ enum {
|
||||||
|
M_MAX_OPTS,
|
||||||
|
};
|
||||||
|
|
||||||
|
-/* Just define the max options array size manually right now */
|
||||||
|
-#define MAX_SUBOPTS D_MAX_OPTS
|
||||||
|
+/*
|
||||||
|
+ * Just define the max options array size manually to the largest
|
||||||
|
+ * enum right now, leaving room for a NULL terminator at the end
|
||||||
|
+ */
|
||||||
|
+#define MAX_SUBOPTS (D_MAX_OPTS + 1)
|
||||||
|
|
||||||
|
#define SUBOPT_NEEDS_VAL (-1LL)
|
||||||
|
#define MAX_CONFLICTS 8
|
||||||
|
@@ -243,6 +246,7 @@ static struct opt_params bopts = {
|
||||||
|
.ini_section = "block",
|
||||||
|
.subopts = {
|
||||||
|
[B_SIZE] = "size",
|
||||||
|
+ [B_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = B_SIZE,
|
||||||
|
@@ -269,6 +273,7 @@ static struct opt_params copts = {
|
||||||
|
.name = 'c',
|
||||||
|
.subopts = {
|
||||||
|
[C_OPTFILE] = "options",
|
||||||
|
+ [C_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = C_OPTFILE,
|
||||||
|
@@ -298,6 +303,7 @@ static struct opt_params dopts = {
|
||||||
|
[D_EXTSZINHERIT] = "extszinherit",
|
||||||
|
[D_COWEXTSIZE] = "cowextsize",
|
||||||
|
[D_DAXINHERIT] = "daxinherit",
|
||||||
|
+ [D_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = D_AGCOUNT,
|
||||||
|
@@ -434,6 +440,7 @@ static struct opt_params iopts = {
|
||||||
|
[I_ATTR] = "attr",
|
||||||
|
[I_PROJID32BIT] = "projid32bit",
|
||||||
|
[I_SPINODES] = "sparse",
|
||||||
|
+ [I_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = I_ALIGN,
|
||||||
|
@@ -500,6 +507,7 @@ static struct opt_params lopts = {
|
||||||
|
[L_FILE] = "file",
|
||||||
|
[L_NAME] = "name",
|
||||||
|
[L_LAZYSBCNTR] = "lazy-count",
|
||||||
|
+ [L_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = L_AGNUM,
|
||||||
|
@@ -592,6 +600,7 @@ static struct opt_params nopts = {
|
||||||
|
[N_SIZE] = "size",
|
||||||
|
[N_VERSION] = "version",
|
||||||
|
[N_FTYPE] = "ftype",
|
||||||
|
+ [N_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = N_SIZE,
|
||||||
|
@@ -627,6 +636,7 @@ static struct opt_params ropts = {
|
||||||
|
[R_FILE] = "file",
|
||||||
|
[R_NAME] = "name",
|
||||||
|
[R_NOALIGN] = "noalign",
|
||||||
|
+ [R_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = R_EXTSIZE,
|
||||||
|
@@ -674,6 +684,7 @@ static struct opt_params sopts = {
|
||||||
|
.subopts = {
|
||||||
|
[S_SIZE] = "size",
|
||||||
|
[S_SECTSIZE] = "sectsize",
|
||||||
|
+ [S_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = S_SIZE,
|
||||||
|
@@ -710,6 +721,7 @@ static struct opt_params mopts = {
|
||||||
|
[M_REFLINK] = "reflink",
|
||||||
|
[M_INOBTCNT] = "inobtcount",
|
||||||
|
[M_BIGTIME] = "bigtime",
|
||||||
|
+ [M_MAX_OPTS] = NULL,
|
||||||
|
},
|
||||||
|
.subopt_params = {
|
||||||
|
{ .index = M_CRC,
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,40 @@
|
|||||||
|
From db5b866537e78669f7b84590345b0c37f841f701 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Thu, 4 Aug 2022 21:28:23 -0500
|
||||||
|
Subject: [PATCH] mkfs: complain about impossible log size constraints
|
||||||
|
|
||||||
|
xfs/042 trips over an impossible fs geometry when nrext64 is enabled.
|
||||||
|
The minimum log size calculation comes out to 4287 blocks, but the mkfs
|
||||||
|
parameters specify an AG size of 4096 blocks. This eventually causes
|
||||||
|
mkfs to complain that the autoselected log size doesn't meet the minimum
|
||||||
|
size, but we could be a little more explicit in pointing out that the
|
||||||
|
two size constraints make for an impossible geometry.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
||||||
|
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
|
||||||
|
---
|
||||||
|
mkfs/xfs_mkfs.c | 7 +++++++
|
||||||
|
1 file changed, 7 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
|
||||||
|
index 12994ed..9dd0e79 100644
|
||||||
|
--- a/mkfs/xfs_mkfs.c
|
||||||
|
+++ b/mkfs/xfs_mkfs.c
|
||||||
|
@@ -3490,6 +3490,13 @@ _("external log device size %lld blocks too small, must be at least %lld blocks\
|
||||||
|
* an AG.
|
||||||
|
*/
|
||||||
|
max_logblocks = libxfs_alloc_ag_max_usable(mp) - 1;
|
||||||
|
+ if (max_logblocks < min_logblocks) {
|
||||||
|
+ fprintf(stderr,
|
||||||
|
+_("max log size %d smaller than min log size %d, filesystem is too small\n"),
|
||||||
|
+ max_logblocks,
|
||||||
|
+ min_logblocks);
|
||||||
|
+ usage();
|
||||||
|
+ }
|
||||||
|
|
||||||
|
/* internal log - if no size specified, calculate automatically */
|
||||||
|
if (!cfg->logblocks) {
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,53 @@
|
|||||||
|
From 04d4c27afa3f2c0088e381102e68cfb6a96b3306 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Shida Zhang <zhangshida@kylinos.cn>
|
||||||
|
Date: Fri, 18 Nov 2022 10:46:33 +0100
|
||||||
|
Subject: [PATCH] xfs: trim the mapp array accordingly in xfs_da_grow_inode_int
|
||||||
|
|
||||||
|
Source kernel commit: 44159659df8ca381b84261e11058b2176fa03ba0
|
||||||
|
|
||||||
|
Take a look at the for-loop in xfs_da_grow_inode_int:
|
||||||
|
======
|
||||||
|
for(){
|
||||||
|
nmap = min(XFS_BMAP_MAX_NMAP, count);
|
||||||
|
...
|
||||||
|
error = xfs_bmapi_write(...,&mapp[mapi], &nmap);//(..., $1, $2)
|
||||||
|
...
|
||||||
|
mapi += nmap;
|
||||||
|
}
|
||||||
|
=====
|
||||||
|
where $1 stands for the start address of the array,
|
||||||
|
while $2 is used to indicate the size of the array.
|
||||||
|
|
||||||
|
The array $1 will advance by $nmap in each iteration after
|
||||||
|
the allocation of extents.
|
||||||
|
But the size $2 still remains unchanged, which is determined by
|
||||||
|
min(XFS_BMAP_MAX_NMAP, count).
|
||||||
|
|
||||||
|
It seems that it has forgotten to trim the mapp array after each
|
||||||
|
iteration, so change it.
|
||||||
|
|
||||||
|
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
libxfs/xfs_da_btree.c | 2 +-
|
||||||
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_da_btree.c b/libxfs/xfs_da_btree.c
|
||||||
|
index 9dc22f2..a068a01 100644
|
||||||
|
--- a/libxfs/xfs_da_btree.c
|
||||||
|
+++ b/libxfs/xfs_da_btree.c
|
||||||
|
@@ -2188,8 +2188,8 @@ xfs_da_grow_inode_int(
|
||||||
|
*/
|
||||||
|
mapp = kmem_alloc(sizeof(*mapp) * count, 0);
|
||||||
|
for (b = *bno, mapi = 0; b < *bno + count; ) {
|
||||||
|
- nmap = min(XFS_BMAP_MAX_NMAP, count);
|
||||||
|
c = (int)(*bno + count - b);
|
||||||
|
+ nmap = min(XFS_BMAP_MAX_NMAP, c);
|
||||||
|
error = xfs_bmapi_write(tp, dp, b, c,
|
||||||
|
xfs_bmapi_aflag(w)|XFS_BMAPI_METADATA,
|
||||||
|
args->total, &mapp[mapi], &nmap);
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
156
0039-xfs-fix-exception-caused-by-unexpected-illegal-bestc.patch
Normal file
156
0039-xfs-fix-exception-caused-by-unexpected-illegal-bestc.patch
Normal file
@ -0,0 +1,156 @@
|
|||||||
|
From 20798cc06315ec1581b87b3da7f868dff62a6efd Mon Sep 17 00:00:00 2001
|
||||||
|
From: Guo Xuenan <guoxuenan@huawei.com>
|
||||||
|
Date: Fri, 18 Nov 2022 10:48:09 +0100
|
||||||
|
Subject: [PATCH] xfs: fix exception caused by unexpected illegal bestcount in
|
||||||
|
leaf dir
|
||||||
|
|
||||||
|
Source kernel commit: 13cf24e00665c9751951a422756d975812b71173
|
||||||
|
|
||||||
|
For leaf dir, In most cases, there should be as many bestfree slots
|
||||||
|
as the dir data blocks that can fit under i_size (except for [1]).
|
||||||
|
|
||||||
|
Root cause is we don't examin the number bestfree slots, when the slots
|
||||||
|
number less than dir data blocks, if we need to allocate new dir data
|
||||||
|
block and update the bestfree array, we will use the dir block number as
|
||||||
|
index to assign bestfree array, while we did not check the leaf buf
|
||||||
|
boundary which may cause UAF or other memory access problem. This issue
|
||||||
|
can also triggered with test cases xfs/473 from fstests.
|
||||||
|
|
||||||
|
According to Dave Chinner & Darrick's suggestion, adding buffer verifier
|
||||||
|
to detect this abnormal situation in time.
|
||||||
|
Simplify the testcase for fstest xfs/554 [1]
|
||||||
|
|
||||||
|
The error log is shown as follows:
|
||||||
|
==================================================================
|
||||||
|
BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0
|
||||||
|
Write of size 2 at addr ffff88810168b000 by task touch/1552
|
||||||
|
CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ #101
|
||||||
|
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
|
||||||
|
1.13.0-1ubuntu1.1 04/01/2014
|
||||||
|
Call Trace:
|
||||||
|
<TASK>
|
||||||
|
dump_stack_lvl+0x4d/0x66
|
||||||
|
print_report.cold+0xf6/0x691
|
||||||
|
kasan_report+0xa8/0x120
|
||||||
|
xfs_dir2_leaf_addname+0x1995/0x1ac0
|
||||||
|
xfs_dir_createname+0x58c/0x7f0
|
||||||
|
xfs_create+0x7af/0x1010
|
||||||
|
xfs_generic_create+0x270/0x5e0
|
||||||
|
path_openat+0x270b/0x3450
|
||||||
|
do_filp_open+0x1cf/0x2b0
|
||||||
|
do_sys_openat2+0x46b/0x7a0
|
||||||
|
do_sys_open+0xb7/0x130
|
||||||
|
do_syscall_64+0x35/0x80
|
||||||
|
entry_SYSCALL_64_after_hwframe+0x63/0xcd
|
||||||
|
RIP: 0033:0x7fe4d9e9312b
|
||||||
|
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
|
||||||
|
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
|
||||||
|
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
|
||||||
|
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
|
||||||
|
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
|
||||||
|
RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c
|
||||||
|
RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000
|
||||||
|
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
|
||||||
|
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000
|
||||||
|
</TASK>
|
||||||
|
|
||||||
|
The buggy address belongs to the physical page:
|
||||||
|
page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000
|
||||||
|
index:0x0 pfn:0x10168b
|
||||||
|
flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff)
|
||||||
|
raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000
|
||||||
|
raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000
|
||||||
|
page dumped because: kasan: bad access detected
|
||||||
|
|
||||||
|
Memory state around the buggy address:
|
||||||
|
ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|
||||||
|
ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|
||||||
|
>ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
|
||||||
|
^
|
||||||
|
ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
|
||||||
|
ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
|
||||||
|
==================================================================
|
||||||
|
Disabling lock debugging due to kernel taint
|
||||||
|
00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78
|
||||||
|
XDD3[S5........x
|
||||||
|
XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file
|
||||||
|
fs/xfs/libxfs/xfs_dir2_data.c. Caller
|
||||||
|
xfs_dir2_data_use_free+0x28a/0xeb0
|
||||||
|
CPU: 5 PID: 1552 Comm: touch Tainted: G B 6.0.0-rc3+
|
||||||
|
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
|
||||||
|
1.13.0-1ubuntu1.1 04/01/2014
|
||||||
|
Call Trace:
|
||||||
|
<TASK>
|
||||||
|
dump_stack_lvl+0x4d/0x66
|
||||||
|
xfs_corruption_error+0x132/0x150
|
||||||
|
xfs_dir2_data_use_free+0x198/0xeb0
|
||||||
|
xfs_dir2_leaf_addname+0xa59/0x1ac0
|
||||||
|
xfs_dir_createname+0x58c/0x7f0
|
||||||
|
xfs_create+0x7af/0x1010
|
||||||
|
xfs_generic_create+0x270/0x5e0
|
||||||
|
path_openat+0x270b/0x3450
|
||||||
|
do_filp_open+0x1cf/0x2b0
|
||||||
|
do_sys_openat2+0x46b/0x7a0
|
||||||
|
do_sys_open+0xb7/0x130
|
||||||
|
do_syscall_64+0x35/0x80
|
||||||
|
entry_SYSCALL_64_after_hwframe+0x63/0xcd
|
||||||
|
RIP: 0033:0x7fe4d9e9312b
|
||||||
|
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
|
||||||
|
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
|
||||||
|
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
|
||||||
|
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
|
||||||
|
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
|
||||||
|
RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c
|
||||||
|
RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001
|
||||||
|
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
|
||||||
|
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000
|
||||||
|
</TASK>
|
||||||
|
XFS (sdb): Corruption detected. Unmount and run xfs_repair
|
||||||
|
|
||||||
|
[1] https://lore.kernel.org/all/20220928095355.2074025-1-guoxuenan@huawei.com/
|
||||||
|
Reviewed-by: Hou Tao <houtao1@huawei.com>
|
||||||
|
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
libxfs/xfs_dir2_leaf.c | 9 +++++++--
|
||||||
|
1 file changed, 7 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_dir2_leaf.c b/libxfs/xfs_dir2_leaf.c
|
||||||
|
index 8827c96..5da6600 100644
|
||||||
|
--- a/libxfs/xfs_dir2_leaf.c
|
||||||
|
+++ b/libxfs/xfs_dir2_leaf.c
|
||||||
|
@@ -144,6 +144,8 @@ xfs_dir3_leaf_check_int(
|
||||||
|
xfs_dir2_leaf_tail_t *ltp;
|
||||||
|
int stale;
|
||||||
|
int i;
|
||||||
|
+ bool isleaf1 = (hdr->magic == XFS_DIR2_LEAF1_MAGIC ||
|
||||||
|
+ hdr->magic == XFS_DIR3_LEAF1_MAGIC);
|
||||||
|
|
||||||
|
ltp = xfs_dir2_leaf_tail_p(geo, leaf);
|
||||||
|
|
||||||
|
@@ -156,8 +158,7 @@ xfs_dir3_leaf_check_int(
|
||||||
|
return __this_address;
|
||||||
|
|
||||||
|
/* Leaves and bests don't overlap in leaf format. */
|
||||||
|
- if ((hdr->magic == XFS_DIR2_LEAF1_MAGIC ||
|
||||||
|
- hdr->magic == XFS_DIR3_LEAF1_MAGIC) &&
|
||||||
|
+ if (isleaf1 &&
|
||||||
|
(char *)&hdr->ents[hdr->count] > (char *)xfs_dir2_leaf_bests_p(ltp))
|
||||||
|
return __this_address;
|
||||||
|
|
||||||
|
@@ -173,6 +174,10 @@ xfs_dir3_leaf_check_int(
|
||||||
|
}
|
||||||
|
if (hdr->ents[i].address == cpu_to_be32(XFS_DIR2_NULL_DATAPTR))
|
||||||
|
stale++;
|
||||||
|
+ if (isleaf1 && xfs_dir2_dataptr_to_db(geo,
|
||||||
|
+ be32_to_cpu(hdr->ents[i].address)) >=
|
||||||
|
+ be32_to_cpu(ltp->bestcount))
|
||||||
|
+ return __this_address;
|
||||||
|
}
|
||||||
|
if (hdr->stale != stale)
|
||||||
|
return __this_address;
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
43
0040-xfs-increase-rename-inode-reservation.patch
Normal file
43
0040-xfs-increase-rename-inode-reservation.patch
Normal file
@ -0,0 +1,43 @@
|
|||||||
|
From 227bc97f12f2df902ab776fe038dc6d065f03c58 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Allison Henderson <allison.henderson@oracle.com>
|
||||||
|
Date: Fri, 18 Nov 2022 10:48:26 +0100
|
||||||
|
Subject: [PATCH] xfs: increase rename inode reservation
|
||||||
|
|
||||||
|
Source kernel commit: e07ee6fe21f47cfd72ae566395c67a80e7c66163
|
||||||
|
|
||||||
|
xfs_rename can update up to 5 inodes: src_dp, target_dp, src_ip, target_ip
|
||||||
|
and wip. So we need to increase the inode reservation to match.
|
||||||
|
|
||||||
|
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
libxfs/xfs_trans_resv.c | 4 ++--
|
||||||
|
1 file changed, 2 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_trans_resv.c b/libxfs/xfs_trans_resv.c
|
||||||
|
index 797176d..04c4448 100644
|
||||||
|
--- a/libxfs/xfs_trans_resv.c
|
||||||
|
+++ b/libxfs/xfs_trans_resv.c
|
||||||
|
@@ -421,7 +421,7 @@ xfs_calc_itruncate_reservation_minlogsize(
|
||||||
|
|
||||||
|
/*
|
||||||
|
* In renaming a files we can modify:
|
||||||
|
- * the four inodes involved: 4 * inode size
|
||||||
|
+ * the five inodes involved: 5 * inode size
|
||||||
|
* the two directory btrees: 2 * (max depth + v2) * dir block size
|
||||||
|
* the two directory bmap btrees: 2 * max depth * block size
|
||||||
|
* And the bmap_finish transaction can free dir and bmap blocks (two sets
|
||||||
|
@@ -436,7 +436,7 @@ xfs_calc_rename_reservation(
|
||||||
|
struct xfs_mount *mp)
|
||||||
|
{
|
||||||
|
return XFS_DQUOT_LOGRES(mp) +
|
||||||
|
- max((xfs_calc_inode_res(mp, 4) +
|
||||||
|
+ max((xfs_calc_inode_res(mp, 5) +
|
||||||
|
xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp),
|
||||||
|
XFS_FSB_TO_B(mp, 1))),
|
||||||
|
(xfs_calc_buf_res(7, mp->m_sb.sb_sectsize) +
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
118
0041-xfs-fix-sb-write-verify-for-lazysbcount.patch
Normal file
118
0041-xfs-fix-sb-write-verify-for-lazysbcount.patch
Normal file
@ -0,0 +1,118 @@
|
|||||||
|
From 4b593a7b25ce7cd155614006a943ddd53ca47669 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Long Li <leo.lilong@huawei.com>
|
||||||
|
Date: Fri, 18 Nov 2022 12:23:57 +0100
|
||||||
|
Subject: [PATCH] xfs: fix sb write verify for lazysbcount
|
||||||
|
|
||||||
|
Source kernel commit: 7cecd500d90164419add650e26cc1de03a7a66cb
|
||||||
|
|
||||||
|
When lazysbcount is enabled, fsstress and loop mount/unmount test report
|
||||||
|
the following problems:
|
||||||
|
|
||||||
|
XFS (loop0): SB summary counter sanity check failed
|
||||||
|
XFS (loop0): Metadata corruption detected at xfs_sb_write_verify+0x13b/0x460,
|
||||||
|
xfs_sb block 0x0
|
||||||
|
XFS (loop0): Unmount and run xfs_repair
|
||||||
|
XFS (loop0): First 128 bytes of corrupted metadata buffer:
|
||||||
|
00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 28 00 00 XFSB.........(..
|
||||||
|
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
|
||||||
|
00000020: 69 fb 7c cd 5f dc 44 af 85 74 e0 cc d4 e3 34 5a i.|._.D..t....4Z
|
||||||
|
00000030: 00 00 00 00 00 20 00 06 00 00 00 00 00 00 00 80 ..... ..........
|
||||||
|
00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................
|
||||||
|
00000050: 00 00 00 01 00 0a 00 00 00 00 00 04 00 00 00 00 ................
|
||||||
|
00000060: 00 00 0a 00 b4 b5 02 00 02 00 00 08 00 00 00 00 ................
|
||||||
|
00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 14 00 00 19 ................
|
||||||
|
XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply
|
||||||
|
+0xe1e/0x10e0 (fs/xfs/xfs_buf.c:1580). Shutting down filesystem.
|
||||||
|
XFS (loop0): Please unmount the filesystem and rectify the problem(s)
|
||||||
|
XFS (loop0): log mount/recovery failed: error -117
|
||||||
|
XFS (loop0): log mount failed
|
||||||
|
|
||||||
|
This corruption will shutdown the file system and the file system will
|
||||||
|
no longer be mountable. The following script can reproduce the problem,
|
||||||
|
but it may take a long time.
|
||||||
|
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
device=/dev/sda
|
||||||
|
testdir=/mnt/test
|
||||||
|
round=0
|
||||||
|
|
||||||
|
function fail()
|
||||||
|
{
|
||||||
|
echo "$*"
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
|
mkdir -p $testdir
|
||||||
|
while [ $round -lt 10000 ]
|
||||||
|
do
|
||||||
|
echo "******* round $round ********"
|
||||||
|
mkfs.xfs -f $device
|
||||||
|
mount $device $testdir || fail "mount failed!"
|
||||||
|
fsstress -d $testdir -l 0 -n 10000 -p 4 >/dev/null &
|
||||||
|
sleep 4
|
||||||
|
killall -w fsstress
|
||||||
|
umount $testdir
|
||||||
|
xfs_repair -e $device > /dev/null
|
||||||
|
if [ $? -eq 2 ];then
|
||||||
|
echo "ERR CODE 2: Dirty log exception during repair."
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
round=$(($round+1))
|
||||||
|
done
|
||||||
|
|
||||||
|
With lazysbcount is enabled, There is no additional lock protection for
|
||||||
|
reading m_ifree and m_icount in xfs_log_sb(), if other cpu modifies the
|
||||||
|
m_ifree, this will make the m_ifree greater than m_icount. For example,
|
||||||
|
consider the following sequence and ifreedelta is postive:
|
||||||
|
|
||||||
|
CPU0 CPU1
|
||||||
|
xfs_log_sb xfs_trans_unreserve_and_mod_sb
|
||||||
|
---------- ------------------------------
|
||||||
|
percpu_counter_sum(&mp->m_icount)
|
||||||
|
percpu_counter_add_batch(&mp->m_icount,
|
||||||
|
idelta, XFS_ICOUNT_BATCH)
|
||||||
|
percpu_counter_add(&mp->m_ifree, ifreedelta);
|
||||||
|
percpu_counter_sum(&mp->m_ifree)
|
||||||
|
|
||||||
|
After this, incorrect inode count (sb_ifree > sb_icount) will be writen to
|
||||||
|
the log. In the subsequent writing of sb, incorrect inode count (sb_ifree >
|
||||||
|
sb_icount) will fail to pass the boundary check in xfs_validate_sb_write()
|
||||||
|
that cause the file system shutdown.
|
||||||
|
|
||||||
|
When lazysbcount is enabled, we don't need to guarantee that Lazy sb
|
||||||
|
counters are completely correct, but we do need to guarantee that sb_ifree
|
||||||
|
<= sb_icount. On the other hand, the constraint that m_ifree <= m_icount
|
||||||
|
must be satisfied any time that there /cannot/ be other threads allocating
|
||||||
|
or freeing inode chunks. If the constraint is violated under these
|
||||||
|
circumstances, sb_i{count,free} (the ondisk superblock inode counters)
|
||||||
|
maybe incorrect and need to be marked sick at unmount, the count will
|
||||||
|
be rebuilt on the next mount.
|
||||||
|
|
||||||
|
Fixes: 8756a5af1819 ("libxfs: add more bounds checking to sb sanity checks")
|
||||||
|
Signed-off-by: Long Li <leo.lilong@huawei.com>
|
||||||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
libxfs/xfs_sb.c | 4 +++-
|
||||||
|
1 file changed, 3 insertions(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
|
||||||
|
index cfa44eb..624bfbf 100644
|
||||||
|
--- a/libxfs/xfs_sb.c
|
||||||
|
+++ b/libxfs/xfs_sb.c
|
||||||
|
@@ -804,7 +804,9 @@ xfs_log_sb(
|
||||||
|
*/
|
||||||
|
if (xfs_sb_version_haslazysbcount(&mp->m_sb)) {
|
||||||
|
mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount);
|
||||||
|
- mp->m_sb.sb_ifree = percpu_counter_sum(&mp->m_ifree);
|
||||||
|
+ mp->m_sb.sb_ifree = min_t(uint64_t,
|
||||||
|
+ percpu_counter_sum(&mp->m_ifree),
|
||||||
|
+ mp->m_sb.sb_icount);
|
||||||
|
mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
|
||||||
|
}
|
||||||
|
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,58 @@
|
|||||||
|
From 978c3087b6afa56986ac3e5a52131d73d28253ca Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Wed, 23 Nov 2022 09:09:28 -0800
|
||||||
|
Subject: [PATCH] xfs_repair: don't crash on unknown inode parents in dry run
|
||||||
|
mode
|
||||||
|
|
||||||
|
Fuzz testing of directory block headers exposed a debug assertion vector
|
||||||
|
in xfs_repair. In normal (aka fixit) mode, if a single-block directory
|
||||||
|
has a totally trashed block, repair will zap the entire directory.
|
||||||
|
Phase 4 ignores any dirents pointing to the zapped directory, phase 6
|
||||||
|
ignores the freed directory, and everything is good.
|
||||||
|
|
||||||
|
However, in dry run mode, we don't actually free the inode. Phase 4
|
||||||
|
still ignores any dirents pointing to the zapped directory, but phase 6
|
||||||
|
thinks the inode is still live and tries to walk it. xfs_repair doesn't
|
||||||
|
know of any parents for the zapped directory and so trips the assertion.
|
||||||
|
|
||||||
|
The assertion is critical for fixit mode because we need all the parent
|
||||||
|
information to ensure consistency of the directory tree. In dry run
|
||||||
|
mode we don't care, because we only have to print inconsistencies and
|
||||||
|
return 1. Worse yet, (our) customers file bugs when xfs_repair crashes
|
||||||
|
during a -n scan, so this will generate support calls.
|
||||||
|
|
||||||
|
Make everyone's life easier by downgrading the assertion to a warning if
|
||||||
|
we're running in dry run mode.
|
||||||
|
|
||||||
|
Found by fuzzing bhdr.hdr.bno = zeroes in xfs/471.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
repair/phase6.c | 9 ++++++++-
|
||||||
|
1 file changed, 8 insertions(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/repair/phase6.c b/repair/phase6.c
|
||||||
|
index 1f9f8de..0be2c9c 100644
|
||||||
|
--- a/repair/phase6.c
|
||||||
|
+++ b/repair/phase6.c
|
||||||
|
@@ -1836,7 +1836,14 @@ longform_dir2_entry_check_data(
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
parent = get_inode_parent(irec, ino_offset);
|
||||||
|
- ASSERT(parent != 0);
|
||||||
|
+ if (parent == 0) {
|
||||||
|
+ if (no_modify)
|
||||||
|
+ do_warn(
|
||||||
|
+ _("unknown parent for inode %" PRIu64 "\n"),
|
||||||
|
+ inum);
|
||||||
|
+ else
|
||||||
|
+ ASSERT(parent != 0);
|
||||||
|
+ }
|
||||||
|
junkit = 0;
|
||||||
|
/*
|
||||||
|
* bump up the link counts in parent and child
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
275
0043-xfs_repair-retain-superblock-buffer-to-avoid-write-h.patch
Normal file
275
0043-xfs_repair-retain-superblock-buffer-to-avoid-write-h.patch
Normal file
@ -0,0 +1,275 @@
|
|||||||
|
From a5915eb4be5c2070adce092e6fff1fd9c906dc7e Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Wed, 23 Nov 2022 09:09:33 -0800
|
||||||
|
Subject: [PATCH] xfs_repair: retain superblock buffer to avoid write hook
|
||||||
|
deadlock
|
||||||
|
|
||||||
|
Every now and then I experience the following deadlock in xfs_repair
|
||||||
|
when I'm running the offline repair fuzz tests:
|
||||||
|
|
||||||
|
#0 futex_wait (private=0, expected=2, futex_word=0x55555566df70) at ../sysdeps/nptl/futex-internal.h:146
|
||||||
|
#1 __GI___lll_lock_wait (futex=futex@entry=0x55555566df70, private=0) at ./nptl/lowlevellock.c:49
|
||||||
|
#2 lll_mutex_lock_optimized (mutex=0x55555566df70) at ./nptl/pthread_mutex_lock.c:48
|
||||||
|
#3 ___pthread_mutex_lock (mutex=mutex@entry=0x55555566df70) at ./nptl/pthread_mutex_lock.c:93
|
||||||
|
#4 cache_shake (cache=cache@entry=0x55555566de60, priority=priority@entry=2, purge=purge@entry=false) at cache.c:231
|
||||||
|
#5 cache_node_get (cache=cache@entry=0x55555566de60, key=key@entry=0x7fffe55e01b0, nodep=nodep@entry=0x7fffe55e0168) at cache.c:452
|
||||||
|
#6 __cache_lookup (key=key@entry=0x7fffe55e01b0, flags=0, bpp=bpp@entry=0x7fffe55e0228) at rdwr.c:405
|
||||||
|
#7 libxfs_getbuf_flags (btp=0x55555566de00, blkno=0, len=<optimized out>, flags=<optimized out>, bpp=0x7fffe55e0228) at rdwr.c:457
|
||||||
|
#8 libxfs_buf_read_map (btp=0x55555566de00, map=map@entry=0x7fffe55e0280, nmaps=nmaps@entry=1, flags=flags@entry=0, bpp=bpp@entry=0x7fffe55e0278, ops=0x5555556233e0 <xfs_sb_buf_ops>)
|
||||||
|
at rdwr.c:704
|
||||||
|
#9 libxfs_buf_read (ops=<optimized out>, bpp=0x7fffe55e0278, flags=0, numblks=<optimized out>, blkno=0, target=<optimized out>)
|
||||||
|
at /storage/home/djwong/cdev/work/xfsprogs/build-x86_64/libxfs/libxfs_io.h:195
|
||||||
|
#10 libxfs_getsb (mp=mp@entry=0x7fffffffd690) at rdwr.c:162
|
||||||
|
#11 force_needsrepair (mp=0x7fffffffd690) at xfs_repair.c:924
|
||||||
|
#12 repair_capture_writeback (bp=<optimized out>) at xfs_repair.c:1000
|
||||||
|
#13 libxfs_bwrite (bp=0x7fffe011e530) at rdwr.c:869
|
||||||
|
#14 cache_shake (cache=cache@entry=0x55555566de60, priority=priority@entry=2, purge=purge@entry=false) at cache.c:240
|
||||||
|
#15 cache_node_get (cache=cache@entry=0x55555566de60, key=key@entry=0x7fffe55e0470, nodep=nodep@entry=0x7fffe55e0428) at cache.c:452
|
||||||
|
#16 __cache_lookup (key=key@entry=0x7fffe55e0470, flags=1, bpp=bpp@entry=0x7fffe55e0538) at rdwr.c:405
|
||||||
|
#17 libxfs_getbuf_flags (btp=0x55555566de00, blkno=12736, len=<optimized out>, flags=<optimized out>, bpp=0x7fffe55e0538) at rdwr.c:457
|
||||||
|
#18 __libxfs_buf_get_map (btp=<optimized out>, map=map@entry=0x7fffe55e05b0, nmaps=<optimized out>, flags=flags@entry=1, bpp=bpp@entry=0x7fffe55e0538) at rdwr.c:501
|
||||||
|
#19 libxfs_buf_get_map (btp=<optimized out>, map=map@entry=0x7fffe55e05b0, nmaps=<optimized out>, flags=flags@entry=1, bpp=bpp@entry=0x7fffe55e0538) at rdwr.c:525
|
||||||
|
#20 pf_queue_io (args=args@entry=0x5555556722c0, map=map@entry=0x7fffe55e05b0, nmaps=<optimized out>, flag=flag@entry=11) at prefetch.c:124
|
||||||
|
#21 pf_read_bmbt_reclist (args=0x5555556722c0, rp=<optimized out>, numrecs=78) at prefetch.c:220
|
||||||
|
#22 pf_scan_lbtree (dbno=dbno@entry=1211, level=level@entry=1, isadir=isadir@entry=1, args=args@entry=0x5555556722c0, func=0x55555557f240 <pf_scanfunc_bmap>) at prefetch.c:298
|
||||||
|
#23 pf_read_btinode (isadir=1, dino=<optimized out>, args=0x5555556722c0) at prefetch.c:385
|
||||||
|
#24 pf_read_inode_dirs (args=args@entry=0x5555556722c0, bp=bp@entry=0x7fffdc023790) at prefetch.c:459
|
||||||
|
#25 pf_read_inode_dirs (bp=<optimized out>, args=0x5555556722c0) at prefetch.c:411
|
||||||
|
#26 pf_batch_read (args=args@entry=0x5555556722c0, which=which@entry=PF_PRIMARY, buf=buf@entry=0x7fffd001d000) at prefetch.c:609
|
||||||
|
#27 pf_io_worker (param=0x5555556722c0) at prefetch.c:673
|
||||||
|
#28 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
|
||||||
|
#29 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
|
||||||
|
|
||||||
|
>From this stack trace, we see that xfs_repair's prefetch module is
|
||||||
|
getting some xfs_buf objects ahead of initiating a read (#19). The
|
||||||
|
buffer cache has hit its limit, so it calls cache_shake (#14) to free
|
||||||
|
some unused xfs_bufs. The buffer it finds is a dirty buffer, so it
|
||||||
|
calls libxfs_bwrite to flush it out to disk, which in turn invokes the
|
||||||
|
buffer write hook that xfs_repair set up in 3b7667cb to mark the ondisk
|
||||||
|
filesystem's superblock as NEEDSREPAIR until repair actually completes.
|
||||||
|
|
||||||
|
Unfortunately, the NEEDSREPAIR handler itself needs to grab the
|
||||||
|
superblock buffer, so it makes another call into the buffer cache (#9),
|
||||||
|
which sees that the cache is full and tries to shake it(#4). Hence we
|
||||||
|
deadlock on cm_mutex because shaking is not reentrant.
|
||||||
|
|
||||||
|
Fix this by retaining a reference to the superblock buffer when possible
|
||||||
|
so that the writeback hook doesn't have to access the buffer cache to
|
||||||
|
set NEEDSREPAIR.
|
||||||
|
|
||||||
|
Fixes: 3b7667cb ("xfs_repair: set NEEDSREPAIR the first time we write to a filesystem")
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
libxfs/libxfs_api_defs.h | 2 ++
|
||||||
|
libxfs/libxfs_io.h | 3 ++
|
||||||
|
libxfs/rdwr.c | 16 +++++++++++
|
||||||
|
repair/phase2.c | 8 ++++++
|
||||||
|
repair/protos.h | 1 +
|
||||||
|
repair/xfs_repair.c | 75 ++++++++++++++++++++++++++++++++++++++++++------
|
||||||
|
6 files changed, 96 insertions(+), 9 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
|
||||||
|
index 9aea8c7..4fe4e75 100644
|
||||||
|
--- a/libxfs/libxfs_api_defs.h
|
||||||
|
+++ b/libxfs/libxfs_api_defs.h
|
||||||
|
@@ -49,9 +49,11 @@
|
||||||
|
#define xfs_buf_delwri_submit libxfs_buf_delwri_submit
|
||||||
|
#define xfs_buf_get libxfs_buf_get
|
||||||
|
#define xfs_buf_get_uncached libxfs_buf_get_uncached
|
||||||
|
+#define xfs_buf_lock libxfs_buf_lock
|
||||||
|
#define xfs_buf_read libxfs_buf_read
|
||||||
|
#define xfs_buf_read_uncached libxfs_buf_read_uncached
|
||||||
|
#define xfs_buf_relse libxfs_buf_relse
|
||||||
|
+#define xfs_buf_unlock libxfs_buf_unlock
|
||||||
|
#define xfs_bunmapi libxfs_bunmapi
|
||||||
|
#define xfs_bwrite libxfs_bwrite
|
||||||
|
#define xfs_calc_dquots_per_chunk libxfs_calc_dquots_per_chunk
|
||||||
|
diff --git a/libxfs/libxfs_io.h b/libxfs/libxfs_io.h
|
||||||
|
index 3cc4f4e..0e444e2 100644
|
||||||
|
--- a/libxfs/libxfs_io.h
|
||||||
|
+++ b/libxfs/libxfs_io.h
|
||||||
|
@@ -217,6 +217,9 @@ xfs_buf_hold(struct xfs_buf *bp)
|
||||||
|
bp->b_node.cn_count++;
|
||||||
|
}
|
||||||
|
|
||||||
|
+void xfs_buf_lock(struct xfs_buf *bp);
|
||||||
|
+void xfs_buf_unlock(struct xfs_buf *bp);
|
||||||
|
+
|
||||||
|
int libxfs_buf_get_uncached(struct xfs_buftarg *targ, size_t bblen, int flags,
|
||||||
|
struct xfs_buf **bpp);
|
||||||
|
int libxfs_buf_read_uncached(struct xfs_buftarg *targ, xfs_daddr_t daddr,
|
||||||
|
diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
|
||||||
|
index 128367e..2c1162b 100644
|
||||||
|
--- a/libxfs/rdwr.c
|
||||||
|
+++ b/libxfs/rdwr.c
|
||||||
|
@@ -376,6 +376,22 @@ libxfs_getbufr_map(struct xfs_buftarg *btp, xfs_daddr_t blkno, int bblen,
|
||||||
|
return bp;
|
||||||
|
}
|
||||||
|
|
||||||
|
+void
|
||||||
|
+xfs_buf_lock(
|
||||||
|
+ struct xfs_buf *bp)
|
||||||
|
+{
|
||||||
|
+ if (use_xfs_buf_lock)
|
||||||
|
+ pthread_mutex_lock(&bp->b_lock);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+void
|
||||||
|
+xfs_buf_unlock(
|
||||||
|
+ struct xfs_buf *bp)
|
||||||
|
+{
|
||||||
|
+ if (use_xfs_buf_lock)
|
||||||
|
+ pthread_mutex_unlock(&bp->b_lock);
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
__cache_lookup(
|
||||||
|
struct xfs_bufkey *key,
|
||||||
|
diff --git a/repair/phase2.c b/repair/phase2.c
|
||||||
|
index ab53ee0..7441451 100644
|
||||||
|
--- a/repair/phase2.c
|
||||||
|
+++ b/repair/phase2.c
|
||||||
|
@@ -250,6 +250,14 @@ phase2(
|
||||||
|
} else
|
||||||
|
do_log(_("Phase 2 - using internal log\n"));
|
||||||
|
|
||||||
|
+ /*
|
||||||
|
+ * Now that we've set up the buffer cache the way we want it, try to
|
||||||
|
+ * grab our own reference to the primary sb so that the hooks will not
|
||||||
|
+ * have to call out to the buffer cache.
|
||||||
|
+ */
|
||||||
|
+ if (mp->m_buf_writeback_fn)
|
||||||
|
+ retain_primary_sb(mp);
|
||||||
|
+
|
||||||
|
/* Zero log if applicable */
|
||||||
|
do_log(_(" - zero log...\n"));
|
||||||
|
|
||||||
|
diff --git a/repair/protos.h b/repair/protos.h
|
||||||
|
index 83734e8..7cdc3a1 100644
|
||||||
|
--- a/repair/protos.h
|
||||||
|
+++ b/repair/protos.h
|
||||||
|
@@ -16,6 +16,7 @@ int get_sb(xfs_sb_t *sbp,
|
||||||
|
xfs_off_t off,
|
||||||
|
int size,
|
||||||
|
xfs_agnumber_t agno);
|
||||||
|
+int retain_primary_sb(struct xfs_mount *mp);
|
||||||
|
void write_primary_sb(xfs_sb_t *sbp,
|
||||||
|
int size);
|
||||||
|
|
||||||
|
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
|
||||||
|
index e44aa40..d043643 100644
|
||||||
|
--- a/repair/xfs_repair.c
|
||||||
|
+++ b/repair/xfs_repair.c
|
||||||
|
@@ -738,6 +738,63 @@ check_fs_vs_host_sectsize(
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
+/*
|
||||||
|
+ * If we set up a writeback function to set NEEDSREPAIR while the filesystem is
|
||||||
|
+ * dirty, there's a chance that calling libxfs_getsb could deadlock the buffer
|
||||||
|
+ * cache while trying to get the primary sb buffer if the first non-sb write to
|
||||||
|
+ * the filesystem is the result of a cache shake. Retain a reference to the
|
||||||
|
+ * primary sb buffer to avoid all that.
|
||||||
|
+ */
|
||||||
|
+static struct xfs_buf *primary_sb_bp; /* buffer for superblock */
|
||||||
|
+
|
||||||
|
+int
|
||||||
|
+retain_primary_sb(
|
||||||
|
+ struct xfs_mount *mp)
|
||||||
|
+{
|
||||||
|
+ int error;
|
||||||
|
+
|
||||||
|
+ error = -libxfs_buf_read(mp->m_ddev_targp, XFS_SB_DADDR,
|
||||||
|
+ XFS_FSS_TO_BB(mp, 1), 0, &primary_sb_bp,
|
||||||
|
+ &xfs_sb_buf_ops);
|
||||||
|
+ if (error)
|
||||||
|
+ return error;
|
||||||
|
+
|
||||||
|
+ libxfs_buf_unlock(primary_sb_bp);
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static void
|
||||||
|
+drop_primary_sb(void)
|
||||||
|
+{
|
||||||
|
+ if (!primary_sb_bp)
|
||||||
|
+ return;
|
||||||
|
+
|
||||||
|
+ libxfs_buf_lock(primary_sb_bp);
|
||||||
|
+ libxfs_buf_relse(primary_sb_bp);
|
||||||
|
+ primary_sb_bp = NULL;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+static int
|
||||||
|
+get_primary_sb(
|
||||||
|
+ struct xfs_mount *mp,
|
||||||
|
+ struct xfs_buf **bpp)
|
||||||
|
+{
|
||||||
|
+ int error;
|
||||||
|
+
|
||||||
|
+ *bpp = NULL;
|
||||||
|
+
|
||||||
|
+ if (!primary_sb_bp) {
|
||||||
|
+ error = retain_primary_sb(mp);
|
||||||
|
+ if (error)
|
||||||
|
+ return error;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ libxfs_buf_lock(primary_sb_bp);
|
||||||
|
+ xfs_buf_hold(primary_sb_bp);
|
||||||
|
+ *bpp = primary_sb_bp;
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
/* Clear needsrepair after a successful repair run. */
|
||||||
|
void
|
||||||
|
clear_needsrepair(
|
||||||
|
@@ -758,15 +815,14 @@ clear_needsrepair(
|
||||||
|
do_warn(
|
||||||
|
_("Cannot clear needsrepair due to flush failure, err=%d.\n"),
|
||||||
|
error);
|
||||||
|
- return;
|
||||||
|
+ goto drop;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Clear needsrepair from the superblock. */
|
||||||
|
- bp = libxfs_getsb(mp);
|
||||||
|
- if (!bp || bp->b_error) {
|
||||||
|
+ error = get_primary_sb(mp, &bp);
|
||||||
|
+ if (error) {
|
||||||
|
do_warn(
|
||||||
|
- _("Cannot clear needsrepair from primary super, err=%d.\n"),
|
||||||
|
- bp ? bp->b_error : ENOMEM);
|
||||||
|
+ _("Cannot clear needsrepair from primary super, err=%d.\n"), error);
|
||||||
|
} else {
|
||||||
|
mp->m_sb.sb_features_incompat &=
|
||||||
|
~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR;
|
||||||
|
@@ -775,6 +831,8 @@ clear_needsrepair(
|
||||||
|
}
|
||||||
|
if (bp)
|
||||||
|
libxfs_buf_relse(bp);
|
||||||
|
+drop:
|
||||||
|
+ drop_primary_sb();
|
||||||
|
}
|
||||||
|
|
||||||
|
static void
|
||||||
|
@@ -797,11 +855,10 @@ force_needsrepair(
|
||||||
|
xfs_sb_version_needsrepair(&mp->m_sb))
|
||||||
|
return;
|
||||||
|
|
||||||
|
- bp = libxfs_getsb(mp);
|
||||||
|
- if (!bp || bp->b_error) {
|
||||||
|
+ error = get_primary_sb(mp, &bp);
|
||||||
|
+ if (error) {
|
||||||
|
do_log(
|
||||||
|
- _("couldn't get superblock to set needsrepair, err=%d\n"),
|
||||||
|
- bp ? bp->b_error : ENOMEM);
|
||||||
|
+ _("couldn't get superblock to set needsrepair, err=%d\n"), error);
|
||||||
|
} else {
|
||||||
|
/*
|
||||||
|
* It's possible that we need to set NEEDSREPAIR before we've
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,90 @@
|
|||||||
|
From 79ba1e15d80eba3aff4396f44629eb8960722d36 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Srikanth C S <srikanth.c.s@oracle.com>
|
||||||
|
Date: Tue, 13 Dec 2022 22:45:43 +0530
|
||||||
|
Subject: [PATCH] fsck.xfs: mount/umount xfs fs to replay log before running
|
||||||
|
xfs_repair
|
||||||
|
|
||||||
|
After a recent data center crash, we had to recover root filesystems
|
||||||
|
on several thousands of VMs via a boot time fsck. Since these
|
||||||
|
machines are remotely manageable, support can inject the kernel
|
||||||
|
command line with 'fsck.mode=force fsck.repair=yes' to kick off
|
||||||
|
xfs_repair if the machine won't come up or if they suspect there
|
||||||
|
might be deeper issues with latent errors in the fs metadata, which
|
||||||
|
is what they did to try to get everyone running ASAP while
|
||||||
|
anticipating any future problems. But, fsck.xfs does not address the
|
||||||
|
journal replay in case of a crash.
|
||||||
|
|
||||||
|
fsck.xfs does xfs_repair -e if fsck.mode=force is set. It is
|
||||||
|
possible that when the machine crashes, the fs is in inconsistent
|
||||||
|
state with the journal log not yet replayed. This can drop the machine
|
||||||
|
into the rescue shell because xfs_fsck.sh does not know how to clean the
|
||||||
|
log. Since the administrator told us to force repairs, address the
|
||||||
|
deficiency by cleaning the log and rerunning xfs_repair.
|
||||||
|
|
||||||
|
Run xfs_repair -e when fsck.mode=force and repair=auto or yes.
|
||||||
|
Replay the logs only if fsck.mode=force and fsck.repair=yes. For
|
||||||
|
other option -fa and -f drop to the rescue shell if repair detects
|
||||||
|
any corruptions.
|
||||||
|
|
||||||
|
Signed-off-by: Srikanth C S <srikanth.c.s@oracle.com>
|
||||||
|
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
fsck/xfs_fsck.sh | 31 +++++++++++++++++++++++++++++--
|
||||||
|
1 file changed, 29 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/fsck/xfs_fsck.sh b/fsck/xfs_fsck.sh
|
||||||
|
index 6af0f22..62a1e0b 100755
|
||||||
|
--- a/fsck/xfs_fsck.sh
|
||||||
|
+++ b/fsck/xfs_fsck.sh
|
||||||
|
@@ -31,10 +31,12 @@ repair2fsck_code() {
|
||||||
|
|
||||||
|
AUTO=false
|
||||||
|
FORCE=false
|
||||||
|
+REPAIR=false
|
||||||
|
while getopts ":aApyf" c
|
||||||
|
do
|
||||||
|
case $c in
|
||||||
|
- a|A|p|y) AUTO=true;;
|
||||||
|
+ a|A|p) AUTO=true;;
|
||||||
|
+ y) REPAIR=true;;
|
||||||
|
f) FORCE=true;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
@@ -64,7 +66,32 @@ fi
|
||||||
|
|
||||||
|
if $FORCE; then
|
||||||
|
xfs_repair -e $DEV
|
||||||
|
- repair2fsck_code $?
|
||||||
|
+ error=$?
|
||||||
|
+ if [ $error -eq 2 ] && [ $REPAIR = true ]; then
|
||||||
|
+ echo "Replaying log for $DEV"
|
||||||
|
+ mkdir -p /tmp/repair_mnt || exit 1
|
||||||
|
+ for x in $(cat /proc/cmdline); do
|
||||||
|
+ case $x in
|
||||||
|
+ root=*)
|
||||||
|
+ ROOT="${x#root=}"
|
||||||
|
+ ;;
|
||||||
|
+ rootflags=*)
|
||||||
|
+ ROOTFLAGS="-o ${x#rootflags=}"
|
||||||
|
+ ;;
|
||||||
|
+ esac
|
||||||
|
+ done
|
||||||
|
+ test -b "$ROOT" || ROOT=$(blkid -t "$ROOT" -o device)
|
||||||
|
+ if [ $(basename $DEV) = $(basename $ROOT) ]; then
|
||||||
|
+ mount $DEV /tmp/repair_mnt $ROOTFLAGS || exit 1
|
||||||
|
+ else
|
||||||
|
+ mount $DEV /tmp/repair_mnt || exit 1
|
||||||
|
+ fi
|
||||||
|
+ umount /tmp/repair_mnt
|
||||||
|
+ xfs_repair -e $DEV
|
||||||
|
+ error=$?
|
||||||
|
+ rm -d /tmp/repair_mnt
|
||||||
|
+ fi
|
||||||
|
+ repair2fsck_code $error
|
||||||
|
exit $?
|
||||||
|
fi
|
||||||
|
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
33
0045-xfs_db-fix-dir3-block-magic-check.patch
Normal file
33
0045-xfs_db-fix-dir3-block-magic-check.patch
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
From 7374f58bfeb38467bab6552a47a5cd6bbe3c2e2e Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 20 Dec 2022 16:53:34 -0800
|
||||||
|
Subject: [PATCH] xfs_db: fix dir3 block magic check
|
||||||
|
|
||||||
|
Fix this broken check, which (amazingly) went unnoticed until I cranked
|
||||||
|
up the warning level /and/ built the system for s390x.
|
||||||
|
|
||||||
|
Fixes: e96864ff4d4 ("xfs_db: enable blockget for v5 filesystems")
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
||||||
|
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
db/check.c | 2 +-
|
||||||
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/db/check.c b/db/check.c
|
||||||
|
index bb27ce5..964756d 100644
|
||||||
|
--- a/db/check.c
|
||||||
|
+++ b/db/check.c
|
||||||
|
@@ -2578,7 +2578,7 @@ process_data_dir_v2(
|
||||||
|
error++;
|
||||||
|
}
|
||||||
|
if ((be32_to_cpu(data->magic) == XFS_DIR2_BLOCK_MAGIC ||
|
||||||
|
- be32_to_cpu(data->magic) == XFS_DIR2_BLOCK_MAGIC) &&
|
||||||
|
+ be32_to_cpu(data->magic) == XFS_DIR3_BLOCK_MAGIC) &&
|
||||||
|
stale != be32_to_cpu(btp->stale)) {
|
||||||
|
if (!sflag || v)
|
||||||
|
dbprintf(_("dir %lld block %d bad stale tail count %d\n"),
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -0,0 +1,36 @@
|
|||||||
|
From b7b81f336ac02f4e4f24e0844a7fb3023c489667 Mon Sep 17 00:00:00 2001
|
||||||
|
From: "Darrick J. Wong" <djwong@kernel.org>
|
||||||
|
Date: Tue, 14 Mar 2023 18:01:55 -0700
|
||||||
|
Subject: [PATCH] xfs_repair: fix incorrect dabtree hashval comparison
|
||||||
|
|
||||||
|
If an xattr structure contains enough names with the same hash value to
|
||||||
|
fill multiple xattr leaf blocks with names all hashing to the same
|
||||||
|
value, then the dabtree nodes will contain consecutive entries with the
|
||||||
|
same hash value.
|
||||||
|
|
||||||
|
This causes false corruption reports in xfs_repair because it's not
|
||||||
|
expecting such a huge same-hashing structure. Fix that.
|
||||||
|
|
||||||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||||||
|
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
||||||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||||||
|
---
|
||||||
|
repair/da_util.c | 2 +-
|
||||||
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/repair/da_util.c b/repair/da_util.c
|
||||||
|
index 7239c2e..b229422 100644
|
||||||
|
--- a/repair/da_util.c
|
||||||
|
+++ b/repair/da_util.c
|
||||||
|
@@ -330,7 +330,7 @@ _("%s block used/count inconsistency - %d/%hu\n"),
|
||||||
|
/*
|
||||||
|
* hash values monotonically increasing ???
|
||||||
|
*/
|
||||||
|
- if (cursor->level[this_level].hashval >=
|
||||||
|
+ if (cursor->level[this_level].hashval >
|
||||||
|
be32_to_cpu(nodehdr.btree[entry].hashval)) {
|
||||||
|
do_warn(
|
||||||
|
_("%s block hashvalue inconsistency, expected > %u / saw %u\n"),
|
||||||
|
--
|
||||||
|
1.8.3.1
|
||||||
|
|
||||||
@ -1,6 +1,6 @@
|
|||||||
Name: xfsprogs
|
Name: xfsprogs
|
||||||
Version: 5.14.1
|
Version: 5.14.1
|
||||||
Release: 14
|
Release: 15
|
||||||
Summary: Administration and debugging tools for the XFS file system
|
Summary: Administration and debugging tools for the XFS file system
|
||||||
License: GPL+ and LGPLv2+
|
License: GPL+ and LGPLv2+
|
||||||
URL: https://xfs.wiki.kernel.org
|
URL: https://xfs.wiki.kernel.org
|
||||||
@ -30,6 +30,42 @@ Patch8: 0008-xfs_repair-fix-the-problem-of-repair-failure-caused-.patch
|
|||||||
Patch9: 0009-mkfs.xfs-fix-segmentation-fault-caused-by-accessing-.patch
|
Patch9: 0009-mkfs.xfs-fix-segmentation-fault-caused-by-accessing-.patch
|
||||||
Patch10: 0010-xfs_repair-fix-warn-in-xfs_buf_find-when-growfs-fails.patch
|
Patch10: 0010-xfs_repair-fix-warn-in-xfs_buf_find-when-growfs-fails.patch
|
||||||
Patch11: 0011-xfs_copy-don-t-use-cached-buffer-reads-until-after-l.patch
|
Patch11: 0011-xfs_copy-don-t-use-cached-buffer-reads-until-after-l.patch
|
||||||
|
Patch12: 0012-xfs-sb-verifier-doesn-t-handle-uncached-sb-buffer.patch
|
||||||
|
Patch13: 0013-libxfs-always-initialize-internal-buffer-map.patch
|
||||||
|
Patch14: 0014-libxfs-shut-down-filesystem-if-we-xfs_trans_cancel-w.patch
|
||||||
|
Patch15: 0015-xfs_db-fix-nbits-parameter-in-fa_ino-48-functions.patch
|
||||||
|
Patch16: 0016-xfs_repair-update-secondary-superblocks-after-changi.patch
|
||||||
|
Patch17: 0017-xfs_repair-fix-AG-header-btree-level-comparisons.patch
|
||||||
|
Patch18: 0018-xfs-fix-maxlevels-comparisons-in-the-btree-staging-c.patch
|
||||||
|
Patch19: 0019-xfs-fold-perag-loop-iteration-logic-into-helper-func.patch
|
||||||
|
Patch20: 0020-xfs-rename-the-next_agno-perag-iteration-variable.patch
|
||||||
|
Patch21: 0021-xfs-terminate-perag-iteration-reliably-on-agcount.patch
|
||||||
|
Patch22: 0022-xfs-fix-perag-reference-leak-on-iteration-race-with-.patch
|
||||||
|
Patch23: 0023-mkfs-fix-missing-validation-of-l-size-against-maximu.patch
|
||||||
|
Patch24: 0024-mkfs-reduce-internal-log-size-when-log-stripe-units-.patch
|
||||||
|
Patch25: 0025-mkfs-don-t-let-internal-logs-bump-the-root-dir-inode.patch
|
||||||
|
Patch26: 0026-mkfs-improve-log-extent-validation.patch
|
||||||
|
Patch27: 0027-xfs_repair-detect-v5-featureset-mismatches-in-second.patch
|
||||||
|
Patch28: 0028-xfs_repair-check-the-ftype-of-dot-and-dotdot-directo.patch
|
||||||
|
Patch29: 0029-mkfs-Fix-memory-leak.patch
|
||||||
|
Patch30: 0030-xfs-zero-inode-fork-buffer-at-allocation.patch
|
||||||
|
Patch31: 0031-xfs-detect-self-referencing-btree-sibling-pointers.patch
|
||||||
|
Patch32: 0032-xfs-validate-inode-fork-size-against-fork-format.patch
|
||||||
|
Patch33: 0033-xfs_repair-always-rewrite-secondary-supers-when-need.patch
|
||||||
|
Patch34: 0034-xfs_repair-ignore-empty-xattr-leaf-blocks.patch
|
||||||
|
Patch35: 0035-xfs_repair-Search-for-conflicts-in-inode_tree_ptrs-w.patch
|
||||||
|
Patch36: 0036-mkfs-terminate-getsubopt-arrays-properly.patch
|
||||||
|
Patch37: 0037-mkfs-complain-about-impossible-log-size-constraints.patch
|
||||||
|
Patch38: 0038-xfs-trim-the-mapp-array-accordingly-in-xfs_da_grow_i.patch
|
||||||
|
Patch39: 0039-xfs-fix-exception-caused-by-unexpected-illegal-bestc.patch
|
||||||
|
Patch40: 0040-xfs-increase-rename-inode-reservation.patch
|
||||||
|
Patch41: 0041-xfs-fix-sb-write-verify-for-lazysbcount.patch
|
||||||
|
Patch42: 0042-xfs_repair-don-t-crash-on-unknown-inode-parents-in-d.patch
|
||||||
|
Patch43: 0043-xfs_repair-retain-superblock-buffer-to-avoid-write-h.patch
|
||||||
|
Patch44: 0044-fsck.xfs-mount-umount-xfs-fs-to-replay-log-before-ru.patch
|
||||||
|
Patch45: 0045-xfs_db-fix-dir3-block-magic-check.patch
|
||||||
|
Patch46: 0046-xfs_repair-fix-incorrect-dabtree-hashval-comparison.patch
|
||||||
|
|
||||||
|
|
||||||
%description
|
%description
|
||||||
xfsprogs are the userspace utilities that manage XFS filesystems.
|
xfsprogs are the userspace utilities that manage XFS filesystems.
|
||||||
@ -113,6 +149,9 @@ rm -rf %{buildroot}%{_datadir}/doc/xfsprogs/
|
|||||||
|
|
||||||
|
|
||||||
%changelog
|
%changelog
|
||||||
|
* Wed Dec 27 2023 wuguanghao <wuguanghao3@huawei.com> - 5.14.1-15
|
||||||
|
- backport patches from community
|
||||||
|
|
||||||
* Fri Nov 3 2023 wuguanghao <wuguanghao3@huawei.com> - 5.14.1-14
|
* Fri Nov 3 2023 wuguanghao <wuguanghao3@huawei.com> - 5.14.1-14
|
||||||
- xfs_copy: don't use cached buffer reads until after libxfs_mount
|
- xfs_copy: don't use cached buffer reads until after libxfs_mount
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user