The 1500-character body floor
Our build-time validator rejects any leaf whose body sections sum to fewer than 1500 characters. The floor was set after we observed indexing failures on two surfaces where the average body sat around eight hundred characters; bumping the floor to fifteen hundred pulled coverage from thirty percent to seventy-eight percent over thirty days. Fifteen hundred is the floor; the average on this surface is closer to twenty-two hundred. Words per leaf matter less than substance per leaf, but the byte count is the proxy the validator can measure cheaply.
The cosine-similarity pre-publish test
Before publishing a new leaf we run a cosine-similarity test against the existing leaves using simple TF-IDF vectors over the body text. Any pair scoring above 0.72 gets flagged for an editorial rewrite; pairs above 0.85 block the publish. The threshold of 0.72 was set empirically: leaves with a peer in the 0.72 to 0.85 band reliably indexed but their per-page impressions stayed flat; leaves below 0.72 grew their impressions over time. The test is forty lines of Python and runs in under five seconds across this entire surface.
Where the uniqueness has to come from
Numerics, examples, and arguments are the only durable sources of uniqueness. Re-skinning the same paragraph with synonyms does not pass the cosine test, does not pass a quality rater, and will not rank. Numerics that are specific to your shipping work are the cheapest source; this leaf has two such numerics, the 1500-character floor and the 0.72 cosine threshold. If you cannot find at least two specific numerics per leaf to anchor the writing, the topic is too thin for pSEO. Editorial is the better play.
