ECHA's 2026 Dose-Setting Update for Reproductive Toxicity Studies: Sharper Definitions, Same Unresolved Argument

Chemtox's opinion of the revised ECHA advice on dose-level selection for OECD TGs 414, 421/422 and 443, benchmarked against the 2022 original.

D.Wilkes, K,Hall, H.Patel, M.Koutsoukalis

6/25/2026

two bugs sitting on top of a green leaf
two bugs sitting on top of a green leaf

Why this document matters

If you work anywhere near REACH registration, you already know that dose-level selection is the single most consequential — and most argued-over — decision in a developmental and reproductive toxicity (DART) testing programme. Get it wrong and you either miss a hazard entirely (an underpowered, under-dosed study that tells you nothing) or you generate a study so confounded by generalised toxicity that no one can agree what it means.

ECHA's "Advice on dose-level selection for the conduct of reproductive toxicity studies (OECD TGs 414, 421/422 and 443) under REACH" is the Agency's attempt to settle that argument for registrants, contract laboratories, and its own evaluators. The original version, published in January 2022, became one of the more contested pieces of EU regulatory toxicology guidance of the decade — contested enough that a coalition of professional and scientific societies (the European Teratology Society, the Society of Birth Defects Research and Prevention, the Society of Toxicology's Reproductive and Developmental Toxicology Specialty Section, ECETOC, NC3Rs, the UK Industrial Reproductive Toxicology Discussion Group, and ESTIV) put their names to a formal rebuttal in Regulatory Toxicology and Pharmacology in 2024 (Beekhuijzen et al.).

ECHA has now published a revised version, dated June 2026. The obvious question — the one this article exists to answer — is whether the revision actually responds to industry's comments, or whether it simply restates the same underlying philosophy in tighter legal language.

Having read both versions side by side, line by line, our answer is: mostly the latter. The 2026 text is a genuine technical improvement in places — it is more precise, better anchored to formal animal-welfare law, and more internally consistent than its predecessor. But on the three substantive points that the DART scientific community actually raised, it does not move, and in at least two respects it arguably hardens ECHA's original position rather than softening it.

What actually changed: a structured walkthrough

1. A new, dedicated section demanding "experimental data" and finer dose-spacing

The single biggest structural addition in 2026 is an entirely new section, absent from 2022, titled in substance "Experimental data is needed to set dose levels." It does two things:

  • It states explicitly that dose selection must be grounded in actual study data — either Test Guideline studies or non-guideline dose-range-finding (DRF) studies — and that the doses used in those supporting studies must correspond closely to those intended for the definitive study.

  • It introduces, for the first time, explicit guidance on dose-spacing near the top of the range. If existing data shows a steep dose-response curve — for example, severe suffering or mortality at one dose, and no, minimal, or mild effects at the dose 2- to 4-fold below it — the document now states that the conventional 2- to 4-fold spacing used in OECD test guidelines is "likely not appropriate," and that additional dose groups, or an entirely new DRF study, should be considered to "identify a dose that induces toxic effects without severe suffering."

There is one welfare-conscious caveat attached: where additional dose groups are added, ECHA suggests this should be done "without increasing the number of animals" — i.e., spread the same total N across more groups rather than adding new animals. It's a real nod to 3Rs thinking, but it does not address the core complaint, which is about the number of studies (and the severity within them) required to triangulate the precise MTD threshold, not solely the per-study animal count.

This section did not exist in 2022 in anything like this form, and it is worth dwelling on because it is the clearest evidence of how ECHA is choosing to operationalise its evaluation findings. Recall that ECHA's own internal review — the one that motivated the 2022 guidance in the first place — found that 20% of EOGRTS studies it evaluated used dose levels too low to identify hazards. The 2026 revision responds to that finding by making the upper end of the dose range a harder, more explicitly interrogated target, not a softer one.

2. Two brand-new worked examples (7 and 8) — and they cut the wrong way for the critics

The 2022 document had six worked case examples for TG 443 dose selection. The 2026 version adds two more, and they are, in our view, the most revealing additions in the entire revision.

Example 7 — "Inappropriate characterization of severe suffering." A TG 422 study shows body-weight reductions of 8%, 5%, and 10% (low, mid, high dose) and reductions in sperm count per cauda epididymis of 8%, 15%, and 33% at the same doses, with no clinical signs. The registrant proposes capping the EOGRTS top dose at the mid-dose level, arguing the high dose shows "excessive toxicity." ECHA's conclusion: rejected. A body-weight decrement under 20% does not, by itself, constitute severe suffering — nor does a 33% reduction in sperm count — so the EOGRTS top dose should remain at the originally tested high dose, not the more conservative mid-dose the registrant proposed.

Example 8 — "Existing information shows severe suffering (but not death)." Here a TG 414 study shows genuinely severe signs (prostration, incoordination, breathing difficulty, hepatic necrosis) at the top dose, with no signs at all at a dose three-fold lower. ECHA's conclusion: the high dose is correctly excluded as dose-limiting, but because the gap to the next lower tested dose is so wide, additional dose groups or a new dedicated DRF study are required to pin down the precise dose at which severe suffering begins.

Read together, these two examples do real, useful work: they import a specific, externally sourced numeric anchor — the >20% body-weight-loss threshold — that traces directly back to the ECETOC/Lewis et al. dose-setting literature, which is exactly the literature Beekhuijzen et al. cite as the scientifically preferable alternative to ECHA's approach. That is a genuine point of engagement, and credit should be given for it.

But look at how it is used. In Beekhuijzen's and ECETOC's framing, the >20% body-weight threshold functions as a ceiling — a signal that a dose is producing secondary, non-specific systemic stress that will confound interpretation of the chemical's intrinsic reproductive effects, and therefore a reason to stay below it. In Example 7, ECHA uses the exact same numeric threshold as a floor that has not yet been cleared — i.e., evidence that the registrant has not yet justified going lower, even in the presence of a substantial (33%) reduction in sperm count. The threshold is the same; the direction of the inference is the opposite of what its originators intended.

Example 8 then reinforces the new dose-spacing rule with a worked precedent that other registrants and evaluators will now be expected to follow: when there's a wide gap between "clearly fine" and "clearly severe," the answer is more testing to find the exact line, not erring toward the conservative side of an already-wide gap.

3. "Severe suffering" gets a firmer — though still imperfect — legal anchor

This is the area where I think the 2026 revision makes its most legitimate, good-faith improvement. The 2022 text leaned on a single citation — OECD Guidance Document 19 (GD19), on the recognition, assessment, and use of clinical signs as humane endpoints. GD19 dates to 2000 and has not been substantively updated since; it is, by the admission of multiple commentators in this space, looking dated relative to current animal-welfare science and practice.

The 2026 text adds, in multiple places throughout the document (general criteria, classification and labelling, and reporting sections), explicit reference to the severity-classification framework under EU Directive 2010/63/EU — the binding EU legislation governing the protection of animals used for scientific purposes. This gives assessors a second, legally grounded, externally defined yardstick for what counts as "severe suffering," rather than relying solely on a two-decade-old OECD document.

This is a real, citable response to the "poorly defined, subjective" criticism that previous opinion's raised. It does not resolve the inherent difficulty of judgement calls across species, studies, and laboratories. But tying the standard to a formal legal severity framework, rather than an ECHA-internal interpretation, is a defensible and useful change.

4. Endocrine disruption (Article 57(f)) language is dropped from the general rationale

The 2022 text explicitly stated that dose-level selection should ensure conclusive data generation for classification and labelling, risk assessment, and whether a substance meets the SVHC criteria for endocrine disruption under Article 57(f) of REACH. The 2026 text simplifies this to classification and labelling and risk assessment only, with the endocrine-disruption reference removed.

This is worth flagging clearly, though I'd caution against over-reading it: the CLP classification framework and underlying REACH obligations around endocrine disruption have not changed, and the studies in question (TG 414, 421/422, 443) remain part of the evidentiary base used in endocrine-disruption assessment regardless of whether this specific paragraph name-checks Article 57(f). It likely reflects that ED assessment has accumulated its own dedicated guidance elsewhere since 2022, making the cross-reference here redundant rather than substantively abandoned. Still, it's a textual change worth registering for anyone tracking how ECHA frames the purpose of these studies over time.

5. The human-relevance sentence disappears entirely

This is, in our interpretation, the most quietly significant deletion in the whole document. The 2022 text contained exactly one sentence anywhere in the document that explicitly invoked human relevance as a factor in dose selection: "Expected human response may indicate the need to use a dose level above 1,000 mg/kg bw/day." It was narrow in scope — framed only as a reason to potentially exceed the standard limit dose — but it was the only place in the entire document where the question "does this matter for humans" appeared as an explicit, named consideration in the dose-setting logic, as opposed to being implicit in the downstream classification consequences.

That sentence is gone in 2026. Nothing has been added in its place anywhere else in the document. There is no new discussion of mode of action, biological plausibility, or kinetic relevance to human exposure routes. If you set out to verify, sentence by sentence, whether ECHA's revision engages with the "limited discussion of human relevance" criticism, the honest answer is that it engages with it by deleting the one sentence that previously gestured toward it. we don't agree this was a deliberate philosophical statement by ECHA — it reads more like a side effect of tightening the limit-dose paragraph for clarity — but the practical result is a document that is, if anything, less anchored to human relevance in 2026 than it was in 2022.

6. Toxicokinetics: one new clause, no change in substance

The 2022 document was fairly blunt about toxicokinetics: TK information "may provide reasons to adjust... the dosing route and regime," but "setting the dose level by toxicokinetic considerations only is not allowed under REACH because dose-level selection should be based on toxicity." The 2026 text adds a single new clause upfront — "toxicokinetic information may assist in assessing appropriate dose levels" — before repeating the same restrictive conclusion verbatim.This is the part of the revision I'd characterise as cosmetic rather than substantive. The acknowledgement that TK "may assist" is true, and better than nothing, but it does not constitute engagement with the actual scientific argument being made.

7. A few genuinely useful, narrower improvements

Not everything in the revision is contested philosophy — some of it is good, uncontroversial housekeeping:

  • New cohabitation-phase guidance. For dietary or drinking-water studies, where both sexes share an exposure medium and can't easily be dosed independently, the 2026 text explicitly allows adjusting the highest dose during the mating phase to protect the more sensitive sex, based on expert judgement. This is a sensible, practical, animal-welfare-positive addition that the 2022 text simply didn't address.

  • A tightened TG 414 rabbit example. The worked example for embryo-fetal effects in rabbits previously cited a bare 15% reduction in gravid uterine weight as sufficient grounds to set the top dose. The 2026 version raises this to a 30% example and explicitly requires the effect to be characterised as "treatment-related and biologically relevant" — a real, if narrow, move toward adversity-based judgement rather than a bare percentage trigger, in exactly the direction Beekhuijzen et al are arguing for. It's a single example, not a general principle, but it's a genuine win for their position, however small.

  • New weight-of-evidence language for interpreting steep dose-response curves in rabbits. Also a positive, if locally scoped, addition.

  • A Board of Appeal decision is now cited in the TG 443 section (Case A-006-2022, Symrise and Others, 29 August 2023) to support the basis for clear-evidence determinations on sexual function and fertility — a sign that this guidance is increasingly being shaped by, and is expected to withstand, formal legal challenge, not just scientific debate.

8. One change that arguably moves backward relative to offspring protection

The 2022 text explicitly noted, for both TG 421 and TG 422 when used as dose-range finders for TG 443, that "prolongation until weaning is recommended to cover the sensitive life stages of pups from parturition to weaning during lactation." This specific advice — encouraging extension of the screening study to better capture postnatal and lactational windows — has been deleted in the 2026 text, which now simply states the studies "can be used also as a dose-range finder for OECD TG 443" with no further elaboration.

Given that Beekhuijzen et al.'s central concern is precisely that postnatal and offspring-related hazards risk being under-investigated when TG 443 dosing prioritises parental fertility, removing rather than reinforcing language that encouraged earlier, dedicated attention to the lactational window is a step that cuts against their position, not toward it. It may simply be editorial streamlining rather than a deliberate policy signal, but the practical effect is the same either way.

The central, unmoved question: TG 443 and the priority of fertility over offspring

Everything above is detail. Here is the substance.

Beekhuijzen et al.'s first and, in our reading, most important point is this: the EOGRTS (TG 443) is explicitly designed by OECD to evaluate both parental reproductive function and postnatal developmental outcomes in offspring across a sensitive developmental window — it is, structurally, the only standard REACH information requirement that does this. ECHA's 2022 guidance instructed that dose levels for TG 443 "should not be reduced to get enough offspring for the assessment of developmental toxicity," explicitly subordinating offspring survival and developmental assessment to parental fertility dosing.

We checked this language carefully against the 2026 text. It is retained verbatim. The sentence "the dose levels should not be reduced to get enough offspring for the assessment of developmental toxicity" appears unchanged. The instruction that the focus on sexual function and fertility "should be prioritised in the study design" is also retained verbatim. And — most tellingly — Example 4, the worked case where a substance already shows severe, clear post-implantation loss sufficient for self-classification as Repr. 1B H360D, but is still required to be dosed at the limit dose in the EOGRTS "despite the developmental effects observed," because developmental classification "is not a valid adaptation justification" for the TG 443 study requirement, is carried over from 2022 to 2026 essentially unchanged, down to the specific numbers.

This is exactly the scenario Beekhuijzen et al. describe playing out in practice in their paper, citing two unpublished EOGRTS reports where dosing at the maternally-tolerated limit produced offspring scoliosis, severe body-weight retardation, and neonatal lethality severe enough that the F1 generation had to be euthanised at postnatal day 21 — precluding assessment of the very postweaning and second-generation endpoints the study exists to generate. Their point is not hypothetical; it is drawn from real studies conducted under this exact guidance. The 2026 revision does not change the rule that produced those outcomes.

If you are asking "did the 2026 update address the single biggest substantive complaint in the field's formal rebuttal," the answer, on the textual evidence, is no.

Scoring the revision against Beekhuijzen et al. (2024)

Point 1: Segregating fertility testing from offspring viability risks missing postnatal/developmental hazards
Addressed in 2026 revision? No. Core language retained verbatim; TG 421/422 "prolongation to weaning" advice deleted, not reinforced; Example 4 unchanged. Direct textual comparison

Point 2: High-dose recommendations exceed the MTD and confound biological response with intrinsic toxicity
Addressed in 2026 revision? No — arguably hardened. New Example 7 explicitly overrules a more conservative registrant proposal despite a 33% sperm-count reduction, because body-weight loss stayed under 20%. New dose-spacing rules push toward finer resolution at the top of the range. | New Example 7; new "Experimental data" section

Point 3: Net effect is more animal use via repeats, extra DRF granularity, and rejection by other regulators
Addressed in 2026 revision? No — arguably hardened. New dose-spacing/DRF granularity requirements and Example 8 both explicitly call for additional dose groups or dedicated new DRF studies near the MTD threshold. New "Experimental data" section; new Example 8

The one place where ECHA does engage substantively and in good faith with the literature Beekhuijzen et al. point to — importing the >20% body-weight-decrement threshold from the ECETOC/Lewis et al. dose-setting framework — it imports the number but inverts its function, using it to justify staying at a higher dose rather than as a signal to step down.

Unanswered questions and what should happen next

Having gone through this in detail, here is where we think the genuinely open scientific and regulatory questions sit.

1. Is "highest possible dose without severe suffering" actually the right target for hazard detection in DART studies? This is the foundational disagreement, and the 2026 revision does not engage with it — it assumes the answer is yes and focuses entirely on defining the boundary more precisely. The competing position, well supported in the maternal-toxicity and dose-selection literature ECETOC and Beekhuijzen et al. cite, is that doses approaching the MTD in pregnant and lactating animals — across an exposure window (10 weeks premating plus gestation plus lactation in TG 443) considerably longer and more physiologically demanding than the screening studies used to inform that dose — risk producing secondary, non-specific effects (stress-mediated litter loss, reduced maternal care, generalized systemic compromise) that are not the chemical's intrinsic property and that actively reduce, rather than improve, the interpretability of the reproductive and developmental endpoints the study exists to measure. This is not a fringe view; it carries the institutional backing of seven professional and scientific societies. ECHA's revision restates its original position with more precision but does not address the substance of this disagreement.

2. How do we reconcile a "push to the ceiling" EU standard with global regulatory acceptance? Beekhuijzen et al. explicitly raise the risk that EOGRTS studies dosed to meet ECHA's high-dose expectations may not be acceptable to other major regulators (they cite US EPA and Japan's MAFF as examples) precisely because the resulting toxicity compromises interpretability by those agencies' own standards — which would mean registrants conducting a single study design that satisfies neither jurisdiction cleanly, and in the worst case repeating studies for different markets. The 2026 text does not address international harmonisation at all. This remains, in our view, one of the most practically important open questions for any company doing global, not just EU, registration.

3. Where is the line between "evidence-based" and "burden of proof you can never fully discharge"? The new requirement that dose-spacing near the top of the range be interrogated more finely sounds reasonable in the abstract, but Examples 7 and 8 together establish a pattern where almost any registrant proposal to set a more conservative top dose can be met with "you haven't yet proven the threshold lies below the level you tested," triggering further studies. At what point does this become an unfalsifiable standard — one where there is always, in principle, a finer-grained DRF study that could narrow the threshold further, and therefore always a basis to reject a registrant's more conservative proposal? The document doesn't engage with this risk, and we would flag it as a real practical concern for anyone designing a dose-range-finding programme under this guidance.

4. What happened to human relevance, and was its removal intentional? Given how central this is to Beekhuijzen et al.'s critique, it would be useful for ECHA to clarify publicly whether the deletion of the one human-relevance sentence in the 2022 text was a deliberate narrowing of scope or simply incidental to tightening the surrounding paragraph. As it stands, the silence is doing a lot of unintended communicative work.

5. Does GD19 need replacing, not just supplementing? Our review is leaning more heavily on OECD GD19 to define severe suffering, while welcome as a clarifying move, runs into the limitation that GD19 itself is 25 years old and has not kept pace with current welfare science. Citing it more often does not fix that it may need substantive revision. This is arguably a problem for OECD to solve rather than ECHA, but ECHA's increased reliance on it makes the case for an OECD-led update more urgent, not less.

6. Is there a workable middle path? The ECETOC dose-setting guidance (Lewis et al., 2024; Sewell et al., 2022) that Beekhuijzen et al. point to as the preferred alternative is not a rejection of high-dose testing in principle — it's an attempt to operationalise "appropriately high" in a way that explicitly screens out doses likely to produce confounding secondary toxicity, using criteria like the body-weight threshold ECHA itself has now partially adopted (in Example 7) but applied in the opposite direction. A genuinely responsive future revision would, in my view, need to engage with where ECETOC's framework and ECHA's framework actually diverge in outcome, not just borrow a number from one while keeping the philosophy of the other.

Bottom line

The June 2026 revision is a better-drafted, more legally precise, and in places more methodologically rigorous document than its 2022 predecessor. The new anchoring to Directive 2010/63/EU's severity framework, the tightened rabbit developmental-toxicity example, and the new cohabitation-phase guidance are all genuine, defensible improvements that a careful reader should credit.

But on the question that actually generated formal, multi-society pushback from the DART scientific community — whether EOGRTS dosing should continue to prioritise parental sexual function and fertility even at the expense of offspring survival and developmental assessment, and whether "highest possible dose without severe suffering" remains the right operating principle at all — the 2026 text does not move. If anything, the two new worked examples suggest ECHA's working interpretation of "appropriately high" has, if anything, hardened rather than relaxed since 2022, even as the document around it has become more precise about what "severe suffering" technically means.

The debate in 2022 was about an ambiguous standard. The debate from 2026 onward will be about a far more precisely worded standard that the field's leading professional societies still don't think is the right one.

This analysis is based on direct comparison of ECHA's "Advice on dose-level selection for the conduct of reproductive toxicity studies (OECD TGs 414, 421/422 and 443) under REACH," January 2022 and June 2026 editions, and Beekhuijzen et al. (2024), Regulatory Toxicology and Pharmacology 151:105665. It reflects our toxicologist's reading of the primary texts and is not a substitute for direct consultation of the ECHA guidance or formal regulatory advice.

info@chemtoxcompliance.com