P.S: My paper was rejected because its scope (clinical evidence, epistemology, philosophy and methodology) does not fit any journals, which led me to LessWrong to share ideas as well as seek some advice.
This is a common problem in many fields, arguably getting much worse due to AI but not fundamentally new. Any sufficiently broad systemic analysis of a complex problem is likely to be outside the scope of any of the institutionalized specialties and subspecialties created to study its parts. Re-integrating those insights is hard, and sometimes requires either a clear crisis and/or someone with much more than the usual amount of clout and standing to force the issue. I suspect this is part of why science in general has so much incremental/revolutionary/incremental structure to its progress, and why many surprising advances historically get made by polymaths or outsiders (despite most claims of such being cranks).
In principle a single sufficiently advanced AI can solve this problem by combining adequate competence in all the fields at once in a single mind, plus arbitrarily high levels of attention to detail at high speed, plus red-teaming and spot checking and other precautions against (and punishments for) cheating. In practice we don't have anything like that today, and our institutions are not prepared for either what we have today or what may be possible in the future. And epistemically, medicine is often a cursed field, where the reality is extremely complicated and confounded and bound by enforced standards that may or may not be either relevant or useful or sufficient in specific cases.
None of this is my field, but how familiar are you with the people who were behind MetaMed? Their stories might have useful lessons for you. Ditto for a substantial minority of the corpus of SlateStarCodex and AstralCodexTen, especially some of Scott's more medically-focused "Much more than you wanted to know" posts.
You may also want to check out this post, in which an economics professor lamenting a somewhat similar problem of no journals considering his paper on the economics of AGI to be in scope despite being well received in conference presentations.
You may also also want to look at the whole EA community and its adjacent research ecosystem directly. "People willing to look straight at weird-sounding interdisciplinary questions to achieve real world positive impact" is kind of their thing.
Thank you! This is genuinely one of the most useful responses I've received on this topic. It feels a bit like finding the kind of mentor perspective I was hoping to encounter here.
I'll look into MetaMed seriously. My first pass already suggests an interesting parallel: epistemic ambition running ahead of institutional receptivity, which feels very relevant here.
Your framing about polymaths and outsiders also resonates with me. My tentative read is that the bottleneck is not breadth of competence alone, but the way incentive structures reward depth-within-lane, while synthesis across lanes carries reputational risk without a clear “home discipline” to absorb it. The four rejections this paper accumulated were all scope mismatches.
The EA community pointer is also well taken. In retrospect, LessWrong is probably one of the more natural venues for this kind of work, a place willing to seriously engage with the intersection of epistemology, institutional design, and AI-related failure modes.
I'll explore the ACX corpus you mentioned. I'm also curious whether you think an EA framing would strengthen or complicate the reception of a more empirically grounded version of this argument.
This is my very first post, I would like to share next posts on this topic.
I'm also curious whether you think an EA framing would strengthen or complicate the reception of a more empirically grounded version of this argument.
I honestly don't know. I'm not really in the EA sphere, though I agree with the underlying EA approach. A lot of people do seem to have weird reactions to it for reasons that I freely admit make little sense to me. So I imagine that means it would probably complicate reception? But the reception already seems pretty complicated anyway, and it could still be useful to explore how they operate, what they've learned, see if there is a path there.
I'd be curious to know which tools you're using and what your workflow is.
Generic commercial chat portals from frontier labs are not (primarily) optimized for epistemology. They're optimized for likeability, perceived usefulness, engagement, etc. They are not inherently truth-seeking. They do not stake out and commit to factual positions. You can nudge them in a particular direction and push them pretty reliably into whatever space confirms your biases.
Literature reviews are useful, if you demand links to sources and double-check them. Grammar/spelling checks are great. But using a public chatbot and expecting a rigorous lab partner is not a good idea.
The big labs have all recently announced/released science tools:
None of these are open. GPT-Rosalind is gated trusted-access; Co-Clinician is a research project, not deployable; Claude for Life Sciences is enterprise-targeted via API credits.
LLMs don't necessarily lead to what you're calling "epistemic immunodepression". They can and will if used naively. The same way AI coding tools can easily lead to mountains of technical-debt slop without some basic rigor. The tools need to be optimized for that particular role and they need to be used responsibly. Right now, neither is happening at scale.
Thanks for the question.
My workflow is modest. For daily work, I use the public versions of Claude (cowork), GPT, and Gemini. No GPT-Rosalind, no Claude for Life Sciences, no Co-Clinician. I do not have institutional access to any gated research tool. I work as an independent clinical researcher with no funding, so what I can use is what is on the open web at a paying-user tier. For research, when I need to evaluate the output of LLMs, I use some local models through Ollama and Google API key through Google AI studio.
What I try to do is detect the model disagreement rather than agreement. I prompt for the strongest objections to my own claim before I ask for support. I ask three models the same question and compare answers. I make the model name the assumption it is hiding. None of this is sophisticated. It is the minimum hygiene that a non-native English clinician with a full-time hospital job can manage.
And yes, I agree with your core point. A generic chatbot used naively will produce confirmation, not correction. That is exactly the failure mode I am trying to diagnose. My paper is not arguing that LLMs are bad. It is arguing that without structured frameworks for use, the default drift is toward what I call "epistemic immunodepression". The tools you listed (Rosalind, Co-Clinician, Life Sciences) are promising because they impose structure. The problem is the gap while we wait for them.
I will share more about the workflow and the empirical study behind this paper in the next post of the sequence. I am building this in public, mostly because I have no other option, and partly because I think the failure modes are easier to see from outside the institutions.
I am writing as a pediatric surgeon and a clinical researcher, whose works seem less likely to be affected by the AI explosion. However, the reality is totally changed and this led me to make a hypothesis that the self-correction ability of science (and of medicine, my specific domain) is eroding in the age of AI.
To be honest, there is a sarcastic truth here that I completed this small paper (preprint https://doi.org/10.31222/osf.io/gqunf_v1) with significant support from AI: research for original articles, language editing (truly efficient for a non-native speaker), simulated peer review, especially when I asked where I should publish my work! That's all of the introduction, now we will follow the flow of my humble but fun reasoning process.
Every day, I make decisions about operation for the children, which is based mostly on the evidence-based medicine (EBM). The question is, how this evidence is generated? Firstly, it comes from case reports or series, then from more strictly designed research (cohort, RCTs), then meta-analysis to make a conclusion or guideline for clinical practice. That chain works not because each link is perfect, but because each link can check the previous one, as listed from previous works: independent evaluation (peer-reviews and editor review), methodological plurality (many types of design help identify the truth), traceability (audit how a conclusion is made), epistemic friction between authors and critics (the huge workload/finance from a question to a conclusion).
In the age of AI, all 4 of these conditions are gradually eroding. Obviously, the friction is substantially reduced that with the support from LLMs, researchers could easily synthesize hundreds of papers in several days rather than several months as before. The traceability is also challenged, while we cannot define exactly how LLMs "think or reason" to give a conclusion or a result. Plurality is decreasing, which is named in recent literature as 'monoculture' phenomenon. More interestingly, nowaday, AI helps researchers (like me) doing research, which is then also reviewed by AI also when reviewers utilized.
These claims are not my personal ideas. Some empirical signals are accumulating: 28.6%–91.4% of LLM-generated references in systematic-review assistance are fabricated; only 6% of published AI models in paediatric surgery are both interpretable and externally validated; an audit of 2,271 evidence syntheses (2017–2024) documents automation spreading across search, screening, and extraction.
To make it more understandable, I termed this syndrome "epistemic immunodepression": a passive weakening through scale, opacity, and the collapse of independence between those who generate research and those who evaluate it. Current governance cannot detect structural failure modes. The fix has to be more verifiable: a research record, a AI logbook, evidence pyramid recalibration, peer review AI accountability.
I also pre-registered on OSF to do an empirical study. If the diagnosis holds empirically, the intervention is urgent because a journal can retract a paper, but a surgeon cannot reverse a decision already executed on a child.
P.S: My paper was rejected because its scope (clinical evidence, epistemology, philosophy and methodology) does not fit any journals, which led me to LessWrong to share ideas as well as seek some advice.