Ameyer.me

Bridging the Gap: A Review of "Mesoscale Modules as Fundamental Units of Tissue Function"

me@ameyer.me (Aaron Meyer) — Sun, 04 Jan 2026 08:00:00 +0000

In their recent perspective, Chen et al. observe that the leap from single-cell phenotypes to whole-organ function is too vast to be bridged by purely data-driven methods. They propose the existence of “mesoscale modules”—intermediate, functional units of tissue organization that serve as the building blocks for higher-order physiological behaviors. Drawing an analogy to the “network motifs” (like feed-forward loops) that revolutionized our understanding of gene regulatory networks (GRNs), the authors suggest that tissues are composed of recurring architectural patterns.

The authors propose several example modules differentiated by their function and structure:

Sheets: 2D arrays (like epithelia) that form selective barriers and facilitate mechanosensing.
Streets: Conduits formed by ECM or cells that expedite long-range transport.
Proliferative Conveyors: Structures like intestinal crypts that harness cell division to produce a controlled output of differentiated cells.
Repeaters: Arrays of sensory units (e.g., villi or hepatic lobules) that enhance signal detection and range.
Search Parties, Spatial A/D Converters, and Flocks: Modules governing collective migration, decision-making boundaries, and coordinated movement.

First, this is an outstanding, beautiful perspective piece. I love that this is an attempt to bring together ideas about how to integratively model intercellular systems. This perspective, along with recent work such as Johnson et al. (2025), shows that systems biologists have much to contribute to this problem. Where Chen et al. define the structural units (the “nouns” of tissue architecture), Johnson et al. provide the “grammar”—a human-interpretable language for encoding the behavioral rules that drive these agents. Together, these works suggest a future where we can describe a tissue by its modular architecture and simulate it using a standardized grammar of cell behaviors.

It also reflects that conceptual advancements will be necessary to understand multicellular networks. Purely data-driven (i.e. data hungry) methods won’t go as far at larger scales. We can obtain millions of cells from individuals to build atlases, but we can’t obtain millions of healthy lungs to run perturbations. We need the kind of mechanistic abstraction these papers propose to simulate “virtual tissues” and bridge the gap between static ‘omics data and dynamic function.

After considering the perspective, I think there are some limitations to the mesoscale modules concept. However, I bring these up as opportunities to refine the idea—it seems like the clearest path to me for building models that capture the emergent behaviors of tissues.

Multicellular Architectures Have a More Limited Mapping to Function Compared to GRN Counterparts

In the study of GRNs, a specific network motif (like a coherent feed-forward loop) provides a constrained range of behaviors, but that range is tunable via parameter values. Furthermore, we have excellent theory for exploring behavior across parameter ranges.

In contrast, multicellular architectures, such as Sheets, appear to work only for certain functions, and those functions are not necessarily revealed by the architecture itself. A sheet might be a barrier in the gut or a filtration system in the kidney glomerulus. The architecture alone doesn’t fully define the input-output relationship in the way a GRN motif often does.

Perhaps unbiased ways of looking for cellular features associated with an architecture motif could reveal new biology? For instance, cells in sheets will be enriched for cadherins and tight junction proteins compared to neighboring cells. If we could systematically link these molecular features to the Sheet module, this might help to define the unique niche of the module.

It Is Revealing That Multicellular Architectures Are Mostly Restricted to the Same Features Copied Across Cells

A striking feature of the modules proposed by Chen et al. is that they often involve the same features copied across many cells. For example, all the cells in a Sheet are sheet-like; all units in a Repeater are identical sensory structures.

Perhaps one exception is the Proliferative Conveyor, where cells exist along a differentiation cascade—stem cells at the base, transit-amplifying cells in the middle, and differentiated cells at the exit. This implies a spatial coordination of different cell states.

However, there are almost certainly many motifs in which cells of distinct types coordinate that are missing from this initial list. For instance, with stem cells and germinal centers, you have “feeder-eater” motifs or niche-support motifs. One cell type provides a signal and support for another to regulate the abundance or survival of the latter (e.g., the distal tip cell in the C. elegans gonad conveyor). How could we define the complex cytokine circuits that exist between immune cells? We could use better ways of discovering this coordination across cell types rather than just identifying geometric arrangements of similar cells. We are certain to only be scratching the surface of module complexity.

GRN Motifs Provide the Necessary Information to Run Useful Simulations—This Is Not Yet True for Multicellular Motifs

If you identify a GRN motif, you often have the topology required to write down differential equations that describe evolution and response of the system. This is not (yet) the case for multicellular motifs.

So you observe that cells are organized in a sheet. What is it that you would even model? The parameters—permeability, stiffness, transport rates—are not inherent to the Sheet definition. Maybe one needs a more focused question at this level of abstraction. Cells participate in many different functions, and so they are not as simple as a gene.

However, if the motif itself doesn’t tell you something specific about the behavior of the system, are we actually learning something more than what would come from an atlasing effort? Maybe refinements of modules can provide the specificity required for simulation. For example, Streets are described as facilitating transport, but “flow-based streets” (like blood vessels) are modeled very differently from “neuronal streets” (axon bundles) or “fibrous streets” (ECM tracks).

Given this, maybe there is a hierarchy of multicellular architectures, like GO terms, where there are high-level abstractions (e.g., “Transport Module”) and low-level, simulation-ready abstractions (e.g., “Vascular capillary network”).

Conclusion

I feel like it would be helpful to think about how one defines architectures beyond manual curation. This was done successfully with GRNs, where groups have been able to enumerate all possible motifs (e.g., 3-node networks) and categorize them based on function and frequency. To truly democratize virtual cell laboratories—as Johnson et al. aim to do with their grammar—we need a rigorous way to enumerate and define these mesoscale modules so they can become the reliable subroutines of tissue simulation.

A more formal plea for ambitious AI benchmarks in cancer research

me@ameyer.me (Aaron Meyer) — Wed, 09 Jul 2025 08:00:00 +0000

A close friend inspired me to turn my recent blog post into a response to a recent RFI from the National Cancer Institute. As these responses never become public, and I am interested in others’ thoughts regarding some of these ideas, here was my response.

In response to the Request for Information, I offer the following input on the development of priority artificial intelligence benchmarks for cancer research.

What are AI-relevant use cases or tasks in cancer research and care that could be advanced through the availability of high-quality benchmarks?

We must aim far beyond tasks that are already largely solvable, such as image segmentation or treatment extraction from electronic health records. The true need is for benchmarks that define currently unachievable scientific goals. Here are some high-priority AI-relevant use cases in cancer research and care that would greatly benefit from novel benchmarks:

Predicting multi-cellular tissue response to localized perturbation: This goes beyond single-cell or bulk analysis. For example: “If you perturb gene X in cell population A within a specific tumor microenvironment, what specific and quantifiable changes would you expect to see in the gene expression, signaling pathways, and phenotypic behavior of neighboring cell populations B and C, and how would this collectively impact tumor growth/metastasis in vivo?” This requires understanding complex intercellular communication and emergent properties of tissues.
Predicting cancer incidence from molecular measurements of patient state: Patients present with many wide-spread molecular changes upon diagnosis, including reprogramming of their immune system. This task could aim to predict which patients in high-risk groups will be diagnosed with cancer, and of which type and when, given readily accessible molecular measurements, such as transcriptomics of their peripheral blood. Significant advancement here would aid the development of both predictive diagnostics and potentially preventative therapies.
Identifying optimal multi-modal therapeutic interventions for an individual patient's complex tumor state: This is not just predicting response to a single drug, but identifying the combination and sequence of therapies (e.g., specific chemotherapies, immunotherapies, targeted therapies, radiation, surgery) that will lead to a defined positive outcome (e.g., complete remission, prolonged progression-free survival, minimal side effects) based on their comprehensive molecular, cellular, and clinical profile. This moves beyond broad patient cohorts to truly individualized predictions of therapeutic efficacy and toxicity. This could be evaluated by providing molecular information about the tumor, alongside the sequence of therapies, and predicting the masked long-term survival.
Predicting long-term systemic impact of cancer and its treatment on patient physiology and quality of life: This benchmark would focus on integrated, whole-organism modeling. For example: “Given a patient's initial cancer diagnosis and treatment plan, predict the trajectory of specific organ function (e.g., cardiac, renal, neurological), immune system state, and patient-reported quality of life metrics over 5 years, accounting for potential late effects of treatment and disease progression.” This requires integrating diverse data types and understanding inter-organ dependencies.
Forecasting cancer evolution and emergence of resistance mechanisms under specific treatment regimens: Instead of merely detecting existing resistance, the benchmark would be: "Given a patient's tumor molecular profile at diagnosis and a proposed treatment regimen, predict the specific genetic and phenotypic alterations the tumor will acquire, and estimate the timeframe for this resistance to emerge in vivo." This requires dynamic, predictive models of evolutionary trajectories.

These are use cases where benchmarks are not merely scarce; they are non-existent because they require a level of biological understanding and predictive power we do not currently possess. Focusing on such challenges will ensure that AI development is aimed at generating novel biological capabilities, not just automating what we can already do.

What are the desired characteristics of benchmarks for these use cases, including but not limited to considerations of quality, utility, and availability?

The most critical characteristic of a benchmark should be its ability to define a quantifiable, testable, and currently unsolved scientific problem. Its utility will not be in comparing a dozen similar models on an existing dataset, but in providing a clear "North Star" for the entire field, compelling us to create models with entirely new predictive powers. This framework necessitates the adoption of masked or sequestered testing sets as a standard practice. By keeping evaluation data hidden, we can ensure objective, unbiased assessment of model performance, a crucial guardrail against the self-deception and publication bias that can otherwise hinder true progress. In many cases, the benchmark will define a task for which the necessary training data has not yet been collected, thereby spurring new experimental work as an integral part of the solution.

Along these lines, some specific desired characteristics for benchmarks include:

Defining a currently unachievable task: The benchmark must target problems that are not trivially solved by existing statistical methods and require significant advancements in AI and systems biology.
Clear, quantifiable metrics for success: Analogous to AlphaFold's protein structure prediction, there must be objective, numerical ways to assess model performance. From the examples above, this could involve:
- Quantifiable changes in gene expression and protein levels in specific cell types within a tissue.
- Measurable reduction in tumor volume, number of metastases, or time to recurrence in in vivo models.
- Specific and measurable improvements in organ function or patient-reported outcomes.
- Accuracy in predicting specific resistance mutations or pathways.
Requiring novel, out-of-sample data for validation: This is crucial to combat publication bias. The validation dataset must be kept separate and unknown to the model developers during training and development. This promotes true generalizability.
Masked testing sets: A portion of the evaluation data should be held secret, accessible only for submitting models and objectively assessing performance. This ensures unbiased evaluation.
Biological and clinical relevance: The benchmarks should address questions that, if solved, would genuinely advance our understanding of cancer biology or significantly improve patient care, rather than automating existing clinical or scientific tasks.
Multi-modal and multi-scale data integration: The problems often span genomic, proteomic, imaging, clinical, and physiological data, requiring models to integrate information across different biological scales (molecular to organismal).
Testable against measurements: The benchmark should be designed such that its proposed solution can eventually be verified through experimental or clinical measurements, even if those measurements are currently challenging to obtain. This encourages the collection of new experimental data and mechanistic understanding.
Promoting open exchange of ideas: The benchmark framework should encourage competition and collaboration, fostering an environment where different techniques can be rigorously compared and insights shared. For example, the AI community regularly publishes both approaches that advance performance or do not work.
Incentivizing major scientific advancement: The challenges should be significant enough to warrant substantial research effort and potentially large prizes, as seen in the protein structure prediction field.

What datasets currently exist that could contribute to or be adapted for benchmarking? Please include information about their size, annotation, availability, as well as AI use cases they could support.

While numerous datasets currently exist, they are insufficient for creating benchmarks by nature of being currently available. These existing resources are useful for training. However, determining whether a model is effective requires out-of-sample validation data that has never been seen by the developers. Furthermore, the very process of tackling a benchmark should involve the generation of new experimental knowledge. The paradigm should be less about fitting models to existing data and more about using models to generate bold, testable hypotheses.

What are the biggest barriers to creating and/or using benchmarks in cancer research and care?

The greatest barrier to creating and using meaningful benchmarks is a fundamental lack of consensus on the long-term goals for AI in biology. Without a shared understanding of what we are trying to achieve, efforts will remain scattered and focused on incremental advances. This is compounded by the immense difficulty and expense of generating the novel experimental data required for true, out-of-sample validation, which incentivizes a culture of self-validation and overly optimistic reporting. Ultimately, the field is hampered by a focus on automating existing data analysis rather than pursuing genuinely new scientific capabilities. Establishing ambitious, common challenges through benchmarks would be an effective way to overcome this inertia, fostering an open exchange of ideas, and creating a framework where funding can be directed toward efforts that demonstrably push the boundaries of science.

I hope that my core message has come across, that benchmarks should focus on tasks that we know are currently not possible without major scientific advancement. This means moving beyond incremental improvements on existing tasks. To achieve this, the NCI could consider:

Convening a multi-disciplinary working group: Bring together leading cancer biologists, clinicians, systems biologists, and data scientists to collectively define these "grand challenge" benchmarks, like how CASP was established for protein structure.
Funding dedicated benchmark development consortia: Support groups specifically tasked with curating existing data, generating new experimental data for masked testing sets, and developing the infrastructure for benchmark competitions.
Establishing "AI X-Prize" style challenges: Offer large prizes for models that meet predefined performance thresholds on these highly ambitious, currently unsolved problems in cancer. This would incentivize innovation and attract talent.
Funding novel modeling approaches: Once a benchmark is established, the NCI should support competitive applications regarding different modeling or experimental solutions to tackle the challenge. If the benchmarks have defined challenges of great significance, then proposals that make even incremental progress are of value.
Prioritizing funding for model validation: Encourage and fund independent validation efforts, especially those involving prospective data collection, to ensure rigor.
Fostering data sharing infrastructure: Invest in secure, federated learning platforms or data enclaves that allow AI models to be trained and tested on real-world cancer data while maintaining patient privacy.
Encouraging mechanistic interpretability in benchmarks: While performance is key, future benchmarks could also incorporate metrics or requirements for models to provide biologically plausible and interpretable insights, not just black-box predictions.

Ultimately, the goal should be to shift the focus from simple automation to using AI to achieve new biological and clinical understanding and capabilities that were previously out of reach. This strategic shift is crucial for AI to truly revolutionize cancer research and care.

I worry that the NCI is going to be convinced that the excitement around AI is a reason for big data collection efforts, again. It is not at all clear to me that we need more data, or what that data would be if so. We are drowning in data. We need to set ambitious goals that can be measured through hard benchmarks that we think are not currently solvable and then provide modeling resources and incentives to understand why they cannot currently be solved. Only then should we go collect more data.

Systems biology and AI in biology need to define their goals through benchmarks

me@ameyer.me (Aaron Meyer) — Wed, 25 Jun 2025 08:00:00 +0000

There’s a great deal of talk these days about building “foundation models” of cells and employing large-scale AI in biology. The idea is that these models could simulate cells, understand their various states, and predict how perturbations might affect cellular responses. However, many of the demonstrated capabilities of current foundation models can already be achieved, or even surpassed, by simple statistical models. If these complex models merely automate tasks that existing data science tools can already handle, what is their true purpose? Unlike software development, where automating repetitive tasks has value, solving a biological question once doesn’t necessitate constant re-automation or rediscovery. We need models with truly novel capabilities, not just more complex ways to do what we can already do.

A significant concern surrounding these new modeling approaches is the issue of validation and the potential for publication bias. Defining truly new capabilities for a model often requires collecting novel, out-of-sample data, which is a resource-intensive undertaking. This limits the number of groups that can both develop and rigorously validate their models, and leaves the very people developing these models as the ones validating them. We’re all familiar with publication bias, and it’s likely that the literature will present the most optimistic picture of how well these models function. Splashy, initial claims of success, which garner significant attention, will overshadow more careful and rigorous validations that inevitably highlight the models’ limitations.

Compounding these issues is a fundamental lack of common definitions and goals within the field. There is no shared understanding of what these large-scale biological modeling efforts should accomplish. Do we need a model that can translate individual cells’ gene expression between species, or one that can identify novel cell populations? Are these tasks on the road to curing a disease? We would greatly benefit from a collective undertaking to define the capabilities we currently lack and that these models should strive to achieve. Without this clarity, efforts risk becoming disparate and unfocused.

These observations collectively point to a critical need for developing goals defined through benchmarks. The field needs to come together and define specific, currently unachievable tasks that computational structures could solve. These benchmarks don’t necessarily require closed datasets; in some cases, solving them might necessitate the collection of new experimental data and knowledge. The protein structure field and AlphaFold serves as an excellent example: the field provided a standardized task—predicting protein structure from sequence—with clear, quantifiable metrics for success. This undertaking remained effectively unsolved for decades, and required both new experimental data and mechanistic understanding of protein folding. We need a similar framework for systems biology and AI models that aim to represent biology computationally.

We need ways to benchmark model performance in a testable manner. For instance, consider these benchmark tasks: if you perturb a gene within one cell population of a tissue, what changes would you expect to see in the neighboring cells? Or, which perturbations, in which cell populations, would you expect to affect overall tissue function? These are specific questions that could be tested against measurements to form the basis of a quantifiable framework.

Furthermore, we should adopt the practice of masked testing sets, a highly successful approach within the AI community. This involves keeping a portion of the evaluation data secret, allowing researchers to submit their models and objectively assess their performance against this hidden data. This ensures a truly unbiased assessment of a model’s capabilities, similar to how protein structure prediction models were evaluated. These benchmarks have proven their value time and again; researchers can even fool themselves about the performance of their models without this objective assessment.

Until we establish common challenges and rigorous benchmarking methods that allow us to assess the success of our modeling endeavors, the field will continue to see an explosion of different techniques, all claiming superiority in various ways, but languish without a common scientific “language” or objective framework for comparison. By forcing the field to test against a common goal, it would allow the field to make progress through the open exchange of ideas. Funders could support competitive efforts to make progress, or establish large prizes to incentivize teams to pass certain thresholds in performance.

Defining a set of benchmarks that evaluate the new and unique capabilities of these models is absolutely crucial for making meaningful progress in large-scale biological modeling. Again, I’m not talking about the sort of benchmarks we already see, that seek to compare existing methods on mostly solved tasks. I mean benchmarks that strike at the heart of what “solving” biology would mean, and tasks that we know are currently not possible without major scientific advancement. By the way, I think that defining these goals will reveal that current approaches are at the wrong level of abstraction. We don’t need a foundation model of cells; we need a foundation model of a person, of an organism, of how cells and tissues are organized across the body. That’s a much bigger undertaking, but defining what it is we are trying to accomplish, we can focus on what will advance the science.

Caring for science that is on life support

me@ameyer.me (Aaron Meyer) — Tue, 20 May 2025 08:00:00 +0000

While uncertain, all signs point to a prolonged period of significantly reduced funding for science. Even if the current administration were to disappear in four years, funding agencies move slowly in the best of times, and substantial damage has been inflicted on their ability to accomplish even basic functions.

The consequences of this funding crisis are causing incalculable harms; nothing in this post is going to change that. I have already seen the damage from so many talented trainees having their career plans disrupted or choosing to go to Canada or the U.K. for their graduate training. It’s infuriating to think about the scale of waste in an entire nation abandoning investments in its most talented and dedicated youth who represent its future innovation potential. Longer term, I fear for this country. Scientific advancements are at the core of our economic, political, and military might, and we are abandoning the people and institutions that generate this progress.

This challenging situation necessitates a re-evaluation of our research strategies to consider what can be accomplished within these new constraints and how, both to minimize harm and take advantage of the situation wherever possible. Massive disruptions usually mean that you might need to rethink your overall strategy and plans. This situation is no different. Maintaining scientific progress will be as important as ever so we continue to address critical problems.

This situation has led me to ponder what sort of questions I can still pursue with much less funding; I think that the answer, at least in my case, is quite a lot. Many of the most impactful scientific breakthroughs have come through the cheapest, simplest experiments, and an immense privilege of my current position is that funding almost exclusively supports students and laboratory materials, not my salary. Moreover, we are living with an absolute deluge of data from omics and large-scale studies that few ever seem to have time to revisit, and I think there are many opportunities to use these data for discovery¹. My day-to-day tasks would change quite a lot: I would do all the experiments, data analysis, and paper writing myself. My lab’s work would rely more heavily on reanalysis of existing data, with a key experiment mixed in. However, would our intellectual impact—the novelty of the findings, the influence on subsequent research, the creation of new knowledge—be reduced? I am not sure it would².

I also wonder what has been lost by the chase for funding. As graduate students, my peers and I had only a vague understanding of how our work was funded. This left us with the freedom to consider what might be possible, without putting a dollar amount next to ideas from the start. As I have gotten more senior, I have seen these big ideas get mashed into the specific aims of proposals, and discussions even start from the perspective of what the funder would want. The cost of chasing funding has been so immense, both in the time committed to writing proposal after proposal, and in limiting the bounds of what is possible to what is fundable. As the fraction of funded proposals has gone from 25%, to 10%, to single percentage points, the cost of time per funding dollar has exploded.

This has really got me thinking lately. If there is a version of a lab with the same impact using less funding, why shouldn’t I choose this approach? There are several potential downsides I can imagine:

Under this change, our lab would have far fewer trainees. University research both serves to uncover new knowledge and train students. This may be good or bad³.
I anticipate that the intellectual impact of our work would be the same, but I am not sure this would be the perception of my field. For better or worse, ideas in biomedical research are often judged by simplistic rules which can sometimes overshadow the intellectual rigor or novelty of the work. Is there an in vivo model? This is just a reanalysis of existing data. These responses do impact the perceived value of our work. At the same time, fighting for funding is often a big part of why we are forced to conform to these rules. Maybe there is value in being freed from these constraints.
There are questions that we currently can ask through larger-scale experiments, that would become inaccessible. However, there are also questions that I can’t pursue right now, as a result of my time and focus being pulled away to fund and support a larger group. We might lose the ability to conduct large-scale screening projects, but gain the capacity for deep, focused theoretical work or the development of novel computational methods.

In the end, we will have to choose how to adapt in this moment, and none of us can predict exactly what the future will bring.

I increasingly feel that integrating data across scales is the key challenge of our time. ↩︎
Part of my judgement here is tied to AI and the cost/benefit of graduate training. When it comes to computational work, it has become easier to do more yourself, and I expect that this trend will continue. ↩︎
There are many inconsistent and conflicting roles of graduate education that warrant its own post one day. ↩︎

A new generation of open tools

me@ameyer.me (Aaron Meyer) — Sun, 16 Feb 2025 08:00:00 +0000

A delightful new ecosystem of open source productivity tools has taken shape over the last few years. Just in the last year, I’ve found myself using Quarto and Typst extensively for typesetting letters of recommendation, grants, and lecture slides. Separating content from formatting pays dividends in complex documents, and the time to learn these tools was a rewarding investment. I’ve also found aider and Simon Wilson’s LLM, with Llama, extremely useful for learning new syntax and tackling more complex tasks. For instance, I recently rebuilt my CV and lab website to pull in the list of trainees, awards, grants, and papers all from shared YAML files. I am mostly unfamiliar with the Jekyll (Ruby) and Typst syntax, but an LLM could get me 99% of the way there. This means that the upfront cost of these more powerful tools is greatly reduced.

Returning to blogging

me@ameyer.me (Aaron Meyer) — Sat, 15 Feb 2025 08:00:00 +0000

It has been four years since I last posted.

It is hard for me to comprehend that it has been this long. I wasn’t a frequent blogger—my pace before was about once or twice per year. However, writing is a part of thinking, and many of the subjects I posted about were ideas I cared and had thought through deeply.

In thinking about what changed, it is difficult to pinpoint any one factor. One definite possibility is greater work demands, both in my time and mental bandwidth. My lab has gotten bigger, we are participating in more collaborative work, and I have a clearer vision for our impact than ever before. My other roles, such as an educator, have also expanded and become more complicated. I am teaching larger classes that have become a core component of our major, and education has become more challenging due to various factors, including the pandemic, large language model use, and challenges with student mental health.

I also think some contributory factors are personal. We are beginning to learn about the harms of addiction-driven social media, and I do think some of my mental energies were lost to those systems. Twitter’s dismantling, and scattering of users, was a useful reminder that social media is not real life, and that users are the product within these networks. I also think that I have become more conservative about the ideas I put out into the world. I don’t think this is a good thing. If writing is a process of thinking, it has to start in an unrefined state.

Visiting my graduate alma mater this past fall, I was reminded how important blogging used to be to me. Two individuals who I deeply respect separately brought up my writing from over a decade ago. My very last post, on cancer autoantibodies, was the kernel of an idea that has become a critical research area for my lab, funded by my Mark Foundation Emerging Leader Award. The project has incredibly exciting results, and may turn out to be my most impactful scientific work. It is safe to say that I wouldn’t be where I am today without this outlet.

While I want to return to blogging, it is a habit that I will have to build again. I plan to make several changes to facilitate this. I am going to try to embrace shorter posts, with small ideas, posting material I might have tweeted about before. Likewise, I am also going to try link blogging more, as this site can be a repository for items I typically save to Pinboard.

We need a renewed focus on our own cancer antibody responses

me@ameyer.me (Aaron Meyer) — Sun, 14 Feb 2021 08:00:00 +0000

Antibodies are central to the treatment of cancer—we use them to block certain cell surface receptor signaling, to label cancer cells for destruction by the immune system, and to block signals that are preventing immune responses. We use antibodies in these same roles in infectious disease, and so it would be reasonable to expect that lessons from infectious disease can often be applied to cancer treatment.

We now have technologies to study the antibody responses to infectious diseases in detail, and in the last couple decades we have learned antibodies are an active hub of communication across the immune system. However, we seldom ask how our endogenously produced anti-tumor antibodies (ATAb) might influence cancer progression. Why is this? The standard tale for our progression in understanding cancer immunity is that we expected the immune system could not mount a productive defense for a long time. Checkpoint inhibitors made it clear that an active response is present but halted at its last steps. Given the recent revisions to our mental model of cellular immunity, I have been working through the literature of exactly what we know about our humoral responses. I think the results indicate it is time for a renewed focus on the humoral immune response to cancers.

The immune system naturally develops antibodies against cancer

A humoral immune response is a common feature of cancers. This has most extensively been explored for prototypical surface antigens like mutant EGFR, HER2, and NY-ESO-1. In general, the majority of patients have measurable antibody responses if their tumor is positive for a given marker, and the titers correlated with higher expression of those antigens. HER2 and NY-ESO-1 show that the targets did not have to be mutated but could simply be aberrantly expressed. cDNA expression libraries expanded our ability to identify target antigens in the 1990’s, and tumor-binding antibodies could be identified against surface markers unique to individual patients’ tumors. In fact, many groups have proposed to identify cancers through their uniquely high abundance of auto-antibody staining.

Antibodies have the capacity to eliminate tumors

We know this from years of using therapeutic monoclonal antibodies in which immune effector response is a key feature. A study in mice has also shown that responses to checkpoint inhibition increase ATAb titers, and that this convalescent plasma can be protective on passive immunization.

What are the Fc properties of endogenous antibodies?

Here we reach the limits of our knowledge; if we have these antibody responses, why are they ineffective at eliminating tumors? Since some of the studies on the prevalence of ATAbs in the 1990’s, we have learned that antibodies are quite complicated. An antibody with the same antigen targeting can be pro- or anti-inflammatory, promote different types of immune responses, and is regulated through sequence and post-translational changes. The Fc signal conveyed to the immune system can be modulated in an antigen-specific manner by signals we do not fully understand. While great progress has been made in our ability to characterize the Fc properties of antibodies, I could not find any systematic characterization of those targeting cancer. Even remarkably basic questions, like the isotype distribution of ATAbs, have barely been addressed.

Why care about endogenously produced antibodies when we can provide passive immunity through monoclonal antibodies?

Given our success in developing monoclonal antibody therapies against cancer, why should we care about ATAbs? We can see answers to this question in lessons from infectious diseases. There may not be shared antigens against every patient’s tumor. Even if we were to develop an enormous panel of monoclonal antibody therapies, tumors are relentlessly adaptable. Endogenous immune responses can mature and spread across antigens and so possibly overcome the development of resistance. Finally, ATAbs likely modulate our response to monoclonal therapies. Our lab has been developing tools to ask how antibodies communicate with the immune system when present in combination, and one prevailent feature is widespread antagonism through both antigen and Fc receptor competition. Therefore, ATABs might impede responses to therapeutic antibodies.

Having read through the literature more extensively, I am struck by how much it seems we have ignored this arm of the cancer immune response. There are very basic questions that it seems we now have the tools to answer and can change our approach to treatment:

When does the ineffectiveness of antibodies arise through antibody properties, cancer properties, or both?
What are the Fc properties of ATAbs? How do they compare to those in acute and chronic infections?
How do ATAbs influence the response to therapeutic monoclonals?
Can we selectively change the properties of ATAbs in vivo? Are there active inhibitory signals ensuring that ATAbs are not tumor destructive?
Do the Fc properties of ATABs reflect the status of cancer immune responses? Do changes upon treatment influence outcome?

Certainly, there have been frustrating setbacks on the path to cancer immunotherapy. One of those has been cancer vaccines that hoped to unleash an immune response against the disease. However, our understanding now is that an immune response—both cellular and humoral—exists but is generally halted at its last steps. We have learned so much about how to unleash cellular immunity in the last few years, and it is time to free immunity’s other arm.

Folding knowledge in with data—where systems biology could be headed

me@ameyer.me (Aaron Meyer) — Fri, 24 Jan 2020 08:00:00 +0000

Last year in our machine learning/data analysis class, we had a bit of extra time to take a step back and return why we were there in the first place. Regardless of the method, machine learning provides us a toolbox of functions with varying amounts of flexibility and rigidness. By applying a method, we get a space of functions:

$$ f(x, \beta) $$

parameterize it with some judge of value:

$$ \arg\max_{\beta} \textrm{Value}(f(x, \beta)) $$

and are left with a function having (hopefully) useful properties:

$$ f(x, \hat{\beta}) $$

Modeling provides benefits beyond prediction, including hypothesis testing, communication, interpretation, and visualization. With these competing goals in mind we roughly organized the various methods in our toolbox:

This helped us identify a couple features of the landscape. First, our choices sit along some front of a tradeoff between explainability/interpretability and prediction performance/flexibility. This is not a new idea—much has been written about this tradeoff. Second, explainable/interpretable models make strong assumptions/expectations about the structure of the data.

Before looking ahead, it is helpful to recognize some of the developments that have made it possible to move out in the direction of prediction/flexibility. Namely, the continued march of computational performance has worked alongside computational tools to enable flexibly-defined, high-parameter models. This includes a resurgence in autodifferentiation tools, parallel/vectorized evaluation, and probabilistic languages to coordinate it all.

However, while very high parameter models¹ have given us accurate predictions, especially with ever-growing training data, these models are poor at extrapolation and interpretation. These two properties are especially critical to understanding biological systems; measurements are essentially always data starved, and the complexities of biological systems are such that even our highest-throughput experiments do not comprehensively sample every possible intervention we could make². In other words, we can take lots of pictures of stop signs to teach a model to identify stop signs, but we almost always build models of cells to predict things we can’t or haven’t ever measured yet. We have to be able to see into the uncharted territory. Can your model identify stop signs if it had never seen one before?

So where does extrapolation come from? It comes from the inflexibility built into the model structure we choose. For example, ordinary least squares models are quite inflexible, and their inflexibility forces predictions where each variable extrapolates based on a linear relationship. Even flexible models, while restricted by the data within the training range, rely heavily on model inflexibility when extrapolating to entirely new predictions. However, the rigidity we encode to date has largely been tied to making a problem numerically tractable. For example, linear assumptions are often chosen because they gives us models that we can reasonably solve, but we don’t expect much of biology to be linear.

However, our new computational tools allow us to perform efficient parameterization with programs of nearly any structure. Through autodifferentiation and probabilistic languages, we can now write semi-flexible functions of essentially any structure and perform reasonably efficient parameterization. The key is flexibility where you don’t have prior knowledge and rigidity where you do.

We did this in a very simple way with our first paper on antibody responses. We expect antibodies to bind following the laws of binding kinetics. Then, without much expectation about how this binding relates to cell response, we allow a flexible statistical relationship. Yuan et al and Ma et al also use a similar idea of encoding the structural information we know (the dynamics and hierarchy of biology, respectively) into a deep learning model.

If I had to guess where machine learning will take systems biology, it is in this direction, providing a balance of flexibility and rigidity. Neural ODEs will lend themselves nicely to biology as a dynamic system where we have tools to measure dynamic responses and whether components interact. In fact, the authors summarize my point nicely: scientific knowledge [should be] encoded in structure, not data points. We will realize that the tools in our toolbox above are different views of the same solution, and learn how to encode our prior knowledge in model structure as well as parameters. Finally, I think we will re-focus our efforts on how far a model can extrapolate, rather than how accurately it can predict.

Call them deep if you must. ↩︎
Sure, single-cell methods can provide enormous throughput, but we ultimately care about what happens in a whole organism, and often in a whole person. ↩︎

Linus Pauling on involvement in politics

me@ameyer.me (Aaron Meyer) — Tue, 14 Mar 2017 11:00:00 +0000

I have to admit I didn’t know Linus Pauling’s second Nobel was for the Peace Prize. In addition to his scientific pursuits, Pauling helped bring about the treaty banning tests of atomic explosives in the atmosphere. Though six decades ago, it seems eerily relevant to discussion about scientist’s role in politics today:

Pauling’s political stand in the last years of the Truman presidency seems mild enough now, a rather flamboyantly idealistic campaign against the cold war, against atomic weapons and the development of the hydrogen bomb. In those days of the rise of Joseph McCarthy and Richard Nixon, when the national anxieties reverberated into hysteria in Southern California, Pauling’s course took courage and principle. His politics had unpleasant consequences at the time. At the end of April 1952, Pauling was supposed to go to London to attend a meeting of the Royal Society on the structure of proteins, but at the last minute was refused a passport. Some claim that had he got to London, he would have seen the newest x-ray-diffraction pictures of DNA from Maurice Wilkins’s and Rosalind Franklin’s laboratories, and from them learned what he needed to solve the structure. When I asked Pauling about the political responsibilities of scientists, he laughed as if at the foolishness of his enemies. “I have contended that scientists—first that they have the responsibilities of ordinary citizens, but then that they also have a responsibility because of their understanding of science, and of those problems of society in which science is involved closely, to help their fellow citizens to understand, by explaining to them what their own understanding of these problems is. And I have contended that they have the duty also to express their own opinions—if they have opinions. Of course, just after 1945, there was a good response from scientists. But then in the United States, McCarthy came along and frightened the majority of scientists out of taking any action. And, of course, there’s been the really great effort made by the administration, by the powers in control, to convince scientists that they should stay out of things, you know.” The brows contracted furiously and the swell of language deepened. “Year after year, I’ve been told I might know a lot about chemistry but that doesn’t mean that I know anything about world affairs or economics or social problems, so I should just keep my mouth shut. I have replied—well, first by saying”—he laughed again, enjoying himself—“that I refuse to take your advice, but second by saying that a lot of other people, the lawyers and politicians, are just as ignorant as I am about these matters, and yet no one tells them to keep their mouths shut. After Dean Acheson had been quoted in Harper’s magazine as saying that I might know a lot about biochemistry but I didn’t know anything about world affairs, and should keep my mouth shut—that there was no justification for me to make pronouncements about world affairs—I didn’t bother to say that I wasn’t a biochemist, but I wrote back saying that if Dean Acheson had studied biochemistry as much as I had studied world affair, then I would say we ought to listen to what he had to say about biochemistry.”

The excerpt is from The Eighth Day of Creation, a comprehensive history of molecular biology’s beginnings.

An Approach (and Template) for Reproducible Proposal Writing

me@ameyer.me (Aaron Meyer) — Mon, 16 Jan 2017 16:00:00 +0000

This past year came with a significant increase in the number and complexity of funding proposals I put together. The process has been a bit of a learning experience, particularly in how important organization can be to larger writing projects. While I think the standard approach is to fight with Word, I’ve taken the opportunity to try and adopt some software tools (including Latex and Git) to make the process a bit more reproducible and straightforward. A few lessons I’ve taken away from the year follow below; if you would like to replicate the workflow I apply, please take a look at the template I start from on Github.

Codify good practice. When putting together large documents I find it challenging to maintain consistency. For example, I would like acronyms to be introduced once, subsequently used throughout, and included in the glossary. The glossary package can seamlessly handle this for me. I would like all citations to be of the supercite variety and so overload the cite command. Aims and tasks end up subtly edited throughout the process of writing and proofing, and so these phrases are defined in commands. Make the computer work for you.
Ensure final tweaking steps will be seamless. Inevitably, a few tweaks will be necessary to ensure the Specific Aims page is, in fact, a page, or that a figure isn’t hanging off the end of a page split. The main place where I adjust my practice in order to accomplish this is wrapping each figure into a command. That way, when I need to move a figure down, it’s just a single line making the trip.
Use git as your track changes. A versioning system allows you to concentrate only on the document before you. What I’ve found works for me is to make commits roughly once a day, or when I hand the document to anyone else for proofing/approval. That way, I can continue to work on a document while someone else is looking through it, and when they provide edits I can instantly jump back to the point they saw. While folks in biology are most used to a Word document with track changes¹, I tell people they can provide me comments however they find most convenient, and many are most comfortable scribbling on a hard copy.
Separate style and content. A style file seems redundant since it all can just go in the preamble of a document but, at least for me, it helps to enforce that those are not settings I should be tweaking when trying to assemble the content.
Respect your readers by being boring. If the purpose of your document is to convey information as efficiently as possible, eschew complications that do not directly serve that purpose. Those font ligatures and bright colors aren’t going to help your reader if it distracts them from learning what you want to tell them.

Obviously, these tips won’t make you have high-quality science to talk about, but have helped me streamline the process of getting that material onto a page without errors.

I also find track changes very distracting, so versioning being handled in a separate tool is helpful. ↩︎