Reading list round up I
In this new series I am going to start capturing my notes and reflections on articles that I find interesting/valuable for future reference.
LLMs won't save us
"Safety and predictability often go hand-in-hand, and I fear that in the rush to destaff unfashionable things, we will sacrifice predictability in the expectation of safety, and receive neither."
LLMs won't save us by Niall Murphy presents an interesting look at the role of LLMs in the context of SRE/DevOps, particularly in incident management.
Despite the immense power of LLMs, there are significant challenges in their adoption in the operations space, which stem from the nature of the technology and the incentives at play (e.g. reduce headcount). First, LLMs are probabilistic in nature - thus their behavior is unpredictable. This is compounded by the fact that there is a significant risk of drift between how models behave during testing/evaluation and production1 (techniques to keep this in check like evals and other LLMs as judges are still incipient).
Second, humans provide essential adaptive capacity in operational systems. While LLMs may handle the most common and trivial incidents just fine, this comes with important second order consequences: first it may deprive human operators of the chance to build up critical skills, degrading human expertise crucial to handle severe incidents. The rich body of work on the negative effects of automation2 and the effects of the "bumpy transfer of control" at the worst possible time substantiate this risk.
The limits of data
"Data is supposed to be consistent and stable across contexts. The methodology of data requires leaving out some of our more sensitive and dynamic ways of understanding the world in order to achieve that stability."
The limits of data by C. Thi Nguyen examines the inherent limitations of quantitative data.While data's power lies in its aggregation and portability – allowing information to be understood across different contexts – this very strength comes at the expense of context. The article highlights several critical considerations when working with large datasets:
First, the availability of easy-to-collect metrics shape how goals are formulated, but "the map is not the territory" and it may lead to negative outcomes3.
Second, the broader the audience for a dataset, more context is lost. This may lead to what essentially amounts to pernicious, even if well-intentioned, metrics (e.g. using ticket sales - which everyone understands - as a metric for determining arts funding).
Third, how data is collected and classified may be biased. Therefore the idea that quantitative data is an immaculate objective view of the world is flawed. Working with data at scale requires decisions about relevance and exclusion, choices that become invisible once embedded in taxonomies and methodologies.
Fourth, metrics can become detached from their original purposes, and can be internalized by individuals leading to "value capture" – where the metric itself becomes the goal rather than what it was meant to measure (e.g. citation rates vs actual understanding in academia).
Therefore when working with large quantitative datasets, there are a few things to be mindful of:
- Who/how was the data collected?
- Who created the system of categories into which the data is sorted? What information does that system emphasize, and what does it leave out?
- Whose interests are served by that filtration system?
- Not everything is tractable as quantitative data. The world is messy and context matters. Quantitative data, especially large datasets are not just limited in this regard, so caveat emptor.
“Founder Mode” and the Art of Mythmaking
“Founder Mode” and the Art of Mythmaking by Charity Majors dissects the Founder Mode talk, and does an excellent job of capturing some of the valuable insights that would otherwise be buried under a ton of founder mythologization (there is some frankly cringe-worthy stuff in the original talk by Chesky).
Airbnb, like many other companies, is now reckoning with the consequences of zero interest rate policy era practices. The incentives to massively scale up operations in an environment of quasi-unrestricted resource allocation (AKA throw-money-and-bodies-at-a-problem) led to organizational dysfunctions. In the new economic reality operational efficiency and profitability are becoming the name of the game.
The main take-aways are that running an efficient organization, with the right number of people is incredibly valuable. A leaner organization requires less need for alignment/meetings, leads to flatter organizational structures and less politics and empire building. In addition to this, having managers that are subject matter experts and manage through the work is something that just makes sense, and I am happy to see this becoming a mainstream opinion.
As for some of the other takes in the original material, let's just say that I am very skeptical about them...
Footnotes
-
There is some indication that LLMs exibihit different behavior during training compared to production: Alignment Faking in Large Language Models ↩
-
The example of Nike's post-Pandemic digital transformation strategy comes to mind. ↩