##  [Nushell and the art of caging AI](/node/191) 

    *Submitted by Lennart on Wed, 15 Apr 2026 - 11:57*  

  ![Nushell and the art of caging AI](/sites/default/files/styles/wide/public/2026-04/composite_1.png.webp?itok=6yjfc8Em)

 
I recently wrote about how I automate my blog releases in one command — text in, article with image out. That flow runs in a shell most people have never heard of: Nushell. And it's not a random choice.

When I build what I call *compound AI systems* — meaning systems where eg. language models and image models are combined with other components to solve a real task — the most important question isn't "which model are we using?". It's "what do we surround the model with?".

And that's where Nushell comes in.

## The Problem with Only Having AI

A language model is probabilistic. It makes educated guesses. Give it the same prompt twice and you might get two different answers\*. That's not a bug — it's by design. Creativity comes from uncertainty.

That's also why AI alone can never be a production system. A production system needs to be able to count on step 3 actually happening if step 2 succeeded, and that step 4 receives the data it expects from step 3. That requires determinism on the pipes — even if the content flowing through them is probabilistic.

That pattern has a name in the research literature: *compound AI systems*. These are systems where AI components are surrounded by classic, logical code that controls the flow, validates output, handles errors, and connects the steps. The Berkeley lab BAIR wrote about the pattern as early as 2024, and it's really the only way to make AI work reliably in production.

In short: the AI is the brain, but the skeleton must be determinism.

## What Nushell Is — and Why It Fits

Nushell is a shell — meaning a command-line tool in the same way as bash or PowerShell — but with a crucial difference: it works with *structured data*, not text streams.

In bash, you send lines of text through pipes. Each command must parse the text itself and produce text that can be parsed by the next. It's an endless exercise in `awk`, `sed`, `cut`, and hidden assumptions about whitespace.

In Nushell, you send tables, records, lists, and primitive types. A pipe step *knows* what it's receiving — not just that it's text that looks like something.

And most importantly: Nushell speaks virtually all the data formats you encounter in modern AI work, directly and built-in.

```
open data.json           # parses to records
open report.csv          # parses to a table
open config.toml         # parses to a record
open notes.md            # opens as text and recently even as data
http get https://api.../ # returns parsed JSON directly

```

All of it is the same: structured data that you can filter, transform, join, and pass on with the same syntax. And unlike in bash, you don't even need a parser — the format is interpreted for you.

## Why This Matters for AI Systems

A typical compound AI system consists of three types of steps:

1. **Data retrieval** — get something from an API, a database, a file.
2. **AI call** — send it to a model and get something back.
3. **Post-processing** — validate, route, save, forward.

Each of these steps delivers data in different formats. The API returns JSON. The database returns rows. The model returns JSON or markdown. The destination API wants JSON in its own format.

In bash, this would be a labyrinth of `jq`, temporary files, and handcrafted parsers. In Nushell, it's:

```
http get $api_url
| where status == "pending"
| each { |row|
    $row | llm "summarize the case in 2 sentences"
    | merge { case_id: $row.id }
  }
| to json
| http post $destination

```

Notice what happens: an AI model is placed in the middle of a pipeline, where the input is structured rows from an API, and the output is merged back with the original `case_id`. This ensures that the AI's output is *correctly linked* to the data it belongs to — no hallucinated ID, no mixed-up row. The logical glue is deterministic. Only the one specific task — the summarization — is AI.

## Control Instances Around the Model

This is the point I want to make: **AI should rarely be the outermost layer.** It should be an inner organ.

The control instances around the model should typically:

- **Validate input** before it reaches the model. Is it actually the type of data we expect? Is it too long? Does it contain sensitive data that needs masking?
- **Structure output** from the model. Does the answer match the schema we expect? If not, try again or fail fast.
- **Route depending on results**. If confidence is low, send to a human. If output is empty, log and skip.
- **Log everything**. Both input, prompt, output, and decisions — so you can debug why the system chose as it did.

In Nushell, you write this kind of logic as easily readable pipes, closer to English than to code:

```
let answer = $input | llm $prompt
if ($answer | str length) < 10 {
  log warn "too short answer, using fallback"
  $fallback
} else if ($answer | from json | get confidence) < 0.7 {
  $answer | route-to-human
} else {
  $answer | save-and-notify
}

```

It's not glamorous. It's precisely what the language tries to be — pragmatic glue between systems.

## The Important Principle: Where Is There AI, and Where Is There Not?

When I help a company design a compound AI system, the first whiteboard exercise is always the same. We map out all the steps. And then I put an 'X' next to each step that *must* be AI. And another mark next to each step that *can* be deterministic code.

Almost always, there are many more of the latter kind than people thought. Classification based on rules? Code. Check if a field is empty? Code. Reformat the date? Code. Send the result to the correct department based on a lookup table? Code.

The AI should be used where the task is linguistic, creative, or pattern-based in a way that is too complex for rules. Everything else is more expensive, slower, and less reliable if AI is allowed to do it.

Nushell makes it easy to have that division of labor, because sending data into a model and filtering a list doesn't feel like two different worlds. It's the same pipes, the same data types, the same mental model.

## Why Managers Should Care About This

This is a technical choice on the surface. But it has direct consequences for three things management cares about:

**Reliability.** A system where AI is encapsulated in deterministic code fails predictably. A system where AI is the glue between components fails mysteriously. Reliability in production is always a question of where the determinism sits.

**Cost.** Every AI call costs tokens. A system that calls AI ten times a day is cheap. A system that calls AI ten times per task because it does *everything* with AI — including the deterministic parts — is expensive. The economic difference between these two patterns is often 10-100× in operational costs.

**Auditability.** Can you, after a failure, explain why the system did what it did? If AI sits in only one place in the pipeline, and the rest is code, you can. If AI sits in every corner, you can't.

## What You Should Ask

Next time someone presents an AI system to you — internal development or external vendor — ask this question:

**"Which part of this is AI, and which part is regular code?"**

If the answer is vague, it's because the architecture is vague. If the answer is "everything is AI" or "it's an end-to-end model," then you know that reliability, cost, and auditability are worse than they need to be.

And if the answer is "AI sits here, here, and here, and the rest is deterministic code with validation" — then you are likely facing someone who has built this kind of system before.

The shell they use matters less than the principle. But the principle is more important than the model.

---

\*Under very controlled conditions, language models can be set up so their output becomes more "deterministic" and consistent for identical input, but the process itself still relies on next token prediction based on probability, not logic. LLMs are inductive and not deductive.


### Tags

- [nushell](/taxonomy/term/3)