The Painless Problem: Why Kibana Runtime Fields Needed an AI Skill, Not Better Docs

When I owned the Runtime Fields authoring experience in Kibana, we shipped a feature that was elegant in concept and brutal in practice. The concept: define fields at query time against an existing data view, no reindex required, perfect for the half of your data that didn’t get mapped correctly the first time. (Reindexing in Elasticsearch means rewriting every document into a new index, which is expensive at scale and a non-starter for clusters serving live traffic.) The practice: open the Runtime Fields flyout, hit an empty Painless editor, and discover that most users don’t speak Painless.

We improved the experience over time, but I’ve been thinking lately about Anthropic’s skills system, and where it would actually move the needle on enterprise software. Runtime Fields are a near-perfect fit, and the natural place to put the skill is inside the editor itself.

The editor

The Runtime Fields flyout is a clean piece of UI. You name the field, pick a type from a dropdown, toggle “Set value,” and a Monaco editor drops down with an empty script body and a comment hinting at emit(). A preview pane runs your script against sample documents from the data view.

Conceptually, it’s the right shape. In practice, the empty script body is where most users stop. Painless is Java-flavored, has its own security restrictions, has seven type-specific emit() signatures you need to match against your dropdown choice, and uses doc-values access (doc['field'].value) where most newcomers reach for params._source.field instead. That second pattern compiles fine and tanks query latency, because _source re-parses the original JSON document on every doc the query touches. The user ends up in a different tab reading reference docs, copies a snippet that almost fits, and bounces between the editor and the preview until something compiles. Or they give up and ask the analytics team.

What a skill is, briefly

A skill is a folder of expert knowledge that a model loads before answering a specific class of question. A SKILL.md sits at the root with the high-level guidance, and supporting files hold templates, examples, and validated patterns. The model reads the skill on demand when the task matches, then applies what it learned to the user’s specific situation. Think of it as a condensed senior engineer in a folder.

What the skill would contain

The SKILL.md itself would be short, maybe 300 lines, structured around four things.

The seven emit signatures. Keyword, long, double, date, ip, geo_point, boolean. Each one has a slightly different emit() pattern, and getting it wrong produces compile errors that don’t tell you which signature you needed. (Composite and lookup runtime fields exist too, with different shapes. They’re out of scope for a scalar-field skill and would be their own thing.)

Doc-values access idioms. The doc['field'].size() == 0 guard for missing fields. When to iterate versus take .value. A side-by-side comparison of doc[] and params._source showing the latency gap on a representative dataset, so the user sees the cost of the wrong choice before they make it.

A recipes folder. Parameterized templates for the things people actually ask for: parse a non-ISO date string out of a log message, regex-extract a status code, derive a session bucket, hash PII, compute geo distance from a fixed point, conditionally categorize based on multiple fields. These are the requests that come up over and over.

An error decoder. Painless compile errors are cryptic. A lookup table of common messages mapped to root causes. For example, dynamic method [...] not found almost always means the user is calling a Java method that isn’t on Painless’s allowlist, and the fix is to substitute the allowlisted equivalent. That single mapping saves an hour the first time someone hits it.

The editor with a skill loaded

Here’s the change to the flyout. Above the script editor, a single input with placeholder text like “Describe the field you want, or paste an example value.” That’s it. No new tabs, no new modals, no new mental model for the user to learn.

A user opens the flyout, names the field extracted_ts, picks date from the type dropdown, and types into the assist input:

My logs have timestamps in the format 2024-03-15 14:23:45.123 UTC in a field called log_message. Extract that as a date.

The skill loads in the background. The script editor populates:

if (!doc.containsKey('log_message.keyword') || doc['log_message.keyword'].size() == 0) { 
    return; 
}
def msg = doc['log_message.keyword'].value;
def matcher = /(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}) UTC/.matcher(msg);
if (matcher.find()) {
    def ts = matcher.group(1);
    java.time.format.DateTimeFormatter formatter = java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS");
    java.time.LocalDateTime ldt = java.time.LocalDateTime.parse(ts, formatter);
    emit(ldt.atZone(java.time.ZoneId.of("UTC")));
}

A small note appears above the editor: “Assumes a single timestamp format and a .keyword multifield on log_message. If your logs have mixed formats, click here to add a fallback chain.”

The preview pane runs automatically against five sample documents and shows parsed dates next to the original log lines. The Save button stays disabled until the preview returns at least one valid value, which catches the silent-failure case where the script compiles but doesn’t actually match anything in the data. That gating is a UI change to the flyout, not something the skill does on its own. The skill recommends the workflow; the editor enforces it.

Compare that to the path without the skill. A Painless reference doc in another tab. Three iterations of compile errors. A working script that uses params._source and tanks query performance the next time the dashboard runs over a real time range. A saved field with no missing-value guard that throws on the first sparse document it hits in production.

The skill bought four things in one interaction: the right emit signature without guessing, the missing-value guard the user wouldn’t have known to add, a flag for the mixed-format edge case before it bites, and a validation step that’s harder to skip than to use.

Why this beats more docs and autocomplete

Documentation is read after frustration. Autocomplete fires inside a token, which is usually too local and too late. Skills get loaded and applied at the moment of need, with the full context of what the user is trying to accomplish.

Most Painless users don’t know they need to read the doc-values section until they’ve already written params._source.field and watched their query latency triple. By then they’ve moved on, the field is in production, and someone else inherits the slow dashboard.

A skill works the other direction. The user describes what they want in plain English, the expert knowledge gets applied before the broken pattern leaves the editor, and the preview pane catches what slips through. That’s a different shape of help, and it’s the right shape for features where the gap between user intent and correct implementation is wide.

The broader pattern

Runtime Fields shipped because power users asked for them, then quietly demanded a fluency the average user doesn’t have. Skills are a way to close that fluency gap inside the editor itself, without retraining a model and without writing yet another doc page that the user won’t find until after the incident.

The other thing worth noticing is that the same skill ships everywhere the editor isn’t. A support engineer triaging a ticket, a partner SA building a POC, an internal bot answering “why is my runtime field returning null,” all of them get the same packaged expertise without anyone rewriting it three times. Build the skill once, deploy it wherever Painless questions show up.

What about ES|QL?

You might reasonably wonder if ES|QL retires the Painless problem. ES|QL is a piped query language that handles transformations natively, no script required, and for ad-hoc investigation it’s a real upgrade. But it doesn’t replace the case Runtime Fields in Data Views were built for: define a field once and have it sit in the field list for everyone else.

The distinction worth pulling out is where the reusability lives. ES|QL queries persist too, in saved searches, dashboard panels, and alerting rules. But their reusability is at the query level: each ES|QL query that needs a derived value redefines it inline. Runtime fields in data views propagate the other direction. Define once, and the field is part of the schema everyone sees when they build against that data view. The analyst who writes the field once does the work for the next fifty users who just want to drag it onto an axis.

That field-level reuse, where the computed field becomes part of the shared schema users browse rather than a transform buried inside one specific query, is what data view runtime fields still own. They’re written in Painless, by the small number of people fluent enough to write them. Closing that fluency gap inside the editor is exactly where a skill should land.

If I were still at Elastic, this is what I’d be prototyping next.