The XML-versus-JSON argument was settled years ago for most web APIs, and JSON won. But "JSON won" is not the same as "XML is dead" — there are whole domains where reaching for JSON would be a mistake. Knowing which format fits which problem is still a useful piece of judgment.

A short history

XML (Extensible Markup Language) was standardized by the W3C in 1998. It came out of SGML, the same lineage as HTML, and it was designed as a general-purpose way to mark up documents — text with structure, where the order and nesting of elements carries meaning. That document heritage explains a lot of XML's design choices that feel heavy when you only think of it as a data-transport format.

JSON (JavaScript Object Notation) was popularized by Douglas Crockford in the early 2000s. It wasn't invented so much as recognized: it's a subset of JavaScript's object literal syntax, formalized into a tiny, language-independent spec. Its entire value proposition was that it maps directly onto the data structures every modern language already has — objects/maps and arrays/lists.

That difference in origin is the root of almost every practical trade-off below. XML models documents. JSON models data structures.

Structure side by side

The same record in both formats:

xml
<order id="1042" currency="USD"> <customer>Ada Lovelace</customer> <items> <item sku="A1" qty="2">Notebook</item> <item sku="B7" qty="1">Pen</item> </items> <total>24.50</total> </order>
json
{ "id": 1042, "currency": "USD", "customer": "Ada Lovelace", "items": [ { "sku": "A1", "qty": 2, "name": "Notebook" }, { "sku": "B7", "qty": 1, "name": "Pen" } ], "total": 24.50 }

A few things stand out immediately.

XML distinguishes attributes from element content.

text
id
and
text
currency
are attributes;
text
customer
is element text. JSON has no such concept — everything is a key/value pair, so the designer has to decide whether
text
id
is a sibling key or something else. This flexibility in XML is also a perennial source of bikeshedding: should a value be an attribute or a child element? There's rarely one right answer.

JSON has types; XML does not. In JSON,

text
2
is a number and
text
"A1"
is a string. In XML, everything between tags is text —
text
qty="2"
is the string "2" until a schema or your code says otherwise. JSON's small type system (string, number, boolean, null, object, array) covers most everyday data without ceremony.

JSON has native arrays. XML expresses a list as repeated sibling elements, which means there's no syntactic difference between "one item" and "a list with one item." Parsers and binding libraries have to be told which elements are collections. JSON's

text
[ ]
removes that ambiguity entirely.

Verbosity, and why it's not the whole story

The most common complaint about XML is verbosity: every element needs an opening and closing tag, and the closing tag repeats the name. On the wire, JSON is usually smaller, and after gzip the gap narrows but doesn't vanish.

For a high-traffic API serving millions of small payloads, that overhead is real and JSON's compactness matters. But verbosity cuts both ways. XML's explicit closing tags make deeply nested, hand-edited documents easier to read and harder to corrupt silently — a missing

text
</section>
is obvious, whereas a misplaced brace in deep JSON can be maddening to find. When you're staring at a tangled payload, a tool like a JSON formatter or an XML formatter does more for readability than either format's raw syntax ever will.

Schemas: this is where XML still shines

Validation is XML's strongest remaining argument. The ecosystem is mature and genuinely powerful:

  • XSD (XML Schema Definition) lets you specify element order, data types, cardinality ("between 1 and 10 items"), value ranges, and reusable complex types. It's verbose, but it's expressive.
  • DTD is the older, simpler validation mechanism, still seen in document formats.
  • Schematron adds rule-based, business-logic validation on top — "if the country is US, the state field is required" — which is awkward to express in pure XSD.

JSON's answer is JSON Schema. It has matured a great deal and now handles types, required fields, enums, conditional rules, and

text
$ref
reuse. For most API contract validation it's entirely sufficient, and it's the right tool when you're already in a JSON world. But it's worth being honest: for complex, deeply structured document validation with rich type hierarchies, XSD plus Schematron is still ahead. If your domain has decades of formalized schemas — finance, healthcare, government — those schemas are almost certainly written in XSD, and rebuilding them in JSON Schema would be a multi-year exercise nobody is paying for.

XML also brings XPath and XSLT — a query language and a transformation language built into the ecosystem. XPath in particular is excellent for pulling values out of large documents (

text
//order[@currency='USD']/total
). JSON has JSONPath and tools like
text
jq
, but XPath is older, more standardized, and far more battle-tested for document querying.

Namespaces

XML namespaces let two documents that both define a

text
<title>
element coexist without collision:

xml
<doc xmlns:book="http://example.com/book" xmlns:html="http://www.w3.org/1999/xhtml"> <book:title>Refactoring</book:title> <html:title>A Page Heading</html:title> </doc>

This matters enormously when you're combining vocabularies from different organizations into one document — exactly the situation enterprise integration and document standards face. JSON has no namespace concept. The community works around it with naming conventions (prefixed keys) or by nesting under a vocabulary key, but there's no built-in mechanism, and there's no equivalent to XML's ability to mix vocabularies inline. For most app payloads you never miss it. For standards-body interchange formats, it's indispensable.

Where XML persists — and rightly so

SOAP web services. SOAP is XML to its bones, with WS-Security, WS-ReliableMessaging, and a formal WSDL contract. It's heavy, but in regulated, contract-first enterprise integration — banks, insurers, telecom billing, government systems — it's entrenched and works. New greenfield projects rarely choose SOAP, but the installed base is vast and stable.

Configuration files. Maven's

text
pom.xml
, Spring's older XML contexts, Android layout and manifest files, and countless enterprise tools use XML config. The verbosity is tolerable for files humans edit occasionally, and the schema-backed validation catches mistakes before runtime.

Documents. This is XML's home turf. Office Open XML (

text
.docx
,
text
.xlsx
,
text
.pptx
) and OpenDocument are ZIP archives full of XML. DocBook and DITA power technical documentation pipelines. When your data is a document — mixed content, text with inline markup, order-sensitive structure — XML models it naturally and JSON does not.

SVG. Scalable Vector Graphics is an XML vocabulary, which is why it composes with HTML and CSS and why you can style and script it in the browser. There's no JSON equivalent, and there doesn't need to be.

RSS, Atom, sitemaps, and other interchange standards are XML because they were standardized when XML was the lingua franca, and the cost of changing them outweighs any benefit.

Where JSON won — and rightly so

REST and web APIs. JSON maps onto the objects your code already uses, parses with one line in every language, and is trivially consumed by browsers via

text
JSON.parse
. For request/response payloads it's the obvious default.

Configuration for modern tooling.

text
package.json
,
text
tsconfig.json
, and most JavaScript-ecosystem tools use JSON (or its friendlier cousins like YAML and TOML, which exist precisely because JSON lacks comments and trailing-comma tolerance).

Document databases and logging. MongoDB, Elasticsearch, and structured log pipelines are JSON-native. Append-friendly formats like NDJSON (one JSON object per line) make streaming and grep-ing painless.

Anything talking to a browser. JavaScript's native handling of JSON makes it the path of least resistance for front-end work, full stop.

How to choose

A practical rule of thumb:

  • Choose JSON when you're moving data structures between programs — APIs, config for modern tools, NoSQL storage, browser communication. It's lighter, simpler, and natively typed.
  • Choose XML when you're representing documents with mixed content, when you need rich schema validation or namespaces, when querying with XPath/XSLT, or when you're integrating with an ecosystem (SOAP, Office formats, SVG, government standards) that already speaks XML.
  • Don't fight the ecosystem. The format is usually dictated by what you're integrating with. The skill is recognizing when the default is wrong for your specific case.

Both formats are good at what they were designed for. XML is a document markup language pressed into service as a data format; JSON is a data format that was never asked to be anything else. Most of the time the friction people blame on "XML being bad" is really XML being used for a job JSON suits better — and, less often, JSON being stretched to do document work XML would handle gracefully. Pick the one whose original purpose matches your problem, and most of the pain disappears.