The XML-versus-JSON argument was settled years ago for most web APIs, and JSON won. But "JSON won" is not the same as "XML is dead" — there are whole domains where reaching for JSON would be a mistake. Knowing which format fits which problem is still a useful piece of judgment.
A short history
XML (Extensible Markup Language) was standardized by the W3C in 1998. It came out of SGML, the same lineage as HTML, and it was designed as a general-purpose way to mark up documents — text with structure, where the order and nesting of elements carries meaning. That document heritage explains a lot of XML's design choices that feel heavy when you only think of it as a data-transport format.
JSON (JavaScript Object Notation) was popularized by Douglas Crockford in the early 2000s. It wasn't invented so much as recognized: it's a subset of JavaScript's object literal syntax, formalized into a tiny, language-independent spec. Its entire value proposition was that it maps directly onto the data structures every modern language already has — objects/maps and arrays/lists.
That difference in origin is the root of almost every practical trade-off below. XML models documents. JSON models data structures.
Structure side by side
The same record in both formats:
xml<order id="1042" currency="USD"> <customer>Ada Lovelace</customer> <items> <item sku="A1" qty="2">Notebook</item> <item sku="B7" qty="1">Pen</item> </items> <total>24.50</total> </order>
json{ "id": 1042, "currency": "USD", "customer": "Ada Lovelace", "items": [ { "sku": "A1", "qty": 2, "name": "Notebook" }, { "sku": "B7", "qty": 1, "name": "Pen" } ], "total": 24.50 }
A few things stand out immediately.
XML distinguishes attributes from element content.
idcurrencycustomeridJSON has types; XML does not. In JSON,
2"A1"qty="2"JSON has native arrays. XML expresses a list as repeated sibling elements, which means there's no syntactic difference between "one item" and "a list with one item." Parsers and binding libraries have to be told which elements are collections. JSON's
[ ]Verbosity, and why it's not the whole story
The most common complaint about XML is verbosity: every element needs an opening and closing tag, and the closing tag repeats the name. On the wire, JSON is usually smaller, and after gzip the gap narrows but doesn't vanish.
For a high-traffic API serving millions of small payloads, that overhead is real and JSON's compactness matters. But verbosity cuts both ways. XML's explicit closing tags make deeply nested, hand-edited documents easier to read and harder to corrupt silently — a missing
</section>Schemas: this is where XML still shines
Validation is XML's strongest remaining argument. The ecosystem is mature and genuinely powerful:
- XSD (XML Schema Definition) lets you specify element order, data types, cardinality ("between 1 and 10 items"), value ranges, and reusable complex types. It's verbose, but it's expressive.
- DTD is the older, simpler validation mechanism, still seen in document formats.
- Schematron adds rule-based, business-logic validation on top — "if the country is US, the state field is required" — which is awkward to express in pure XSD.
JSON's answer is JSON Schema. It has matured a great deal and now handles types, required fields, enums, conditional rules, and
$refXML also brings XPath and XSLT — a query language and a transformation language built into the ecosystem. XPath in particular is excellent for pulling values out of large documents (
//order[@currency='USD']/totaljqNamespaces
XML namespaces let two documents that both define a
<title>xml<doc xmlns:book="http://example.com/book" xmlns:html="http://www.w3.org/1999/xhtml"> <book:title>Refactoring</book:title> <html:title>A Page Heading</html:title> </doc>
This matters enormously when you're combining vocabularies from different organizations into one document — exactly the situation enterprise integration and document standards face. JSON has no namespace concept. The community works around it with naming conventions (prefixed keys) or by nesting under a vocabulary key, but there's no built-in mechanism, and there's no equivalent to XML's ability to mix vocabularies inline. For most app payloads you never miss it. For standards-body interchange formats, it's indispensable.
Where XML persists — and rightly so
SOAP web services. SOAP is XML to its bones, with WS-Security, WS-ReliableMessaging, and a formal WSDL contract. It's heavy, but in regulated, contract-first enterprise integration — banks, insurers, telecom billing, government systems — it's entrenched and works. New greenfield projects rarely choose SOAP, but the installed base is vast and stable.
Configuration files. Maven's
pom.xmlDocuments. This is XML's home turf. Office Open XML (
.docx.xlsx.pptxSVG. Scalable Vector Graphics is an XML vocabulary, which is why it composes with HTML and CSS and why you can style and script it in the browser. There's no JSON equivalent, and there doesn't need to be.
RSS, Atom, sitemaps, and other interchange standards are XML because they were standardized when XML was the lingua franca, and the cost of changing them outweighs any benefit.
Where JSON won — and rightly so
REST and web APIs. JSON maps onto the objects your code already uses, parses with one line in every language, and is trivially consumed by browsers via
JSON.parseConfiguration for modern tooling.
package.jsontsconfig.jsonDocument databases and logging. MongoDB, Elasticsearch, and structured log pipelines are JSON-native. Append-friendly formats like NDJSON (one JSON object per line) make streaming and grep-ing painless.
Anything talking to a browser. JavaScript's native handling of JSON makes it the path of least resistance for front-end work, full stop.
How to choose
A practical rule of thumb:
- Choose JSON when you're moving data structures between programs — APIs, config for modern tools, NoSQL storage, browser communication. It's lighter, simpler, and natively typed.
- Choose XML when you're representing documents with mixed content, when you need rich schema validation or namespaces, when querying with XPath/XSLT, or when you're integrating with an ecosystem (SOAP, Office formats, SVG, government standards) that already speaks XML.
- Don't fight the ecosystem. The format is usually dictated by what you're integrating with. The skill is recognizing when the default is wrong for your specific case.
Both formats are good at what they were designed for. XML is a document markup language pressed into service as a data format; JSON is a data format that was never asked to be anything else. Most of the time the friction people blame on "XML being bad" is really XML being used for a job JSON suits better — and, less often, JSON being stretched to do document work XML would handle gracefully. Pick the one whose original purpose matches your problem, and most of the pain disappears.