Okay, so I don’t actually hate XML.
But recently I have been working on writing a syndication tool and I am beginning to agree with a lot of people that question the use of XML for simple data exchange. XML was originally supposed to be both machine and human readable, and in the case of using XML to create structured documents, like XHTML, it is. It was an offshoot of SGML but had much stricter and therefore simpler syntax rules. But then people started to try and use XML for any sort of communication over the network; CSV files got turned into XML (at no real gain other than it’s XML), protocols for method invocation over HTTP (SOAP), to defining the interface for those method invocations (WSDL) and now it seems, for any data exchange out there, a lot of people think that you need to do it in XML, and that you should define the XML via an XSD (XML Schema Definition). Now XSDs I hate! In defining the schema of an XML document using XML you are using an crude tool for the task of exchanging data by using a terrible tool for the task of defining a schema. XSD is painful unless you have some sort of tool to to help you. Don’t believe me, here is the XSD for syndication. Maybe I am crazy but I think that a schema definition language should be human readable and I don’t think XSD is. The arguments for XML are many, but mostly seem to revolve around it being a standard, and that there are a lot of tools that exist for it. So XML has evolved from a simplification of SGML for the creation of structured documents, to a catch all hammer in the toolbox of many software designers. Soon people will start suggesting that we just write the programs that run XML based files in some sore of XML based programming language (oh wait, they did that already with XSL and XSLT). There has to be a better way.
Right now I have been looking at other data exchange formats and have been focusing on JSON and YAML. Both are more human readable (YAML even more so than JSON) and have less weight to them than XML for data exchange. They are standards with decent library support and can cover any structured data format that XML can. There is even a tool out there to create verifiable schemas for both JSON and YAML called Kwalify. I also am starting to think that there needs to be a language for defining schemas in a language/platform neutral way. This language could be used by tools to generate things like XSD (if you have to use XML), YAML for Kwalify, SQL etc. This language becomes like a DSL (Domain Specific Language) for defining schemas. I know there are a lot of people that think that creating a parser for a new language is hard, but using tools like ANTLR and yacc it’s fairly easy and a powerfull addition to your developers toolbox. As Martin Fowler says, don’t be afraid of creating parsers! We need to start thinking about the proper use of XML as a tool. It has it’s place, but there are better tools out there for doing many of the things that XML is currently used for. Also, is the obsession with using XML for everything preventing us from creating even better tools? It’s something we need to think about.
PS: Apologies to Max Cannon, and many thanks to folks that helped create Build Your Own Meat!
All the RETS buzz these days seems to be about the new RESO Syndication standard. It promises to make the lives of syndicators (Google, Zillow, Yahoo, Trulia etc), aggregators (ThreeWide, Point2), brokers and even MLSs easier. With a common data format to use, the workload of everyone will go down significantly. But right now there is one actor in that list that will be left with a syndication tool gap; the broker. What is needed is a simple easy to use tool that can allow the broker to create a syndication file reliably, even if they don’t have their own listing database.



