The problem of HTML and XHTML is obvious and known by everybody familiar with web application development. Both neither offer proper semantical markup for the website contents, nor do they offer an advanced layout language.
For several web projects I looked for a possibility to use a semantic markup for my contents, so that they accessible by others and even by software, and not loose the common possibilities for layouting my website, like the designer requested. Until now, I didn't find a solution for this, but let's go a bit into detail what the problems are...
The problem has been described by a lot of persons and does not need much explanation, but I lets try a short summary.
XML - as it is an "extensible markup language" - defines the markup of some contents. Let's ignore, that HTML is not XML, but XHTML still is; The differences are not relevant here.
A markup language combines text and extra information about the text. The extra information, for example about the text's structure or presentation, is expressed using markup, which is intermingled with the primary text. (http://en.wikipedia.org/wiki/Markup_language)
With XHTML the markup still defines both, structure and presentation, which is not wanted, if you want to offer easy access by programs and effective layouting for your website. As a matter of fact XHTML does not solve any of those two problems properly. XHTML may have some more, and more clearly defined, structures then HTML, and it is often used in a better way. But this does not solve the problem, only reduces it a very little bit.
There are very basic structures for semantic markup, like lists, you define using <ul>, or headlines and paragraphs, using <h[1-6]> and <p>; Same for tables, etc. ... you all know this kind of stuff.
Even proposed for XHTML 2.0, a lot of things are still missing in the current specification, like <nl> for navigation lists - one of the most common things on each website; Same for breadcrumbs, articles, abstracts, etc.
Real markup would start by defining a custom schema fitting your web application, or reusing a predefined one, like one for weblogs, or a project site. Those markups would only care about proper semantic markup and may also include and reuse existing namespaces like dublin core and RDF.
A very simple example for a blog site could look like:
<?xml version="1.0" encoding="UTF-8"?> <blog> <title>Some random thoughts</title> <!-- ... --> <posts> <post> <title>The long way to a semantic web</title> <description>...</description> <tags> <tag>XML</tag> <tag>PHP</tag> <!-- ... --> </tags> <comments> <!-- ... --> </comments> </post> <!-- ... --> </posts> </blog>
This just includes the content without any definitions how to layout the stuff. You should of course define a schema, when using such custom XML, so that other developers are really able to read and reuse your XML.
Like mentioned above there are a lot of namespaces you may reuse here, because they already define everything you need and other application may already correctly reuse them.
XLink is used to provide links between resources, like <a href=""/> in XHTML. This may be reused in the elements <blog> and <post>.
<?xml version="1.0" encoding="UTF-8"?> <blog xmlns:xlink="http://www.w3.org/1999/xlink"> <posts> <post xlink:href="/blog/the_long_long_way_to_semantic_web.txt" xlink:type="simple"> <!-- ... --> </post> </posts> </blog>
As you may guess from type="simple" there are also more complex link types - but those again do not matter here.
Dublin Core is one of the XML definitions for meta data. You may use it to declare something simple as author and license of content, but there are also some more advanced features. Integrating in the blog example we just stay with the mentioned author and license...
<?xml version="1.0" encoding="UTF-8"?> <blog xmlns:dc="http://purl.org/dc/elements/1.1/"> <title>Some random thoughts</title> <dc:creator>Kore Nordmann</dc:creator> <dc:rights>CC by-sa</dc:rights> </blog>
Displaying such XML structures in the web you have the following three choices.
You may just format the XML you defined above using CSS including a format in the head of your document definition.
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/css" href="/blog.css"?>
But this very simple approach has several drawbacks.
The content may not have the same order as it should have in the output. Imagine, that your blog posts are sorted chronologically, but you only want to show the latest ten blog posts with the newest on top...
The capabilities of CSS are quite limited.
Especially with the low number of elements you normally have in your website XML description when using a proper markup.
XSLT offers a quite easy way to transform arbitrary XML to some other language, which may be XML, HTML or something completely different. Transforming all the content to HTML leaves you there, where HTML websites are today - with all the layout capabilities.
But what did you get then? On the other hand you still have the semantic markup in your source view, when the browser does the transformations, and you don't process the XSLT on the server side (which may have some hidden surprises for you ;). On the other hand you still have to cope with the pseudo semantic markup of HTML, where you of course shouldn't use tables for design etc.
This is nothing I want to do, and I am quite sure, that I am not alone with this feeling. Even the user might not notice this markup is HTML, it is just the wrong markup language for this, because it uses some random mix of semantical and structural markup. It is on its way to get more and more semantical which makes it even more and more useless for this task, even I second this development.
Sounds useless? Think of some very custom XML markup just for layout. You may think of SVG, but I mean something which can cope with text better then SVG can ;). You could think of something like the glade files used for GTK, or some GUI structures from some random language poured into XML.
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/css" href="/blog.css"?> <layout> <verticalGrid> <grid>$title</grid> <grid> <tree> <!-- ... --> </tree> </grid> </verticalGrid> </layout>
Just a very simple example to extend in your mind by yourself :). Each user could define his custom semantic markup and using the layout definition language he likes best. This would enable you to create really nice interfaces - you could embed SVG elements and format everything else with CSS:
There are two basic things required for user interaction on websites.
As shown above, the correct way to embed links in your website is to use the XLink namespace to define links for any of the elements.
If users should be able to submit some content to your website you need forms. As long as XForms are not available all you may get are the good old forms from HTML / XHTML. XHTML is XML, so you also may include them in your website using proper namespace definitions.
So why am I not using this for my website?
So there are four features the main browsers would need to support, so this would be usable. Now the sad part starts.
XML + CSS
* The support is limited to a subset of the XLink specification, but at least simple XLinks are supported, which work like the links known from HTML.
Sadly the links using XLink are not clickable in any browser but the ones using the Gecko-Engine. There exists some workaround with Opera which seems not to work when the href-Attribute is in some namespace.
For links you could import the <a> element from XHTML, but this again would be an ugly hack.
I did not expect anything like supporting those standards from the Microsoft Internet Explorer, but I am really disappointed, that Opera and KHtml/Webkit do not support something simple like XLink for links in elements beside the standard XHTML a element. Not supporting the Internet Explorer would not matter for my personal site, but excluding the other two engines would really hurt.
The discussion about XHTML 2.0 and HTML 5.0 could be skipped, if the browsers would just support existing standards like XLink. You could define your custom semantic and structural markup, translate between them using XSLT and do not have any dependecies to crappy markup languages like HTML or XHTML any more.
Together with CSS 3, and the implementation of new features like calc() it then could be the first time I would be happy with the (semantic) markup in web environments.
Comments are closed. This blog only exists so that all articles can still be referenced. There is no relevant activity any more on this blog. Since spammers still also find this blog comments are shut down entirely.