and ; Same for
tables, etc. ... you all know this kind of stuff.
Even proposed for `XHTML 2.0`__, a lot of things are still missing in the
current specification, like for navigation lists - one of the most common
things on each website; Same for breadcrumbs, articles, abstracts, etc.
__ http://en.wikipedia.org/wiki/XHTML#The_XHTML_2.0_draft_specification
`Dublin core`__ and `RDF`__ are possibilities to include a lot more stuff in
your document to tag your contents in various ways, but you are still bound to
the broken XHTML.
__ http://en.wikipedia.org/wiki/Dublin_Core
__ http://en.wikipedia.org/wiki/Resource_Description_Framework
Real semantic markup
--------------------
Real markup would start by defining a custom schema fitting your web
application, or reusing a predefined one, like one for weblogs, or a project
site. Those markups would only care about proper semantic markup and may also
include and reuse existing namespaces like dublin core and RDF.
A very simple example for a blog site could look like: ::
Some random thoughts
The long way to a semantic web
...
XML
PHP
This just includes the content without any definitions how to layout the
stuff. You should of course define a schema, when using such custom XML, so
that other developers are really able to read and reuse your XML.
Reusing other namespaces
------------------------
Like mentioned above there are a lot of namespaces you may reuse here, because
they already define everything you need and other application may already
correctly reuse them.
XLink
^^^^^
`XLink`__ is used to provide links between resources, like in
XHTML. This may be reused in the elements and . ::
As you may guess from type="simple" there are also more complex link types -
but those again do not matter here.
__ http://en.wikipedia.org/wiki/XLink
Dublic Core
^^^^^^^^^^^
Dublin Core is one of the XML definitions for meta data. You may use it to
declare something simple as author and license of content, but there are also
some more advanced features. Integrating in the blog example we just stay with
the mentioned author and license... ::
Some random thoughts
Kore Nordmann
CC by-sa
Generate the output
===================
Displaying such XML structures in the web you have the following three
choices.
Formatting with CSS
-------------------
You may just format the XML you defined above using CSS including a format in
the head of your document definition. ::
But this very simple approach has several drawbacks.
1. The content may not have the same order as it should have in the output.
Imagine, that your blog posts are sorted chronologically, but you only
want to show the latest ten blog posts with the newest on top...
2. The capabilities of CSS are quite limited.
3. Especially with the low number of elements you normally have in your
website XML description when using a proper markup.
Transform to HTML using XSLT
----------------------------
XSLT offers a quite easy way to transform arbitrary XML to some other
language, which may be XML, HTML or something completely different.
Transforming all the content to HTML leaves you there, where HTML websites are
today - with all the layout capabilities.
But what did you get then? On the other hand you still have the semantic
markup in your source view, when the browser does the transformations, and you
don't process the XSLT on the server side (which may have some hidden
surprises for you ;). On the other hand you still have to cope with the pseudo
semantic markup of HTML, where you of course shouldn't use tables for design
etc.
This is nothing I want to do, and I am quite sure, that I am not alone with
this feeling. Even the user might not notice this markup is HTML, it is just
the wrong markup language for this, because it uses some random mix of
semantical and structural markup. It is on its way to get more and more
semantical which makes it even more and more useless for this task, even I
second this development.
XSLT to transform XML to XML
----------------------------
Sounds useless? Think of some very custom XML markup just for layout. You may
think of SVG, but I mean something which can cope with text better then SVG
can ;). You could think of something like the glade files used for GTK, or
some GUI structures from some random language poured into XML. ::
$title
Just a very simple example to extend in your mind by yourself :). Each user
could define his custom semantic markup and using the layout definition
language he likes best. This would enable you to create really nice
interfaces - you could embed SVG elements and format everything else with CSS:
Browser support
===============
There are two basic things required for user interaction on websites.
- Links
As shown above, the correct way to embed links in your website is to use the
XLink namespace to define links for any of the elements.
- Forms
If users should be able to submit some content to your website you need
forms. As long as `XForms`__ are not available all you may get are the good
old forms from HTML / XHTML. XHTML is XML, so you also may include them in
your website using proper namespace definitions.
__ http://en.wikipedia.org/wiki/XForms
So why am I not using this for my website?
Browser support
---------------
So there are four features the main browsers would need to support, so this
would be usable. Now the sad part starts.
+--------------------+-----------+-----------+-----------+-----------+
| | XML + CSS | XSLT | Links | Forms |
+====================+===========+===========+===========+===========+
| `Gecko`__ | YES | YES | YES* | YES |
+--------------------+-----------+-----------+-----------+-----------+
| `Opera`__ | YES | YES | NO | YES |
+--------------------+-----------+-----------+-----------+-----------+
| `KHtml`__ | YES | YES | NO | YES |
+--------------------+-----------+-----------+-----------+-----------+
| Internet Explorer | YES | YES | NO | NO |
+--------------------+-----------+-----------+-----------+-----------+
__ http://www.mozilla.org/newlayout/
__ http://www.opera.com/
__ http://www.konqueror.org/features/browser.php
\* The support is limited to a subset of the XLink specification, but at
least simple XLinks are supported, which work like the links known from
HTML.
Sadly the links using XLink are not clickable in any browser but the ones
using the Gecko-Engine. There exists `some workaround with Opera`__ which
seems not to work when the href-Attribute is in some namespace.
__ http://www.xml.com/pub/a/2000/04/19/opera/index.html
For links you could import the element from XHTML, but this again would be
an ugly hack.
I did not expect anything like supporting those standards from the Microsoft
Internet Explorer, but I am really disappointed, that Opera and KHtml/Webkit
do not support something simple like XLink for links in elements beside the
standard XHTML a element. Not supporting the Internet Explorer would not
matter for my personal site, but excluding the other two engines would really
hurt.
Conclusion
==========
The discussion about XHTML 2.0 and HTML 5.0 could be skipped, if the browsers
would just support existing standards like XLink. You could define your custom
semantic and structural markup, translate between them using XSLT and do not
have any dependecies to crappy markup languages like HTML or XHTML any more.
Together with CSS 3, and the implementation of new features like `calc()`__ it
then could be the first time I would be happy with the (semantic) markup in
web environments.
__ http://en.wikipedia.org/wiki/Comparison_of_layout_engines_%28CSS%29#Values_and_units
Trackbacks
==========
Comments
========
- Keith Alexander at Thu, 30 Aug 2007 10:18:13 +0200
Hi, RDF is more than a namespace to drop into XML documents - it's a uniform
structure for data (triples). Likewise, the idea behind the semantic web is
more than a web of semantically-marked up documents - it's a web of
interlinked data.
You can describe the semantics of your data in a separate RDF file that your
html can link to. (Have a look at the SIOC [http://sioc-project.org/] RDF
Vocabulary for describing blogs and other types of web sites).
But there are also techniques for marking up (valid) web pages such that
RDF can be extracted from them. GRDDL [http://www.w3.org/TR/grddl/] is a
standard for linking to optional XSL transformations of your custom markup
to RDF. RDFa is a nascent standard for adding attributes to XHTML in such a
way as to express RDF statements within the web page. (However, I'd wait a
little while until the specification is stable).
I use (for example, see my blog's pages) a syntax called eRDF
[http://getsemantic.com/wiki/ERDF], which just uses existing HTML attributes
to express RDF statements (RDFa uses new attributes). I use a @profile
attribute to point to the GRDDL transformation into RDF. So you can extract
the data from my web pages by piping them through a web service like
http://triplr.org/. A benefit of this method (of using existing html
attributes) is that I can hang CSS and javascript on these semantic hooks
and browsers all understand it.
- Kore at Thu, 30 Aug 2007 11:03:51 +0200
@Keith: Thanks for the links.
I have been a bit unclear about Dublin Core and RDF, because they should not
be main topic of the blog post. Thanks for the clarifications on RDF.
I know what RDF basically does, I will come back to this soon in a
completely different blog post when it comes to topologies and ontologies in
web applications, where RDF and Tagging get relevant.
- Martin Fjordvald at Thu, 30 Aug 2007 14:52:32 +0200
I know this is a bit off-topic but it still touches it somewhat. (sorry :P)
The problem with the ability to manipulate the content to such an extent is
that the average user will be much better equipped to get content exactly as
they see fit, which will naturally mean that advertisement will be left
out.
Basically what this is, is the ideal web for academia, you have content
semantically represented and styles can easily be swapped with minimum user
effort.
Also, it's the exact opposite of what internet entrepreneurs want. A large
part of internet content is funded by advertisements, as you are of course
aware of, (not suggesting otherwise) which means that a lot of interesting
content would disappear if removing the ads become too easy. Firefox
extensions such as Adblock and, perhaps even worse, Greasemonkey enables the
average user to remove it, semantic markup just makes it even easier.
It's a battle of two opposing sides, perhaps that would even explain why the
internet is as messy as it currently is.
- Idetrorce at Sat, 15 Dec 2007 12:36:27 +0100
very interesting, but I don't agree with you Idetrorce