Design
How will we design this CMS to be simple, fast, lightweight, extensible and standards compliant (future proof)?
Design pattern
I'm fond of the MVC design pattern (Model-Viewer-Controller), so that's what I'm going to be using as a foundation. Here's how to apply it:
Model is the Data, which in our case is served by XML-files.
The Controller is a core component that uses site meta and session data to decide how to handle requests from the client user (discussed further down).
The Viewer is another core component that does what the Controller has decided including XSLT transition - and translation of content data if appropriate. I think the viewer should be both server and client side - the main focus should lay with the server side for the CMS itself, but by default be able to serve a client web browser an (X)HTML page (with a 5 Level Compatibility Client Browser GUI which I will explain later).
The advantage of this approach is for a true semantic service, where not only a regular desktop web browser could be a browser, but a whole range of client applications that can communicate over HTTP.
Architecture
I also like
tier-based architectures and here's a good explanation
that I found in '
The PHP Scalability
Myth' by Jack Herrington:
There are three basic web architectures in common use today: two-tier, logical three-tier, and physical three-tier. Engineers give them different names and slightly different mechanics, so to be clear about what I mean, I will illustrate the three architectures.
Two-Tier Architecture
A two-tier application web application has one tier for the display and application logic, connected to a second tier, which is the database server. Two-tier web applications perform best because they require the least processing for a single request.
Three-Tier Architecture
A three-tier architecture separates the display and application logic into two separate tiers. To the left, a physical three-tier architecture puts the three tiers onto three separate machines or processes. To the right, a logical three-tier architecture has the application and business logic layer together in one process, but still divided by an API layer.
Three-tier web applications trade some performance hit for better code factoring. With the business logic separated from the user interface, it is possible to access the application logic from multiple directions. For example, we could expose the layer through web services.
I find them all rather sensible, so I leave it up to you to choose which one suits you better.
Framework
We've decided upon PHP as the platform for this CMS. To use the feature set we're expecting to be using version 5.1 seems to be the minimum requirement. PHP is well supported on web hosting sites and even if PHP version 4 is muchg more common at present, it can be assumed that they will in the near future migrate to the latest stable version of the 5.x branch when they do. So it probably would be pretty easy and not so much fuzz to deploy the WYMsite CMS onto a hosting account at a web server supporting PHP (5.1)
The Model Data, which is discussed in the next section, can exist either on the same or on another server machine - it's independent as long as it's reachable by the Controller and the Viewer. The Controller Component is a collection of PHP scripts that use site meta (from an RDF-based file in the Model Data Repository) and session data to decide how to handle requests from the client user (also discussed further down). The meta data has information on which data (content, style sheets etc) to use and combine for the server response to the client. The Viewer does what the Controller has decided including transition - and translation of data if appropriate. I think the viewer should be both server and client side - the main focus should lay with the server side for the CMS itself, but by default be able to serve a client web browser an (X)HTML. The advantage of this approach is for a true semantic service, where not only a regular desktop web browser could be a browser, but a whole range of client applications that can communicate over HTTP via what I would like to call Interface Gateways.
Data
So what XML vocabulary to use? Is there one suitable or is an custom better? I'll vote for going with as much standards as possible, and consider the most suitable vocabulary for each kind of data. The important thing here is how to store it, not the way to distribute it, and as such it better be the most reusable format for each kind of data, rather than what's most easy to serve up. Configuration files - with the *.prop(erties) file extension - one for the server or servers ( server.prop ) and one for each WYMsite site served on that server ( default.site.prop ). [ HV:*.conf files could confuse the user as it sounds more like plain text configuration files, so it should be something else, *.prop for now] For the pure configuration data, a simple set of XML elements is a good choice: easy to learn, easy to remember, easy to extend. One file for each site is a good solution, though for complex sites it could be interesting to split it in multiple sections - which is why some sort of expanding syntax should be included.
Site Meta Data ( default.site.rdf ) - which is purely references of what content is contained in which files and how they fit together for output - I'm thinking RDF might be well suited, but possibly another format or a custom is to prefer for readability. In current version, we only use .xml files (BTW ) and project, section, elem and prop elements. A property (prop) can contain a text node or XHTML nodes.
It would be useful to implement a kind of inheritance between sites, and between server.xml and the sites.
Inheritance is very useful also between elements, so an element can inherit properties from another.
This is doable with the help of XSLT (though with some limitations), but it would be preferable and easier to implement inheritance in PHP.
We can also think about properties (and methods?) overloading,
so an element can overload its parent's properties, by simply
redefining it.
[ JFH: I'd prefer to keep xml files simple, and to use the same syntax everywhere. I don't know if RDF and RSS are really mandatory here.
A limited set of element types + strict XHTML might ease XSLT development, too. ]
Style Data I'm thinking would be XSLT (eXtensible Style sheet Language Transition) to get the different XML vocabularies into their presentational form (ie HTML/XHTML for web browser). But also CSS (Cascading Style Sheets) are Style data which are more for presentation than for structure.
For
Content Data I'm thinking several different formats
might fit different types of data:
For the first four core features - news, updates, blog and/or about page - my opinion is that they all could be either in RSS or ATOM vocabulary - possibly a customisation or microformat thereof.
For articles and other more standard types of pages I consider either DocBook or XHTML itself to be alternatives here. DocBook might be more future proof since XHTML is in transition and DocBook comes with many XSLT style sheets for transitions to many other output formats. On the other hand XHTML is much simpler to create, serve up and manage for the time being. Starting out with XHTML to maybe do a system transition to DocBook in the future might be the way to go.
For I18n data I'm figuring easy translation of the CMS standard buttons and other standard textual content (small portions, not articles and such) to use XLIFF (which will make translating the the CMS itself easier).
Controller
Depending on the clients request and authorization the server side Controller decides how to handle it. If everything seems to be OK it forwards to the viewer what to serve the client with. [ More to come...]
Viewer
Depending on the client the server side Viewer produces an output format that the client side can handle (the one it have requested). [ More to come...]
Features
Core Engine
The core engine has the minimum set of features the CMS needs to have:
- Copying data for a first use installation and for backup (in a simple to use GUI preferably)
- Versioning data, using a version control system
- Editing site configuration with a user-friendly web interface (no alternative external application connections for the time being)
- Parsing and traversing XML data (already included)
- Editing XML (XHTML) content data in an user-friendly WYSIWYM
editor (WYMeditor) with use of XSLT stylesheet (already included)
- Transforming XML data of a certain format to another format with use of XSLT stylesheets (already included)
- Serving content in the user's language (i18n)
- Plugability for using plugins not part of the core (those that
are otional to install or uninstall)
- Interfacing with serving up webpage content with
HTML/XHTML/XML/XSLT/CSS files over HTTP like a normal web server
(standard)
-
Interfacing with serving up XML/XSLT content as an web application server over HTTP (through plugin interface gateways like REST and SOAP etc)
W3Client
This is the default set of client GUI to the system, included in and loaded into the end-users web browser client. It could make use of a 5-layered DOMscripting compability model:
|
Layers |
Structure |
Presentation |
Behaviour |
|
1 |
HTML 4
|
Old Style Table based Layout (+ XSLT Server side) | (PHP Server side)
|
|
2 |
XHTML 1.0 Strict
|
CSS Level 1 (+ XSLT Server side) | (PHP Server side) |
|
3 |
XHTML 1.0 Strict | CSS Level 2 (+ XSLT Server side) | Javascript (+ PHP Server side) |
|
4 |
XHTML 1.0 Strict | CSS Level 2 (+ XSLT Server side) | Javascript + XMLHttpRequest + jQuery (+ PHP Server side) |
|
5 |
XHTML 1.0 Strict + XML | CSS Level 3 + XSLT (Client side) | Javascript + XMLHttpRequest + jQuery |
On the client GUI end, what is really loaded into the end-users web browser client is this:
Using WYMeditor for content editor will give us some opportunities:
- Applying classes to XHTML elements, so the end-user can give them a meaning: date, note, address, recipe, definition, and so on.
- Writing XHTML strict
- Using a schema validation at the server side (ie Relax NG) and AJAX, so the end-user writes well-structured documents
Plugability
To allow for plugins there needs to be a well-documented Plugin API. I suggest the Plugin's should be distributed in a similar form as Java JAR-files - basicly a renamed zip-file with a manifest XML-file - though having an unique significant file extension (*.xapp - Xml Api for Php Plugins?), structure and a custom XML manifest file.
Example structure plugin.xapp:
- plugin.prop - manifest file with the same name as xapp-file and the prop(erties) file extension (This will be copied to the Server Data repository when installed.)
- controller - folder for the plugin's controller engine in PHP5 (This will be copied as the plugin's controller folder on the Web Application Server when installed.)
- model - folder for meta data, XML content data, XSLT style sheets, XLIFF translations and anything else considered as default data of the plugin (This will be copied to the Server Data repository when installed.)
- viewer - folder for the plugin's custom viewer engine in PHP5 - if other than the core engine one (This will be copied as the plugin's viewer folder on the Web Application Server when installed.)
The plugin xapp files are all put in a specific folders inside the WYMsite framework. The plugin manifest RDF file inside the zipped xapp file then contains info of where to put all the files and what data to add into which files when installed. The same info is used to uninstall the plugin if the user should choose to do so at another time. As long as possible these xapp-files shouldn't be dependant on each other, but if they do, that dependencies must be in the manifest file and added into the system meta data.
The WYMsite distribution should come with a default set of plugins, that both demonstrates how the plugin feature with the plugin API work and also has some optional but often used additional features for the WYMsite CMS.
Interface Gateways
The communication between the client application - the users web browser loaded with a page from the CMS with ECMAscript behaviour control or another external application - is then subjected to be one of an web service, where two main approaches has emerged: SOAP and REST.
I like REST( REpresentational State Transfer) the most since it is the simpler of the both, only using the HTTP as it was originally designed. SOAP (Simple Object Access Protocol) however have had a much broader and earlier acceptance in the corporate world. Going back to the analysis, our users maybe not so much are of corporate users, why REST might be to prefer, but OTOH they might very well be used to and accustomed to corporate software tools that they might want to keep on using as long as possible. My suggestion is to put focus on having a REST server gateway and secondary support a SOAP server gateway. Along side just serving up regular (X)HTML pages of course.
The default web browser client side viewer by the way. might look and function in several different ways. Older browser versions should probably be served a mostly static (old) kind of HTML pages, while newer browsers might have an AJAX enhanced version, where a client side web application interface just reloads part of the page a la AJAX or AHAH way. The latter could even communicate through the REST or SOAP server gateway interface.