SEEMS THAT NO TOOLS AUTOMATICALLY RESOLVE THE NAMESPACES IN THE DOCUMENTS BY LOOKING UP THE xmlns NAMESPACES SPECIFIED IN THE *.xml DOCUMENT. Unlike DTDs, must specifically tell validators what *.rng file to use for validation. Catalogs are completely ignored for this purpose. Xerces is Apache's XML processor (superior to the one that ships with Saxon). Saxon is a Java XSLT processor (but I'll use Xalan). NOT DOCUMENTING SAXON!! Xalan is Apache's XSLT processor, usually used with Apache's Xerces (available for both Java and C). VALIDATION N.b.: MUST VALIDATE WITH .../docbookxi.rng IF USIING Xinclude. XALAN The Xalan (or possibly Xerces) validator seems to be missing? "java Validate doc.xml". Sun MSV (Much better diagnostics than Jing for syntax errors) Try the daily Sun relames.jar snapshot distro: IMPORTANT: Must extract all the jars fo relames.jar will pull in other jars with -jar. Must throw resolver.jar into same dir to get Catalog functionality. java -jar path/to/relames.jar -xerces /path/to/docbook.rng input.xml ("-catalog x" switch useless for xmlns file resolution, since you must specify the "grammar file" explicitly) Jing. Has Ant RelaxNG enterface!!! (but diagnostics suck) http://www.thaiopensource.com/relaxng/jing.html Installation and usage very similar to MSV. java -jar path/to/jing.jar path/to/file.rng input.xml http://www.thaiopensource.com/relaxng/jing-ant.html xmllint xmllint --noout --relaxng /path/to/docbook.rng input.xml OR xmllint --noout --catalogs input.xml (Uses /etc/xml/catalog or env value $SGML_CATALOG_FILES) (Useless for xmlns resolution) xsltproc is C and much faster than the Java XSLT processors. DocBook style sheets "Most" processors will only use local sheets unless specified sheet is a network URL instead of a file path. The Linux DocBook stylesheet installer writes XMl style catalog /etc/xml/catalog. I see that there is much indirection to other catalog files both in /etc/xml dir and elsewhere. [FYI, DOCTYPE defs are of format Don't use relative file SYSTEM_IDs, because SAX resolves them relative to the doc's location. CATALOG USE ALGORITHM Tries replacing of SYSTEM_ID with /s first. Failing that, IF prefer="public", THEN try mapping the PUBLIC_ID (if any). Failing that (of if prefer="system"), use the specified SYSTEM_ID. CATALOGS and can set prefer. see above. Optional. prefer="public" means to try public mappings if system replacement fails. prefer="system" means to try system replacement, then just fall back to coded SYSTEM_ID. The SYSTEM SYSTEM_ID (replaces SYSTEM_ID) XML STYLE (Preferred!) Target "uri"s relative to group's xml:base, if any. DTD Mappings (no useful key): elements map "publicId" -> real "uri" elements replace "systemId" -> real"uri" same but substitutes only prefix SCHEMA NAMESPACE VALUES (xmlns) AND (OTHER) FILES elments map "name" -> "uri" same but substitutes only prefix. GENERALLY only for files specified in the source, like in processing instructions, but xsltproc will process command-line file paths in this way (i.e. file paths to xsltproc are not necessarily literal). NESTING s are only followed if no direct mappings in current catalog file work. s are similar, but redirect based on prefixes (at least). RUNTIME RESOLUTION (works with both XML and SGML catalog files). Add XML Commons Resolver's resolver.jar to CLASSPATH. Add CatalogManager.properties to CLASSPATH (or set Java Sys Props). (Most importantly, it specifies catalog file paths, relative to CatalogManager.properties file or the target doc, via setting). TEST: java -jar path/to/resolver.jar [-d 2] -c cat.xml \ -{nps} "-//Example//DTD Example V1.0//EN" public -d: Debugging level -npsi: sets the Name|Public id|System id|Uri public: A keyword: doctype|document|entity|notation|public|system|uri -c is useless for xmlns resolution EXAMPLE: (but this is useless, since the docbook rng must always be specified explicitly!) java -jar path/to/resolver.jar -c rng-catalog.xml -u http://docbook.org/ns/docbook uri xsltproc Just uses /etc/xml/catalog, or env var XML_CATALOG_FILES. extensions/xalan.* is jar file to put in CLASSPATH for Xalan-specific Docbook extensions. MUST SET STYLESHEET PARAM "use.extensions" to 1 TO ENABLE (all). After use.extensions is 1, can disable individual ext. feature params. The most important ext. feature is "textinsert.extension". XALAN Need from Xalan distro only: xalan.jar, xml-apis.jar, xercesImp.jar (My old notes say xalan.jar + serializer.jar?) Java 1.4 ONLY!!!!!: YOU MUST SET SYSTEM PROPERTY java.endorsed.dirs TO DIR CONTAINING THE JARS IN ADDITION TO HAVING THE JARS IN THE CLASS PATH. java org.apache.xalan.xslt.EnvironmentCheck ######################### java org.apache.xalan.xslt.Process \ [-ENTITYRESOLVER org.apache.xml.resolver.tools.CatalogResolver \ -URIRESOLVER org.apache.xml.resolver.tools.CatalogResolver] \ [-Dorg.apache.xerces.xni.parser.XMLParserConfiguration=\ org.apache.xerces.parsers.XIncludeParserConfiguration******] \ -out outputfile -in xml-document -xsl stylesheet-path -param name value ######################### HTML: java org.apache.xalan.xslt.Process -out myfile.html \ -in myfile.xml -xsl docbook-xsl/html/docbook.xsl -param use.extensions 1 FO: java org.apache.xalan.xslt.Process -out myfile.fo \ -in myfile.xml -xsl docbook-xsl/fo/docbook.xsl -param use.extensions 1 \ -param fop1.extensions 1 !WORKS!!!!!!!!!!!!!!!!!!!!: add file to classpath: META-INF/services/org.apache.xerces.xni.parser.XMLParserConfiguration containing: org.apache.xerces.parsers.XIncludeParserConfiguration FOP Just need in class path: fop.jar jai_core.jar + jai_code.jar [basic .png et. al. support for fop) (Java distro from http://java.sun.com/products/java-media/jai/current.html) fop-hyph.jar [fop for foreign languages] (fop-ver-specific binary from http://offo.sourceforge.net/hyphenation/index.html) batik*jar IFF need svg graphics java org.apache.fop.cli.Main \ [options] [-fo|-xml] infile [-xsl stylesheet-path] -pdf outfile.pdf .fo -> pdf: java org.apache.fop.cli.Main -fo myfile.fo -pdf myfile.pdf .xml -> pdf: java org.apache.fop.cli.Main \ -xml myfile.xml -xsl docbook-xsl/fo/docbook.xsl -pdf myfile.pdf An example also adds commons-io, commons-logginand xmlgraphics-commons jars to CLASSPATH. I don't know if or when these are required. GENERAL NAMESPACES <... xmlns="..."> Defines ns for no-prefix. Means that this tag AND ALL non-prefixed nested tags use this ns. <... prefix:xmlns="..."> Defines ns for prefix: MEANS THAT AND ALL TAGS WITH prefix: use this ns. Consequences The xmlns*= DOES NOT NEED APPLY TO CURRENT TAG (or any really) Can set multiple namespaces and/or default in one tag: <... xmlns="..." prefix1:xmlns="..." prefix2:xmlns="..."> EVERY DOCBOOK SOURCE FILE MUST DEFINE NAMESPACE. The normative RelaxNG schema DOES NOT SUPPORT entity declarations AT ALL!!!? ? How use the XML Schema instead then ? Old "id=" att is now "xml:id=". xlinking: (Generally noops for non-inline elements) (generated link text) link text link text ... elementbody... N.b. DOCS say link needs content, but it does not. Label text New generalized comes after (or can put title inside info). <alt>explanatory text</alt> Notates its PARENT ELEMENT with plain text. (Only for IMG elements, will render as ALT in HTML). <annotation> is like <alt>, except: Contents must be any tags. In either direction, annotation="x y" + xml:id="x y" associate annotations with target elements. SOME GOOD PARAMS admon.graphics 1 corpstyle.css "path.css" Can put param settings in a customization layer file. <xi:include href="chapterX.xml" xpointer="targetid" /> xmlns:xi="http://www.w3.org/2001/XInclude" /> <xi:include href="file.txt parse="text" [encoding="non-UTF8-encoding"] xmlns:xi="http://www.w3.org/2001/XInclude" /> More convenient is to define this namespace prefix in the root element. THERE ARE NO .dtd's in a DocBook5 relaxNG doc. Can use INTERNAL DTD subset: <!DOCTYPE book [ <!ENTITY company "value"> <!ENTITY second "2"> ... ]> Or define & immediately-deref a ext SYSTEM_ID: <!DOCTYPE book [ <!ENTITY % myent SYSTEM "/path/to/myentities.ent"> %myent; ]> HTML vs. XHTML. XHTML doesn't output a charset/encoding specification, so the resulting encoding is screwed up when viewing with browser. OUTPUT ENCODING. The chunker.output.encoding param has no effect, since the Docbook *.xsl files specify their output encoding literally. IMPORTANT PRACTICES: Give an xml:id for every chapter and section name. This will generate URLs which will persist as document is changed, and will also generate useful chunk section file name URLs (assuming the right property is set). Can have paragraphs immediately before <section>s, but not after them. Processing instructions aren't just for specifying XSLT file: <dbhtml filename="intro.html" ?> specifies chunk file name to generate. I think that "recto" = front-side page, and "verso" = back-side page. Docbook XML source is in UTF-8 if the <?xml> PI doesn't specify otherwise. table entry's do not need to nest a div-like. listitems must nest a div-like. ADMINITIONS (note/caution/important/warning/tip) Must nest a div-like (same as <listitem>s). Can give a title, defaults to name of the element (like "Note"). PDF and HTML have opposite behavior about surrounding whitespace, but neither have any dependency upon the source code in this respect. PDF always puts one blank line at top of the box. Otherwise, no extra white space, regardless of source code. PDF "shading" applies only to the text box (not title), whereas the HTML box includes the title. remarks, on the contrary, may contain only span items (they are wrapped in <em>). Verbatims (screen/programlisting/LiteralLayout) *Important: To avoid extra whitespace above and below, in all output formats, make sure there is no extra whitespace inside of the screen/programlisting tags. Outside of them (like in an <example>) makes no difference. Do no folding for PDF. I wonder if there is some some setting to tell it to wrap at 80 columns or something? That may not be adequate, since the available space depends on the current sectioning level. When xincluding (non-parsed) a Java properties file, make sure to specify the encoding with xinclude attribute: encoding="ISO-8859-1" INDEX The text of the index entries is not based on normally occurring document text, but on the content of your <primary>, <secondary>, and <tertiary> elements. I.e., the contents of your <primary>, etc. appears only in the generated index, not in the generated main content. <index/> at end <indexterm [significance="preferred"]><primary>...