You are here: Home Programming Python Adventures in Plone (and Zope) Plone text formats

Plone text formats

A beginners guide to the various ways you can enter text in Plone. Written by myself as part of documentation for a project, excerpted here for the greater good.

Plone accepts text in several markup formats, including plain, structured and restructured text and HTML. These formats differ in their complexity and the degree of control over how text is laid out. While reStructured text is probably the most useful of these (being much simpler than HTML, while being capable of complex markup), it is a relatively recent development and so it not available in all places. Forms and text fields will usually state what formats they can accept. The following is a quick primer in the different formats.

Plain text

Unless otherwise stated, all input is in plain (literal) text. This is rendered as is, with line breaks kept and blank lines seperating paragraphs. Thus the input:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Morbi suscipit tortor ac neque.

Cras nunc. Curabitur malesuada ligula nec justo. Mauris
eleifend. Pellentesque habitant morbi tristique senectus et netus
et malesuada fames ac turpis egestas.

is rendered as as it appears above.

Structured text

Structured text (or STX) is the most common text format in Zope and Plone, using indentation and punctuation symbols to suggest more complex layout, while still being readable in its raw form.

Words flanked with asterisks (e.g. *italic*) are rendered italic. Double asterisks (e.g. **bold**) are rendered bold. Underscores (e.g. _underlined_) make an underline. Single quotes (e.g. 'literal') show as monospaced literal text.

For example:

Lorem ipsum *dolor* sit amet, **consectetuer** adipiscing elit.
Morbi 'suscipit' tortor ac neque.

is rendered as:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Morbi suscipit tortor ac neque.

Paragraphs are seperated by blank lines. However, paragraph indenting is used to indicate titles and sections. This can be confusing. Basically, a paragraph of a single line, followed by a paragraph that is indented further to the right, is treated as a title of the indented section. If there is another paragraph below, indented even further, this is treated as a subsection. The indentation can apply to the first line or all lines of the text - it's the the depth of the first line that is used to work out titles and sections. An example may make this clearer:

Title

   First Section title

      A paragraph of some text to go within this section, and some
      filler text.

   Second Section title

      First subsection title

         A paragraph of some text to go within this section, and some
         filler text.

Which could also be written as:

Title

   First Section title

      A paragraph of some text to go within this section, and some
filler text to pad this paragraph out

   Second Section title

      First subsection title

         A paragraph of some text to go within this section, and some
filler text.

Sometimes it's may necessary to artifically break a short chunk of text across lines so STX recognises it as a paragraph rather than a title:

I'm a paragraph,
not a title!

STX allows three types of lists -- bulleted (unordered), enumerated (ordered) and definitions (like a dictionary or glossary). A paragraph that begins with a -, *, or o and then a space is treated as an item in an bulleted list. If instead, the the paragraph that begins with a number (followed by a space or period and space) is treated as an item in an enumerated list. A paragraph with a first line that contains some text followed by some space and then -- is treated as a an item in a descriptive list. So:

* First bullet item

* Second bullet item

1. First enumerated item

2. Second enumerated item

One -- First definition

Two -- Second definition

appears as:

  • First bullet item
  • Second bullet item
  1. First enumerated item
  2. Second enumerated item
One
First definition
Two
Second definition

Hyperlinks are given by text enclosed by double quotes followed by a colon, a URL, and concluded by punctuation plus space or just space. So "Pointer to Google":http://www.google.com makes a link like this Pointer to Google. Intra-site hyperlinks don't require the protocol prefix http://.

Images can be included in a similar way, with the protocol prefix being img:. For example "My image":img:my_pic.jpg will insert the picture my_pic.jpg into the current webpage.

Text enclosed in brackets which consists only of letters, digits, underscores and dashes is treated as hyper links within the document, and can be used to provide footnotes. For example, [a12] appears as [a12]__. The destination for these links can be written on a line by itself like .. [a12] "Effective Techniques" by Jane Doe.

For example:

This is a "a pointer to another site":http://www.google.com while this
is a pointer to "somewhere on this site":otherdoc.txt. This is a
pointer to a footnote [a13]. This is a "picture of mine":img:my_pic.jpg

.. [a13] This is the actual footnote with a backreference to where the footnote
   came from

is rendered as:

This is a a pointer to another site while this is a pointer to somewhere on this site. This is a pointer to a footnote [a13]. This is a picture of mine

mypic.jpg
[a13]This is the actual footnote with a backreference to where the footnote came from

If a paragraph ends in the word example or examples, or :: the following indented paragraph is treated as literal text and output as is, just like plain text.

An obscure but useful feature of structured text is that you can include HTML commands directly within it. For example, instead of using the primitive image include directives in STX (e.g. "My image":img:my_pic.jpg), you can use the more powerful HTML img command with all its options (e.g. <IMG SRC="my_pic.jpg" class="fancyframe" align="right">). This also proves useful if STX objects to foreign characters or symbols in the text.

Many STX tutorials exist on the web [1], although readers should be aware that there are a number of dialects and what applies in one may not work in Zope. It's a fine format ... as long as documents don't get too complicated. In practice, once you get text with 3 or 4 nested levels of headings, it becomes difficult to keep track of the correct depth of indentation. Add in lists, preformatted chunks of text and the vagarity of some STX documentation, it can become impossible to render a document correctly. It also appears that there may be a few bugs in the Zope STX implementation [2] [3] [4]. ReStructured text - if available - is far easier to write and troubleshoot, and is also more powerful. The moral: simple structured text for simple documents.

reStructured text

"reST" is a more recent evolution of structured text. Generally, it is easier to write, more standard, more powerful with more informative error messages. In fact, it's what this manual is written in. However, as said above, it is still spreading throughout the Zope world and not all Products have updated to use it.

Paragraphs and headings are simple. A paragraph, like STX, is a chunk of text flanked by blank lines. Unlike STX, paragraphs all have the same indentation and line up at their left edge. Anything that is indented is seen as a quote.

Titles (and as a result sections) are indicated by underlining and overlining. That is, a title is a single line of text adorned with an underline (and perhaps overline) of the same length, constructed from punctuation characters. The first style of underlining is treated as a top-level title, the second as a section title, the third as a sub-section title and so. So:

==========
Main Title
==========

Section Title
-------------

Subsection Title
~~~~~~~~~~~~~~~~

A paragraph of some text to go within this subsection, and some
filler text to pad this paragraph out.

Section Title 2
---------------

A paragraph of some text to go within this subsection, and some
filler text to pad this paragraph out.

   An indented quote!

You can use basically any non-alphanumeric character to indicate titles, it's the order in which they are encountered that dictates the level they are interpreted as. Thus the following would be equivalent to the above layout:

Main Title
**********

Section Title
+++++++++++++

^^^^^^^^^^^^^^^^
Subsection Title
^^^^^^^^^^^^^^^^

A paragraph of some text to go within this subsection, and some
filler text to pad this paragraph out.

Section Title 2
+++++++++++++++

A paragraph of some text to go within this subsection, and some
filler text to pad this paragraph out.

   An indented quote!

Note that reST will complain if a section is empty, (i.e. if two titles of the same level follow each other).

Much like STX, text can be marked as *italics* for italics, **bold** for bold and ``literal`` for literal (monospaced) text. Regretably there is no markup for underlining. Of course, this may all seem to present a problem if you want to use some of these characters, but reST is fairly good at figuring when something isn't markup (e.g. "3 * 4" or "don`t").

reST provides the same three list types as STX. Bulleted list items start their line with *, + or -. Enumerated list items start with a a number or letter followed by a period ., right bracket ) or surrounded by brackets ( ). Definition list items consist of a term followed immediately by an indented paragraph. Note that list items can be long paragraphs, as long as all text aligns with the first line of text. Also lists can be nested to make sublists with additional indentation. List items must always start and end with blank lines to be recognised. So:

* First bullet item and some filler text to demonstrate how wrapping
  can take place.

* Second bullet item

   1. First enumerated item (nested)

   2. Second enumerated item (nested)

One
   First definition, and some filler text to show how a very long
   definition should be written.

Two
   Second definition

renders as:

  • First bullet item and some filler text to demonstrate how wrapping can take place.

  • Second bullet item

    1. First enumerated item (nested)
    2. Second enumerated item (nested)
One
First definition, and some filler text to show how a very long definition should be written.
Two
Second definition

As with STX, ending a paragraph with :: leads to the following indented text being presented literally. Urls written out plainly in angled braces are turned into hyperlinks. Otherwise a url can be written as `DESC <http://www.example.com/>`_ where DESC is the visible text for the given link:

http://www.python.org

`Python  <http://www.python.org/>`__

which renders as:

http://www.python.org

Python

Footnote style internal hyperlinks are written with square brackets, like STX but with a trailing underscore:

Many studies have confirmed this point [CIT2002]_.

.. [CIT2002] Jane Doe. (2002) "An important study".

which renders as:

Many studies have confirmed this point [CIT2002].

[CIT2002]Jane Doe. (2002) "An important study".

Images are included with a footnote-like directive:

.. image:: images/ball1.gif

HTML

Hypertext Markup Language is the actual notation used for webpages. When a Plone document written in structured or reStructured text is visited, the text is translated into HTML for presentation. Unfortunately, HTML is verbose, tedious to write and easy to make mistakes in. For the casual user it is best avoided. However, it may useful to know a few simple pieces of HTML, especially where structured text fails.

There are HTML tutorials on the web, so this is only a quick overview. HTML uses explicit markup called "tags" to identify the layout of text. Thus tags are surrounded by angled brackets and usually occur in pairs that delimit text:

<P> This is a paragraph. It has <B>bold</B> and <I>italic</I> text. </P>

which renders as:

This is a paragraph. It has bold and italic text.

Note that the closing tags start with a forward-slash /. Tags are case insensitive (e.g. <P> is the same as <p>). Whitespace and indentation is ignored.

Headings are given as <H#> where # is the level of heading from 1 to 6:

<H1>Top level heading</H1>

<h2>2nd level heading</h2>

<H3>3rd level heading</h3>

Hyperlinks are written as this:

visit <A HREF="http://www.python.org">here</A> for more info

to render like this:

visit here for more info

Images may be included like this:

Have a look at this! <IMG SRC="images/my_pic.jpg">

References

[1]Paul Everitt. An introduction to structured text. At <http://www.zope.org/Documentation/Articles/STX> Zope.org.
[2]Well understood problems in STX. At Zwiki.org <http://zwiki.org/WellUnderstoodProblems>
[3]Text Formatting Surprises in STX. At Zwiki.org <http://zwiki.org/TextFormattingSurprises>
[4]Tips & tricks: Structured text. At Agapow.net <http://www.agapow.net/programming/plone-zope/tips>
[5]reStructured text. At the Docutils Sourceforge project <http://docutils.sourceforge.net/rst.html>.
[6]Quick reStructured text. At the Docutils Sourceforge project <http://docutils.sourceforge.net/docs/user/rst/quickref.html>.
[7]The reStructuredText cheat sheet. At the Docutils Sourceforge project <http://docutils.sourceforge.net/docs/user/rst/cheatsheet.txt>.

If there are any errors below this point, they have been generated by missing links in the examples above.

Document Actions
Visitors
Locations of visitors to this page
 
Personal tools