1 files changed, 199 insertions, 0 deletions
diff --git a/conduits/docconduit/bmkSpecification.txt b/conduits/docconduit/bmkSpecification.txt
new file mode 100644
index 0000000..f8a68d9
--- /dev/null
+++ b/conduits/docconduit/bmkSpecification.txt
@@ -0,0 +1,199 @@
+KPilot PalmDoc Conduit bookmark Specification
+=============================================
+
+(c) 2003 Reinhold Kainhofer, reinhold@kainhofer.com
+
+This document is licensed under the FDL (Free Documentation License) 
+as published by the FSF. Any version of the FDL can be applied 
+at your convenience.
+
+
+
+
+The PalmDoc conduit has three ways to indicate bookmarks for a text:
+  -) Inline tags of the form <* bookmarkname *>
+	-) Endtags of the form <bookmarkname> at the end of the document
+	-) Regular expressions in a separate textname.bmk file 
+	   (textname.bmk ist the filename of the text with the .txt replaced by .bmk)
+		 
+
+In the design of the .bmk file, I tried to stay close to the 
+syntac of MakeDocJ bookmark files, but it turned out that I 
+needed to extend the syntax a little. Also, MakeDocJ uses Java 
+RegExps, while the PalmDoc conduit uses the QRegExp, which have
+some slight differences (especially concerning the ^ and $ 
+patterns as well as backreferences). So if you used MakeDocJ, 
+the .bmk file syntax will be quite familiar, but you will still
+have to adapt your bookmark files for Qt regular expressions 
+instead of Java regular expressions
+
+ 
+		 
+1) INLINE TAGS
+
+Whenever a tag of the form <* someText *> appears in the text, 
+this sequence is removed from the text, and a bookmark is set 
+there with the bookmark name "someText" (the part between the 
+<* and the *>).
+
+
+2) ENDTAGS
+
+If the text ends with tags of the form <someText>, the string 
+in braces is used as bookmark name, and wherever it appears in 
+the text, a bookmark is set. 
+After the > any number of whitespace is allowed, but no other 
+characters like letters, numbers, or punctuation. Also, inside 
+the braces no line break must occur. The conduit searches the 
+text from the end and if it finds a line break inside a <...> 
+sequence, the tag and everything before it is assumed to belong 
+to the text and doesn't form a bookmark tag. 
+Between endtags any number of whitespace (spaces, tabs, line 
+feeds etc.) is allowed.
+
+As an example, assume you have a text ending in:
+... the bad guy was punished, and they lived happily 
+ever after! 
+<Tag with
+line feed>
+       <bad guy> <princess>
+<married>
+
+The conduit starts at the end, ignores all whitespace between 
+the tags, so it finds the tags "married", "princess", and "bad guy". 
+The "Tag with line feed" has a line feed, so it is assumed to belong 
+to the text. 
+Assume now you have a text ending in:
+... the bad guy was punished, and they lived happily 
+ever after! 
+<bad guy> The End <princess>
+<married>
+
+Here, only "married" and "princess" are found as bookmarks. Because 
+of the letters before the "princess" tags, the search for the 
+bookmarks ends at the letter "d" of "The End" (the conduit starts 
+from the end and moves backward until it finds some text which 
+cannot be seen as a endtag.
+
+
+
+
+3) REGULAR EXPRESSIONS IN A SEPARATE FILE
+
+This is by far the most complex way to specify bookmarks, but 
+it is also the mose powerful. 
+If you have a text with filename "My fairy tale.txt", the 
+bookmarks will be specified in a file called "My fairy tale.bmk" 
+(just the text filename with the .txt replaced by .bmk). This 
+file contains the bookmark definitions, one in each line. Lines 
+starting with a # are seen as comments, and empty lines are also 
+ignored.
+
+
+In the .bmk file, each bookmark line has one of the following syntaces 
+(I will explain all fields later on). Fields in [..] are optional:
+
+bmkName
+bmkPosition, bmkName
++, bmkPatternRegExp[, bmkNameAsString[, firstIncludedBmk[, lastIncludedBmk]]]
++, bmkPatternRegExp[, bmkNameIndexOfSubexpression[, firstIncludedBmk[, lastIncludedBmk]]]
+-, bmkPatternRegExp[, bmkNameAsString]
+-, bmkPatternRegExp[, bmkNameIndexOfSubexpression]
+
+  If the first field is a string, it is used as the bookmark name 
+and pattern to search for. 
+  If the first field is a number, it means the position of the 
+bookmark, and the second field is the name of the bookmark.
+  If the first field is either + or -, the second field gives 
+a regular expression that is used to find the position of the 
+bookmark. If the first field is a -, the search is done only 
+once and only the first match will be added as bookmark. If 
+the first field is a +, the search is done until the regular 
+expression can no longer be found (the fourth and fifth fields 
+can be used to include only a certain range of hits). If there 
+is a third field, and it is a string, it gives the name of the 
+bookmark as a regular expression (i.e. \1 are replaced by the 
+first subexpression of the search, where subexpressions are 
+specified by round brackets in the regexp of the second field). 
+If there is a third field, and it is a number, it gives the index 
+of the subexpression of bmkPatternRegExp that is used as the 
+bookmark name.
+If there is no third field, the whole matched text will be used 
+as bookmark name.
+The optional fourth and fifth fields can be used to set bookmarks 
+only after the first few ocurrences of the regexp in the text, and 
+to stop the search after the expression has been found a certain 
+number of times.
+
+
+
+If the PDB->PC sync is set up to store the bookmarks in a bookmark file, 
+it will create a file "My fairy tale.bm" (no "k") with entries of the form
+position,bmkName
+The .bmk file will be used if it exists, but if no .bmk file exists, the .bm file
+will be used. This way you can override the bookmark settings, while 
+at the same time the PDB->TXT sync does not destroy your possibly 
+existing .bmk file.
+
+
+
+Examples:
+
+1) Imagine you have a line like:
+frog princess
+In this case, the text is searched for "frog princess", and a 
+bookmark is set whenever "frog princess" occurs in the text. 
+The name of each of these bookmarks will be "frog princess".
+
+2) A bookmark line:
+55, Bookmark at offset 55
+Here, a bookmark will be set at offset 55 (55th character of 
+the text), and it will have the name "Bookmark at offs" (truncated 
+to 16 characters)
+
+3) A bookmark line
+-,Chapter \d+
+causes a bookmark to be set at the first ocurrence of "Chapter XXX", 
+where XXX denotes one or more digits. The bookmark name will be 
+"Chapter XXX" (XXX replaced by the actual digits).
+
+4) A bookmark line
++,Chapter \d+
+causes bookmarks to be set wherever "Chapter XXX" (XXX being one 
+or more digits) appears in the text. The bookmark name will again 
+be "Chapter XXX", but the search does not stop after the first hit.
+
+5) A bookmark line
++,\n\s*(Chapter \d+)\D+, 1
+causes a bookmark to be set whenever a new line starts with 
+"Chapter XXX" (whitespace is allowed before the "Chapter"), and 
+uses the first subexpression in (..) as the bookmark name. If you 
+have a passage 
+     Chapter 15: here it starts
+The regular expression will match, so a bookmark will be set there 
+and the subexpression "Chapter 15" (which matches the (Chapter \d+) ) 
+will be used as bookmark text.
+
+6) A bookmark line
++,\n\s*Part (\d+),\1\. part
+sets a bookmark whenever a line starts with "Part XXX". The XXX 
+will be stored as the first matched subexpression. The third field 
+"\1\. part" is the regular expression for the bookmark name, where 
+\1 is replaced by the first matched subexpression of the search (XXX 
+in this case). So if a line starts with "   Part 17: ", the bookmark 
+name will be "17. part".
+
+7) A bookmark line
++,Table (\d+): ,\1\. Tabelle,5,25
+will match whenever "Table XXX: " appears in the text, and the bookmark 
+name will be "XXX. Tabelle". However, the fourth field means that the 
+first four hits are ignored (the 5th hit is the first hit to be included 
+as a bookmark), and the fifth field means that all further hits after the 
+25th will be ignored, too.
+
+8) In law texts, I use a regular expression
++,\n *(�\.? *\d+[a-z]?\.?) +, 1
+to search for all paragraphs starting like "�. 15. " or "  �23 ", and set
+a bookmark there using only the part from the � to the last digit or the 
+full stop after the last digit (the pattern between the (), in our two 
+cases the bookmark names will be "�. 15." and "�23" ).