Non-ASCII characters in the form or in the command file. - The
default alphabet is: "'" (apostrophe), lower- and uppercase "A a .. Z z" plus the numerics "0 .. 9". The
default punctuation is ", ; : . ! ?". Parentheses "( )" are declared as padding
characters, i.e. they are left in the contexts, but are stripped from the keywords. "-" (minus) is the
default diacritic, forming words only joined to alphabet letters.
For a better performance of the program, the encoding of the data typed
in the string fields of the form or in the command file
should be always the same as the encoding of the source text to process (
Macintosh, Windows, UTF-8, UTF-16 ), the default being UTF-8.
Whichever way you choose to operate, if you are sending to OCPR any
non-ASCII characters, i.e. characters with diacritics (not "A-Z a-z 0-9"), then
remember to set your browser's encoding to the same
encoding of the text.
Escaping significant characters. - Some characters (
space " ' = . : <
) have a special meaning in the command language of the program.
If you are going to use as literal any of them in the form or in
the command file, they should be escaped with the escape character, e.g. writing
:' or :: or :".
The default escape character is the colon ":", but you can set a better one of your choise
in the input field below.
The current escape character is
Tabs in the command file. - If your command file contains tabs,
e.g. to indent the statements, the program will substitute them with 4 blanks,
to properly position error markers when sintax errors are detected. If the setting
of your text processor is not the same, then for a better visualization you can
change this value here, in the input field below. A tab have no meaning in the text
file itself to be processed, because it is read as a single blank.
The current tab value is
blanks.
Current limits. - Maximum text file size is 800kb.
If needed, you may segment your data and send upto 3 text files,
naming them <filename>1, <filename>2, <filename>3. They will be
internally joined in a single stream to process. Beware not to mix file
types and encodings in the stream.
RTF files will lost all their tags and only the intrinsic L
category (i.e. line numbers) will be available to select or to print. XML-TEI files too will lost
all their tags and only the intrinsic L category (i.e. line numbers) and the P category (i.e. page
numbers) will be available to select or to print.
Furthermore your text file/s should contain max 500 different characters
and max 290 different letters. Depending on the size of the text file and
on the tasks you have requested, some processings will take a very long time, upto
30 minutes, mainly for the sorting stage, when accessing the hard disk.
Be patient and you will be rewarded!
Input
Comments
letter
[
To letter
]
Text
[
From n
]
To n
Newline letter
--
Hyphen letter
--
Continue letter
Starting
References
Fixed n
[
To n
]
= c
Starting string
= c
Cocoa
[
letter
[
To letter
]]
On c
Set c
= string
Select
Or ...
n
[
To n
]
c
= string
Between
string
[
To string
]
Starting
string
At end of c
Words
Alphabet
Base
string
Punctuation string
Diacritics string
Compress string
Ignore string
Padding string
Maximum Wordn
Letters
Action
Do
with Stats
Stop
Before Sort
After Sort
Keys sorted by...
+
+
Contexts sorted by...
Maximum context
n
[
[
n
]
Complete c
]
--
[
To alphabetstring
]
References
NB. assigned order is relevant here!
c
= n
[
With string
]
Pick
Or ...
Not as alphabet
wordstring
Headwords
word
= wordstring
Starting Letter Range
[
string
]
To string
Collocates
wordnword
[
]
n
[
To n
]
Sample of n
Words
Format
Context
Size n
Aligned
Indent n
Complete
Print newline string
With key marker left string
With key marker right string
Headwords
Print as
Headwords
Same line
Cycle
Frequency
[
With Relative
[
of
]
]
Layout
Columns n
--
Width n
--
Length n
Gap n
--
Depth n
--
No Pages
[
]
Margin n
Lines n
Below
Print
Except
alphabetstring
Unseen
alphabetstring
Use
string
As
alphabetstring
References
References
To n
[
string
]
GT n
As Index
Between string
Titles
Title string hor.
[
vert. n
]
Lines Above n
--
Lines Below n
Title string hor.
[
vert. n
]
Lines Above n
--
Lines Below n
Page
[
label
]
n hor.
[
vert. n
]
Lines Above n
--
Lines Below n
Headwords
hor.
[
vert. n
]
Lines Above n
--
Lines Below n