To be small, fast and easy to read.
How does it work?
CodeColor language definition contains number of states.
Each state may have number of rules with patterns written in RegExp.
CodeColor creates one combined RegExp statement for each state:
/(rule1_pattern)|(rule2_pattern)|...|(rulen_pattern)/m
Parsing algorithm uses feature of RegExp to produce full match and all partial matches.
Since in this case only one partial match will be non-empty, I use it to find matching
rule and apply style. This approach requires usage of "(?:..)" in any pattern definition,
otherwise the engine will not work.
References
The project was inspired by
Jonathan de Halleux article.
I played with Espresso
and RegExp Tester to develop
the algorithm for identification matching rules.
I also looked at the implementations of color highlighting rules in
#develop and
SyntaxBox.
I used a number of language specification documents:
C# 1.2,
VB.Net 8.0,
JavaScript,
XML,
HTML.
For hyperlink processing I used info from
Regex for URLs.
Samples
C# Sample
/* block comment */
# define test
#define test2
class Test<T>
{
}
/// <summary>This is doc comment http://notebar.com
/// This is doc comment</summary>
/// <param name="paramName">This a param</summary>
if (x == "test" && y < 2)
{
s = "string with \" escapes \\ more \n another \r tab \t";
s = "not complete should finish at the end of line
"this must not affect previous line";
s = @"special C# string
can be multiline http://notebar.com
use "" to include quot";
// this is linecomment
b = 3;
char c = 'c';
}
else
{
/* this is block
comment */
/* block comment may have * or / */
b = 4;
@int = 1; //Identifier
int[] hex = new int[] {0x12F, 0X23ADUL, 0xFFUL};
double d = new double[] {.12, 12.33, 1e-2, .1e+22, 1F};
}
VB.Net Sample
#Const TestMode = "Test"
#If TestMode = "Test" Then
'Some statements
#End If
''' <summary>
''' Test Doc comments <seealso cref="test">sss</seealso>
''' </summary>
''' <remarks></remarks>
Public Class Test
Sub TestSub()
Dim Text, Dim$, _test, [if]
Text = "aaa" & "aaa" & "string "" string "
Dim n As Integer, d As DateTime
n = 1
n = &HCCAAFF
n = &O1234
n = .234
n = 34E-12
n += 2
d = #12/12/1970#
d = # 12:30 #
d = # 12/12/1970 12:30 AM #
'comment
REM comment
If Test Then
End If
End Sub
End Class
JavaScript Sample
/* Block comment */
var str1 = "test's test";
var str2 = 'test for "test"';
function Text(s)
{
return s.replace(/(\-|\+|\*|\?|\(|\)|\[|\]|\\|\$|\^|\!)/g, "\\$1");
}
function ProcessDocument()
{
var preList = document.getElementsByTagName("pre");
for (var i = 0, len = preList.length; i < len; i++)
{
var pre = preList[i];
var lang = languages[(pre.lang) ? pre.lang.toLowerCase() : ""];
if (lang)
{
var coloredText = lang.ProcessText(pre.innerHTML + "");
if (pre.outerHTML) //HACK: IE does not preserve end of lines.
{
pre.outerHTML = "<pre lang=" + pre.lang + ">" + coloredText + "</pre>";
}
else
{
pre.innerHTML = coloredText;
}
}
}
}
XML Sample
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE document [
<!ELEMENT document (node*)>
<!ATTLIST document WMSNameSpaceVersion CDATA "2.0">
<!ELEMENT node (node*)>
<!ATTLIST node name CDATA #REQUIRED>
<!ATTLIST node opcode ( create | remove | setval | clearval | rename | movebefore ) #REQUIRED>
<!ATTLIST node secure ( true | false ) #IMPLIED>
<!ATTLIST node type ( string | boolean | int32 | binary | int64 ) #IMPLIED>
<!ATTLIST node value CDATA #IMPLIED>
]>
Text
<![CDATA[ cdata text ]]>
<!-- Comment
Comment -->
<?mso-application progid="Word.Document"?>
<test attr="value" attr="ssss">
<inner/>
</test>
Text text   text ዺ įGA;
HTML Sample
Text
<!-- Comment
Comment -->
Text
<table cellpadding=0 enabled cellspacing="0">
<tr>
<td> Sample text <br/><br>another line </td>
</tr>
</table>
License for the CodeColor.
(c) 2006 Vladimir Morozov
Last time updated: 04/24/2006