not a beautiful or unique snowflake (nothings) wrote,
not a beautiful or unique snowflake
nothings

sean's intuitive macro programming language

Designing a macro language for HTML authoring.

Some time in 1997 or 1998 I wrote a macro processing language that allowed me to author HTML using a far less cumbersome syntax. I made a parameterless language, which was kind of limiting. I later expanded it for my old web journal; an analysis is here.

The basic idea is that I can write @italicized text@ and _bold text_ in an simpler way. Such syntaces have issues of there own, but central point is that they're a lot easier to read. Ideally, you'd make them easier to write as well, for example getting rid of the verbose end tags of html.

A few days ago I discovered a similar language, latte. This goes a good step towards making things cleaner, but it heavily overuses its \ syntax in a way that is fairly redundant. Latte has been supplanted by Blatte, which is written in Perl and has no web page.

So, I decided it was time to rebuild a new macro processing system, one that was a bit more flexible--parameters are definitely required, and I'd like to do the _foo_ sort of thing so I need macro substituion without any magic character triggering.

So I follow my traditional rules: make the underlying system as generic as possible, and move as much logic as possible into macros, and out of the engine source. I've come up with the following specification:

The first line of the file looks like this: {[simple]}

The {[...]} characters tell the engine what characters you're using to be meta. Here, {} are used to indicate definitions and functions; nested things inside {} will appear as [], nested inside those are {}, etc.

To use the characters { or } at the top level, you double them {{ -> {, }} ->}; [ and ] can appear at top-level unescaped. (Inside {}, you have to do both {{ and [[ for { and [. It's never meaningful to have {{ as a real sequence, since it has to be {[ or [{ due to the alternation.) (NOTE: Some web browsers are eating some of the square brackets in this description.)

There are four main primitive operators, plus one more for multiple-input-file support, plus three more for multiple-output-file support and iteration.

  • {def NAME VALUE}
  • {fun NAME [PARAMETERS] VALUE}
  • {undef NAME}
  • {defcon NAME VALUES}


def defines a parameterles macro that operates at top-level, outside {}. fun defines a macro that takes parameters, and is triggered by {NAME ...parameters}. {undef} deletes either of these.

def takes several optional parameters. 'raw' means 'don't substitute further parameters'; 'left' and 'right' provide context information for what characters are legal to appear to the left and right of this macro's name. (For instance, I define _ so that _foo_ works, but foo_bar doesn't do anything: _ only begins processing if the left character is non-alphanumeric and the right character is non-whitespace; but this means un_fucking_believable won't work.) {defcon} Allows defining new sets of context info; the only ones provided by default are 'newline' and 'white' (whitespace), plus the 'not' inversion operator.

fun takes a list of parameter names. To do a parameter substitution on the parameter name foo, you put [foo] in somewhere. (If you want to use the funciton [foo] somewhere in the macro body, then pick a different name for the parameter!) This does turn out to be clumsy--it might be nice if there was another character to introduce a macro parameter instead of overloading the existing nesting.

So, you can define a macro that deletes itself when it's used:
{def foo [undef foo]bar}

Then, define a function that does this automatically:

{fun defdel [n v] [def {n} {undef [n]}{v}]}

Now, let's define a pair of macros where the second one is defined temporarily after using the first one (they're a balanced pair).

We want:

{def a b}
{defdel c d}

But we want the definition of c -> d to occur as part of a:

{def a b[defdel c d]}

Now make that a function:
{fun deftag [a b c d] [def {a} {b}{defdel [c] [d]}]

Ok, now to make it practical, I want the same thing, but we want to pass in options to the defs.

{fun defdel [a b c] [def {]a]} {b} {c}]}
{fun deftag [a b c d e f]
[def {[a]} {b} {c}{defdel [d] [e] [f]}]}

Now, suppose we want to use the same token for the open and close tags; because macro definitions stack (you define a new one, then undef it, the old one is back), this just works:

{fun defdual [n a b c d] [deftag {a} {n} {b} {c} {n} {d}]}

Now let's define a function that makes a doubled character an escape sequence:

{fun escape [n] [def {n}{n} {n}]}

And now let's define _:

   {escape _}
   {defdual _ [left nonalpanum right nonwhite] <b>
              [right nonalphanum left nonwhite] </b>


In fact, we can define a new function to do that work, and just say:

{defchar _ <b> </b>}
{defchar * <i> </i>}
{defchar ` <tt> </tt>}

Of course, we never need to do ANY of this if we want to use Latte-style markup:

{def b [v] <b>[v]</b>}
{def i [v] <i>[v]</i>}

Now {b this is bold} and {i this is italicized} and {i {b this} is both}. Meanwhile, _this is bold_ and *this is italicized* and *_this_ is both*.

It seems like it should work; I'll probably code it up tomorrow.
Subscribe
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 3 comments