how to add a custom lexer to camlp4

Tagged:  

Adding a custom parser in the old camlp4 (now camlp5) was relatively easy. The new camlp4 is quite different. The problem was discussed in two recent threads in the ocaml mailing list here and here.

The main point is to provide a new Lexer module with a compatible signature with the Camlp4 lexer.

There are 3 camlp4 modules that should be defined, namely Loc, Token and Error. The signature to redefine a camlp4 lexer is

open Camlp4.Sig

type token =
  | KWD of string
  | CHAR of char
  | EOI

exception Error of int * int * string

module Loc   : Loc with type t = int * int
module Token : Token with module Loc = Loc and type t = token
module Error : Error

val mk : unit -> (Loc.t -> char Stream.t -> (Token.t * Loc.t) Stream.t)

I still don't understand enough about the camlp4 internal to comment about it. I've reused the cduce lexer as starting point and added a small lexer for regular expressions.

The complete code (lexer + parser) is Here

Reply

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Copy the characters (respecting upper/lower case) from the image.