home   Java main page       Search this site        

ULex - A lexical analyzer generator


Current version: 0.2

ULex is a java application that generates java code for lexical analyzers. The current version could be better, but it seems to work. To render this probable, I have written a lexer file for a subset of the java language - check it out here: java_lex.txt. It generates a lexer that prints out a list of tokes to the console, given a java source file as the first parameter. The generated code can be found here: JavaLexer.java (200-250 kb - yes, no minimization ;-) ). 

ULex only generates 7 bit lexers.
ULex does not support regular expression macros.

To use ULex, download this file, unzip and run install.bat (non-windows users: you are on your own). ULex should run on all machines that has a Java virtual machine.

The generated lexer implements the java_cup.runtime.Scanner interface - i.e. it generates a class definition that includes the method java_cup.runtime.Symbol next_token(). I have NOT tested with the javacup parser. You can find info on the javacup parser at http://www.cs.princeton.edu/~appel/modern/java/CUP/why.html. I have included the source and class files for Scanner and Symbol in the ULex distribution.

ulex_0_2.zip : The 0.2 version of ULex
manual.htm : A pretty lousy manual (so far) - included in ulex_0.2.zip
java_lex.txt : A lexer definition for the Java language - included in ulex_0.2.zip
JavaLexer.java : ULex generated lexer for Java - included in ulex_0.2.zip

You need a java compiler( at least 1.1 ) and a Java VM ( at least 1.2 ) to use ULex. You can find both here: http://java.sun.com/j2se

TODO:



Write a decent manual
Minimize DFAs


These pages are maintainedby Ulrik Magnusson. Please contact me at
ulrikm@yahoo.com if you discover any bugs, misinformation etc.

   home     Java main page       Search this site