![]() | Ned Batchelder : Blog | Code | Text | Site Cog » Home : Code |
Created 10 February 2004, last updated 18 March 2008 An older version of this document is also available in Russian. Cog is a code generation tool. It lets you use pieces of Python code as generators in your source files to generate whatever code you need. The sections below are:
What does it do?Cog transforms files in a very simple way: it finds chunks of Python code embedded in them, executes the Python code, and inserts its output back into the original file. The file can contain whatever text you like around the Python code. It will usually be source code. For example, if you run this file through cog: // This is my C++ file. it will come out like this: // This is my C++ file. Lines with triple square brackets are delimiter lines. The lines between [[[cog and ]]] are the generator Python code. The lines between ]]] and [[[end]]] are the output from the generator. When cog runs, it discards the last generated Python output, executes the generator Python code, and writes its generated output into the file. All text lines outside of the special markers are passed through unchanged. The cog marker lines can contain any text in addition to the triple square bracket tokens. This makes it possible to hide the generator Python code from the source file. In the sample above, the entire chunk of Python code is a C++ comment, so the Python code can be left in place while the file is treated as C++ code. DesignCog is designed to be easy to run. It writes its results back into the original file while retaining the code it executed. This means cog can be run any number of times on the same file. Rather than have a source generator file, and a separate output file, typically cog is run with one file serving as both generator and output. Because the marker lines accommodate any language syntax, the markers can hide the cog Python code from the source file. This means cog files can be checked into source control without worrying about keeping the source files separate from the output files, without modifying build procedures, and so on. I experimented with using a templating engine for generating code, and found myself constantly struggling with white space in the generated output, and mentally converting from the Python code I could imagine, into its templating equivalent. The advantages of a templating system (that most of the code could be entered literally) were lost as the code generation tasks became more complex, and the generation process needed more logic. Cog lets you use the full power of Python for code generation, without a templating system dumbing down your tools for you. InstallationCog requires Python 2.3 or later. Depending on the extent of your use, you may also need these packages: Cog is installed with a standard Python distutils script:
You should now have cog.py in your Python scripts directory. LicenseCog is distributed under the MIT license. Use it to spread goodness through the world. Writing the source filesSource files to be run through cog are mostly just plain text that will be passed through untouched. The Python code in your source file is standard Python code. Any way you want to use Python to generate text to go into your file is fine. Each chunk of Python code (between the [[[cog and ]]] lines) is called a generator and is executed in sequence. The output area for each generator (between the ]]] and [[[end]]] lines) is deleted, and the output of running the Python code is inserted in its place. To accommodate all source file types, the format of the marker lines is irrelevant. If the line contains the special character sequence, the whole line is taken as a marker. Any of these lines mark the beginning of executable Python code: //[[[cog Cog can also be used in languages without multi-line comments. If the marker lines all have the same text before the triple brackets, and all the lines in the generator code also have this text as a prefix, then the prefixes are removed from all the generator lines before execution. For example, in a SQL file, this: --[[[cog will produce this: --[[[cog Finally, a compact form can be used for single-line generators. The begin-code marker and the end-code marker can appear on the same line, and all the text between them will be taken as a single Python line: // blah blah You can also use this form to simply import a module. The top-level statements in the module can generate the code. If there are multiple generators in the same file, they are executed with the same globals dictionary, so it is as if they were all one Python module. Cog tries to do the right thing with white space. Your Python code can be block-indented to match the surrounding text in the source file, and cog will re-indent the output to fit as well. All of the output for a generator is collected as a block of text, a common whitespace prefix is removed, and then the block is indented to match the indentation of the cog generator. This means the left-most non-whitespace character in your output will have the same indentation as the begin-code marker line. Other lines in your output keep their relative indentation. The cog moduleA module called cog provides the functions you call to produce output into your file. The functions are:
cog.out(""" Because I use XML data files in my own code generation tasks, handyxml is included in the cog distribution. Running cogCog is a command-line utility which takes arguments in standard form. cog - generate code with inlined Python code. Files on the command line are processed as input files. Files can also be listed in a text file named on the command line with an @: $ cog @files_to_cog.txt These @-files can be nested, and each line can contain switches as well as a file to process. For example, you can create a file cogfiles.txt:
then invoke cog like this: cog -s " //**cogged**" @cogfiles.txt Now cog will process four files, using C++ syntax for markers on all the C++ files, SQL syntax for the .sql file, and no markers at all on the readme.txt file. As another example, cogfiles2.txt could be:
with cog invoked like this: cog -D version=3.4.1 @cogfiles2.txt Cog will process template.h twice, creating both data1.h and data2.h. Both executions would define the variable version as "3.4.1", but the first run would have thefile equal to "data1.xml" and the second run would have thefile equal to "data2.xml". Overwriting filesThe -r flag tells cog to write the output back to the input file. If the input file is not writable (for example, because it has not been checked out of a source control system), a command to make the file writable can be provided with -w: $ cog -r -w "p4 edit %s" @files_to_cog.txt Setting globalsGlobal values can be set from the command line with the -D flag. For example, invoking Cog like this: cog -D thefile=fooey.xml mycode.txt will run Cog over mycode.txt, but first define a global variable called thefile with a value of "fooey.xml". This variable can then be referenced in your generator code. You can provide multiple -D arguments on the command line, and all will be defined and available. The value is always interpreted as a Python string, to simplify the problem of quoting. This means that: cog -D NUM_TO_DO=12 will define NUM_TO_DO not as the integer 12, but as the string "12", which are different and not equal values in Python. Use int(NUM_TO_DO) to get the numeric value. Checksummed outputIf cog is run with the -c flag, then generated output is accompanied by a checksum: --[[[cog If the generated code is edited by a misguided developer, the next time cog is run, the checksum won't match, and cog will stop to avoid overwriting the edited code. Output line suffixesTo make it easier to identify generated lines when grepping your source files, the -s switch provides a suffix which is appended to every non-blank text line generated by Cog. For example, with this input file (mycode.txt):
invoking cog like this: cog -s " //(generated)" mycode.txt will produce this output: [[[cog MiscellaneousThe -x flag tells cog to delete the old generated output without running the generators. This lets you remove all the generated output from a source file. The -d flag tells cog to delete the generators from the output file. This lets you generate code in a public file but not have to show the generator to your customers. The -I flag adds a directory to the path used to find Python modules, and to handyxml's search path. The -z flag lets you omit the [[[end]]] marker line, and it will be assumed at the end of the file. HistoryCog's change log is on a separate change page. FeedbackI'd love to hear about your successes or difficulties using cog. Comment here, or send me a note. See AlsoThere are a handful of other implementations of the ideas in Cog:
You might like to read:
| |
Comments
I'm using Cog! I use it to do code generation for a library that's implemented in three different languages (C#, c++, Java). Linked from my blog. Thanks for the distribution fix!
This is really nice - normally I don't like the idea of having files which are hand-edited AND machine generated, but this is simple enough to change my mind about this. I really like this.
Some minor issues - the 'cog.py' file in scripts has a cr-lf at the end of the first line, which makes it fail to run, and makes distutils fail to edit in the proper python bin location. Easily fixed.
Running the script with no command line parameters should cause it to print a help message.
How about an option which will cause
cog to fail with an error if the file's
actual autogenerated code doesn't match the live python output. This might make more sense than '-r' for use in a makefile, to ensure that you don't edit the wrong code and have your changes discarded.
I notice that the 'import cog' is not actually required, this is good. It would be very nice if each python snippet were run in the same global dictionary, so you could define a variable (or import a module) at one point and use it throughout.
Finally -- certain languages (VHDL,
Makefiles, Python, for instance) do
not have multi-line comments, so
you can't use it with these, unless
I am missing something. May I suggest:
if the [[[cog is preceded by some text on its line, then all lines between that
and the ]]] are checked to see if they
start with the same text, and if so, that is discarded before further processing.
Thanks, Greg. I added the non-multi-line comment support, and a few other minor things. Thanks for the suggestions!
It looks neat, but I generally like to have no codegen code in the generated code.
But a slightly different approach would fix that:
1. External Python or cog-CodeBehind file next to the source file.
2. You add naming to your cog-codegen slots.
This allows the CodeBehind codegen code to find the output slot for its generated code.
EXAMPLE:
// file:MyCxxFile.hpp
// SLOT-CONTE
/*[[[cog-slot:EnumerateMethods]]] */
void DoSomething();
void DoAnotherThing();
void DoLastThing();
//[[[end]]]
// file:MyCxxFile.hpp.cogCodeBehind or MyCxxFile.hpp.cog or ...
// NOTE: Comments etc. are optional now.
/*[[[cog-codeBehind:EnumerateMethods
import cog
fnames = ['DoSomething', 'DoAnotherThing', 'DoLastThing']
for fn in fnames:
cog.outl("void %s();" % fn)
]]]*/
You can come close with Cog as it is. Remember that you can import any Python module you want into the cog code. By moving all of your Python code into another module, you can reduce the Cog code to a single import statement:
// file: MyCxxFile.hpp
/*[[[cog
import MyCxxFileGen
]]]*/
/*[[[end]]]*/
# file: MyCxxFileGen.py
import cog
fnames = ['DoSomething', 'DoAnotherThing', 'DoLastThing']
for fn in fnames:
cog.outl("void %s();" % fn)
Cog is great, but what I'd really like to be able to do is the following:
// [[[cog import mycodgen as m]]]
// [[[end]]]
... lots of regular code ...
// [[[cog m.somestuff()]]]
// [[[end]]]
... other code ...
// [[[cog m.otherstuff()]]]
// [[[end]]]
As it is, each cog slot seems to "forget" the globals() dict for previous cog slots (i.e. line 86 in cogapp.py passes an empty globals() dict to eval).
You got it: 1.3 does what you want.
Just FYI, think I've finally settled on COG as a tool for PHP code generation. Had looked at empy (some notes here: http://www.sitepoint.com/blog-post-view.php?id=222590) but what's clinched it is the way you've implemented the output area.
The mission is to eliminate any work PHP might do that relates to application configuration or the environment it's running in - stuff that won't change once an app is deployed so eliminate the runtime overhead.
One particular thing I want to keep is the ability to execute the PHP scripts, while hacking, before they have been run through COG, e.g.
/**
[[[cog
import cog
cog.outl("require_once '/full/path/to/someClass.php';")
]]]*/
// While hacking, use this
require_once DEV_PATH 'someClass.php';
//[[[end]]]
May even ditch the require_once completely and have COG embed the class code directly into the script.
Anyway - thanks.
Just what I'm looking for, thanks! However, am using it to generate multiple source files from each template file (ie. feed a different xml config file in to the template to generate a unique file). Is there any way to pass an argument (e.g. a file name) through the command line invocation?
That way I could provide the xml file to the template dynamically.
Thanks again!
Theo
Theo, it would be straightforward to add a -D name=value syntax to the command line. Each would create a variable in the global context. I think this would cover your needs.
I think this is a good idea. I'll add it into Cog soon.
Excellent, looking forwards to it,
Theo
You know what COG really needs? A "COG-recipes" site. I'm sure people have developed some interesting scripts.
One script I'm interested in (that I'm sure others would be too) is a script that will generate C++ functions that will translate enum values to and from string representations. Having a recipes site would allow me to share my code as well as allow others to give feedback about my script.
can you use COG to make codes out of videos andpictures and soundfile? for instance on myspace you can go to websites and get html codes for videos, wich resemble COG in a way, would you be able to e-mail me backon this subject?
I like the idea expressed by Kevin above about a "COG recipes" site. Because of that I started a COG Wiki site (http://www.bluwiki.org/go/COG_Code_Generator) to allow for discussions and code snippets for COG.
Anyone who has snippets to share, please post them. The Wiki will become better as more people share their code.
Hey man,
this is a really handy tool you wrote.
Keep up the great work.
*And why not directing the stdout instead of cog.outl ?
Thanks
You should note that intalling COG also requires the PATH module.
Hi, this tool certainly looks very promising, I'm going to give it a go to clean up some of the internals used in aqsis (www.aqsis.org). The other devs favour xsltproc at the moment - I'll see if I can convince them and myself...
One thing I noticed after just installing cog-2.0 from the tarball is that cog.out no longer seems to recognise the trimnewlines command. After grepping the source I see that it's apparently been changed to trimblanklines?
is anyone willing to try and teach me this stuff ? i find it really interesting.. but i don't expect anyone to take me up on this, it seems really complicated. thats why im so interested.
Bug report.
/*[[[cog
import cog
cog.outl(" extern void simple1();")
cog.outl(" extern void simple2();")
]]]*/
//[[[end]]]
I think the result should be indented, since the dedent parameter isn't given and the default value is False.
However, the result is "dedented". (reindentBlock is called twice)
Jay: this code is behaving as intended. I've edited the description of indentation to try to make it clearer. The dedent parameter doesn't affect the indentation of the line in the output, just the interpretation of a multi-line string parameter. Cog collects output as a block of text, then indents the block to match the generator. The dedent parameter affects how cog adds the lines to the collected output, but now how that output is finally written.
To see what the dedent parameter does, try putting five spaces at the left of one of your lines, and try it with dedent=True and dedent=False to see the difference.
Sorry for the confusion. If you need the output block indented differently, indent your entire cog block to where you want the output to go.
Hi Ned,
This is some neat python app :-)
I would like to try it for some Java apps around here. Has anyone tried whether
it works with jython? If so I could easily use it from ant and would not
require people to install Python.
... I just checked, it seems that jython won't work since cog requires the compiler package (which isn't available on jython).
I've never used Jython, but yes, Cog depends on being able to execute the chunks of Python code it finds.
Add a comment: