![]() | Ned Batchelder : Blog | Code | Text | Site D. Richard Hipp's software universe » Home : Blog : January 2010 |
You may not have heard the name D. Richard Hipp, but you've used his software: SQLite is his creation, and it's everywhere. SQLite is an impressive piece of work, but it's not alone. Along the way, Hipp also wrote LEMON, his own parser generator for parsing SQL in SQLite. And now he has his own distributed source control, Fossil, which hosts the SQLite development stream. Fossil is interesting because it's also a distributed wiki and bug tracker, kind of like Trac meets Mercurial or something. As with all of his work, the Fossil documentation is very clear about the design principles and internals, again, very impressive. SQLite's documentation includes a detailed page about how it is tested, including how its coverage is measured. Needless to say, it is well measured: 100% condition coverage! The description there of the use of C macros to enhance measurement is a good example of how macros can be extremely useful in building complex software, and makes me wish for something with similar capabilities in Python. I admire Hipp's output, but I worry that it might be somewhat insular. SQLite obviously has great acceptance, but what will happen to Fossil? It has a huge uphill climb to get users, what with Git and Mercurial slugging it out, and a dozen others competing for attention. It's the age-old dilemma about using the best technology or the most widely accepted. In this case, I don't even know if Fossil is better than the alternatives. At this point it doesn't have the critical mass that would even move it from the Curiosities category to the Look Into It list.
tagged:
source control,
coverage» 10 reactions | |
Comments
I think Fossil is a very cool idea, but believe that Richard's choice of C as a development language will severely limit its adoption: the bang-for-the-buck ratio is lower than it would be for Perl/Python/Ruby/whatever. I think a Fossil-like tool on top of Git or Mercurial has a much greater chance of being the replacement for Trac we've all been waiting for.
There are several problems with the "insular" model. Not only is Fossil unique but there is no way to convert into or out of other VCS systems. At least providing a plugin to tailor would allow that. Additionally Fossil has very limited functionality - for example you can't search in tickets, only get a report and use your browser to search titles. Similarly there are no hooks to send email on ticket changes.
DRH also codes as though C has made no progress since 1989 and all the world is still a Vax which results in several issues. For example size_t and off_t are not used instead just using ints. This is a bad thing to do on 64 bit platforms and it took quite a bit of effort to convince him just how bad a problem it was, and the fixes aren't exactly models of how code should be written. Similarly enums are avoided instead picking types manually which assumes the C author knows better than the compiler which sizes are most efficient.
The claim of 100% coverage is true, but the code is not complete. For example the error return codes of several functions are ignored.
Apparently I look like a spammer so you have to look these bugs up manually as citations of the problems. Use http://www.sqlite.org/cvstrac/tktview?tn=9999 and replace 9999 with the bug numbers.
Type confusion: 2125, 3246
Ignoring error codes: 3946, 3507, 3394
Python is actually rather good at this sort of thing especially if you make a debug build. In my test suite I do return errors at all possible points and test for them with execution flowing through my code, Python C code and SQLite and often crossing those boundaries multiple times.
@Greg: Fossil isn't really written entirely in C. Large amounts are generated C code using TCL scripting (especially templating) and much logic is in SQL.
The final compiled result is fantastic for deployment. It is a single binary with no shared libraries and can run as a standalone web server, be invoked from CGI or inetd. There is no dependency hell.
Greg: I doubt Dr. Hipp is interested in converting Fossil to run on top of Git etc. That'd mean migrating away from the SQLite back-end, which I'm sure he sees as a compelling feature. Indeed, it's compelling in a lot of ways--sans one.
Monotone also uses SQLite to store its repository. The feeling I got from the Monotone camp was "and that's why we will never, ever, ever be competitive speed-wise with the rest of the SCMs". SQLite is an astonishing project in many ways, but there is no chance it will outperform an intelligently designed custom-written SCM back-end.
Ned: The upside of being insular is that you distance yourself from the turbulence of the real world. Dr. Hipp and the contributors to SQLite/Fossil/LEMON can all happily work away in blissful ignorance of the engoing SCM superiority battles. And they won't have painful "please rebuild your repository" hairy changeovers and "wait, what version of the SCM are you using?" support issues unless they're self-inflicted. Only Mercurial has yet to ever change its repository format; while this is charmingly backwards compatible it's also costing them in the fastest-SCM arms race.
I doubt that Fossil will be a major impediment to people contributing patches; the UI looks basically sane. And I assume it's easy to incorporate universal diffs. So it's probably not a big deal.
Of course they'll have "please rebuild your repository" issues, why wouldn't they? Precisely because they're so insular, they have no incentive to maintain backwards compatibility, since anytime you change something all you need to do is tell the 5 people using it to update and recompile.
Yes, I concur with some of the other commenters.
I would rather see the features of Fossil achieved by improving the existing tools: work with Ditz etc. to get distributed bug tracking the way you want, work with Bazaar etc. to get the distributed VCS behaviour you want, work with Ikiwiki to get the wiki behaviour you want, etc.
Each of those is deliberately extensible (and I'm sure there are other existing tools that can be substituted), so the need for wheel re-invention is lost on me.
Thanks for mentioning Fossil, Ned! My server logs tell me that lots of people have seen Fossil for the first time as a result of your blog.
Some of the comments suggest that folks think I am trying to compete with Monotone, git, hg, and other "more established" DVCSes. This is not the case. I wrote Fossil to meet my own needs. If others find it useful, great. If not, I'll use it myself and be happy. It was never been my goal to create the Next Great DVCS.
That said, I think there are many original ideas in Fossil that other DVCSes would do well to consider and perhaps incorporate into their own designs. The built-in wiki and bug tracking are new. The "fossil ui" command that starts a local webserver and launches the users web browser to view it has proven to be a very powerful idea. The "embedded documentation" has worked well for us. The "autosync" mode works better than the usual DVCS for my work practices. Bandwidth efficiency and the ability to penetrate restrictive firewalls is an important feature for many users. And many people like the fact that Fossil is a single stand-alone executable, making it drop-dead simple to install or uninstall or even run in a chroot jail. I think it would be great if, as others have suggested, some or all of these and other features of Fossil were added to other DVCSes. I promise that noone will not hurt my feelings by stealing the ideas. Hack away. Just don't expect me to do the hacking for you, since I'm happy with Fossil.
Many folks seem put off by the idea that Fossil is written in C. They believe that a scripting language (ex: Python) would be a better choice. Actually, I did several early prototypes of Fossil using TCL but what I found is that the high level features of a scripting language did not really help. I'm the first to admit that languages like TCL or Python is usually a much better choice for implementing a big project like this and I was surprised to see that C worked as well or better in this application. I'm not exactly sure why that is, but I think the comment from Roger above (Roger Binns?) is probably closest to the truth when he points out that the scripting language used by Fossil is really SQL. Most the work that isn't SQL tends to be low-level byte twiddling that is easer to do in C. Perhaps the take-away here is that the best language for an application might not always be what you expect.
Thanks to all for the criticism and feedback.
@Richard, I admire your "I'm not trying to take over the world, this is what works for me" attitude.
We seem to have gotten onto the "C is not a great implementation choice" theme. While I see the benefits of implementing in Python (or other higher-level languages), as I point out concerning C macros, C has some advantages that other languages do not. And I don't see anyone criticizing Git over Linus' decision to code in C.
I would like to add one more thing which I think is important to the success of a SCM these days. That is how easy it is to make a SCM's functionality available as a plugin to IDE's and editors.
As soon as converters from and to other SCM's are available I will give fossil a try.
I am impressed by Fossil with my first quick run, especially after my struggles to get Trac & Mercurial working together (I am not saying this is true for all). Going forward a few things are definitely going to become important (just picking a few things from comments here):
* The ability to quickly search through issues & notifications
* Integration with IDEs
* Ability to migrate from/to other VCSs (what I'm looking for, to deal with a just-in-case scenario)
Add a comment: