Version 2.1.0 6/12/2008New Features:
Bugs Fixed:
Version 2.0.1 10/23/2007Bugs Fixed:
Version 2.0.0, 10/12/2007New Features:
Bugs Fixed:
Version 1.0.2, April 27, 2007Bugs Fixed:
This is the version that shipped as part of Boost 1.34. Version 1.0.1, October 2, 2006Bugs Fixed:
Version 1.0.0, March 16, 2006Version 1.0! Version 0.9.6, August 19, 2005The version reviewed for acceptance into Boost. The review began September 8, 2005. Xpressive was accepted into Boost on September 28, 2005. Version 0.9.3, June 30, 2005New Features:
Version 0.9.0, September 2, 2004New Features:
Version 0.0.1, November 16, 2003Announcement of xpressive: http://lists.boost.org/Archives/boost/2003/11/56312.php The following features are planned for xpressive 2.X:
Here are some wish-list features. You or your company should consider hiring me to implement them!
Since many of xpressive's users are likely to be familiar with the Boost.Regex library, I would be remiss if
I failed to point out some important differences between xpressive and Boost.Regex. In particular:
Also, in the current implementation, the regex algorithms in xpressive will not detect pathological behavior and abort by throwing an exception. It is up to you to write efficient patterns that do not behave pathologically. The performance of xpressive is competitive with Boost.Regex. I have run performance benchmarks comparing static xpressive, dynamic xpressive and Boost.Regex on two platforms: gcc (Cygwin) and Visual C++. The tests include short matches and long searches. For both platforms, xpressive comes off well on short matches and roughly on par with Boost.Regex on long searches. <disclaimer> As with all benchmarks, the true test is how xpressive performs with your patterns, your input, and your platform, so if performance matters in your application, it's best to run your own tests. </disclaimer> Below are the results of a performance comparison between:
Test Specifications
Comparison 1: Short MatchesThe following tests evaluate the time taken to match the expression to the input string. For each result, the top number has been normalized relative to the fastest time, so 1.0 is as good as it gets. The bottom number (in parentheses) is the actual time in seconds. The best time has been marked in green. Short Matches
Comparison 2: Long SearchesThe next test measures the time to find all matches in a long English text. The text is the complete works of Mark Twain, from Project Gutenberg. The text is 19Mb long. As above, the top number is the normalized time and the bottom number is the actual time. The best time is in green. Long Searches
Below are the results of a performance comparison between:
Test Specifications
Comparison 1: Short MatchesThe following tests evaluate the time taken to match the expression to the input string. For each result, the top number has been normalized relative to the fastest time, so 1.0 is as good as it gets. The bottom number (in parentheses) is the actual time in seconds. The best time has been marked in green. Short Matches
Comparison 2: Long SearchesThe next test measures the time to find all matches in a long English text. The text is the complete works of Mark Twain, from Project Gutenberg. The text is 19Mb long. As above, the top number is the normalized time and the bottom number is the actual time. The best time is in green. Long Searches
In xpressive, regex objects can refer to each other and themselves by value
or by reference. In addition, they ref-count their referenced regexes to
keep them alive. This creates the possibility for cyclic reference counts,
and raises the possibility of memory leaks. xpressive avoids leaks by using
a type called ConstraintsOur solution must meet the following design constraints:
Handle-Body Idiom
To use
The body type must inherit from
References and Dependencies
We refer to (1) above as the "references" and (2) as the "dependencies".
It is crucial to the understanding of Why is this important? Because it means that when a body no longer has a handle referring to it, all its references can be released immediately without fear of creating dangling references. References and dependencies cross-pollinate. Here's how it works:
Consider the following code: sregex expr; { sregex group = '(' >> by_ref(expr) >> ')'; // (1) sregex fact = +_d | group; // (2) sregex term = fact >> *(('*' >> fact) | ('/' >> fact)); // (3) expr = term >> *(('+' >> term) | ('-' >> term)); // (4) } // (5) Here is how the references and dependencies propagate, line by line:
This shows how references and dependencies propagate when creating cycles of objects. After line (4), which closes the cycle, every object has a ref-count on every other object, even to itself. So how does this not leak? Read on. Cycle Breaking
Now that the bodies have their sets of references and dependencies, the
hard part is done. All that remains is to decide when and where to break
the cycle. That is the job of
This suggests that more than one handle can refer to a body. In fact,
What does the cycle-breaker do? Recall that the body has a set of references
of type template<typename DerivedT> struct reference_deleter { void operator ()(std::set<shared_ptr<DerivedT> > *refs) const { refs->clear(); } }; The job of to the cycle breaker is to ensure that when the last handle to a body goes away, the body's set of references is cleared. That's it. We can clearly see how this guarantees that all bodies are cleaned up eventually. Once every handle has gone out of scope, all the bodies' sets of references will be cleared, leaving none with a non-zero ref-count. No leaks, guaranteed. It's a bit harder to see how this guarantees no dangling references. Imagine that there are 3 bodies: A, B and C. A refers to B which refers to C. Now all the handles to B go out of scope, so B's set of references is cleared. Doesn't this mean that C gets deleted, even though it is being used (indirectly) by A? It doesn't. This situation can never occur because we propagated the references and dependencies above such that A will be holding a reference directly to C in addition to B. When B's set of references is cleared, no bodies get deleted, because they are all still in use by A. Future Work
All these Also, some objects stick around longer than they need to. Consider: sregex b; { sregex a = _; b = by_ref(a); b = _; } // a is still alive here!
Due to the way references and dependencies are propagated, the |