Boost.MultiIndex Examples
Contents
See source code.
Basic program showing the multi-indexing capabilities of Boost.MultiIndex
with an admittedly boring set of employee records.
See source code.
Usually keys assigned to an index are based on a member variable of the
element, but key extractors can be defined which take their value from
a member function or a global function. This has some similarity with the concept of
calculated keys supported by some relational database engines.
The example shows how to use the predefined const_mem_fun
and global_fun key extractors to deal with this situation.
Keys based on functions usually will not be actual references,
but rather the temporary values resulting from the invocation of the
member function used. This implies that modify_key cannot be
applied to this type of extractors, which is a perfectly logical
constraint anyway.
See source code.
We show a practical example of usage of multi_index_container::ctor_arg_list ,
whose definition and purpose are explained in the
tutorial. The
program groups a sorted collection of numbers based on identification through
modulo arithmetics, by which x and y are equivalent
if (x%n)==(y%n) , for some fixed n .
See source code.
This example shows how to construct a bidirectional map with
multi_index_container . By a bidirectional map we mean
a container of elements of std::pair<const FromType,const ToType>
such that no two elements exists with the same first
or second value (std::map only
guarantees uniqueness of the first member). Fast lookup is provided
for both keys. The program features a tiny Spanish-English
dictionary with online query of words in both languages.
This bidirectional map can be considered as a primitive precursor
to the full-fledged container provided by
Boost.Bimap.
See source code.
The combination of a sequenced index with an index of type ordered_non_unique
yields a list -like structure with fast lookup capabilities. The
example performs some operations on a given text, like word counting and
selective deletion of some words.
See source code.
This program illustrates some advanced techniques that can be applied
for complex data structures using multi_index_container .
Consider a car_model class for storing information
about automobiles. On a first approach, car_model can
be defined as:
struct car_model
{
std::string model;
std::string manufacturer;
int price;
};
This definition has a design flaw that any reader acquainted with
relational databases can easily spot: The manufacturer
member is duplicated among all cars having the same manufacturer.
This is a waste of space and poses difficulties when, for instance,
the name of a manufacturer has to be changed. Following the usual
principles in relational database design, the appropriate design
involves having the manufactures stored in a separate
multi_index_container and store pointers to these in
car_model :
struct car_manufacturer
{
std::string name;
};
struct car_model
{
std::string model;
car_manufacturer* manufacturer;
int price;
};
Although predefined Boost.MultiIndex key extractors can handle many
situations involving pointers (see
advanced features
of Boost.MultiIndex key extractors in the tutorial), this case
is complex enough that a suitable key extractor has to be defined. The following
utility cascades two key extractors:
template<class KeyExtractor1,class KeyExtractor2>
struct key_from_key
{
public:
typedef typename KeyExtractor1::result_type result_type;
key_from_key(
const KeyExtractor1& key1_=KeyExtractor1(),
const KeyExtractor2& key2_=KeyExtractor2()):
key1(key1_),key2(key2_)
{}
template<typename Arg>
result_type operator()(Arg& arg)const
{
return key1(key2(arg));
}
private:
KeyExtractor1 key1;
KeyExtractor2 key2;
};
so that access from a car_model to the name field
of its associated car_manufacturer can be accomplished with
key_from_key<
member<car_manufacturer,const std::string,&car_manufacturer::name>,
member<car_model,const car_manufacturer *,car_model::manufacturer>
>
The program asks the user for a car manufacturer and a range of prices
and returns the car models satisfying these requirements. This is a complex
search that cannot be performed on a single operation. Broadly sketched,
one procedure for executing the selection is:
- Select the elements with the given manufacturer by means
of
equal_range ,
- feed these elements into a
multi_index_container sorted
by price,
- select by price using
lower_bound and
upper_bound ;
or alternatively:
- Select the elements within the price range with
lower_bound and upper_bound ,
- feed these elements into a
multi_index_container sorted
by manufacturer,
- locate the elements with given manufacturer using
equal_range .
An interesting technique developed in the example lies in
the construction of the intermediate multi_index_container .
In order to avoid object copying, appropriate view types
are defined with multi_index_container s having as elements
pointers to car_model s instead of actual objects.
These views have to be supplemented with appropriate
dereferencing key extractors.
See source code.
Boost.MultiIndex
composite_key construct provides a flexible tool for
creating indices with non-trivial sorting criteria.
The program features a rudimentary simulation of a file system
along with an interactive Unix-like shell. A file entry is represented by
the following structure:
struct file_entry
{
std::string name;
unsigned size;
bool is_dir;
const file_entry* dir;
};
Entries are kept in a multi_index_container maintaining two indices
with composite keys:
- A primary index ordered by directory and name,
- a secondary index ordered by directory and size.
The reason that the order is made firstly by the directory in which
the files are located obeys to the local nature of the shell commands,
like for instance ls . The shell simulation only has three
commands:
cd [.|..|<directory>]
ls [-s] (-s orders the output by size)
mkdir <directory>
The program exits when the user presses the Enter key at the command prompt.
The reader is challenged to add more functionality to the program; for
instance:
- Implement additional commands, like
cp .
- Add handling of absolute paths.
- Use serialization
to store and retrieve the filesystem state between program runs.
See source code.
Hashed indices can be used as an alternative to ordered indices when
fast lookup is needed and sorting information is of no interest. The
example features a word counter where duplicate entries are checked
by means of a hashed index. Confront the word counting algorithm with
that of example 5.
See source code.
A typical application of serialization capabilities allows a program to
restore the user context between executions. The example program asks
the user for words and keeps a record of the ten most recently entered
ones, in the current or in previous sessions. The serialized data structure,
sometimes called an MRU (most recently used) list, has some interest
on its own: an MRU list behaves as a regular FIFO queue, with the exception
that, when inserting a preexistent entry, this does not appear twice, but
instead the entry is moved to the front of the list. You can observe this
behavior in many programs featuring a "Recent files" menu command. This
data structure is implemented with multi_index_container by
combining a sequenced index and an index of type hashed_unique .
See source code.
The example resumes the text container introduced in
example 5 and shows how substituting a random
access index for a sequenced index allows for extra capabilities like
efficient access by position and calculation of the offset of a given
element into the container.
See source code.
There is a relatively common piece of urban lore claiming that
a deck of cards must be shuffled seven times in a row to be perfectly
mixed. The statement derives from the works of mathematician Persi
Diaconis on riffle shuffling: this shuffling
technique involves splitting the deck in two packets roughly the same
size and then dropping the cards from both packets so that they become
interleaved. It has been shown that when repeating this procedure
seven times the statistical distribution of cards is reasonably
close to that associated with a truly random permutation. A measure
of "randomness" can be estimated by counting rising sequences:
consider a permutation of the sequence 1,2, ... , n, a rising sequence
is a maximal chain of consecutive elements m, m+1, ... , m+r
such that they are arranged in ascending order. For instance, the permutation
125364789 is composed of the two rising sequences 1234 and 56789,
as becomes obvious by displaying the sequence like this,
125364789.
The average number of rising sequences in a random permutation of
n elements is (n+1)/2: by contrast, after a single riffle
shuffle of an initially sorted deck of cards, there cannot be more than
two rising sequences. The average number of rising sequences approximates
to (n+1)/2 as the number of consecutive riffle shuffles increases,
with seven shuffles yielding a close result for a 52-card poker deck.
Brad Mann's paper
"How
many times should you shuffle a deck of cards?" provides a
rigorous yet very accessible treatment of this subject.
The example program estimates the average number of rising sequences
in a 52-card deck after repeated riffle shuffling as well as applying
a completely random permutation. The deck is modeled by the following
container:
multi_index_container<
int,
indexed_by<
random_access<>,
random_access<>
>
>
where the first index stores the current arrangement of the deck, while
the second index is used to remember the start position. This representation
allows for an efficient implementation of a rising sequences counting
algorithm in linear time.
rearrange
is used to apply to the deck a shuffle performed externally on an
auxiliary data structure.
See source code.
Boost.MultiIndex supports special allocators such as those provided by
Boost.Interprocess,
which allows for multi_index_container s to be placed in shared
memory. The example features a front-end to a small book database
implemented by means of a multi_index_container stored
in a Boost.Interprocess memory mapped file. The reader can verify that several
instances of the program correctly work simultaneously and immediately see
the changes to the database performed by any other instance.
Revised July 16th 2007
© Copyright 2003-2007 Joaquín M López Muñoz.
Distributed under the Boost Software
License, Version 1.0. (See accompanying file
LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt)
|