
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 France License.
Since I know I'm going to forget how to do it properly, this is a tiny recipe to easily parse options in bash. I use getopts (and not getopt . !!! mind the S) that is a builtin shell command. help getopts to read the doco.
The snippet is pretty easy and self explanatory :
Honestly ocaml format module is a royal PITA to use. The only documentation apart the reference manual is this document here. Don't get me wrong. I think it's a very nice piece of software and absolutely worth having it in the stdlib, but it simply not intuitive (at least for me) to use at the first glance. I'll write down a couple of example. hopefully this will help me - and others - the next time I'll need to use it.
I'm going to use the Format.fprintf function quite a lot. This function uses similar formatting string to the more widely used Printf.fprintf. In the Format module page you can find all the details. Let's start easy and print a string. We write a pretty printer function pp_cell that gets a formatter and an element. This is my favourite way of writing printing function as I can daisy chain together in a printf function call using the "%a" formatting string. If the formatter is Format.std_formatter the string will be printed on stdout.
Let's start playing witht the boxes. The formatting boxes are the main reason why I use the format module and they are very handy if you want to pretty print nested structure easily.
If we use the std_formatter and the list pretty printer without formatting box, we obtain this output.
hbox. If I want a vertical list, I can use the vbox. This gives respectively: @, is interpreted differently by the formatter, once as newline, once as space. Moreover by adding an indentation, the formatter will take care of adding an offset to all strings printed within that box. And this is a winner when pretty printing nested structures.
Lets now delve a bit deeper and let's try to format a table... I didn't found any tutorial on the net about this, but bit and pieces of code buried into different projects... A table for me is a tuple composed by a header (a string array) and two-dimensional array string array. The point here is to format the table in a way where each element is displayed in a column in relation to the longest element in the table. First we need two support pretty printers, one for the header and the other one the each row in the table. In order to set the tabulation margins of the table, we need to find, for each column the longest string in the table. The result of this computation (the function is shown below in pp_table) is an array of integer widths. When we print the header of the table, we make sure to set the width of each column with the Format.pp_set_tab fmt function. The magic of the Format module will take care of the rest. The second function to print each row is pretty straightforward to understand.
The pretty printer for the table is pretty easy now. First we compute the width of the table, then we open the table box, we print the headers, we iterate on each row and we close the box. tadaaaa :)
A long while ago I wrote about installing redmine as a normal user on debian. I've been using and administering a redmine installation for quite some time now and I'm very happy with it. Since the new stable release of redmine was released in june, I decided to give it another try.
The redmine installation manual is already very detailed. These are the steps I followed to get a simple redmine instance running on my system. First we need to install a specific version of rails and rack. I didn't check which version we have in debian (sid in this example), but since the gem system is very easy to use, I prefer to stay on the safe side and to use the recommended versions. To install the ruby gems is an arbitrary directory it is sufficient to set the GEM_PATH environment variable. Everything else follows pretty easily.
Since I don't want to bother with a full fledged database, in this example I'll use the sqlite3 backend. In order to do this, I need to install another gem :
I'm ready to get the redmine code. I prefer to get it from the svn directly, so I conveniently can keep it up-to-date.
Now we need to configure the database backend, This is accomplished just be adding the following two line to the file config/database.yml. Mind that the backend is called sqlite and not sqlite3 !
Time to create and initialize the db with the default data. These two commands will create the db structure and add trackers, default users and take care of other details.
Since we are going to run the redmine instance either as a fcgi or with the standalone server, we just need to make sure that few directories are writable to my user :
Time to try it out !
DONE!
Last time I tried redmine I had to fight a bit to install the ldap authentication plugin. Just by looking around the new stable version, I'm pretty happy to notice that it is now part of the default bundle. This is nice ! I've left to setup the fastcgi server and to configure the SCM repositories. Maybe I'll write a part two later...
During the last two days I spent some time to implement part of the proposed features for distcheck/ edos-distcheck. Since everybody is at debconf and talk is silver, but code is gold, I hope that a real implementation can get the ball rolling and get us closer to a stable release of the next generation of edos/mancoosi tools.
In particular this post is about the new YAML output format for distcheck. The rational to use YAML is to have a data structure that is at the same time human and machine friendly. There are a lot of scripts in debian that rely on distcheck and we want to provide a grep friendly output that sat the same time doesn't hurt your eyes. The other proposed solution was to use json, but was ditched in favor of YAML. We also removed the xml output.
In order to provide a machine readable output and to minimize parsing mistakes, I used the schema language proposed here. This is the resulting data structure definition :
Distcheck now outputs a list of broken or installable packages depending on the given options (--failoures , --success, --explain and combinations of thereof ) . Two quick examples :
python-gi-dbg is broken because there is a conflict between the packages python-gobject and python-gi. The reason why python-gi-dbg is affected by this conflict is explained by following the dependency chain from python-gi-dbg to the two offending packages. Note that for each package element of each path we specify the vpkg, that is the dependency (as it is written in the control file) that lead to the conflict. Since a dependency can be a virtual package or a package with a version constraint, it can be expanded to a disjunction of packages (think a dependency on mta-agent can be expanded as postfix. exim or sendmail...). All possible paths to an offending package are reported.
Likewise if a package is broken because there is an unfulfilled dependency, distcheck will show the path leading to the problem . In the following example we show that the package gnash-tools is broken because there are two dependency that depend on the missing package libboost-date-time1.40.0 (>= 1.40.0-1).
The code is still in a flux and it is not ready for production yet (everything is in the mancoosi svn). I hope this is a good step in the right direction. Comments on the debian wiki are welcome.
if we compare the output of distcheck with the old edos-debcheck we get the following:
After reading this interesting blog post from Petter Reinholdtsen, I've decided to repeat his experiments and save the results in with dudf-save . Using the Petter's script, I've created a lenny schroot, installed mancoosi-contest and the run apt-get and aptitide in simulation mode to create and upload the dudf to mancoosi.debian.net.
For example :
from lenny to squeeze (2010-07-28).
I'll repeat these tests from time to time. The idea would be to find upgrade problems, but in particular to compare apt-get / aptitude results with other solvers.
apt-get and aptitide were two missing competitors of the misc competition. However it is important and interesting how these two tools compete against other solvers submitted to MISC. In this post I want to present two simple tools to convert cudf documents to something that apt-get based tools can handle. Cudf and debian share many characteristics but also have important semantic differences. One important difference is about installing multiple versions of the same package. Since this is allowed in cudf, but not in debian, we can use apt-get and aptitude only to solver cudf problems that respect this constraint, ruling out, for example, all cudfs from the rpm world. Another difference to take care is about the request semantic. In cudf, request can contain version constraints. For example, one can ask to upgrade the package wheel to a version greater then 2. Since it is not possible to translate directly this request in cudf we are forced to add a dummy package encoding the disjunction of all packages that respect this constraint. This problem does not arise with remove request as the refer always to the currently installed package.
Apt-get needs two files : The Packages file that contains the list of all packages known to the meta-installer and the status file that contains the list of packages that must result currently installed. To generate these files I wrote a small utility using the dose3 framework imaginatively called cudftodeb . This tool gets a cudf and produces three files : Packages, status and Request with the Request file containing the list of files to install or remove in a syntax compatible with apt-get .
In other to run apt-get/aptitude with these files, you would need a simple bash script. You can find details here for apt-get and here for aptitude. Most important option is the -s used to simulate an installation.
With the -v option of apt-get we can generate a parsable solution. This output is the piped through an other tool called aptgetsolutions in order to produce a cudf solution closing the circle.
For example, this is the trace produced by aptitude when trying to solve the legacy.cudf problem :
Not the package dummy_wheel used to encode the upgrade request of wheel>>2. This dummy package encodes the request as a dependency :
One last remark about apt-get. I just run on this bug today using an old version of apt-get that is shipped with lenny. For our experiments we are using only the latest version of apt-get in debian testing.
One of the goals of the project Mancoosi is to get together researcher from various disciplines to advance the state of art of package managers. To this end, we organized an sat solving competition specifically tailored to upgrade/installation problems. The winner of the competition was announced during the workshop lococo hosted at the international conference FLOC the 10th of july 2010. I spent several hours preparing the infrastructure for the competition and here I'd like to give a brief account of my experience. This work was done together with Ralf Treinen, Roberto Di Cosmo and Stefano Zacchiroli.
Other then these *real* problems, we generated a number of artificial problems built from debian repositories. The utility we used is called randcudf and it is freely available on the mancoosi website, part of the dose3framework. We kept a number of variables into consideration in order to generate difficult problems but not so far away from the reality of every day use.
Among these parameters are
keep and the type of keep (version or package)Playing around with these variables we were able to produce problems of different size and different degree of complexity. During the competition, for example, the three categories had respectively a universe with 30k , 50K and 100K packages. Moreover we discarded all problems that do not have a solution at all.
From our experience during the problem selection, considering over 30K packages, if extremely easy to generate cudf problems that do not have a solution at all. For example in debian lenny there are 17K packages connected by a kernel of 80 conflicts. This configuration produce around 5K strong conflicts. This means that if we pick two packages among these 17K there is a high probability that these two packages are in conflict. This is because of the high level of inter-dependencies of open source distributions. With bigger remove/install requests this probability grows even bigger. Since the goal was to provide random problems as close as possible to reality our documents have a request to install at most 10 packages and remove 10 packages at the same time.
The five categories used in the competition :
Starting from a base system (as generated by debootstrap) we added :
The second mistake was to not specify exactly the java version. open-java has subtle differences from sun-java and it seems these differences created a few problems for one of the participants. This problem was quickly rectified.
To run the competition I wrote few simple bash scripts and test cases. The test cases were meant to test the execution environment and to be sure that all constraints were correctly enforced. The execution environment is available in the mancoosi svn. In practice, we run the competition in four phases.
In the first one we deployed all solvers in the execution environment. In order to cleanup the solver directory and "start fresh" after every invocation, I created an empty git repository for every solver. After each invocation, the repository was cleaned-up using
In the second phase, we actually run the competition. The script used is runcomp.sh. It takes 3 arguments, the list of solvers, the list of problems and a timeout in seconds. Since we used the same set of problems for the trendy and paranoid track we run the competition only once for both tracks. The output of the runcomp.sh script is a directory (i.e. tmp/201007060918) with all the raw results of the competition. All raw data is publicly available here .
In third phase we compute the aggregate results by track using the script recompute.sh. This script takes 4 arguments: the list of all solvers in one track, the list of problems (the same used before), the timestamp of the last run (ex 201007060918) and the name of the track. The output of this script is a file containing all the aggregate results, one per line, of the form category, problem, solver, time, results. For example a snippet from this file looks like :
The last step is the classification of the solutions. The misc-classifier gets as input the aggregates results and outputs the html tables that will be then published on the web.
Running a solver competition is not as easy as it seems. To get it right we run an internal competition in january 2009 that helped us to highlight, understand and solve different problems. It is mostly a matter of writing down the rules, specify a clear and understandable protocol for the solver submission (for example asking to version their solver and a md5 hash associated to the binary is a very good idea in order to avoid mix-up ) and spend some time to debug the scripts. The runsolver utility from Olivier Roussel (available here) is a very nice tool that can take care of many delicate details in process accounting and resource management. I added a small patch to be able to specify a specific signal as *warning* signal. The code is in my private git repository : git clone http://mancoosi.org/~abate/repos/runsolver.git/ . This is the actual code we used for the competition. The 32 bit binary is available in the svn. All in all it was a great experience.
The results of the competition are published here .
I use rssh to allow restricted shell access to my servers. a few weeks ago I've noticed a lot of errors in my log of the form
It turned out a problem with a recent security update that removed the set user id from /usr/lib/rssh/rssh_chroot_helper. Dpkg has a nice way to make permanent such changes, that is dpkg-stateoverride. It simply boils down to:
Every now and then I have to re-learn xslt to transform xml documents. My mantra is to use the right tool for the right job... so I'm here again struggling with xslt. There is a bit of a love/hate relation between me and this language. I've used a lot during my master thesis transforming an enormous - industry size - software specification (in SDL) to a formal language. This was a very painful way of learning a new language, After that I just used it to perform small tasks, but I always manage to forget part of the specification...
Anyway, just to avoid repeating this error again, the lesson today is about xml namespaces. | Here you can find a lengthy explanation about namespace and all possible related problems to xslt. I stumbled on a very simple case. Why my xslt stylesheet does not work in the presence of namespace ?
A small motivating example. Suppose you have a very simple xml document
and you want to transform it in a different xml document as :
Therefore you want to - change the document root (and namespace) - copy all the content of the element <a> but the element <b> - move the element <b> below <a> - embed the text content of <b> in a CDATA section.
We go one step at the time. First we transform the root element and we change the default namespace. The header of the xsl file is standard except for the declaration of the attribute xmlns:ns="http://mynamespace.org". Since our source xml document as a namespace, we have to match elements in this namespace. In other to do so, we associate a label ns to the namespace http://mynamespace.org and the we use it to match the element <doc>.
The second part is the standard way of copying nodes and contents...
In the third part we match the root element, we create a new element and we copy everything. Since <xsl:apply-templates/> does not have a select it applies by default to all nodes inside the root element.
The result :
Now we want to move the element <b>. We add a new template to match <a> and copy all its content except the node <b>
This template being more specific then the default template we specified at the beginning of the document will be applied to <a> . Then we have to modify the template for the root element in order to copy the content of <a> and the the content of <b> below.
This will give something like this :
Last we want to embed the content of <b> in a cdata section. This is easy as we just need to add an output directive at the beginning of the file. The complete example :
A long time ago I wrote about how to handle compressed files in ocaml using extlib : http://mancoosi.org/~abate/transparently-open-compressed-files-ocaml
Today I got back to it and added bz2 support. The code is trivial. The only small problem to notice is that since the bz2 interface does not support a char input function, I've to simulate it using Bz2.read. A bit of a hack. I want to look at the bz2 bindings to fix this small shortfall. This is the code :
Recent comments
2 weeks 4 days ago
22 weeks 6 days ago
1 year 23 weeks ago
1 year 27 weeks ago
1 year 40 weeks ago
1 year 41 weeks ago
1 year 44 weeks ago
1 year 46 weeks ago
1 year 48 weeks ago
2 years 25 weeks ago