dependencies graphs with debtree for drupal

I've generate a few graphs with debtree for the drupal modules (10 Feb 2010).

The list of direct dependencies is not very interesting. Dependencies in drupal are mostly flat or the go one or two level deep... You can still spot redundant dependencies like for the module get-node where then dependency on contents could be left implicit by the virtue that imagefield depends indirectly on it.

The list of reverse dependencies (actually, both direct and reverse dependencies are shown in this graph) are equally not very exciting. Since all dependencies are conjunctive (meaning the all dependencies must be satisfied for a package to work), these graph can give a measure of the importance of a package with respect to other modules in the repository.

debtree with a local repository

debtree is a fantastic tool to create colorful graphs of package dependencies. One small shortcoming is that the user cannot provide a Packages.gz file directly to be used as repository. Since debtree is based on the excellent apt_pkg library, it is actually not that difficult to convince apt to look in a different location. To change debtree (and apt-get_ default behaviour you just need to create a new apt-get repository and then set the environment variable APT_CONFIG appropriately.

So ... imagine you create a repository as :

mkdir -p /tmp/apt/{archives,lists}/partial
cp $yourPackagesfile /tmp/apt/lists
cp $yourstatusfile /tmp/apt/

now you need to create a new apt.conf file that looks like :

APT::Get::List-Cleanup "false";
Dir::Cache /tmp/apt;
Dir::State /tmp/apt;
Dir::State::status /tmp/apt/status;
Dir::Etc::SourceList /tmp/apt/sources.list;

and now you're ready to play with your new repo :

APT_CONFIG=apt.conf apt-get update
APT_CONFIG=apt.conf debtree dpkg

done.

gammu x301

The other day I played a bit with gammu on my laptop. The integrated modem is an Ericsson F3507g. There are a lot of useful info about this modem. This is just a resume different info I found on the net.

First you would need to install gammu, that is packaged for debian, so no prob. The to connect to the modem you would need a file .gammurc in you home directory (or in /etc) and the correct permission to talk to the modem. This is how my conf file looks like:

[gammu]
port = /dev/ttyACM0
connection = at19200

To enable the modem, if it is not already enabled at boot, you will have to switch it on then then initialize it.

echo 1 > /sys/devices/platform/thinkpad_acpi/wwan_enable
/usr/sbin/chat -v "" "AT+CFUN=1" "+PACSP0" "AT" "OK" > /dev/ttyACM2 <  /dev/ttyACM2

And this is it. Gammu is ready to talk to the modem and tell me few info :

zed:~# gammu --identify
Device               : /dev/ttyACM0
Manufacturer         : Ericsson
Model                : unknown (F3507g)
Firmware             : R1B/1
IMEI                 : xxxxxxxxxxxxxxxxxxxx
SIM IMSI             : xxxxxxxxxxxxxxxxxxx

The ref man of gammu is something nice to have at hand ...

Unfortunately I've realized that the SIM card I'm using is not able to register to the net...

zed:~# gammu --networkinfo
Network state        : registration to network denied
GPRS                 : detached

Digging a bit more, this is can be seen directly talking AT with the modem:

AT+CREG=2
OK
AT+CREG?
+CREG: 2,3
This page details the list of AT commands you can use. The 3 part in +CREG: 2,3 means "Registration denied" ... sigh, sime to find a new sim...

analyzing drupal dependencies for fun and profit

After few inspiring talks in the drupal room at fosdem I decided to spend few hours to figure out the module dependency system in drupal.

Drupal has a highly modular design. The core is composed by a set of required modules (dependencies) and a set of optional modules (suggests). All contrib modules declare similar dependencies between each other. All dependencies are conjunctive, that is, in order to install a component all its dependencies must be satisfied. There are no conflict between components, and this implies that a module is always installable. The only implicit conflict is that two versions of the same module cannot be installed at the same time. This makes the module installation algorithm trivial as it is equivalent to a simple visit of the dependency graph (that might have cycles).

There is a nice page on the drupal website explaining the format of the metadata for the next version of drupal .

For example :

name = Tables Filter
description = Provides a filter that converts a [table  ] macro into HTML encoded table.
dependencies[] = filter
package = Input filters
core = 6.x

; Information added by drupal.org packaging script on 2009-09-10
version = "6.x-1.0"
core = "6.x"
project = "tables"
datestamp = "1252563652"
Note the conversion in an intermediate aggregate data below.

In order to analyze all modules' dependencies I've downloaded all available modules for the release 6 of drupal (15th Feb 2010), extracted all the meta data and transform them in something that the tools in dose3 can handle. Downloading all projects archives I've also find that there a significant number of archives that cannot be downloaded (403 / 404) and few mistakes in the metadata ... I'll blog about this in the future maybe.

Numbers and intermediate aggregate modules list

From the file .info in each module archive, I extracted all the relevant data and transformed in a 822 format similar to the one used in debian. There are about 4800 modules in the drupal repository for drupal 6.x.

This is a small snippet representing few drupal core modules and a meta package (that I created from the metadata) to express the core's dependencies) :

[...]
package: tables
version: 6.x-1.0
depends: filter

package: blogapi
version: 6.15

package: profile
version: 6.15

package: filter
version: 6.15

package: drupal
version: 6.15
depends: system , user , block , node , filter
provides: core = 6.15
suggests: translation , comment , menu , openid , contact , tracker , forum , ping , syslog , help , dblog , search , trigger , poll , update , locale , php , path , taxonomy , color , aggregator , upload , throttle , statistics , blog , book , blogapi , profile
[...]
Since I'm considering only modules for drupal version 6.x, all dependencies for core >= 6.0 , core < 7.0 are left implicit.

Dependency graphs

The result are a set of nice graphs showing for each package their (deep) dependencies. From the global dependency graph, I've extracted the "connected" components, that is all modules that are related with each other in some way. This generates 375 sub-graphs. This is the top 10 (WARNING: some of the biggest pdf systematically manage to trash my workstation... handle with care) ... and circo didn't manage to create the pdf for views and taxonomy:

The complete list is here

From these graphs, it seems that apart from a couple of dozen of packages, the rest of the drupal components are loosely connected. I don't think this is a matter of code sharing but this is more likely because the drupal repository has a plethora of small components with a very special functionality that only depends on the drupal core.

Dist check

Distcheck is a small utility that transforms package dependencies in a propositional logic problem and then uses a sta solver to simulate it's installation. Since there are no conflicts, it should be always possible to install a package. The only reason for a package to be broken is a missing dependency in the repository. Periodically performing this analysis could prevent the distribution of broken packages.

Conclusions

  1. Periodic generation of aggregate module metadata information.
  2. Dist Check the module repository to avoid releasing a module that is not installable due to a missing dependency.
  3. Integrate a developer tool to display all dependency of a module (like debtree, or directly using debtree)
  4. As the system grows it might be necessary to review the dependency system to include disjunctive dependencies and conflicts between modules. At present this might be not necessary, Adding more expressivity to the dependency system of course will significantly increase the complexity of the installation problem (from polynomial to NP-complete).

I think it is important to spend few words about this last point. It is clear that not all 4800 packages can be installed at the same time. Just think about the filter modules that manipulate user's submissions. At the moment the only was a site developer has to discover a conflict it to try out the module and check if it did not break anything else on the site. Given the complexity of many drupal site this can be a painful and costly task to perform.

Adding conflicts to the meta data will make modules integration much easier for site developers, and move the burden of finding potential problems to the module developers and to the module installer. As I said before if we include conflicts (that is negation, in logical terms) the problem of installing a new module suddenly become NP-complete. Running a NP complete algorithm on a webserver is of course a bad idea, but using drush offline to run complex install operations, should be completely acceptable as much as it is acceptable to wait for apt-get to install the latest program on debian.

If conflicts are indeed needed it would be fun to have a mod_php_minisat and to implement a small dependency solver in php !

create rpm packages on a debian machine

This gave me a bit of an headache ... Why on earth rpmbuild does not simply respect environment variables nor have a command line option to specify the TEMPDIR used to build the package or simply look for a simple configuration file in the local directory ? From command line you can only specify --buildroot . For everything else you must specify a global file called .rpmmacros (not .rpmrc !) and write the new defaults there. This must be either in your home, or in /etc (other other rpm specific paths). Anyway ... enough for this rant. This is the hack to create a local build environment :

#!/bin/bash

RPMBUILD=`pwd`/rpmbuild
RPMMACROS="$HOME/.rpmmacros"

if [ -f $RPMMACROS ] ; then
  cp $RPMMACROS $RPMMACROS~
fi

echo "%_topdir  $RPMBUILD" > $RPMMACROS
echo "%_tmppath $RPMBUILD/tmp" >> $RPMMACROS
echo "%_rpmdir `pwd`/RPMS" >> $RPMMACROS
echo "%mkrel(c:) %(echo mdv20010)" >> $RPMMACROS

mkdir -p rpmbuild/{SRPMS,BUILD,RPMS,SPECS,tmp}

The mkrel part is something else that is needed if you want to build a package that resemble a mandriva package ... The rpm package in debian does not include this macro by default.

I'm using the rpmbuild part of the rpm package on debian sid (RPM version 4.7.2).

generate hdlist/sysnthesis files on a debian system

Imagine you want to create hdlist/sysnthesis files from a bunch of rpms on a debian system. If you know what you need it is not actually that difficult. The mandriva svn contains the tool genhdlist2. To use it on a debian machine we need to download and install two perl modules that are contained in the same directory as above, namely, perl-URPMI and MDV-Packdrakeng. The tool we want to use is in the rpmtools directory. You can svn checkout everything from http://svn.mandriva.com/svn/soft/rpm .

Once downloaded everything you would either need to install these perl libraries system-wide (bad idea...) or locally and to fiddle a bit with the genhdlist2 script in order to specify the location of these two libraries. Personally I added these two lines at the beginning of the file :

use lib '/tmp/perl-URPM/trunk/blib/lib';
use lib '/tmp/MDV-Packdrakeng/trunk/lib';

Make also sure that you compile the URPMI.so component of the perl library and copy it to the lib directory.

Now you are ready to generate your hdlist/systhesis files just by giving a directory with all the rpms to the tool as in perl genhdlist2 rpms

et voila.

Update

I've committed the script and supporting modules here if you are interested. To access the forge you need a username/password: you can use mancoosi/mancoosi if you want anonymous access.

expose you command line application on the web with python and fcgi

So one day you're too lazy to write a fcgi library for your favorite language but you want nonetheless expose an application on the web... Then use python ! There are quite a few frameworks to run fcgi with python, but if you want something easy, I think that flup is for you.

The code below takes care of few aspects for you. First flup span a server talking at port 5555 on localhost. You can configure it to be multi thread is you want to. Then using the cgi module we make sure that the input is clean and ready to use. Finally we run your fantastic application as DOSOMETHING. If your application is a simple program, of course there is no reason to write a fcgi. A common cgi will pay the bill. However, if your application can benefit from some form of caching, then maybe writing the web related stuff in python and use the application as a black box can be a nice idea.

If might want to check out [flup http://trac.saddi.com/flup] and [werkzeug http://werkzeug.pocoo.org/]. I've not used the last one, but it seems more complete then flup.

#!/usr/bin/python
# -*- coding: UTF-8 -*-

from cgi import escape
import sys, os
from flup.server.fcgi import WSGIServer
import subprocess
import urlparse
import cgi
# expose python errors on the web
import cgitb
cgitb.enable()

def app(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    yield '<html><head></head>'
    yield '<body>\n'

    form = cgi.FieldStorage(fp=environ['wsgi.input'], environ=environ,keep_blank_values=1)

    yield '<table border="1">'
    if form.list:
        yield '<tr><th colspan="2">Form data</th></tr>'

    for field in form.list:
        yield '<tr><td>%s</td><td>%s</td></tr>' % (field.name, field.value)

    yield '</table>'

    if form.has_key('c') and form.has_key('l'):
        cat = form.getvalue('c')
        pl = form.getlist('l')
        command = DOSOMETHING(cat,pl)
        results = subprocess.Popen(command,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
        )
        (i,e) = results.communicate()
        yield '%s\n' % i
        yield '%s\n' % e

    yield '</body>\n'
    yield '</html>\n'

WSGIServer(app,bindAddress=('localhost',5555)).run()

quote of the day

You may well ask: "Why direct action? Why sit-ins, marches and so forth? Isn't negotiation a better path?" You are quite right in calling, for negotiation. Indeed, this is the very purpose of direct action. Nonviolent direct action seeks to create such a crisis and foster such a tension that a community which has constantly refused to negotiate is forced to confront the issue. It seeks so to dramatize the issue that it can no longer be ignored. My citing the creation of tension as part of the work of the nonviolent-resister may sound rather shocking. But I must confess that I am not afraid of the word "tension." I have earnestly opposed violent tension, but there is a type of constructive, nonviolent tension which is necessary for growth. Just as Socrates felt that it was necessary to create a tension in the mind so that individuals could rise from the bondage of myths and half-truths to the unfettered realm of creative analysis and objective appraisal, we must we see the need for nonviolent gadflies to create the kind of tension in society that will help men rise from the dark depths of prejudice and racism to the majestic heights of understanding and brotherhood. - Martin Luther King: "Letter from Birmingham"

to read :

fosdem cool stuff

Well... I went to fosdem 2010:) Very nice indeed as every year. Kudos to the organizers. Even though this year I didn't manage to grab a t-shirt as I usually do ...

indenti.ca

I attended a presentation about identi.ca. I'm not very much micro-blogging person. I like social networks but just to get in touch with friends and nothing else. What I like about identi.ca (as many other people at fosdem) is the FOSS side of it and it's implementation of open standards. The talk was well delivered and informative. Thanks !

gwibber

So, during a presentation I snoop on my neighbor's laptop and he was using twhirl, that is a nice "freeware" piece of software. Closed source ? No way ! There is a nice FOSS alternative though. Gwibber. I've tried it out yesterday and it does a nice job. From its homepage, this is the description.

Gwibber is an open source microblogging framework and desktop client for GNOME developed with Python and GTK+. The Gwibber backend is a stand-alone daemon that manages updates and retrieves stream data from social networks. The Gwibber backend can be accessed through D-Bus and currently uses GConf to store account configuration info.

homepage: http://live.gnome.org/Gwibber

Does it work ? uhmmm playing with it I would say that is a still a bit young. The interface does not always work (how do I replay to a message, or how to I subscribe to a specific tag on identi.ca ?) and it is not very well polished. A piece of software to be consider, but not ready for prime time, at least for me.

Nmap Scripting Engine (NSE)

I attended a very nice presentation about NSE, the nmap scripting engine that I didn't know at all. It's a very powerful tool to scan an analyzing networks. There is a free chapter of the nmap book available and I think it's completely worth reading it.

This is the material from the presentation that includes a very nice handout about nse.

lua

And NSE is written in LUA that is a powerful, fast, lightweight, embeddable scripting language. I've already came across it and I think I'll spend sometimes to see if it can be useful for my projects.

homepage : http://www.lua.org/about.html

Drupal

As every year I spent a few hours in the Drupal run. The drupal community is extremely active. I failed to attend the talk about the upcoming drupal 7 release. Fortunately I've found this keynote online that is work watching if you are interested. Drupal 7 is going to be a super release both from site designed and developers. I really looking forward to it.

During a presentation about installing and developing with Drupal, the discussions went on the dependency system of the modules and plugins in Drupal. Needless to say that this might be a nice application of the work we are doing with mancoosi. At it should also be reasonably easy to integrate it with drush. Now I just need a php binding of a sat solver that understands CUDF. AH!

guake

Guake is a quake console stile unix terminal for gnome. I'm addicted. I've configured guake to slide down with alt+space, in the same way I've ubiquity on firefox. I feel home. The console is there when I need it, it's fast, and with the correct key-bindings is just like gnome-terminal. I'm definitely happy I've discover it. Next step is going to be a tiling window manager, but this is material for another post.

this is the wikipedia page about it : http://en.wikipedia.org/wiki/Guake It's homepage seems down at the moment.

I attended many other presentations, but this is enough for one post. See you at fosdem 2011

ExtLib OptParse (part 2)

I've already wrote something about OptParse last month. Today I discovered how to create a new option (that is not a string, int or bool) and validate it within the arg parser.

So suppose we want to write an application that can output both txt and html and we want the user to specify the format with command line option. One way would be to use a StdOpt.str_option - eventually with a default option - and to retrive it in application code with OptParse.Opt.get.

However this is not satisfactory as we are mixing the application code with command line parsing. A better way is to create a new type of option with Opt.value_option .

This is the concept :

type out_t = Txt | Html
module Options = struct
  open OptParse
  exception Format
  let out_option ?default ?(metavar = "<txt|html>") () =
    let corce = function
      |"txt" -> Txt
      |"html" -> Html
      | _ -> raise Format
    in
    let error _ s = Printf.sprintf "%s format not supported" s in
    Opt.value_option metavar default corce error

  let output = out_option ~default:Txt ()

  let description = "This is an example"
  let options = OptParser.make ~description:description ()

  open OptParser
  add options ~short_name:'o' ~long_name:"out" ~help:"Output type" output;
end

Note that the function Opt.value_option get a default value, a metavar - that is the sting associated with the option in the help (  -o<txt|html>, --out=<txt|html> Output type ), a corce function, that is, a function that transforms a string in the desired type, and an error function that is used by the parser to give a meaningful error is the option is not correctly validated.

For example :

$./test.native -oooo
usage: test.native [options]

test.native: option '-o': ooo format not supported

Now when we use this new option in the application code with OptParse.Opt.get and we can be certain that it was correctly validated.

Syndicate content