Re: [Tails-l10n] language statistics

Delete this message

Reply to this message
Author: intrigeri
Date:  
To: Tails localization discussion
Subject: Re: [Tails-l10n] language statistics
hi,

sajolida@??? wrote (19 Aug 2012 15:22:35 GMT) :
> I wrote a little script that calculates the total percentage of
> messages on the website that have been translated in each language.


Nice :)

> I'm also sending you the script in attachment for review.


Here's an improved version:

* move to POSIX /bin/sh
* add vim/Emacs modelines
* separatedly account for fuzzy translations
* ignore obsolete strings
* allow specifying a list of languages as command line arguments
* minor factorization
* stricter msgid matching regexp

How about:

  * adding this script somewhere in our Git repository?
  * adding support for non-ikiwiki .po files, so that we can run this
    script from the root of our Git repository? ... and then it would
    take into account config/chroot_local-includes/usr/share/locale/
  * having cron email this script's output to tails-l10n once every
    6 weeks (e.g. at the beginning of release cycle, or at freeze
    time)?


(I'm not volunteering, merely suggesting.)

#!/bin/sh
# -*- mode: sh; sh-basic-offset: 4; indent-tabs-mode: nil; -*-
# vim: set filetype=sh sw=4 sts=4 expandtab autoindent:

set -e

LANGUAGES=${@:-de es fr pt}

count_msgids () {
    cat | grep -E '^msgid\s+' | wc -l
}


for lang in $LANGUAGES ; do
    PO_FILES="$(mktemp -t XXXXXX.$lang)"
    find -iname "*.$lang.po" > $PO_FILES
    PO_MESSAGES="$(mktemp -t XXXXXX.$lang.po)"
    msgcat --files-from=$PO_FILES --output=$PO_MESSAGES
    TOTAL=$(msgattrib --no-obsolete $PO_MESSAGES | count_msgids)
    FUZZY=$(msgattrib --only-fuzzy --no-obsolete $PO_MESSAGES | count_msgids)
    TRANSLATED=$(
        msgattrib --translated --no-fuzzy --no-obsolete $PO_MESSAGES \
            | count_msgids
    )
    echo "$lang: $(($TRANSLATED*100/$TOTAL))% translated, $(($FUZZY*100/$TOTAL))% fuzzy"
    rm -f $PO_FILES $PO_MESSAGES
done