[Tails-project] some stats on the website

Delete this message

Reply to this message
Author: sajolida
Date:  
To: Public mailing list about the Tails project
Subject: [Tails-project] some stats on the website
To the request of the Internationalization Lab who helped us translated
the website in Farsi, I did some stats on the hits we see on the
website. These are aggregate numbers from April 24 to May 22 so I
thought I could as well publish them here as they might be of interest
to different people. I'm also documenting my scripts for the future and
in case I made errors (I often do on these things).


Translation stats
=================

These are the stats we publish in our month reports. They have nothing
to do with website hits but, since I'm writing this for the
Internationalization Lab, I thought I'd copy it here as well.

Overall translation of the website
----------------------------------

- de: 50% (2615) strings translated, 43% words translated
- fa: 47% (2502) strings translated, 54% words translated
- fr: 63% (3278) strings translated, 63% words translated
- it: 17% ( 949) strings translated, 18% words translated
- pt: 31% (1661) strings translated, 29% words translated

Total original words: 53520

Core pages of the website
-------------------------

See https://tails.boum.org/contribute/l10n_tricks/core_po_files.txt

- de: 79% (1432) strings translated, 79% words translated
- fa: 40% ( 726) strings translated, 42% words translated
- fr: 73% (1330) strings translated, 77% words translated
- it: 49% ( 886) strings translated, 56% words translated
- pt: 55% (1001) strings translated, 55% words translated

Total original words: 14006


Hits per language
=================

for lang in en fa fr de ; do echo -n "${lang} " ; grep -E "GET
.+\.${lang}\.html HTTP/1\..\" 200" access.log* | wc -l ; done

en 1501323 (83.1%)
fa 11468 ( 0.6%)
fr 124823 ( 6.9%)
de 170007 ( 9.4%)


Top 50 pages in Farsi and their hits
====================================

Note that this doesn't mean that these pages are actually translated in
Farsi. For example, the top 2, 3, 8, 10, and 12 pages are not translated
into Farsi.

grep -E "GET .+\.fa\.html HTTP/1\..\" 200" /tmp/access.log | sed -n -re
's/.* ([^ ]+)\.fa\.html HTTP.*/\1/p' | sort | uniq -c | sort -rn | head
-n 50

    686 /index
    312 /install
    189 /install/os
    183 /news
    162 /about
    128 /support/faq
    128 /getting_started
    127 /install/win
    126 /doc/anonymous_internet/Tor_Browser
    120 /install/win/usb
    106 /news/version_2.3
    106 /install/win/usb/overview
    105 /doc
    103 /support/known_issues
    103 /doc/first_steps/startup_options/bridge_mode
     85 /doc/about/license
     84 /support
     83 /contribute
     82 /press
     74 /doc/about/warning
     74 /contribute/how/donate
     67 /security
     66 /doc/first_steps/introduction_to_gnome_and_the_tails_desktop
     66 /doc/about/trust
     64 /install/vm
     63 /doc/anonymous_internet/claws_mail_to_icedove
     59 /doc/encryption_and_privacy/secure_deletion
     59 /doc/anonymous_internet/tor_status
     57 /doc/anonymous_internet/icedove
     56 /doc/anonymous_internet/electrum
     55 /doc/first_steps/bug_reporting
     55 /doc/anonymous_internet/pidgin
     55 /doc/about/features
     54 /doc/anonymous_internet/i2p
     53 /doc/first_steps/startup_options/network_configuration
     52 /install/dvd
     51 /security/Numerous_security_holes_in_2.2.1
     51 /install/debian
     50 /doc/introduction
     50 /doc/anonymous_internet/index
     49 /news/version_1.7
     49 /doc/first_steps/upgrade
     49 /doc/first_steps/startup_options/mac_spoofing
     48 /doc/about/openpgp_keys
     48 /doc/about/acknowledgments_and_similar_projects
     47 /news/version_2.2.1
     46 /news/version_2.2
     46 /doc/advanced_topics/virtualization
     45 /doc/first_steps/installation/manual/linux
     44 /install/win/clone/overview



Top 50 pages across all languages
=================================

grep -E "GET .+\...\.html HTTP/1\..\" 200" /tmp/access.log | sed -n -re
's/.* ([^ ]+)\...\.html HTTP.*/\1/p' | sort | uniq -c | sort -rn | head
-n 50

554957 /news
154156 /install
146827 /install/os
99661 /install/win
70154 /index
65685 /install/win/usb/overview
62222 /install/win/usb
40132 /about
33440 /install/debian
28471 /getting_started
22188 /doc/about/warning
20882 /install/debian/usb
20486 /news/version_2.3
20428 /install/dvd
20370 /install/debian/usb/overview
19971 /install/linux
19889 /doc
19305 /install/vm
14709 /install/mac
12914 /install/win/clone/overview
12627 /doc/about/features
11638 /install/clone
10947 /install/download
10279 /support/faq
9732 /doc/first_steps/installation
9239 /install/linux/usb/overview
9072 /security/Numerous_security_holes_in_2.2.1
8242 /doc/first_steps/reset/windows
8204 /support/known_issues
8024 /install/linux/usb
7871 /support
6729 /doc/first_steps/installation/manual
6351 /doc/first_steps/startup_options/bridge_mode
6243 /doc/first_steps/startup_options/administration_password
6136 /doc/advanced_topics/virtualization/virtualbox
6088 /install/mac/usb/overview
5808 /doc/about/license
5746 /doc/get/verify
5654 /doc/first_steps/persistence/configure
5301 /doc/first_steps/startup_options
5123 /install/expert/usb/overview
4986 /doc/first_steps/startup_options/mac_spoofing
4968 /doc/first_steps/start_tails
4772 /install/mac/usb
4704 /doc/advanced_topics/virtualization
4698 /doc/about/fingerprint
4677 /install/expert/usb
4386 /doc/first_steps/upgrade
4333 /doc/first_steps/reset/linux
4289 /doc/about/requirements


Top 20 user agents
==================

grep -E "GET .+\.fa\.html HTTP/1\..\" 200" /tmp/access.log | sed -e 's/
/ /g' | cut -d ' ' -f 17 | sort | uniq -c | sort -rn | head -n 20

   7722 "Mozilla/5.0
    731 "Domain
    701 "Mozilla/4.5
    576 "Wget/1.15
    490 "Mozilla/4.0
    274 "Googlebot/2.1
    252 "Riddler
    232 "GigablastOpenSource/1.0"
     99 "ltx71
     61 "PrivateSearch/0.1.0
     61 "eilisabot/1.0.0-beta"
     54 "yacybot
     49 "ResearchBot;
     27 "UserAgent"
     21 "DoCoMo/2.0
     15 "SAMSUNG-SGH-E250/1.0
     14 "Opera/9.80
     11 "-"
     10 "Ruby"
      7 "UCWEB/2.0