Re: [T(A)ILS-dev] Metadata Anonymizing Toolkit for file Publ…

Delete this message

Reply to this message
Author: intrigeri
Date:  
To: Antonio Davoli
CC: Damian Johnson, Tor Assistants, The T\(A\)ILS public development discussion list, Peter Eckersley, tech
Subject: Re: [T(A)ILS-dev] Metadata Anonymizing Toolkit for file Publication - GSoC'11 Proposal
Hi,

Antonio Davoli wrote (03 Apr 2011 16:45:53 GMT) :
> I am sorry for the late answer but I had to prepare a talk for this
> Monday.


No problem, as long as you give us enough time for reading your next
proposal version so that you can fix the last things up before the
(now approaching) deadline is here.

> I had checked them before sending the proposal. It seems that all
> the libraries (with the exception of librtf) are supported without
> problems on Windows and Mac OS X.


Sounds good.

> However I am thinking to change the proposal to not include the
> multi-platforms capability. It could always be integrated in the
> next versions.


Ok.

> I am planning to add the support for the ZIP archives as I did for
> the Tar ones.


Great.

> I think that the idea to anonymize also the files inside the archive
> is quite interesting and I want to add it in the new version of the
> proposal.


Ok, as long as it does not take too much time that could be used for
working on more important features / integration.

>> - What about video files? I expect you at least make it clear in
>> your
>> proposal why you put these aside, if you really plan to do so.
>>


> I found a project for the support of XMP metadata written in Python
> (http://code.google.com/p/python-xmp-toolkit/) with a long series of
> supported files and which is based on the SDK of Adobe. Among them
> the are several video files included.


XMP support sounds great, but video files may have embedded meta-data
in other forms, can't they?
Also, python-xmp does not seem to be available in Debian.

>> > For what concerns the command-line utilities, a tool for each type
>> > of files will be created.
>>
>> What is the rationale for exposing such details to the user, instead
>> of a common command-line frontend?
>>


> Maybe it is better to remove this capability and maintain only a
> classical front end for the users interaction.


If you have good reasons to prefer the "one command-line tool per file
type" way, don't hesitate stating them.

>> srm is a quite well-spread name. Could you please point us at the
>> implementation you are talking of?
>>


> I thought to use the wipe implementation
> (http://wipe.sourceforge.net/) or the secure-delete package. Both of
> them are available in the Debian packages. However I would like to
> spend some time in comparing these two implementations.


shred is another alternative you might want to consider.
It's shipped by the GNU coreutils package and is thus installed by
default in every GNU/Linux system I think.

> It's not clear if you plan to rely on manual testing or rather on
> writing a test-suite {before,while,after} implementing the code.
> What are your plans for unit-testing? GUI testing? Test-suite
> framework?


> After the change of language, I think to implement all the test
> suite with the pyUnit test framework.


Great. The now-shipped-in-Python under the unittest name, right?

> For what concerns the creation time, i think to develop the test
> {after,during} the coding phase.


Ok.

> The tests can be even useful to define the guidelines of the
> projects.


Sure. This is an approach I do like.

> I hope to have enough time to study the .deb in these days, in order
> to introduce that capability inside the new version of the proposal.


Ok. Else don't worry, as long as you tell us clearly in advance.

> Unfortunately the manual of this project is written in Italian (it
> was a project that I created for an university exam). I have not had
> time to upload it to sourceforge and to write an extensive
> documentation. I am considering to remove it from the proposal if I
> will not have time to write any guide/tutorial.


Please don't remove references to your previous work from your
proposal. I was merely asking to understand a bit better how you deal
with code once it's published.

>> > Because of my studies I can not dedicate all the time to this
>> > project, but I can assure a commitment of at least four hours a
>> > day that I’m going to spend entirely on this project.
>>
>> Do your studies happen to go on until the end of the GSoC coding time?
>>


> I am going to be busier untile the June 15 th. However I am going to
> be completely free from July 15 th.


If I understand clearly, this means:

May 23 - June 15th: very busy
June 16th - July 14th: quite busy, 4 hours a day
July 11th - July 15th: Mid-term evaluations
July 15th - August 22nd: Firm 'pencils down' date

Feel free to correct me if needed.

> I am going to improve the specification and also the timeline with
> more details about every step.


Great. Do you feel you can send us (and the google-melange webapp)
your updated proposal before Wednesday, 06:00pm GMT?

> I think the hardest part will be the starting point and the design
> of the structure of the main library. The production of code should
> not be so hard, especially if the design is well done.


Agreed.

>> Will the GUI you plan to create have drag'n'drop support?


> I think at the moment I am going to discard this support, unless
> your thoughts are different about it.


In case you think this can be implemented in, say, <=2 days, and this
does not mean overcommitting, I'd be happy to see this feature make it
into your proposal.

> Do you want to read the newer version when I finish to write it?


I do!

> Naturally, I don't want to waste your time.


There's a better way to waste my time: submitting a proposal I would
not like as much as I could :)

Bye,
--
intrigeri <intrigeri@???>
| GnuPG key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc
| OTR fingerprint @ https://gaffer.ptitcanardnoir.org/intrigeri/otr.asc
| Did you exchange a walk on part in the war
| for a lead role in the cage?