SvnDumpTool =========== SvnDumpTool is a tool for processing Subversion dump files (Subversion is a version control system available from http://subversion.tigris.org/). It's written in Python (tested with Python 2.4.4 on Linux, though 2.3 should work fine too). It has the following commands: * apply-autoprops apply the auto-props defined in SVN's config * check check a dump file * copy copy a dump file * cvs2svnfix fix a cvs2svn created dump file * diff show differences between two dump files * edit edit files in a dump file * eolfix fix EOL of text files in a dump * export export files from a dump file * join join dump files * log show the log of a dump file * ls list files of a given revision * merge merge dump files * sanitize sanitize dump files * split split dump files * transform-revprop transform a revision property * transform-prop transform a node property Its homepage is: http://svn.borg.ch/svndumptool/ Installing ========== Unpack the tarball, cd into the directory and run the following command: ./setup.py install Usage ===== svndumptool.py command [options] [dumpfiles...] Only Version 2 dump files can be processed with this tool! (Version 2 dumps are those created without the --deltas option) Apply-Autoprops --------------- Applies the automatic properties defined in Subversion's config file. The value of "enable-auto-props" is ignored and the properties are always applied. usage: svndumptool.py apply-autoprops [options] inputdump outputdump options: --version show program's version number and exit -h, --help show this help message and exit --config-file=CONFIGFILE Subversion config file [default: $HOME/.subversion/config]. Check ----- Checks a dump file. svndumptool.py check [options] dumpfiles... options: --version show program's version number and exit -h, --help show this help message and exit -a, --check-actions check actions like add/change/delete -d, --check-dates check that svn:date increases -m, --check-md5 check md5 sums of the files -A, --all-checks do all checks -v, --verbose verbose output Known bugs: * cvs2svn created dumps may cause false negatives. Copy ---- Copies a dump file. Doesn't sound like that makes sense but it's a useful test of svndump classes (and sometimes it's able to fix broken dump files). svndumptool.py copy [options] source destination options: --version show program's version number and exit -h, --help show this help message and exit Known bugs: * None Cvs2svnfix ---------- Fixes a cvs2svn created dump file. Some (all?) versions of cvs2svn do not create 100% valid dump files according to Subversion's specification of the dump file format. It omits the node kind for copied nodes. This command repairs those nodes. svndumptool.py cvs2svnfix [options] inputfile outputfile options: --version show program's version number and exit -h, --help show this help message and exit Known bugs: * None Diff ---- Shows differences between two Subversion dump files. svndumptool.py diff [options] dumpfile1 dumpfile2 Options: --version show program's version number and exit -h, --help show this help message and exit -e, --check-eol check for EOL differences -q, --quiet quiet output -v, --verbose verbose output -IIGNORES, --ignore=IGNORES Ignore types of differences. This option can be specified more than once. Valid types are 'UUID', 'RevNr', 'RevDate', 'RevDateStr', 'NodeCount', 'Path', 'Action', 'Kind', 'CopyFromPath', 'CopyFromRev', 'HasText', 'TextLen', 'TextMD5', 'EOL', 'Text', 'PropDiff', 'PropMissing', 'RevPropDiff' and 'RevPropMissing' --ignore-revprop=IGNOREREVPROP ignore a differing/missing revision property --ignore-property=IGNOREPROPERTY ignore a differing/missing property Known bugs: * cvs2svn created dumps may cause false negatives. Edit ---- Edits the content of files in a dump by replacing them with the contents of another file. Also allows properties to be deleted from files. svndumptool.py edit [options] sourcedump [destinationdump] options: --version show program's version number and exit -h, --help show this help message and exit -r REVNR, --revision=REVNR revision number to edit. -f FILENAME, --file=FILENAME filename of repos path to edit. --replace=REPLACE path of a replacement file. --propdel=PROPDEL prop name to delete. -q, --quiet quiet output. -v, --verbose verbose output. --dry-run just show what would be done but don't do it. Known bugs: * None Eolfix ------ Textfiles normally contain only one kind of 'end-of-line' characters (LF, CRLF or CR). This command fixes the EOL's in text files, it converts all EOL's to LF. svndumptool.py eolfix [options] sourcedump destinationdump options: --version show program's version number and exit -h, --help show this help message and exit -EEOLSTYLE, --eol-style=EOLSTYLE add svn:eol-style property to text files, the value can be 'native', 'LF', 'CRLF' or 'CR' -fFIX, --fix=FIX a comma separated list of what (and how) to fix, can be a combination of 'CRLF', 'CR' and 'RemCR'. If 'CR' and 'RemCR' both specified 'RemCR' is ignored. 'CRLF' and 'CR' mean replace them by LF, 'RemCR' means remove CR's. -FFIXREVPATH, --fix-rev-path=FIXREVPATH a colon separated list of fix option, revision number and path of a file. -rREGEXP, --regexp=REGEXP regexp for matching text file names -tTMPDIR, --temp-dir=TMPDIR directory for temporary files (default: './'). -wWARNFILE, --warn-file=WARNFILE file for storing the warnings. --dry-run just show what would be done but don't do it Known bugs: * EOL's aren't fixed in a file which was copied and the old file was not a text file. * Diff shows a few Text changes after eolfix !?! Export ------ Exports files from a dump file. svndumptool.py export [options] dumpfile options: --version show program's version number and exit -h, --help show this help message and exit -e REV REPOSPATH FILENAME, --export= REV REPOSPATH FILENAME adds a file to export. -dDIR, --directory=DIR set the directory for the exported files. Known bugs: * None Join ---- Concatenates two or more dump files. svndumptool.py join -o outputfile dumpfiles... options: --version show program's version number and exit -h, --help show this help message and exit -o OUTFILE, --output-file=OUTFILE set the name of the output dump file. Known bugs: * None Log --- Shows the log of a dump file in (almost) the same format as "svn log". svndumptool.py log [options] dumpfiles... options: --version show program's version number and exit -h, --help show this help message and exit -r REVISION, --revision=REVISION revision number or range (X:Y) -v, --verbose verbose output Known bugs: * None Ls -- Lists all files and dirs in the given revision or HEAD. svndumptool.py ls [options] dumpfiles... options: --version show program's version number and exit -h, --help show this help message and exit -r REVNR, --revision=REVNR revision number Known bugs: * None Merge ----- Merges multiple dump files into one. It does this by reading all dump files at the same time and always adding the revision with the oldest revision date to the output dump file. Use 'svndumptool.py check -a -d dumpfile' to check that actions and dates in the merged file make sense. svndumptool.py merge [options] Options: --version show program's version number and exit -h, --help show this help message and exit -i INFILE, --input-file=INFILE adds an input dump filename. -r FROM TO, --rename= FROM TO adds a rename to the previously added file. -s SEARCH REPLACE, --regex-substitute= SEARCH REPLACE performs regular expression search and replace -x DIR, --mkdir-exclude=DIR exclude mkdir from the previously added file. -o OUTFILE, --output-file=OUTFILE sets the output filename. -d DIR, --mkdir=DIR create an additional directory. -m MSG, --message=MSG logmessage for the directory creating revision. --example show a little usage example. Known bugs: * There's no warning when a dump file does not have monotonic increasing revision dates. Use 'svndumptool.py check -d dumpfile' to check the revision dates of a dump file. * mkdir-exclude may fail in some cases in cvs2svn created dumps. * using -r and -s options on the same path will have unknown results Sanitize -------- Replaces data and/or metadata of a dump file with md5 hashes. svndumptool.py sanitize [options] source destination options: --version show program's version number and exit -h, --help show this help message and exit -f, --no-file-data Do not sanitize file data. (Equivalent to --file- data=none.) -m FILE_DATA_METHOD, --file-data=FILE_DATA_METHOD Method to sanitize file data: whole, line, none. Default is whole. -n, --no-filenames Do not sanitize filenames -e FILENAME_EXCLUDES, --exclude-filename=FILENAME_EXCLUDES Do not sanitize this filename. May be used multiple times. -u, --no-usernames Do not sanitize usernames -l, --no-logs Do not sanitize log messages -s SALT, --salt=SALT Specify the salt to use in hex Known bugs: * None Split ----- Splits a dump file into multiple smaller dump files. svndumptool.py split inputfile [startrev endrev filename]... options: --version show program's version number and exit -h, --help show this help message and exit Known bugs: * None Transform-revprop ----------------- Transforms a revision property using a regular expression and a replacement string. svndumptool.py transform-revprop propname regex replace source destination options: --version show program's version number and exit -h, --help show this help message and exit Known bugs: * None Transform-prop -------------- Transforms a property using a regular expression and a replacement string. svndumptool.py transform-prop propname regex replace source destination options: --version show program's version number and exit -h, --help show this help message and exit Known bugs: * None Some tips ========= Moving things around -------------------- The merge command has a rename option so merge can be 'abused' to just rename parts of a repository. The following example shows how to add trunk, tags and branches directories and move everything from the repository root into trunk: svndumptool.py merge -i source.svndmp -r "" trunk \ -d trunk -d tags -d branches -m "Create trunk, tags and branches." \ -o destination.svndmp Fixing end-of-line ------------------ While testing with real data I found some weird stuff in these files. The safest way to convert EOL's is: 1. Do a test conversion (without --dry-run) converting CRLF only and generate a warnings file. svndumptool.py eolfix -Enative -fCRLF -wwarnings.log \ -r '\.txt$' input.svndmp output.svndmp 2. Check all files mentioned in warnings.log and decide how to convert each of them. Also choose a set of default fix options to minimize the list of special fix options. Maybe it's possible to choose a better set of regular expressions. 2.1 Binary files: set empty fix option for that rev/file pair. svndumptool.py ... -F :12:binary_file.txt ... 2.2 Textfiles containing CR EOL's: convert these svndumptool.py ... -F CRLF,CR:12:cr_file.txt ... 2.3 Textfiles containing CR which aren't EOL's: remove these svndumptool.py ... -F CRLF,RemCR:12:weird_file.txt ... 3. Do the real conversion. 4. Use 'svndumptool.py diff [options] input.svndmp output.svndmp' to compare the dump files. svnadmin load will do another check. Python classes ============== The Python classes are documented using epydoc (http://epydoc.sourceforge.net/). To generate the HTML docs just enter the following commands: mkdir doc epydoc --html -o doc -n "SvnDump 0.5.0" svndump Packaging ========= Creating a source tar.gz ------------------------ ./setup.py sdist --formats gztar This will create: ./dist/svndumptool-0.5.0.tar.gz Creating an rpm --------------- mkdir workarea cd workarea mkdir SPECS RPMS SRPMS BUILD SOURCES buildroot cp ../svndumptool.spec SPECS/ cp ../dist/svndumptool-0.5.0.tar.gz SOURCES/ rpmbuild --define="_topdir `pwd`" \ --define="buildroot `pwd`/buildroot" \ -ba SPECS/svndumptool.spec This will generate 2 files, a binary rpm and a source rpm: $ find -name \*.rpm ./SRPMS/svndumptool-0.5.0-1.src.rpm ./RPMS/noarch/svndumptool-0.5.0-1.noarch.rpm Creating a .exe --------------- Needs py2exe from http://www.py2exe.org/ .\setup.py py2exe