squarepusher, this has been a useful script for me. Thanks.
Since this script was written, c-span has changed their format slightly so that the scripts do not work as originally written. One more level of indirection fixes the problem.
One problem with the original scripts was that the original scripts did not use the offset information contained in the xml file, resulting in much video information being downloaded twice.
I hacked the original scripts to fix these two problems and the changed scripts are shown below. I have been using these modified scripts for a couple of weeks on Linux without problems. I have not tested them on Windows.
--------Here is the modified cspandownloader scripts------------
title="C-SPAN Archive Downloader v0.0.2"
if [ -z "$1" ]; then
echo -e "$title\n\nERROR: No program ID specified as first parameter. The program could not continue.\n\nUSAGE: cspandownloader <programid> <directory_name>\n\nHINT: The program ID is the last part of the URL - for instance, in the URL:\nhttp://www.c-spanvideo.org/program/17753-1\n\nThe number '17753-1' (without quotes) is the program ID that you would specify in place of <programid>\n\nThe optional parameter directory_name specifies the name of the directory which will contain the downloaded data. \nIf no directory name is specified, the downloaded data will be placed in a subdirectory named after the programID\n"
else
if [ -z "$2" ]; then
mkdir $1; cd $1
else
mkdir $2; cd $2
fi
#get full web page and extract program id
programid=`curl -s
http://www.c-spanvideo.org/program/$1 | sed -n '/programid=[0-9]/p' | sed 's/^.*programid=//' | sed 's/&.*$//' | sed '2,$d'`
#extract program info using programid
curl -s
http://www.c-spanvideo.org/common/services/flashXml.php?programid=$programid | tidy -xml -indent -quiet > cspantemp.xml
sed -n '/<string name=\"path\">/,/<\/'string'>/ {
s/^.*<string name=\"path\">//
s/<\/string>.*$//
p
}' cspantemp.xml | sed '/^$/d' | sed 's/^[ \t]*//' > cspanfilelist.txt
sed -n '/<number name=\"offset\">/,/<\/'number'>/ {
s/^.*<number name=\"offset\">//
s/<\/number>.*$//
p
}' cspantemp.xml | sed '/^$/d' | sed 's/^[ \t]*//' | sed 's/<number name=\"length.*$//' | sed '/^$/d' > cspanoffsetlist.txt
i=1
while [ $i -le `wc -l cspanfilelist.txt | gawk '{print $1}'` ] ; do line=`head -$i cspanfilelist.txt | tail -1`; offset=`head -$i cspanoffsetlist.txt | tail -1`; cspandumper $line $offset; i=`expr $i + 1`; done
#Clean up all the temporary files now
echo "Cleaning up temporary files..."
rm cspantemp.xml cspanfilelist.txt;
echo "Done"
exit
fi
----------Here is the modified cspandumper script------------
cspanchecksum=`wget
http://www.c-spanarchives.org/flash/cspanPlayer.swf 2>&1 /dev/null | sha256sum | sed 's/-//g'`
echo $cspanchecksum
cspansize=`ls -l cspanPlayer.swf | cut -f 5 -d ' '`
filename=`echo $1 | sed 's|.*/||'`
#We don't need '/tmp/cspanPlayer.swf' anymore (we've calculated the checksum and the filesize which were the only reasons why we needed to download it), so delete it
rm cspanPlayer.swf
#Here comes the rtmpdump command
echo "rtmpdump -r rtmp://video.c-spanarchives.org:1935/fastplay/../ -y $1 -s
http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf -A $2 -w $cspanchecksum -x $cspansize -o $filename"
rtmpdump -r rtmp://video.c-spanarchives.org:1935/fastplay/../ -y $1 -s
http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf -A $2 -w $cspanchecksum -x $cspansize -o $filename