PrisonPlanet Forum
May 18, 2013, 04:32:30 AM *
Welcome, Guest. Please login or register.

Login with username, password and session length
 
   Home   Help Login Register  
Pages: [1]   Go Down
  Print  
Author Topic: Download C-SPAN Videos Using This Script  (Read 13514 times)
squarepusher
Member
*****
Offline Offline

Posts: 2,013



« on: April 15, 2010, 07:27:02 PM »

Credit should go to the Ron Paul forums for giving me the necessary info that enabled me to write this script:

http://www.ronpaulforums.com/showthread.php?p=2602502

For Windows users (XP/Vista/7)
You will need to download this PDF where I explain what you will have to do to get it working. Linux users will have an easier time getting this to work than Windows users, but because I know most people on here are using Windows, I've made a really easy-to-setup procedure - and that 'procedure' is shown in this PDF tutorial.

http://uploadbud.com/files/QAEVE9YB/tutorial-cspandownloader.pdf

Alternatively, just download the PDF here:

http://popularsymbolism.com/mitaphane/tutorial-cspandownloader.pdf

For Linux users

You can just download the following two scripts, put them in your 'bin' folder, and make sure you have at least the following dependencies:
* Tidy
* Wget
* Rtmpdump

If you don't have them, you can install them using the package manager of your specific Linux distribution (Synaptic Package Manager for Ubuntu, YAST for SUSE, Pacman for Arch, and so on)

Here are the two scripts:

http://popularsymbolism.com/mitaphane/cspandownloader

http://popularsymbolism.com/mitaphane/cspandumper

Make sure you make them executable - the following command will suffice:

chmod +x cspandownloader cspandumper

Usage
The script must be used from the (Cygwin in case you're using Windows) commandline. The syntax is shown below:

cspandownloader <videoid>

The video-ID is the last part of the URL. Here is an example (the part in ’italics’/'underline' would be the video-ID)

http://www.c-spanvideo.org/program/167196-1

Whenever you watch a video on the C-SPAN site, there should always be at the end of the URL a ’video ID’. You will need to input this ’video ID’ number as a parameter so that you can download the file.

For instance, if you wanted to download the above video, you would type:

cspandownloader 167196-1

That’s all there is to it - the script will now proceed to download all the video files. In case the script starts downloading another file after the previous one is done, don’t worry - this is common behavior since most of the files on C-SPAN are split into separate parts.
Logged

Infowars Wiki - Help make this become the official wiki of Infowars.com - contribute!
jofortruth
Member
*****
Offline Offline

Posts: 10,113



WWW
« Reply #1 on: April 15, 2010, 07:44:33 PM »

Been wondering how to do this. Thx squarepusher!
Logged

Don't believe me. Look it up yourself!

The Great Deception - Forum/Library - My Research
http://z4.invisionfree.com/The_Great_Deception/index.php?showforum=110
blissentia
Member
**
Offline Offline

Posts: 75


« Reply #2 on: May 05, 2010, 01:06:34 AM »

do you know of any scripts for mac osx?
Logged
zdux0012
Member
*****
Offline Offline

Posts: 876



« Reply #3 on: May 06, 2010, 12:12:14 AM »

If anyone wants a script like this just ask, someone here can provide it.

Mac user, get off your OS. In every hacking contest ever your OS is the first hacked. You basically have FreeBSD with big brother installed.
Windows users too, but at least there are more than 20 windows users who can at least determined how they are getting hacked from time to time.
Logged

Get off of Windows / Mac!! You are not safe.
Get an OS you can trust. Linux, Free BSD. Ask for help!
H0llyw00d
Guest
« Reply #4 on: May 06, 2010, 12:25:01 AM »

Thank you sqarepusher, use DJGPP compiler here, gonna try something diff (basically compiled w/ DJ), w/ your idea if thats ok?
great tool!!
Logged
squarepusher
Member
*****
Offline Offline

Posts: 2,013



« Reply #5 on: May 06, 2010, 01:53:50 AM »

Thank you sqarepusher, use DJGPP compiler here, gonna try something diff (basically compiled w/ DJ), w/ your idea if thats ok?
great tool!!

Yeah man be my guest - you don't have to ask for my permission to use this script as a base for your own program/script - it's a very basic and sloppy hack job and could be improved in lots of easy ways - but I'm too lazy ATM to make a better script in something like Python.

I'd actually like to see someone creating a Python script like this with a good usable frontend and something that can also 'extract' the transcript - I haven't found a way to get at the entire transcript - might have to run it through Wireshark again and see what it brings up when it's loading the page, the video and the transcript.
Logged

Infowars Wiki - Help make this become the official wiki of Infowars.com - contribute!
gusty_wind_1
Member
*
Offline Offline

Posts: 2


« Reply #6 on: December 13, 2010, 07:46:55 AM »

squarepusher,  this has been a useful script for me.  Thanks.

Since this script was written, c-span has changed their format slightly so that the scripts do not work as originally written.  One more level of indirection fixes the problem. 

One problem with the original scripts was that the original scripts did not use the offset information contained in the xml file, resulting in much video information being downloaded twice.

I hacked the original scripts to fix these two problems and the changed scripts are shown below.   I have been using these modified scripts for a couple of weeks on Linux without problems.  I have not tested them on Windows.

--------Here is the modified cspandownloader scripts------------

title="C-SPAN Archive Downloader v0.0.2"
if [ -z "$1" ]; then
   echo -e "$title\n\nERROR: No program ID specified as first parameter. The program could not continue.\n\nUSAGE: cspandownloader <programid> <directory_name>\n\nHINT: The program ID is the last part of the URL - for instance, in the URL:\nhttp://www.c-spanvideo.org/program/17753-1\n\nThe number '17753-1' (without quotes) is the program ID that you would specify in place of <programid>\n\nThe optional parameter directory_name specifies the name of the directory which will contain the downloaded data.  \nIf no directory name is specified, the downloaded data will be placed in a subdirectory named after the programID\n"
else
        if [ -z "$2" ]; then
       mkdir $1; cd $1
        else
            mkdir $2; cd $2
        fi

         
        #get full web page and extract program id
        programid=`curl -s http://www.c-spanvideo.org/program/$1 | sed -n '/programid=[0-9]/p'  | sed 's/^.*programid=//' |  sed 's/&.*$//' | sed '2,$d'`

        #extract program info using programid
   curl -s http://www.c-spanvideo.org/common/services/flashXml.php?programid=$programid  | tidy -xml -indent -quiet > cspantemp.xml
   sed -n '/<string name=\"path\">/,/<\/'string'>/ {
                  s/^.*<string name=\"path\">//
                  s/<\/string>.*$//
                  p
             }' cspantemp.xml | sed '/^$/d' | sed 's/^[ \t]*//' > cspanfilelist.txt

   sed -n '/<number name=\"offset\">/,/<\/'number'>/ {
                  s/^.*<number name=\"offset\">//
                  s/<\/number>.*$//
                  p
             }' cspantemp.xml | sed '/^$/d' | sed 's/^[ \t]*//' | sed 's/<number name=\"length.*$//' | sed '/^$/d' > cspanoffsetlist.txt


   i=1
   while [ $i -le `wc -l cspanfilelist.txt | gawk '{print $1}'` ] ; do line=`head -$i cspanfilelist.txt | tail -1`;  offset=`head -$i cspanoffsetlist.txt | tail -1`; cspandumper $line $offset; i=`expr $i + 1`; done
   #Clean up all the temporary files now
   echo "Cleaning up temporary files..."
   rm cspantemp.xml cspanfilelist.txt;
   echo "Done"
   exit
fi

----------Here is the modified cspandumper script------------

cspanchecksum=`wget http://www.c-spanarchives.org/flash/cspanPlayer.swf 2>&1 /dev/null  | sha256sum | sed 's/-//g'`
echo $cspanchecksum
cspansize=`ls -l cspanPlayer.swf | cut -f 5 -d ' '`
filename=`echo $1 | sed 's|.*/||'`
#We don't need '/tmp/cspanPlayer.swf' anymore (we've calculated the checksum and the filesize which were the only reasons why we needed to download it), so delete it
rm cspanPlayer.swf
#Here comes the rtmpdump command
echo "rtmpdump -r rtmp://video.c-spanarchives.org:1935/fastplay/../ -y $1 -s http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf -A $2  -w $cspanchecksum -x $cspansize -o $filename"
rtmpdump -r rtmp://video.c-spanarchives.org:1935/fastplay/../ -y $1 -s http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf -A $2  -w $cspanchecksum -x $cspansize -o $filename 


Logged
squarepusher
Member
*****
Offline Offline

Posts: 2,013



« Reply #7 on: December 16, 2010, 07:42:40 AM »

squarepusher,  this has been a useful script for me.  Thanks.

Since this script was written, c-span has changed their format slightly so that the scripts do not work as originally written.  One more level of indirection fixes the problem.  

One problem with the original scripts was that the original scripts did not use the offset information contained in the xml file, resulting in much video information being downloaded twice.

I hacked the original scripts to fix these two problems and the changed scripts are shown below.   I have been using these modified scripts for a couple of weeks on Linux without problems.  I have not tested them on Windows.

--------Here is the modified cspandownloader scripts------------

title="C-SPAN Archive Downloader v0.0.2"
if [ -z "$1" ]; then
   echo -e "$title\n\nERROR: No program ID specified as first parameter. The program could not continue.\n\nUSAGE: cspandownloader <programid> <directory_name>\n\nHINT: The program ID is the last part of the URL - for instance, in the URL:\nhttp://www.c-spanvideo.org/program/17753-1\n\nThe number '17753-1' (without quotes) is the program ID that you would specify in place of <programid>\n\nThe optional parameter directory_name specifies the name of the directory which will contain the downloaded data.  \nIf no directory name is specified, the downloaded data will be placed in a subdirectory named after the programID\n"
else
        if [ -z "$2" ]; then
       mkdir $1; cd $1
        else
            mkdir $2; cd $2
        fi

        
        #get full web page and extract program id
        programid=`curl -s http://www.c-spanvideo.org/program/$1 | sed -n '/programid=[0-9]/p'  | sed 's/^.*programid=//' |  sed 's/&.*$//' | sed '2,$d'`

        #extract program info using programid
   curl -s http://www.c-spanvideo.org/common/services/flashXml.php?programid=$programid  | tidy -xml -indent -quiet > cspantemp.xml
   sed -n '/<string name=\"path\">/,/<\/'string'>/ {
                  s/^.*<string name=\"path\">//
                  s/<\/string>.*$//
                  p
             }' cspantemp.xml | sed '/^$/d' | sed 's/^[ \t]*//' > cspanfilelist.txt

   sed -n '/<number name=\"offset\">/,/<\/'number'>/ {
                  s/^.*<number name=\"offset\">//
                  s/<\/number>.*$//
                  p
             }' cspantemp.xml | sed '/^$/d' | sed 's/^[ \t]*//' | sed 's/<number name=\"length.*$//' | sed '/^$/d' > cspanoffsetlist.txt


   i=1
   while [ $i -le `wc -l cspanfilelist.txt | gawk '{print $1}'` ] ; do line=`head -$i cspanfilelist.txt | tail -1`;  offset=`head -$i cspanoffsetlist.txt | tail -1`; cspandumper $line $offset; i=`expr $i + 1`; done
   #Clean up all the temporary files now
   echo "Cleaning up temporary files..."
   rm cspantemp.xml cspanfilelist.txt;
   echo "Done"
   exit
fi

----------Here is the modified cspandumper script------------

cspanchecksum=`wget http://www.c-spanarchives.org/flash/cspanPlayer.swf 2>&1 /dev/null  | sha256sum | sed 's/-//g'`
echo $cspanchecksum
cspansize=`ls -l cspanPlayer.swf | cut -f 5 -d ' '`
filename=`echo $1 | sed 's|.*/||'`
#We don't need '/tmp/cspanPlayer.swf' anymore (we've calculated the checksum and the filesize which were the only reasons why we needed to download it), so delete it
rm cspanPlayer.swf
#Here comes the rtmpdump command
echo "rtmpdump -r rtmp://video.c-spanarchives.org:1935/fastplay/../ -y $1 -s http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf -A $2  -w $cspanchecksum -x $cspansize -o $filename"
rtmpdump -r rtmp://video.c-spanarchives.org:1935/fastplay/../ -y $1 -s http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf -A $2  -w $cspanchecksum -x $cspansize -o $filename  




Thanks for sharing this with everyone. I will update the main page and links.

I'm also thinking of putting up a package in AUR for Arch Linux users out there.
Logged

Infowars Wiki - Help make this become the official wiki of Infowars.com - contribute!
gusty_wind_1
Member
*
Offline Offline

Posts: 2


« Reply #8 on: December 17, 2010, 10:48:26 AM »

I have rewritten this as a Perl script that I can send to you if you are interested.

Also,  if you are thinking of doing any further modification, you can simplify the script somewhat.  The checksum part of cspandumper is broken and can be replaced.  The -x and -w rtmpdump parameters can be replaced with a "-W  http://www.c-spanarchives.org/flash/cspanPlayer.swf" parameter.
Logged
bio44
Member
*
Offline Offline

Posts: 1


« Reply #9 on: February 28, 2011, 02:48:02 PM »

I'm not sure the script works in its current form. It's hard to tell, but rtmpdump freezes after about 1%.
Logged
Oreally
Member
*
Offline Offline

Posts: 4


« Reply #10 on: April 08, 2011, 06:00:21 AM »

Thanks for the work guys. I am using the first windows version posted and I am still having that problem of downloading the same parts twice, this is making the file size double and unable to watch properly. I would appreciate a fix for the windows version of the script that I could use. You guys deserve a lot of credit for this work, thanks again !
Logged
Overcast
Member
*****
Offline Offline

Posts: 4,120



« Reply #11 on: April 08, 2011, 09:05:00 AM »

Here's an app for FLV video too..

http://www.nirsoft.net/utils/web_video_capture.html

Nirsoft has MANY useful utilities for Winders as well.. Smiley

http://www.nirsoft.net/
Logged

It is when a people forget God, that tyrants forge their chains. ~ Patrick Henry

Our founding fathers, if they met the current politicians in office; would either kick their asses good or just shoot them dead. ~Me
Oreally
Member
*
Offline Offline

Posts: 4


« Reply #12 on: April 08, 2011, 11:32:47 AM »

Thanks for the links. I have noticed though that a lot of the videos on cspan don't even let you stream them on the website or offer the download option. But when I use cspandownloader script it will actually find the file and capture it, that is why I am steering away from flv capture software and trying to tinker with the current script which is nearly perfect.
Logged
Oreally
Member
*
Offline Offline

Posts: 4


« Reply #13 on: April 20, 2011, 07:55:55 AM »

I'm still trying to get this script working. I am using the modified version, all the files are getting created, the "cspantemp.xml" and the "cspanoffsetlist.txt". However when I looked in the XML it shows

        <string name="path">
        mp4:full/2001/05/05/20010505195800001.mp4</string>
        <number name="offset">223</number>
        <number name="length">4180</number>

cspanoffsetlist.txt is only displaying the offset, not the length, is this a problem?

Basically the script will download say 6% then when it gets the next file it will reset back to 5% and start climbing again to say 7%, then it goes back to 6%. While all this is happening the file size still grows, so data seems to still be downloaded twice?? Can anyone give me some pointers to get this working? Thanks
Logged
Oreally
Member
*
Offline Offline

Posts: 4


« Reply #14 on: May 25, 2011, 12:02:44 PM »

I am still looking to get this work, has anyone been able to figure out how to get the script to not download the same parts twice??
Logged
uswgo
Guest
« Reply #15 on: November 25, 2011, 03:42:44 PM »

I have found a good stream capturing software that can download videos from CSPAN for http://www.c-span.org/Events/Lawmakers-Question-Holder-on-Operation-Fast-and-Furious/10737425323/

The problem is every time I get farther along downloading up to 1.3GB it always stays at 1 hour which leads me to believe that it is downloading the same 1 hour video over and over whenever the whole HD hearing is 2 Hour and 40 Minutes.

This is how the streaming capture software downloads it from RTMP detection.

Code:
rtmp://video.c-spanarchives.org:1935/fastplay<playpath>mp4:full/2011/11/08/20111108100000001_hd.mp4 <swfUrl>http://www.c-spanvideo.org/videoLibrary/assets/swf/CSPANPlayer.swf?pid=302569-1 <pageUrl>http://www.c-span.org/Events/Lawmakers-Question-Holder-on-Operation-Fast-and-Furious/10737425323/ <objectEncoding>

That only gets me one hour download from the CSPAN Archives MPEG-4 streaming server.

How can I get the other 1 hour and 40 mins?

I know it was debated before in this forum and I like to use this for educating people on operation fast and furious. Can any of you help me get the other each hour pieces of the whole video stream?
Logged
uswgo
Guest
« Reply #16 on: November 25, 2011, 05:23:46 PM »

I finally found out where part 2 and part 3 of the CSPAN Videos were.

Apparently the entire hearing has three pieces.

mp4:full/2011/11/08/20111108100000001_hd.mp4
mp4:full/2011/11/08/20111108110000001_hd.mp4
mp4:full/2011/11/08/20111108120000001_hd.mp4



2011 is the year of the hearing, 11 is the month, and 08 is the day. 1 is a section of the CSPAN video then 0/1/2 is the parts of the CSPAN Streaming video then the rest should not be messed with.

Like this diagram:



Now I can upload the three parts to the infamous Eric Holder being grilled over Operation Fast and Furious over the gunrunning.
Logged
Effie Trinket
member
Member
*
Offline Offline

Posts: 1,200



« Reply #17 on: April 19, 2012, 03:56:13 PM »

Does anyone have any updates to this, or have found any alternatives?
Logged
Pages: [1]   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.17 | SMF © 2011, Simple Machines Valid XHTML 1.0! Valid CSS!