Downloading the textual content of Facebook URLs

Eduardo Ochs

unread,

Aug 15, 2014, 12:48:11 AM8/15/14

to fb...@googlegroups.com

Hi Fbcmd people,

is there a way to use fbcmd to download the (textual) content of Facebook URLs like these?

https://www.facebook.com/sergio.martins.984991/posts/10152616093738086

https://www.facebook.com/jornalanovademocracia/photos/a.288492381220437.66632.187051701364506/679809862088685/

https://www.facebook.com/permalink.php?story_fbid=921476867869306&id=347772661906399

https://www.facebook.com/photo.php?fbid=10201336092313990&set=a.1569106477271.73917.1523735650

Something that would output what it got from FB in a raw-ish form - JSON? - would be ideal. Any hints welcome, as I am trying to write a set of scripts for caching texts posted to FB that I may want to access quickly later... the code that converts URLs to local file names is ready - the URLs above are associated to files with these names,

posts_sergio.martins.984991_10152616093738086

photos_jornalanovademocracia_a.288492381220437.66632.187051701364506_679809862088685

pesfi_921476867869306_347772661906399

photofs_10201336092313990_a.1569106477271.73917.1523735650

but right now I the only ways I have to put contents into these files - for playing with a prototype - is with cut-and-paste between a browser and Emacs... which is not fun.

Thanks in advance, cheers =),

Eduardo Ochs

eduar...@gmail.com

https://www.facebook.com/eduardo.ochs

http://angg.twu.net/

P.S.: here is a similar project that I am working on, which is keeping local copies of videos: http://angg.twu.net/youtube-db/README.html

P.P.S.: I wrote "downloading" but what I really meant was "reading from Facebook and outputting to stdout"...

B. Henry

unread,

Aug 21, 2014, 3:58:27 PM8/21/14

to fb...@googlegroups.com

I have not tried this, but will experiment when I have a chance.
Why do you not just use wget or something similar to download this content. You can filter out images I think, or if not
just delete the files you do not want from the created folder(s)?
--
B.H.

> --
> ---
> You received this message because you are subscribed to the Google
> Groups "fbcmd" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [1]fbcmd+un...@googlegroups.com.
> For more options, visit [2]https://groups.google.com/d/optout.
>
> References
>
> 1. mailto:fbcmd+un...@googlegroups.com
> 2. https://groups.google.com/d/optout

Eduardo Ochs

unread,

Aug 21, 2014, 6:27:45 PM8/21/14

to fb...@googlegroups.com

A call to wget with my username, password, and a "-U

$MOZILLA_USER_AGENT" could work in theory, but even if the page did

not need any Javascript to render its initial contents it would be

hard to parse, some comments would be truncated, etc...

I did try to parse the html for a while, months ago, and also the

output of marking with ctrl-A the whole text of the page displayed in

the browser, and then copying-and-pasting that to an Emacs buffer...

both things were very frustrating, and I was stumbling all the time on

corner cases and on rules that I had to guess. Having the contents of

posts as JSON will probably make things much easier.

By the way: is it possible to load fbcmd's functions from "php5 -a"

and call its functions directly? I am using Debian stable, with php5.5

and php5-readline from http://packages.dotdeb.org/ , and I normally

run interactive programs from Emacs, using this trick here - the demo

starts at 0:16 -

http://www.youtube.com/watch?v=Lj_zKC5BR64

but the interactive mode of "php5 -a" is a bit limited - for example,

autoloads don't work... it would certainly be much easier to just

run these things,

// See: https://graph.facebook.com/754664537913868

$p_url = 'https://www.facebook.com/vera.rodrigues.944023/posts/605622119554057';

$p_id = '754664537913868'

echo get_json_of_page_from_id($p_id);

echo get_json_of_page_from_url($p_url);

than to also have to implement command-line options for calling these

functions...

Cheers, TIA &c =),

Eduardo Ochs

B. Henry

unread,

Aug 22, 2014, 3:03:03 PM8/22/14

to fb...@googlegroups.com

Of course, makes sense...I was just replying off the top of my head, but I do not think fbcmd will be helpful for this.

> > [1]https://www.facebook.com/sergio.martins.984991/posts/
> 10152616093738086
> > [2]https://www.facebook.com/jornalanovademocracia/photos/
> a.288492381220437
> > .66632.187051701364506/679809862088685/
> > [3]https://www.facebook.com/permalink.php?story_fbid=
> 921476867869306&id=34
> > 7772661906399
> > [4]https://www.facebook.com/photo.php?fbid=

> > [5]eduar...@gmail.com
> > [6]https://www.facebook.com/eduardo.ochs
> > [7]http://angg.twu.net/

> >
> > P.S.: here is a similar project that I am working on, which is
> keeping

> > local copies of videos: [8]http://angg.twu.net/youtube-

> db/README.html
> >
> > P.P.S.: I wrote "downloading" but what I really meant was
> "reading from
> > Facebook and outputting to stdout"...
>

> --
> ---
> You received this message because you are subscribed to the Google
> Groups "fbcmd" group.
> To unsubscribe from this group and stop receiving emails from it, send

> an email to [9]fbcmd+un...@googlegroups.com.
> For more options, visit [10]https://groups.google.com/d/optout.
>
> References
>
> 1. https://www.facebook.com/sergio.martins.984991/posts/10152616093738086
> 2. https://www.facebook.com/jornalanovademocracia/photos/a.288492381220437
> 3. https://www.facebook.com/permalink.php?story_fbid=921476867869306&id=34
> 4. https://www.facebook.com/photo.php?fbid=10201336092313990&set=a.1569106
> 5. javascript:/
> 6. https://www.facebook.com/eduardo.ochs
> 7. http://angg.twu.net/
> 8. http://angg.twu.net/youtube-db/README.html
> 9. mailto:fbcmd+un...@googlegroups.com
> 10. https://groups.google.com/d/optout

Reply all

Reply to author

Forward