No. of Recommendations: 41
First of all, I salute Mark for what he's done so far. This message board would be next to worthless to me, and probably quite a few others, without the ability to search it.
Since the cat is out of the bag, I'll go ahead and post what I would have added to my last private email to you.
I'm having no trouble at all backing up posts with message numbers and URLs using iMacros for Firefox (an add-on I shamelessly recommend for all deficiencies in the GTR1 backtester's user interface). Here is a code sample:
SET !EXTRACT_TEST_POPUP NO
SET extURL {{!URLCURRENT}}
TAG POS=1 TYPE=A ATTR=ID:ctl01_ctl00_BaseContentPlaceHolder_BoardsBaseContentPlaceHolder_browseHeader_lnkBoardName EXTRACT=TXT
SET extBoardName {{!EXTRACT}}
SET !EXTRACT NULL
TAG POS=1 TYPE=A ATTR=TITLE:View*this*fool*profile* EXTRACT=TXT
SET extAuthor {{!EXTRACT}}
SET !EXTRACT NULL
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:aspnetForm ATTR=ID:ctl01_ctl00_BaseContentPlaceHolder_BoardsBaseContentPlaceHolder_ctlMessageHeader_txtMessageNumber EXTRACT=TXT
SET extMsgNum {{!EXTRACT}}
SET !EXTRACT NULL
TAG POS=R1 TYPE=A ATTR=CLASS:pnvalink EXTRACT=TXT
SET extSubject {{!EXTRACT}}
SET !EXTRACT NULL
TAG POS=R1 TYPE=TD ATTR=CLASS:pbnav EXTRACT=TXT
SET extDate {{!EXTRACT}}
SET !EXTRACT NULL
ADD !EXTRACT {{extMsgNum}};{{extURL}}
SAVEAS TYPE=EXTRACT FOLDER=* FILE=msg_urls.txt
SAVEAS TYPE=HTM FOLDER=* FILE={{extMsgNum}}
TAG POS=1 TYPE=B ATTR=TXT:Next
To run it after installing iMacros, go to any MI message board post you want to start archiving from. Open the iMacros widget, find the macro called #Current.imm, open it for editing and replace its code with the above. Then set the "Repeat Macro" setting to whatever you want and click "Play (Loop)". The macro actually extracts more information than it actually saves for the sake of demonstration; some of the tags might need tweaking or else can be eliminated if they stop working.
When I start this macro with Firefox sitting at the top message in this thread (and with the upper loop limit set to 20), I get a file in my iMacros "Downloads" folder called msg_urls.txt containing the following:
"245926;http://boards.fool.com/unless-anyone-knows-better-i-think-th...
"245927;http://boards.fool.com/incidentally-it-is-a-real-blast-to-lo...
"245928;http://boards.fool.com/that-would-be-a-shame-i-find-the-sear...
"245929;http://boards.fool.com/actually-it-gave-me-an-idea-that-you-...
"245930;http://boards.fool.com/i-saw-whafas-reply-it-seems-like-some...
"245931;http://boards.fool.com/im-not-following-you-i-just-tried-to-...
"245932;http://boards.fool.com/thats-a-big-compliment-to-be-mentione...
"245933;http://boards.fool.com/it-looks-like-theyve-started-using-in...
"245934;http://boards.fool.com/you-realize-you-did-not-even-implemen...
"245935;http://boards.fool.com/let-me-define-terms-a-little-more-clo...
"245936;http://boards.fool.com/would-you-mind-if-i-complained-to-the...
"245937;http://boards.fool.com/please-do-and-add-every-member-of-the...
"245938;http://boards.fool.com/thanks-again-robbie-i-got-this-to-wor...
"245939;http://boards.fool.com/sorry-cannot-make-out-what-13-will-do...
"245940;http://boards.fool.com/how-about-writing-a-polite-and-well-r...
"245941;http://boards.fool.com/maybe-hell-even-pay-mark-for-the-righ...
"245942;http://boards.fool.com/i-suggested-this-over-at-quotimprove-...
I also get the following files:
245926.htm
245927.htm
245928.htm
245929.htm
245930.htm
245931.htm
245932.htm
245933.htm
245934.htm
245935.htm
245936.htm
245937.htm
245938.htm
245939.htm
245940.htm
245941.htm
245942.htm
These html files could presumably then be parsed and indexed by your existing PHP program, hopefully without much modification.
This was actually the first iMacro I've ever written that does its own parsing. Normally I launch Firefox from within other code and run macros that simply save raw pages for parsing later within my own code. But I've written this example because it's portable and demonstrates the main features of iMacros.
For example, I launch iMacros from within C++ with two lines:
sprintf(cmd, "\"%s\" -new-window imacros://run/?m=%s%d.iim", FirefoxPath, Progname, i);
system(cmd);
Since my ping time to US websites from Australia is often pathetic, I make use of my CPU's idle time by running lots of iMacros simultaneously. I signal to my C++ programs that an iMacro has completed by making the last line of my macro point the browser to a local html file with a distinctive title. When my program detects that a window is open with that title, it kills it and moves on (using FindWindow and SendMessage in windows.h).
Note to Fool.com: If you are paying programmers to read this thread and find ways to thwart our efforts at making the message board usable, then I humbly suggest that your funds would be better spent on upgrading to a 21st century forum platform instead. Posting plain text messages (without even the ability to fix typos) in threads was novel in the 1980s and even 1990s, but in 2013 it's a joke, and not in the good TMF jester kind of way.
Robbie Geary