Afternoon all,
I am in need of an engine to index my whole site and about >2GB of docs(PDF/HTML/DOC/XLS...etc). Have tried htdig/asearch which both take the machine down and just can't scale to the size of file-store to be indexed.
Would appreciate some pointers. Thanks in advance.
Trevor
===== ( >- -< ) /~\ __ Scaling FLOSS in the __ /~\ | ) / Enterprise : trevor.w@pvision.biz \ (/ | |_|_ \ Call Now: 9820349221 / _|_| ____________________________________/
__________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
I am in need of an engine to index my whole site and about
2GB of docs(PDF/HTML/DOC/XLS...etc). Have tried
htdig/asearch which both take the machine down and just can't scale to the size of file-store to be indexed.
Would appreciate some pointers. Thanks in advance.
Htdig should be fine, dunno why it crashes for you. Tried out swish-e?
Afternoon SP,
--- Sthitaprajna sp.jena@irisindia.net wrote:
Htdig should be fine, dunno why it crashes for you. Tried out swish-e?
[snip]
It takes forever to index the stuff. Disks have been churning for close to 5 days without any sign of completion. Drove me nuts and had to terminate the same. HTDIG is good enough for plain ol html....can't scale for larger sites/content.
Trevor
===== ( >- -< ) /~\ __ Scaling FLOSS in the __ /~\ | ) / Enterprise : trevor.w@pvision.biz \ (/ | |_|_ \ Call Now: 9820349221 / _|_| ____________________________________/
__________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
Trevor Warren wrote:
Afternoon all,
I am in need of an engine to index my whole site and about >2GB of docs(PDF/HTML/DOC/XLS...etc). Have tried htdig/asearch which both take the machine down and just can't scale to the size of file-store to be indexed.
Why not use google site search?
-Krishna.
On Tue, 27 Apr 2004, Trevor Warren wrote:
~ I am in need of an engine to index my whole site and ~ about >2GB of docs(PDF/HTML/DOC/XLS...etc). Have tried
Forward everything to ilug-bom. Google will index them for you.
--- Amit Upadhyay upadhyay@me.iitb.ac.in wrote:
On Tue, 27 Apr 2004, Trevor Warren wrote: Forward everything to ilug-bom. Google will index them for you.
[snip]
Very smart...its internal data.
Trevor
-- Amit Upadhyay
===== ( >- -< ) /~\ __ Scaling FLOSS in the __ /~\ | ) / Enterprise : trevor.w@pvision.biz \ (/ | |_|_ \ Call Now: 9820349221 / _|_| ____________________________________/
__________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
Why not upload to a server and ask google to index ... alternately tell google that you have 3 gb of mails with sensible content. you will get a gmail box and bingo !
Afternoon all,
I am in need of an engine to index my whole site and about >2GB of docs(PDF/HTML/DOC/XLS...etc). Have tried htdig/asearch which both take the machine down and just can't scale to the size of file-store to be indexed.
Would appreciate some pointers. Thanks in advance.
Trevor
===== ( >- -< ) /~\ __ Scaling FLOSS in the __ /~\ | ) / Enterprise : trevor.w@pvision.biz \ (/ | |_|_ \ Call Now: 9820349221 / _|_| ____________________________________/
Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
--- Harsh R Busa mailme@backendguru.com wrote:
Why not upload to a server and ask google to index ... alternately tell google that you have 3 gb of mails with sensible content. you will get a gmail box and bingo !
[snip]
grrrrrrrrrrrrrrr.....!!!!
Trevor
===== ( >- -< ) /~\ __ Scaling FLOSS in the __ /~\ | ) / Enterprise : trevor.w@pvision.biz \ (/ | |_|_ \ Call Now: 9820349221 / _|_| ____________________________________/
__________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
On Tuesday 27 Apr 2004 7:41 pm, Trevor Warren wrote:
grrrrrrrrrrrrrrr.....!!!!
;-)
You can see how powerful Google is. ;-)
ok Trevor... I think you may find this extreemly fast and light... It's called KBnow...
Website: http://kbnow.com
However, I've tested it only on about 100 MB of data. Since the data is going to be much larger than that, you should use MySQL instead of the built-in DB provided. The author Sean Nolan will tell you how to do that...
Downside: It's non-free...
But setting it up and populating data is a breeze and can be done in 5-15 minutes. No kidding... I've been hoping to find someone that would be happy to create something like this using open source software and license it as GPL. I'd be happy to receive quotes and pay for this if it can be done.
Regards
Rishi
Morning Rishi,
--- Rishi rishi@gangfam.com wrote:
On Tuesday 27 Apr 2004 7:41 pm, Trevor Warren wrote:
GPL. I'd be happy to receive quotes and pay for this if it can be done.
[snip]
Thanks dear. Lemme try this one out but as you say my heart would be at home with a gpl'ed app.
Lemme try this one out and even "siwsh-e". Will write a small paper on it when done.
Trevor
Regards
Rishi
===== ( >- -< ) /~\ __ Scaling FLOSS in the __ /~\ | ) / Enterprise : trevor.w@pvision.biz \ (/ | |_|_ \ Call Now: 9820349221 / _|_| ____________________________________/
__________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover