Choosing between GistLite and GistPro.

GistLite
GistLite is the original algorithm used by GistWeb to create the multi-document summaries.  It works well on a large variety of keywords.
Choose GistLite as the algorithm for your summary if your keywords are not likely to return a lot of textual content (e.g. highly commercial keywords).  GistLite will do a better job with more text […]

Summaries algo update yields superior results.

I was looking at the results from GistWeb’s new search summaries today, and I noticed that the first paragraph was very often spot on target, and that the rest of the pargraphs, while generally very good, didn’t meet up with the same level of quality as the introductory paragraph.
‘Well,’ I said, ‘I’ll just apply the […]

GistWeb now creates web search summaries!

Okay guys, this is BIG news.  I’ve added a multi-document summarizer to GistWeb.  It will take the top 10, 20 or 30 pages resulting from a search query and extract only the most important information from each page and display it in a remarkably readable summary.
Here’s the link:
http://gistweb.com/web.php
It will create summaries from Google, Google News […]

Update ensures quotes are kept in tact.

Dr. Wellman said that the virus was “very dangerous to the preservation of the human race.  It has the potential to wipe out the world!”
Before this update, the above sentence would get split at “human race”, with GistWeb believing that to be the end of the sentence.  This new update allows the algorithm to recognize […]

Modified text-culling algo for paragraph-by-paragraph analysis.

Prior to now, GistWeb analyzed the text of an entire page to decide how much it should cut out.  While this obviously worked quite well, I’ve made it even better.
Now GistWeb looks at each paragraph in the text, and makes a decision on a paragraph-by-paragraph basis.  This has resulted in, I feel, a significant improvement […]

GistWeb now shows significant breaks in the text.

I noticed that when gisting blogs, GistWeb was lumping the comments in with the rest of the text.  This could be a bad thing, because a comment might be written in such a way that — when viewed in GistWeb — it appears to actually be part of the article.  That would be misleading, so […]

Major improvement in text-parsing algorithm.

Okay guys, your comments on my first post got me focused on a way to improve the text-parsing algorithm that finds the “meat” on any page.
A number of folks commented that, for blogs they run through GistWeb, they were seeing parts of the navigation, footer, etc. come out in the content.
That got me to sit […]

Why GistWeb?

My wife is pregnant, so lately I was studying up on the subject of pregnancy, reading a book written by a doctor on the subject.  The doctor was very wordy, and I started noticing that most of what he said really wasn’t the “meat” that I was looking for.
I remembered taking a speed-reading course as […]