|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
Description
| Class Summary | |
|---|---|
| BlockProximityFusion | Fuses adjacent blocks if their distance (in blocks) does not exceed a certain limit. |
| DocumentTitleMatchClassifier | Marks TextBlocks which contain parts of the HTML
<TITLE> tag, using some heuristics which are quite
specific to the news domain. |
| ExpandTitleToContentFilter | Marks all TextBlocks "content" which are between the headline and the part that
has already been marked content, if they are marked DefaultLabels.MIGHT_BE_CONTENT. |
| KeepLargestBlockFilter | Keeps the largest TextBlock only (by the number of words). |
| SimpleBlockFusionProcessor | Merges two subsequent blocks if their text densities are equal. |
The BoilerpipeFilters in this package are pure heuristics.
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||