- All Implemented Interfaces:
- org.apache.tika.detect.EncodingDetector
public class HtmlEncodingDetector
extends Object
implements org.apache.tika.detect.EncodingDetector
Character encoding detector for determining the character encoding of a
HTML document based on the potential charset parameter found in a
Content-Type http-equiv meta tag somewhere near the beginning. Especially
useful for determining the type among multiple closely related encodings
(ISO-8859-*) for which other types of encoding detection are unreliable.
- Since:
- Apache Tika 1.2