NEWS
tm.plugin.factiva 1.8 (2019-10-19)
- Stop truncating article IDs to avoid duplicates
(thanks to Tom Nicholls).
- Handle non-numeric 'page' metadata entries
(thanks to Tom Nicholls).
tm.plugin.factiva 1.7 (2017-11-20)
- Port from XML to xml2 package to support tm 0.8.
tm.plugin.factiva 1.6 (2017-02-08)
- Avoid importing each article twice with new Factiva HTML format.
- Add screencast showing how to export correct HTML files in ?FactivaSource.
tm.plugin.factiva 1.5 (2014-07-05)
- Fix encoding issues on non-UTF-8 systems, adding back the
'encoding' argument to work around a bug in package XML.
tm.plugin.factiva 1.4 (2014-06-11)
- Adapt to tm 0.6.
- Remove the 'encoding' argument to FactivaSource()
as it is not supported by tm 0.6 (normally not needed).
- Change all tags to lowercase (for consistency with tm).
- Ensure meta-data variables which are supposed to contain
only one value always do so.
tm.plugin.factiva 1.3 (2014-01-11)
- Extract Company, Industry, Information Provider Code (IPC) and
Information Provider Description (IPD) meta-data (based on a
patch by Grigorij Ljubownikow).
- Remove inconsistent line breaks in HTML format.
- Update to support tm 0.5-10 and clean the code a bit.
tm.plugin.factiva 1.2 (2013-01-29)
- Extract Subject and Coverage meta-data.
- Add Reuters21578 example.
- Fix handling of articles with no header or body.
- Split lead paragraphs into separate lines.
- Fix package help page to mention HTML.
tm.plugin.factiva 1.1 (2012-07-01)
- Add support for HTML files since Factiva no longer allows exporting to XML.
- Work around encoding issues on Windows (for HTML only).
- Preserve paragraphs information so that e.g. makeChunks() from tm can be used
to split documents into smaller pieces.
tm.plugin.factiva 1.0 (2012-05-14)
- Initial release with support for XML files.