• raffaele
    @https://digipres.club/@raffaele

    starting from here https://www.reddit.com/r/DataHoarder/comments/7v6f0y/kiwix_zims_for_subreddits_blogs_forums_instikis/ i ended up on http://www.openzim.org/wiki/OpenZIM heard many times, never tried. the format http://www.openzim.org/wiki/ZIM_file_format is utterly complex compared to warc.

    but an interesting thing is the embedded full text search index (i.e. xapian used here https://github.com/openzim/zimwriterfs )

    the question: embedding a fts index into a warc is something ever thinked?

    2018-02-05T11:17:50Z
  • raffaele
    @https://digipres.club/@raffaele

    @despens you're right, this is a good point. also my mistake in reading: openzim is not embedding the index together with the content. if not provided with the download the index is created by the client reader application.

    2018-02-05T21:31:26Z
  • ➡️

...